WO2011041834A1 - Reconstruction of a recorded sound field - Google Patents

Reconstruction of a recorded sound field Download PDF

Info

Publication number
WO2011041834A1
WO2011041834A1 PCT/AU2010/001312 AU2010001312W WO2011041834A1 WO 2011041834 A1 WO2011041834 A1 WO 2011041834A1 AU 2010001312 W AU2010001312 W AU 2010001312W WO 2011041834 A1 WO2011041834 A1 WO 2011041834A1
Authority
WO
WIPO (PCT)
Prior art keywords
plane
hoa
matrix
domain
wave
Prior art date
Application number
PCT/AU2010/001312
Other languages
French (fr)
Inventor
Craig Jin
Andre Van Schaik
Nicolas Epain
Original Assignee
The University Of Sydney
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2009904871A external-priority patent/AU2009904871A0/en
Application filed by The University Of Sydney filed Critical The University Of Sydney
Priority to AU2010305313A priority Critical patent/AU2010305313B2/en
Priority to EP10821476.8A priority patent/EP2486561B1/en
Priority to US13/500,045 priority patent/US9113281B2/en
Priority to JP2012532418A priority patent/JP5773540B2/en
Publication of WO2011041834A1 publication Critical patent/WO2011041834A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S1/00Two-channel systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/15Aspects of sound capture and related signal processing for recording or reproduction

Definitions

  • the present disclosure relates, generally, to reconstruction of a recorded sound field and, more particularly, to equipment for, and a method of, recording and then reconstructing a sound field using techniques related to at least one of compressive sensing and independent component analysis.
  • HOA HOA-constrained acoustic sensor array
  • the small sweet spot phenomenon refers to the fact that the sound field is only accurate for a small region of space.
  • Reconstructing a sound field refers, in addition to reproducing a recorded sound field, to using a set of analysis plane- wave directions to determine a set of plane- wave source signals and their associated source directions.
  • analysis is done in association with a dense set of plane-wave source directions to obtain a vector, g, of plane-wave source signals in which each entry of g is clearly matched to an associated source direction.
  • HRTFs Head-related transfer functions
  • HRIRs Head-related impulse responses
  • HOA-domain and HOA-domain Fourier Expansion refer to any mathematical basis set that may be used for analysis and synthesis for Higher Order Ambisonics such as the Fourier-Bessel system, circular harmonics, and so forth. Signals can be expressed in terms of their components based on their expansion in the HOA-domain mathematical basis set. When signals are expressed in terms of these components, it is said that the signals are expressed in the "HOA-domain”. Signals in the HOA-domain can be represented in both the frequency and time domain in a manner similar to other signals.
  • HOA refers to Higher Order Ambisonics which is a general term encompassing sound field representation and manipulation in the HOA-domain.
  • Compressive Sampling or “Compressed Sensing” or “Compressive Sensing” all refer to a set of techniques that analyse signals in a sparse domain (defined below).
  • “Sparsity Domain” or “Sparse Domain” is a compressive sampling term that refers to the fact that a vector of sampled observations y can be written as a matrix- vector product, e.g., as:
  • pinv refers to a pseudo-inverse, a regularised pseudo-inverse or a
  • the L2-norm of a vector x is denoted by
  • the L1-L2 norm of a matrix A is denoted by
  • ICA Independent Component Analysis which is a mathematical method that provides, for example, a means to estimate a mixing matrix and an unmixing matrix for a given set of mixed signals. It also provides a set of separated source signals for the set of mixed signals.
  • the "sparsity" of a recorded sound field provides a measure of the extent to which a small number of sources dominate the sound field.
  • Dominant components of a vector or matrix refer to components of the vector or matrix that are much larger in relative value than some of the other components. For example, for a vector x , we can measure the relative value of component x i compared by computing the ratio or the logarithm of the ratio, If the ratio or
  • log-ratio exceeds some particular threshold value, say ⁇ th , x i may be considered a dominant component compared to x j ,
  • “Cleaning a vector or matrix” refers to searching for dominant components (as defined above) in the vector or matrix and then modifying the vector or matrix by removing or setting to zero some of its components which are not dominant components.
  • “Reducing a matrix M” refers to an operation that may remove columns of M that contain all zeros and/or an operation that may remove columns that do not have a Dominant Component. Instead, “Reducing a matrix M” may refer to removing columns of the matrix M depending on some vector x. In this case, the columns of the matrix M that do not correspond to Dominant Components of the vector x are removed. Still further, “Reducing a matrix M” may refer to removing columns of the matrix M depending on some other matrix N. In this case, the columns of the matrix M must correspond somehow to the columns or rows of the matrix N. When there is this correspondence, “Reducing the matrix M” refers to removing the columns of the matrix M that correspond to columns or rows of the matrix N which do not have a Dominant Component.
  • “Expanding a matrix M” refers to an operation that may insert into the matrix M a set of columns that contains all zeros.
  • An example of when such an operation may be required is when the columns of matrix M correspond to a smaller set of basis functions and it is required to express the matrix M in a manner that is suited to a larger set of basis functions.
  • “Expanding a vector of time signals x(t)” refers to an operation that may insert into the vector of time signals x(t) , signals that contain all zeros.
  • FFT means a Fast Fourier Transform
  • IFFT means an Inverse Fast Fourier Transform.
  • a "baffled spherical microphone array” refers to a spherical array of microphones which are mounted on a rigid baffle, such as a solid sphere. This is in contrast to an open spherical array of microphones which does not have a baffle.
  • Time domain and frequency domain vectors are sometimes expressed using the following notation: A vector of time domain signals is written as x(t) . In the frequency domain, this vector is written as x . In other words, x is the FFT of x(t) . To avoid confusion with this notation, all vectors of time signals are explicitly written out as x(i) .
  • Matrices and vectors are expressed using bold-type. Matrices are expressed using capital letters in bold-type and vectors are expressed using lower-case letters in bold-type.
  • a matrix of filters is expressed using a capital letter with bold-type and with an explicit time component such as M(t) when expressed in the time domain or with an explicit frequency component such as M(eo) when expressed in the frequency domain.
  • the matrix of filters is expressed in the time domain. Each entry of the matrix is then itself a finite impulse response filter.
  • the column index of the matrix M(t) is an index that corresponds to the index of some vector of time signals that is to be filtered by the matrix.
  • the row index of the matrix M(t) corresponds to the index of the group of output signals.
  • the "multiplication operator" is the convolution operator described in more detail below.
  • ® is a mathematical operator which denotes convolution. It may be used to express convolution of a matrix of filters (represented as a general matrix) with a vector of time signals. For example, represents the convolution of the
  • x(t) may correspond to a set of microphone signals
  • y (t) may correspond to a set of HOA-domain time signals. In this case, the equation indicates that the microphone signals are
  • Step 1.A.2.B.1 indicates that in the first step, there is an alternative operational path A, which has a second step, which has an alternative operational path B, which has a first step.
  • equipment for reconstructing a recorded sound field including
  • a signal processing module in communication with the sensing arrangement and which processes the recorded data for the purposes of at least one of (a) estimating the sparsity of the recorded sound field and (b) obtaining plane-wave signals and their associated source directions to enable the recorded sound field to be reconstructed.
  • the sensing arrangement may comprise a microphone array.
  • the microphone array may be one of a baffled array and an open spherical microphone array.
  • the signal processing module may be configured to estimate the sparsity of the recorded data according to the method of one of aspects three and four below.
  • the signal processing module may be configured to analyse the recorded sound field, using the methods of aspects five to seven below, to obtain a set of plane- wave signals that separate the sources in the sound field and identify the source locations and allow the sound field to be reconstructed.
  • the signal processing module may be configured to modify the set of plane- wave signals to reduce unwanted artifacts such as reverberations and/or unwanted sound sources. To reduce reverberations, the signal processing module may reduce the signal values of some of the signals in the plane- wave signals. To separate sound sources in the sound field reconstruction so that the unwanted sound sources can be reduced, the signal processing module may be operative to set to zero some of the signals in the set of plane-wave signals.
  • the equipment may include a playback device for playing back the reconstructed sound field.
  • the playback device may be one of a loudspeaker array and headphones.
  • the signal processing module may be operative to modify the recorded data depending on which playback device is to be used for playing back the reconstructed sound field.
  • a method of reconstructing a recorded sound field including
  • the method may include recording a time frame of audio of the sound field to obtain the recorded data in the form of a set of signals, s mic (t) , using an acoustic sensing arrangement.
  • the acoustic sensing arrangement comprises a microphone array.
  • the microphone array may be a baffled or open spherical microphone array.
  • the method may include estimating the sparsity of the recorded sound field by applying ICA in an HOA-domain to calculate the sparsity of the recorded sound field.
  • the method may include analysing the recorded sound field in the HOA domain to obtain a vector of HOA-domain time signals, b HOA (t) , and computing from b H0A (t) a mixing matrix, M ICA , using signal processing techniques.
  • the method may include using instantaneous Independent Component Analysis as the signal processing technique.
  • the method may include projecting the mixing matrix, M ICA , on the HOA direction vectors associated with a set of plane-wave basis directions by computing S is me transpose (Hermitian conjugate) of the real-
  • HOA complex- valued HOA direction matrix associated with the plane-wave basis directions and the hat-operator onYj w .
  • HOA indicates it has been truncated to an HOA- order M.
  • the method may include estimating the sparsity, S, of the recorded data by first determining the number, N source , of dominant plane-wave directions represented by v. source and then computing where w is the number of analysis plane-
  • the method may include estimating the sparsity of the recorded sound field by analysing recorded data using compressed sensing or convex optimization techniques to calculate the sparsity of the recorded sound field.
  • the method may include analysing the recorded sound field in the HOA domain to obtain a vector of HOA-domain time signals, b H0A (t) , and sampling the vector of
  • HOA-domain time signals over a given time frame, L to obtain a collection of time samples at time instances t 1 to t N to obtain a set of HOA-domain vectors at each time instant: ) expressed as a matrix, B H0A by:
  • the method may include applying singular value decomposition to B HOA to obtain a matrix decomposition:
  • the method may include forming a matrix S reduced by keeping only the first m columns of S , where m is the number of rows of B H0A and forming a matrix, ⁇ , given by
  • the method may include solving the following convex programming problem for a matrix ⁇ :
  • Y pIw is the matrix (truncated to a high spherical harmonic order) whose columns are the values of the spherical harmonic functions for the set of directions corresponding to some set of analysis plane waves
  • ⁇ ⁇ is a non-negative real number.
  • the method may include obtaining G from ⁇ using:
  • V T is obtained from the matrix decomposition of B H0A .
  • the method may include obtaining an unmixing matrix, , for the Z-th time frame, by calculating: where;
  • the method may include obtaining G,,, ⁇ . ⁇ using:
  • the method may include obtaining the vector of plane-wave signals, from the collection of plane-wave time samples, G plw . smooth , using standard overlap-add techniques. Instead when obtaining the vector of plane-wave signals , the
  • method may include obtaining, g pIw . cs (t) , from the collection of plane-wave time samples, G plw , without smoothing using standard overlap-add techniques.
  • the method may include estimating the sparsity of the recorded data by first computing the number, N comp , of dominant components of and then
  • the method may include reconstructing the recorded sound field, using frequency-domain techniques to analyse the recorded data in the sparse domain; and obtaining the plane-wave signals from the frequency-domain techniques to enable the recorded sound field to be reconstructed.
  • the method may include transforming the set of signals, to the frequency domain using an FFT to obtain
  • the method may include analysing the recorded sound field in the frequency domain using plane-wave analysis to produce a vector of plane-wave amplitudes,
  • the method may include conducting the plane-wave analysis of the recorded sound field by solving the following convex programming problem for the vector of plane-wave amplitudes,
  • Tpiw/mic is a transfer matrix between plane-waves and the microphones
  • s mic is the set of signals recorded by the microphone array
  • the method may include conducting the plane- wave analysis of the recorded sound field by solving the following convex programming problem for the vector of plane-wave amplitudes, p
  • Tpiw/mio is a transfer matrix between the plane-waves and the microphones
  • s mic is the set of signals recorded by the microphone array
  • ⁇ ⁇ is a non-negative real number
  • TpiwHOA i a transfer matrix between the plane-waves and the HOA-domain Fourier expansion
  • ⁇ 2 is a non-negative real number.
  • the method may include conducting the plane-wave analysis of the recorded sound field by solving the following convex programming problem for the vector of plane-wave amplitudes, g pIw . cs :
  • T plw/mi0 is a transfer matrix between plane-waves and the microphones
  • T m j C /HOA i s a transfer matrix between the microphones and the HOA-domain Fourier expansion
  • the method may include conducting the plane-wave analysis of the recorded sound field by solving the following convex programming problem for the vector of plane-wave amplitudes, g plw . os :
  • Tpiw/mic 1S a transfer matrix between plane-waves and the microphones
  • T plw/H0A is a transfer matrix between the plane-waves and the HOA-domain
  • ⁇ 2 is a non-negative real number.
  • the method may include setting ⁇ ⁇ based on the resolution of the spatial division of a set of directions corresponding to the set of analysis plane-waves and setting the value of ⁇ 2 based on the computed sparsity of the sound field. Further, the method may include transforming g p i w . cs back to the time-domain using an inverse FFT to obtain g p
  • the method may include using a time domain technique to analyse recorded data in the sparse domain and obtaining parameters generated from the selected time domain technique to enable the recorded sound field to be reconstructed.
  • the method may include analysing the recorded sound field in the time domain using plane-wave analysis according to a set of basis plane- waves to produce a set of plane-wave signals, g plw . cs (t) .
  • the method may include analysing the recorded sound field in the HOA domain to obtain a vector of HOA-domain time signals, b H0A (t) , and sampling the vector of HOA-domain time signals over a given time frame, L, to obtain a collection of time samples at time instances t, to t N to obtain a set of HOA-domain vectors at each time instant: b expressed as a matrix, B
  • the method may include computing a correlation vector,
  • the method may include solving the following convex programming problem for a vector of plane- wave gains,
  • TTpiwHOA is a transfer matrix between the plane-waves and the HOA-domain
  • ⁇ ⁇ is a non-negative real number.
  • the method may include solving the following convex programming problem for a vector of plane-wave gains, p plw . os :
  • ⁇ plw/HOA is a transfer matrix between the plane-waves and the HOA-domain
  • ⁇ ⁇ is a non-negative real number
  • ⁇ 2 is a non-negative real number.
  • the method may include setting ⁇ ⁇ based on the resolution of the spatial division of a set of directions corresponding to the set of analysis plane-waves and setting the value of ⁇ 2 based on the computed sparsity of the sound field.
  • the method may include thresholding and cleaning p plw . cs to set some of its small components to zero.
  • the method may include forming a matrix, according to the plane- wave basis and then reducing Y to by keeping only the columns
  • HOA is an HOA direction matrix for the plane-wave basis and the hat-operator on A indicates it
  • the method may include computing
  • the method may include solving the following convex programming problem for a matrix
  • Y plw is a matrix (truncated to a high spherical harmonic order) whose columns are the values of the spherical harmonic functions for the set of directions corresponding to some set of analysis plane waves
  • ⁇ ⁇ is a non-negative real number.
  • the method may include obtaining an unmixing matrix, H L , for the L-th time frame, by calculating:
  • n refers to the unmixing matrix for the L-l time frame
  • a is a forgetting factor such that 0 ⁇ a ⁇ 1 .
  • the method may include applying singular value decomposition to B HOA to obtain a matrix decomposition:
  • the method may include forming a matrix S reduced by keeping only the first m columns of S , where m is the number of rows of B H0A and forming a matrix, ⁇ , given by
  • the method may include solving the following convex programming problem for a matrix ⁇ :
  • the method may include obtaining G plw from ⁇ using:
  • V T is obtained from the matrix decomposition of
  • the method may include obtaining an unmixing matrix, ⁇ , , for the I-th time frame, by calculating:
  • ⁇ _ is an unmixing matrix for the L- ⁇ time frame
  • the method may include obtaining G plw . sraooth using:
  • the method may include obtaining the vector of plane-wave signals, ,
  • the method may include obtaining, from the collection of plane-wave time samples, without smoothing using standard overlap-add techniques.
  • the method may include modifying g p]w.cs (t) to reduce unwanted artifacts such as reverberations and/or unwanted sound sources. Further, the method may include, to reduce reverberations, reducing the signal values of some of the signals in the signal vector, The method may include, to separate sound sources in the sound
  • the method may include modifying dependent on the means
  • the method may include modifying as follows:
  • the method may include converting back to the HOA-
  • the method may include decoding
  • the method may include modifying ) to determine headphone gains as follows:
  • 0 is a head-related impulse response matrix of filters corresponding to
  • the method may include using time-domain techniques of Independent Component Analysis (ICA) in the HOA-domain to analyse recorded data in a sparse domain, and obtaining parameters from the selected time domain technique to enable the recorded sound field to be reconstructed.
  • ICA Independent Component Analysis
  • the method may include computing from b H0A (t) a mixing matrix, using signal processing techniques.
  • the method may include using instantaneous Independent Component Analysis as the signal processing technique.
  • the method may include projecting the mixing matrix, M ICA , on the HOA direction vectors associated with a set of plane-wave basis directions by computing is the transpose (Hermitian conjugate)
  • the method may include using thresholding techniques to identify the columns of V source that indicate a dominant source direction. These columns may be identified on the basis of having a single dominant component.
  • the method may include reducing the matrix Y
  • the method may include, for each frequency, reducing a transfer matrix, between the plane-waves and the microphones, T plw/mio , to obtain a matrix, T plw/mic . reduced , by removing the columns in T plw/mic that do not correspond to dominant source directions associated with matrix V source .
  • the method may include estimating d by computing:
  • the method may include expanding g plw . ica . reduoed (t) to obtain g plw . ica (t) by inserting rows of time signals of zeros so that g plw . ica (t) matches the plane-wave basis.
  • the method may include computing from b H0A a mixing matrix, M ICA , and a set of separated source signals, g ica (t) using signal processing techniques.
  • the method may include using instantaneous Independent Component Analysis as the signal processing technique.
  • the method may include projecting the mixing matrix, M ICA , on the HOA direction vectors associated with a set of plane-wave basis directions by computing 3 ⁇ 4 where is the transpose (Hermitian conjugate) of the real-
  • the method may include using thresholding techniques to identify from V source the dominant plane- wave directions. Further, the method may include cleaning g ioa (t) to obtain g which retains the signals corresponding to the dominant plane-wave
  • the method may include modifying g plw-ioa (t) to reduce unwanted artifacts such as reverberations and/or unwanted sound sources.
  • the method may include, to reduce reverberations, reducing the signal values of some of the signals in the signal vector, g p i w . ica (t) ⁇ Further, the method may include, to separate sound sources in the sound field reconstruction so that the unwanted sound sources can be reduced, setting to zero some of the signals in the signal vector,
  • the method may include modifying g pUv . ica (t) dependent on the means of playback of the reconstructed sound field.
  • the method may include modifying g plw . ica (t) as follows:
  • Ppiw/spk is a loudspeaker panning matrix.
  • the method may include converting g plw . ioa (t) back to the HOA- domain by computing:
  • Ypiw-HOA is an HOA direction matrix for a plane- wave basis and the hat-operator on Ypi w- HOA indicates it has been truncated to some HOA-order M.
  • the method may include decoding using HOA
  • the method may include modifying g plw-cs (t) to determine headphone gains as follows:
  • Ppiw/hph (t) is a head-related impulse response matrix of filters corresponding to the set of plane wave directions.
  • the disclosure extends to a computer when programmed to perform the method as described above.
  • the disclosure also extends to a computer readable medium to enable a computer to perform the method as described above.
  • Fig. 1 shows a block diagram of an embodiment of equipment for reconstructing a recorded sound field and also estimating the sparsity of the recorded sound field;
  • Figs. 2-5 show flow charts of the steps involved in estimating the sparsity of a recorded sound field using the equipment of Fig. 1;
  • Figs. 6-23 show flow charts of embodiments of reconstructing a recorded sound field using the equipment of Fig. 1 ;
  • Figs. 24A-24C show a first example of, respectively, a photographic representation of an HOA solution to reconstructing a recorded sound field, the original sound field and the solution offered by the present disclosure.
  • Figs. 25A-25C show a second example of, respectively, a photographic representation of an HOA solution to reconstructing a recorded sound field, the original sound field and the solution offered by the present disclosure.
  • reference numeral 10 generally designates an embodiment of equipment for reconstructing a recorded sound field and/or estimating the sparsity of the sound field.
  • the equipment 10 includes a sensing arrangement 12 for measuring the sound field to obtain recorded data.
  • the sensing arrangement 12 is connected to a signal processing module 14, such as a microprocessor, which processes the recorded data to obtain plane-wave signals enabling the recorded sound field to be reconstructed and/or processes the recorded data to obtain the sparsity of the sound field.
  • the sparsity of the sound field, the separated plane-wave sources and their associated source directions are provided via an output port 24.
  • the signal processing module 14 is referred to below, for the sake of conciseness, as the SPM 14.
  • a data accessing module 16 is connected to the SPM 14.
  • the data accessing module 16 is a memory module in which data are stored.
  • the SPM 14 accesses the memory module to retrieve the required data from the memory module as and when required.
  • the data accessing module 16 is a connection module, such as, for example, a modem or the like, to enable the SPM 14 to retrieve the data from a remote location.
  • the equipment 10 includes a playback module 18 for playing back the reconstructed sound field.
  • the playback module 18 comprises a loudspeaker array 20 and/or one or more headphones 22.
  • the sensing arrangement 12 is a baffled spherical microphone array for recording the sound field to produce recorded data in the form of a set of signals, s mic (T ).
  • the SPM 14 analyses the recorded data relating to the sound field using plane- wave analysis to produce a vector of plane-wave signals, g plw (t) .
  • Producing the vector of plane- wave signals, g plw (t) is to be understood as also obtaining the associated set of pale-wave source directions.
  • g pIw (i) is referred to more specifically as ⁇ piw-os (0 if Compressed Sensing techniques are used or ica ( ) if IC A techniques are
  • the SPM 14 is also used to modify gpiw ( ⁇ > i f desired.
  • the SPM 14 Once the SPM 14 has performed its analysis, it produces output data for the output port 24 which may include the sparsity of the sound field, the separated plane- wave source signals and the associated source directions of the plane-wave source signals. In addition, once the SPM 14 has performed its analysis, it generates signals, s out (t) , for rendering the determined g plw (t) as audio to be replayed over the loudspeaker array 20 and/or the one or more headphones 22.
  • the SPM 14 performs a series of operations on the set of signals, s mic (t) , after the signals have been recorded by the microphone array 12, to enable the signals to be reconstructed into a sound field closely approximating the recorded sound field.
  • a set of matrices that characterise the microphone array 12 are defined. These matrices may be computed as needed by the SPM 14 or may be retrieved as needed from data storage using the data accessing module 16. When one of these matrices is referred to, it will be described as "one of the defined matrices”.
  • j m is the spherical Bessel function of order m
  • j' m is the derivatives of j m and 3 ⁇ 4 2) , respectively.
  • the hat-operator on W mic indicates that it has been truncated to some order M.
  • T sph/mi0 is similar to T sph/mio except it has been truncated to a much higher order
  • Y pIw is the matrix (truncated to the higher order M' ) whose columns are the values of the spherical harmonic functions for the set of directions corresponding to some set of analysis plane waves.
  • Y plw is similar to Y plw except it has been truncated to the lower order
  • Tpiw/HOA * s a transfer matrix between the analysis plane waves and the HOA- estimated spherical harmonic expansion (derived from the microphone array 12) as:
  • Tpiw/mic is a transfer matrix between the analysis plane waves and the microphone array 12 as:
  • T sph/mic is as defined above.
  • E mic/HOA (7) is a matrix of filters that implements, via a convolution operation, that transformation between the time signals of the microphone array 12 and the HOA- domain time signals and is defined as:
  • the operations performed on the set of signals, s mio (t) are now described with reference to the flow charts illustrated in Figs. 2-16 of the drawings.
  • the flow chart shown in Fig. 2 provides an overview of the flow of operations to estimate the sparsity, S, of a recorded sound field. This flow chart is broken down into higher levels of detail in Figs. 3-5.
  • the flow chart shown in Fig. 6 provides an overview of the flow of operations to reconstruct a recorded sound field.
  • the flow chart of Fig. 6 is broken down into higher levels of detail in Figs. 7-16.
  • the SPM 14 calculates a vector of HOA-domain time signals b H0A (t) as:
  • Step 2.2 there are two different options available: Step 2.2.A and Step 2.2.B.
  • the SPM 14 estimates the sparsity of the sound field by applying ICA in the HOA-domain. Instead, at Step 2.2.B the SPM 14 estimates the sparsity of the sound field using a Compressed Sampling technique.
  • Step 2.2.A.1 the SPM 14 determines a mixing matrix, M ICA , using Independent Component Analysis techniques.
  • the SPM 14 projects the mixing matrix, M ICA , on the HOA direction vectors associated with a set of plane-wave basis directions. This projection is obtained by computing V where is the transpose of the Defined
  • V source is a matrix which is ideally composed of columns which either have all components as zero or contain a single dominant component corresponding to a specific plane wave direction with the rest of the column's components being zero. Thresholding techniques are applied to ensure that V souree takes its ideal format. That is to say, columns of V source which contain a dominant value compared to the rest of the column's components are thresholded so that all components less than the dominant component are set to zero. Also, columns of V source which do not have a dominant component have all of its components set to zero.
  • the SPM 14 computes the sparsity of the sound field. It does this by calculating the number, N source , of dominant plane wave directions in V source . olean .
  • the SPM 14 then computes the sparsity, S, of the sound field as where
  • N lw is the number of analysis plane-wave basis directions.
  • Step 2.2.B.1 the SPM 14 calculates the matrix B H0A from the vector of HOA signals b H0A (t) by setting each signal in b H0A (t) to run along the rows of B H0A so that time runs along the rows of the matrix B H0A and the various HOA orders run along the columns of the matrix B H0A . More specifically, the SPM 14 samples b H0A (t) over a given time frame, labelled by L, to obtain a collection of time samples at the time instances to t N . The SPM 14 thus obtains a set of HOA- domain vectors at each time instant: .
  • the SPM 14 thus obtains a set of HOA- domain vectors at each time instant: .
  • the SPM 14 calculates a correlation vector, ⁇ , as
  • b omni is the omni-directional HOA-component of expressed as a
  • the SPM 14 solves the following convex programming problem to obtain the vector of plane-wave gains, P plw . cs :
  • T plw/H0A is one of the defined matrices and ⁇ ⁇ is a non-negative real number.
  • the SPM 14 estimates the sparsity of the sound field. It does this by applying a thresholding technique to P plw.cs in order to estimate the number,
  • the SPM 14 then computes the sparsity, S, of the sound field as where N is the number of analysis plane-wave basis
  • Step 1 and Step 2 are the same as in the flow chart of Fig. 2 which has been described above. However, in the operational flow of Fig. 6, Step 2 is optional and is therefore represented by a dashed box.
  • the SPM 14 estimates the parameters, in the form of plane-wave signals g plw (t) , that allow the sound field to be reconstructed.
  • the plane-wave signals are expressed either as depending on the method of
  • Step 4 there is an optional step (represented by a dashed box) in which the estimated parameters are modified by the SPM 14 to reduce reverberation and/or separate unwanted sounds.
  • Step 5 the SPM 14 estimates the plane-wave signals, ( ),(possibly modified) that are used to reconstruct and play back the sound field.
  • Step 1 and Step 2 having been previously described, the flow of operations contained in Step 3 are now described.
  • the flow chart of Fig. 7 provides an overview of the operations required for Step 3 of the flow chart shown in Fig. 6. It shows that there are four different paths available: Step 3. A, Step 3.B, Step 3.C and Step 3.D.
  • the SPM 14 estimates the plane- wave signals using a Compressive
  • the SPM 14 estimates the plane- wave signals using a Compressive Sampling technique in the frequency-domain.
  • the SPM 14 estimates the plane- wave signals using ICA in the HOA-domain.
  • the SPM 14 estimates the plane-wave signals using Compressive Sampling in the time domain using a multiple measurement vector technique.
  • Step 3.A.1 b HOA (t) and B H0A are determined by the SPM 14 as described above for Step 2.1 and
  • Step 2.2.B.1 respectively.
  • Step 3.A.2 the correlation vector, ⁇ , is determined by the SPM 14 as described above for Step 2.2.B.2.
  • Step 3.A.3 there are two options: Step 3.A.3.A and Step 3.A.3.B.
  • Step 3.A.3.A there are two options: Step 3.A.3.A and Step 3.A.3.B.
  • the SPM 14 solves a convex programming problem to determine plane-wave direction gains, p plw-cs .
  • This convex programming problem does not include a sparsity constraint. More specifically, the following convex programming problem is solved:
  • is as defined above and is one of the Defined Matrices
  • ⁇ ⁇ is a non-negative real number.
  • Step 3.A.3.B the SPM 14 solves a convex programming problem to determine plane-wave direction gains, , only this time a sparsity constraint is
  • ⁇ , ⁇ 1 are as defined above ,
  • ⁇ 2 is a non-negative real number.
  • ⁇ ⁇ may be set by the SPM
  • ⁇ 2 may be set by the SPM 14 based on the computed sparsity of the sound field (optional Step 2).
  • the SPM 14 applies thresholding techniques to clean p_ lw . os so that some of its small components are set to zero.
  • the SPM 14 forms a matrix, ⁇ ⁇ 1 ⁇ . ⁇ 0 ⁇ 5 according to the plane- wave basis and then reduces Yp lw-H0A to Y p i w -reduce_ by keeping only the columns corresponding to the non-zero components in P plw . cs , where Y P1W . HOA ⁇ s an HOA direction matrix for the plane-wave basis and the hat-operator on Y PIW . H0A indicates it has been truncated to some HO A-order M.
  • Step 3.A.6 the SPM 14 calculates g pIw-0 , reduced (i) as:
  • the SPM 14 expands g plw . c , reduced (t) to obtain g pIw , s (t) by inserting rows of time signals of zeros to match the plane-wave basis that has been used for the analyses.
  • Step 3.B An alternative to Step 3. A is Step 3.B.
  • the flow chart of Fig. 9 details Step 3.B.
  • the SPM 14 calculates a FFT, smio . of s mic (0 and/or a FFT, b HOA , of b H0A (t) .
  • the SPM 14 solves one of four optional convex programming problems.
  • the convex programming problem shown at Step 3.B.2.A operates on s mjo and does not use a sparsity constraint. More precisely, the SPM 14 solves the following convex programming problem to determine
  • T plw/mic is one of the Defined Matrices
  • s mic is as defined above, and
  • ⁇ ⁇ is a non-negative real number.
  • the convex programming problem shown at Step 3.B.2.B operates on s mjo and includes a sparsity constraint. More precisely, the SPM 14 solves the following convex programming problem to determine g plw-cs :
  • T piw/mio ' T piw/HOA are each one of the Defined Matrices
  • ⁇ 2 is a non-negative real number.
  • the convex programming problem shown at Step 3.B.2.C operates on b and does not use a sparsity constraint. More precisely, the SPM 14 solves the following convex programming problem to determine
  • b H0A , and ⁇ 1 are as defined above.
  • the convex programming problem shown at Step 3.B.2.D operates on b H0A and includes a sparsity constraint. More precisely, the SPM 14 solves the following convex programming problem to determine g plw . cs :
  • b H0A , ⁇ 1 , and ⁇ 2 are as defined above.
  • the SPM 14 computes an inverse FFT of g plw . cs to obtain gpiw-cs (0 ⁇
  • overlap-and-add procedures are followed.
  • Step 3.C A further option to Step 3.A or Step 3.B is Step 3.C.
  • the flow chart of Fig. 10 provides an overview of Step 3.C.
  • the SPM 14 computes b H0A (t) as
  • Step 3.C.2 there are two options, Step 3.C.2.A and Step 3.C.2.B.
  • Step 3.C.2.A the SPM 14 uses ICA in the HOA-domain to estimate a mixing matrix which is then used to obtain g plw-ica (t) .
  • Step 3.C.2.B the SPM 14 uses ICA in the
  • HOA-domain to estimate a mixing matrix and also a set of separated source signals.
  • Both the mixing matrix and the separated source signals are then used by the SPM 14 to obtain g plw-ica (t ) .
  • the SPM 14 applies ICA to the vector of signals b H0A (t) to obtain the mixing matrix, M ICA .
  • Step 3.C.2.A.3 the SPM 14 applies thresholding techniques to V source to identify the dominant plane-wave directions in V souroe . This is achieved similarly to the operation described above with reference to Step 2.2.A.3.
  • Step 3.C.2.A.4 there are two options, Step 3.C.2.A.4.A and Step 3.C.2.A.4.B.
  • Step 3.C.2.A.4.A the SPM 14 uses the HOA domain matrix, 3 ⁇ 4 w , to compute ) Instead, at Step 3.C.2.A.4.B, the SPM 14 uses the
  • the SPM 14 reduces the matrix Yj w to obtain the matrix, Yj w . reduced , by removing the plane-wave direction vectors in Y plw that do not correspond to dominant source directions associated with matrix V source .
  • the SPM 14 calculates g plw-ica . reduced (f ) as:
  • Step 3.C.2.A.4.A An alternative to Step 3.C.2.A.4.A, is Step 3.C.2.A.4.B.
  • Step 3.C.2.A.4.B.1 the SPM 14 calculates a FFT, s At Step 3.C.2.A.4.B.1, the SPM 14 calculates a FFT, s At Step 3.C.2.A.4.B.1, the SPM 14 calculates a FFT, s At Step 3.C.2.A.4.B.1, the SPM 14 calculates a FFT, s At Step 3.C.2.A.4.B.1, the SPM 14 calculates a FFT, s At Step 3.C.2.A.4.B.1, the SPM 14 calculates a FFT, s At Step 3.C.2.A.4.B.1, the SPM 14 calculates a FFT, s At Step 3.C.2.A.4.B.1, the SPM 14 calculates a FFT, s At Step 3.C.2.A.4.B.1, the SPM 14 calculates a FFT, s At Step 3.C.2.A.4.B.1, the SPM 14 calculates a FFT, s
  • Step 3.C.2.A.4.B.3, the SPM 14 calculates g plw , ca , educed as: )
  • T plw/mic.reduced and s mio are as defined above.
  • Step 3.C.2.A.4.B.4 the SPM 14 calculates g plw-ioa-reduced (f) as the IFFT of
  • Step 3.C.2.A.5 the SPM 14 expands to obtain g plw.ica (t) by inserting rows of time signals of zeros to match the plane-wave basis that has been used for the analyses.
  • Step 3.C.2.A An alternative to Step 3.C.2.A is Step 3.C.2.B.
  • the flow chart of Fig. 14 describes the details of Step 3.C.2.B.
  • the SPM 14 applies ICA to the vector of signals b H0A (t) to obtain the mixing matrix, M ICA , and a set of separated source signals g ica (t) .
  • the SPM 14 projects the mixing matrix, M ICA , on the HOA direction vectors associated with a set of plane-wave basis directions as described for Step 2.2.
  • A.2 i.e the projection is obtained by computing > where is the transpose of the defined matrix
  • the SPM 14 applies thresholding techniques to V source to identify the dominant plane-wave directions in V source . This is achieved similarly to the operation described above for Step 2.2.A.3. Once the dominant plane-wave directions in V source have been identified, the SPM 14 cleans g ioa (t) to obtain g plw-ica (t) which retains the signals corresponding to the dominant plane- wave directions V source and sets the other signals to zero.
  • Step 3.D a further option to Steps 3. A, 3.B and 3.C, is Step 3.D.
  • the flow chart of Figure 15 provides an overview of Step 3.D.
  • Step 3.D.1 the SPM 14 computes b H0A (t) as
  • the vector of HOA signals b H0A (t) by setting each signal in b H0A (t) to run along the rows of B H0A so that time runs along the rows of the matrix B H0A and the various HOA orders run along the columns of the matrix B H0A .
  • the SPM 14 samples b H0A (t) over a given time frame, L, to obtain a collection of time samples at the time instances t 1 to t N .
  • the SPM 14 thus obtains a set of HOA-domain vectors at each time instant: b H0A (t j ) , b H0A (t 2 ) , . . ., b H0A (t N ) .
  • the SPM 14 forms the matrix, B HOA by:
  • Step 3.D.2 there are two options, Step 3.D.2.A and Step 3.D.2.B.
  • Step 3.D.2.A the SPM 14 computes g p!w-cs using a multiple measurement vector technique applied directly on B H0A .
  • Step 3.D.2.B the SPM 14 computes g plw . os using a multiple measurement vector technique based on the singular value decomposition of
  • Step 3.D.2.A.1 the SPM 14 solves the following convex programming problem to determine G PLW :
  • Y PLW is one of the Defined Matrices
  • ⁇ 1 is a non-negative real number.
  • Step 3.D.2.A.2 there are two options, i.e. Step 3.D.2.A.2.A and Step 3.D.2.A.2.B.
  • Step 3.D.2.A.2.A the SPM 14 computes g plw . cs (t) directly from G plw using an overlap-add technique. Instead at Step 3.D.2.A.2.B, the SPM 14 computes g plw . os (t) using a smoothed version of G plw and an overlap-add technique.
  • the SPM 14 calculates an unmixing matrix, Tl L , for the Z-th time frame, by calculating:
  • ⁇ ⁇ _ 1 refers to the unmixing matrix for the L-l time frame and « is a forgetting factor such that 0 ⁇ a ⁇ 1 , and B H0A is as defined above.
  • the SPM 14 calculates G plw . sra00th as:
  • H L and B H0A are as defined above.
  • the SPM 14 calculates g plw-os (0 from G plw . smooth using an overlap-add technique.
  • Step 3.D.2.A An alternative to Step 3.D.2.A is Step 3.D.2.B.
  • the flow chart of Fig. 18 describes the details of Step 3.D.2.B.
  • the SPM 14 computes the singular value decomposition of B HOA to obtain the matrix decomposition:
  • the SPM 14 calculates the matrix, S reduced , by keeping only the first m columns of S , where m is the number of rows of B H0A .
  • the SPM 14 calculates matrix ⁇ as:
  • Step 3.D.2.B.4 the SPM 14 solves the following convex programming problem for matrix ⁇ :
  • Y pIw is one of the defined matrices
  • is as defined above, and
  • ⁇ ⁇ is a non-negative real number.
  • Step 3.D.2.B.5 there are two options, Step 3.D.2.B.5.A and Step 3.D.2.B.5.B.
  • Step 3.D.2.B.5.A the SPM 14 calculates G lw from ⁇ using: where V T is obtained from the matrix decomposition of B H0A as described above. The SPM 14 then computes ) directly from w using an overlap-add technique.
  • Step 3.D.2.B.5.B the SPM 14 calculates g plw . cs (0 using a smoothed version of G p]w and an overlap-add technique.
  • Fig. 19 shows the details of Step 3.D.2.B.5.B.
  • the SPM 14 calculates at unmixing matrix, n L , for the i-th time frame, by calculating:
  • ⁇ L _ 1 refers to the unmixing matrix for the L-l time frame and a is a forgetting factor such that 0 ⁇ a ⁇ 1 , and ⁇ and ⁇ are as defined above.
  • Step 3.D.2.B.5.B.2 the SPM 14 calculates G plw . smooth as:
  • the SPM 14 calculates g plw . cs (0 from G plw . sm00th using an overlap-add technique.
  • Step 4 of the flow chart of Fig. 6 The SPM 14 controls the amount of reverberation present in the sound field reconstruction by reducing the signal values of some of the signals in the signal vector Instead, or in addition, the SPM 14 removes
  • Step 5 of the flow chart of Fig. 6 the parameters g plw (t) are used to play back the sound field.
  • the flow chart of Fig. 20 shows three optional paths for play back of the sound field: Step 5. A, Step 5.B, and Step 5.C.
  • the flow chart of Fig. 21 describes the details of Step 5. A.
  • the SPM 14 computes or retrieves from data storage the loudspeaker panning matrix, P plw/spk , in order to enable loudspeaker playback of the reconstructed sound field over the loudspeaker array 20.
  • the panning matrix, P p i w/spk can be derived using any of the various panning techniques such as, for example, Vector Based Amplitude Panning (VBAP).
  • the SPM 14 calculates the loudspeaker signals ( ) ( ) ( ) ( )
  • Step 5.B.1 the SPM 14 computes b H0A . highres (t) in order to enable loudspeaker playback of the reconstructed sound field over the loudspeaker array 20.
  • ⁇ HOA-highres (0 * s a high-resolution HOA-domain representation of g plw ( ⁇ ) that is capable of expansion to an arbitrary HOA-domain order.
  • the SPM 14 calculates k H0A . h j ghres (0 as
  • Y plw is one of the Defined Matrices and the hat-operator on Y plw indicates it has been truncated to some HOA-order M.
  • the SPM 14 decodes ) using HOA decoding techniques .
  • Step 5.C An alternative to loudspeaker play back is headphone play back.
  • the operations for headphone play back are shown at Step 5.C of the flow chart of Fig. 20.
  • the flow chart of Fig. 23 describes the details of Step 5.C.
  • the SPM 14 computes or retrieves from data storage the head- related impulse response matrix of filters, P p i w/hp h (0 > corresponding to the set of analysis plane wave directions in order to enable headphone playback of the reconstructed sound field over one or more of the headphones 22.
  • the head-related impulse response (HRIR) matrix of filters, P plw/hph (t) is derived from HRTF measurements.
  • the SPM 14 calculates the headphone signals as using a fllter convolution operation.
  • N spk is the number of loudspeakers
  • Y spk is the transpose of the matrix whose columns are the values of the spherical harmonic functions, Y where are the spherical coordinates for the
  • b H0A is the play back signals represented in the HOA-domain.
  • the basic HOA decoding in three dimensions is a spherical-harmonic-based method that possesses a number of advantages which include the ability to reconstruct the sound field easily using various and arbitrary loudspeaker configurations.
  • it will be appreciated by those skilled in the art that it also suffers from limitations related to both the encoding and decoding process. Firstly, as a finite number of sensors is used to observe the sound field, the encoding suffers from spatial aliasing at high frequencies (see N. Epain and J. Daniel, "Improving spherical microphone arrays," in the Proceedings of the AES 124 th Convention, May 2008).
  • the distance between the standard HOA solution and the compressive sampling solution may be controlled using, for example, the constraint When ⁇ 2 is zero, the compressive sampling solution
  • the SPM 14 may dynamically set the value of ⁇ 2 according to the computed sparsity of the sound field.
  • the microphone array 12 is a 4 cm radius rigid sphere with thirty two omnidirectional microphones evenly distributed on the surface of the sphere.
  • the sound fields are reconstructed using a ring of forty eight loudspeakers with a radius of 1 m.
  • the microphone gains are HOA-encoded up to order 4.
  • the compressive sampling plane-wave analysis is performed using a frequency-domain technique which includes a sparsity constraint and using a basis of 360 plane waves evenly distributed in the horizontal plane.
  • the values of ⁇ 1 and ⁇ 2 have been fixed to

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)

Abstract

Equipment (10) for reconstructing a recorded sound field includes a sensing arrangement (12) for measuring the sound field to obtain recorded data. A signal processing module (14) is in communication with the sensing arrangement (12) and processes the recorded data for the purposes of at least one of (a) estimating the sparsity of the recorded sound field and (b) obtaining plane-wave signals to enable the recorded sound field to be reconstructed.

Description

"Reconstruction of a recorded sound field"
Cross-Reference to Related Applications
The present application claims priority from Australian Provisional Patent Application No. 2009904871 filed on 7 October 2009, the contents of which are incorporated herein by reference in their entirety.
Field
The present disclosure relates, generally, to reconstruction of a recorded sound field and, more particularly, to equipment for, and a method of, recording and then reconstructing a sound field using techniques related to at least one of compressive sensing and independent component analysis.
Background
Various means exist for recording and then reproducing a sound field using microphones and loudspeakers (or headphones). The focus of this disclosure is accurate sound field reconstruction and/or reproduction compared with artistic sound field reproduction where creative modifications are allowed. Currently, there are two primary and state-of-the-art techniques used for accurately recording and reproducing a sound field: higher order ambisonics (HOA) and wave-field synthesis (WFS). The WFS technique generally requires a spot microphone for each sound source. In addition, the location of each sound source must be determined and recorded. The recording from each spot microphone is then rendered using the mathematical machinery of WFS. Sometimes spot microphones are not available for each sound source or spot microphones may not be convenient to use. In such cases, one generally uses a more compact microphone array such as a linear, circular, or spherical array. Currently, the best available technique for reconstructing a sound field from a compact microphone array is HOA. However, HOA suffers from two major problems: (1) a small sweet spot and (2) degradation in the reconstruction when the mathematical system is under-constrained (for example, when too many loudspeakers are used). The small sweet spot phenomenon refers to the fact that the sound field is only accurate for a small region of space.
Several terms relating to this disclosure are defined below.
"Reconstructing a sound field" refers, in addition to reproducing a recorded sound field, to using a set of analysis plane- wave directions to determine a set of plane- wave source signals and their associated source directions. Typically, analysis is done in association with a dense set of plane-wave source directions to obtain a vector, g, of plane-wave source signals in which each entry of g is clearly matched to an associated source direction.
"Head-related transfer functions" (HRTFs) or "Head-related impulse responses" (HRIRs) refer to transfer functions that mathematically specify the directional acoustic properties of the human auditory periphery including the outer ear, head, shoulders, and torso as a linear system. HRTFs express the transfer functions in the frequency domain and HRIRs express the transfer functions in the time domain.
"HOA-domain" and "HOA-domain Fourier Expansion" refer to any mathematical basis set that may be used for analysis and synthesis for Higher Order Ambisonics such as the Fourier-Bessel system, circular harmonics, and so forth. Signals can be expressed in terms of their components based on their expansion in the HOA-domain mathematical basis set. When signals are expressed in terms of these components, it is said that the signals are expressed in the "HOA-domain". Signals in the HOA-domain can be represented in both the frequency and time domain in a manner similar to other signals.
"HOA" refers to Higher Order Ambisonics which is a general term encompassing sound field representation and manipulation in the HOA-domain.
"Compressive Sampling" or "Compressed Sensing" or "Compressive Sensing" all refer to a set of techniques that analyse signals in a sparse domain (defined below).
"Sparsity Domain" or "Sparse Domain" is a compressive sampling term that refers to the fact that a vector of sampled observations y can be written as a matrix- vector product, e.g., as:
Figure imgf000003_0001
where Ψ is a basis of elementary functions and nearly all coefficient in x are null. If S coefficients in x are non-null, we say the observed phenomenon is S-sparse in the sparsity domain Ψ .
The function "pinv" refers to a pseudo-inverse, a regularised pseudo-inverse or a
Moore-Penrose inverse of a matrix.
The LI -norm of a vector x is denoted and is given by
Figure imgf000003_0006
Figure imgf000003_0002
ί
The L2-norm of a vector x is denoted by ||x||2 and is given by ||
Figure imgf000003_0003
The L1-L2 norm of a matrix A is denoted by | and is given by:
Figure imgf000003_0005
Figure imgf000003_0004
wher ] is the z'-th element of u , an
Figure imgf000004_0004
j] is the element in
Figure imgf000004_0001
the z'-th row and y'-th column of A .
"ICA" is Independent Component Analysis which is a mathematical method that provides, for example, a means to estimate a mixing matrix and an unmixing matrix for a given set of mixed signals. It also provides a set of separated source signals for the set of mixed signals.
The "sparsity" of a recorded sound field provides a measure of the extent to which a small number of sources dominate the sound field.
"Dominant components" of a vector or matrix refer to components of the vector or matrix that are much larger in relative value than some of the other components. For example, for a vector x , we can measure the relative value of component xi compared by computing the ratio or the logarithm of the ratio, If the ratio or
Figure imgf000004_0003
Figure imgf000004_0002
log-ratio exceeds some particular threshold value, say θth , xi may be considered a dominant component compared to xj ,
"Cleaning a vector or matrix" refers to searching for dominant components (as defined above) in the vector or matrix and then modifying the vector or matrix by removing or setting to zero some of its components which are not dominant components.
"Reducing a matrix M" refers to an operation that may remove columns of M that contain all zeros and/or an operation that may remove columns that do not have a Dominant Component. Instead, "Reducing a matrix M" may refer to removing columns of the matrix M depending on some vector x. In this case, the columns of the matrix M that do not correspond to Dominant Components of the vector x are removed. Still further, "Reducing a matrix M" may refer to removing columns of the matrix M depending on some other matrix N. In this case, the columns of the matrix M must correspond somehow to the columns or rows of the matrix N. When there is this correspondence, "Reducing the matrix M" refers to removing the columns of the matrix M that correspond to columns or rows of the matrix N which do not have a Dominant Component.
"Expanding a matrix M" refers to an operation that may insert into the matrix M a set of columns that contains all zeros. An example of when such an operation may be required is when the columns of matrix M correspond to a smaller set of basis functions and it is required to express the matrix M in a manner that is suited to a larger set of basis functions. "Expanding a vector of time signals x(t) " refers to an operation that may insert into the vector of time signals x(t) , signals that contain all zeros. An example of when such an operation may be required is when the entries of x(t) correspond to time signals that match a smaller set of basis functions and it is required to express the vector of time signals x(t) in a manner that is suited to a larger set of basis functions.
"FFT" means a Fast Fourier Transform.
"IFFT" means an Inverse Fast Fourier Transform.
A "baffled spherical microphone array" refers to a spherical array of microphones which are mounted on a rigid baffle, such as a solid sphere. This is in contrast to an open spherical array of microphones which does not have a baffle.
Several notations related to this disclosure are described below:
Time domain and frequency domain vectors are sometimes expressed using the following notation: A vector of time domain signals is written as x(t) . In the frequency domain, this vector is written as x . In other words, x is the FFT of x(t) . To avoid confusion with this notation, all vectors of time signals are explicitly written out as x(i) .
Matrices and vectors are expressed using bold-type. Matrices are expressed using capital letters in bold-type and vectors are expressed using lower-case letters in bold-type.
A matrix of filters is expressed using a capital letter with bold-type and with an explicit time component such as M(t) when expressed in the time domain or with an explicit frequency component such as M(eo) when expressed in the frequency domain.
For the remainder of this definition we assume that the matrix of filters is expressed in the time domain. Each entry of the matrix is then itself a finite impulse response filter. The column index of the matrix M(t) is an index that corresponds to the index of some vector of time signals that is to be filtered by the matrix. The row index of the matrix M(t) corresponds to the index of the group of output signals. As a matrix of filters operates on a vector of time signals, the "multiplication operator" is the convolution operator described in more detail below.
"® " is a mathematical operator which denotes convolution. It may be used to express convolution of a matrix of filters (represented as a general matrix) with a vector of time signals. For example, represents the convolution of the
Figure imgf000005_0001
matrix of filters M(t) with the corresponding vector of time signals in x(t) . Each entry of M(t) is a filter and the entries running along each column of M (t) correspond to the time signals contained in the vector of time signals x(t) . The filters running along each row of M (?) correspond to the different time signals in the vector of output signals y (t) . As a concrete example, x(t) may correspond to a set of microphone signals, while y (t) may correspond to a set of HOA-domain time signals. In this case, the equation indicates that the microphone signals are
Figure imgf000006_0001
filtered with a set of filters given by each row of M(t) and then added together to give a time signal corresponding to one of the HOA-domain component signals in y (t) .
Flow charts of signal processing operations are expressed using numbers to indicate a particular step number and letters to indicate one of several different operational paths. Thus, for example, Step 1.A.2.B.1 indicates that in the first step, there is an alternative operational path A, which has a second step, which has an alternative operational path B, which has a first step.
Summary
In a first aspect there is provided equipment for reconstructing a recorded sound field, the equipment including
a sensing arrangement for measuring the sound field to obtain recorded data; and
a signal processing module in communication with the sensing arrangement and which processes the recorded data for the purposes of at least one of (a) estimating the sparsity of the recorded sound field and (b) obtaining plane-wave signals and their associated source directions to enable the recorded sound field to be reconstructed.
The sensing arrangement may comprise a microphone array. The microphone array may be one of a baffled array and an open spherical microphone array.
The signal processing module may be configured to estimate the sparsity of the recorded data according to the method of one of aspects three and four below.
Further, the signal processing module may be configured to analyse the recorded sound field, using the methods of aspects five to seven below, to obtain a set of plane- wave signals that separate the sources in the sound field and identify the source locations and allow the sound field to be reconstructed.
The signal processing module may be configured to modify the set of plane- wave signals to reduce unwanted artifacts such as reverberations and/or unwanted sound sources. To reduce reverberations, the signal processing module may reduce the signal values of some of the signals in the plane- wave signals. To separate sound sources in the sound field reconstruction so that the unwanted sound sources can be reduced, the signal processing module may be operative to set to zero some of the signals in the set of plane-wave signals.
The equipment may include a playback device for playing back the reconstructed sound field. The playback device may be one of a loudspeaker array and headphones. The signal processing module may be operative to modify the recorded data depending on which playback device is to be used for playing back the reconstructed sound field.
In a second aspect, there is provided, a method of reconstructing a recorded sound field, the method including
analysing recorded data in a sparse domain using one of a time domain technique and a frequency domain technique; and
obtaining plane-wave signals and their associated source directions generated from the selected technique to enable the recorded sound field to be reconstructed.
The method may include recording a time frame of audio of the sound field to obtain the recorded data in the form of a set of signals, smic (t) , using an acoustic sensing arrangement. Preferably, the acoustic sensing arrangement comprises a microphone array. The microphone array may be a baffled or open spherical microphone array.
In a third aspect, the method may include estimating the sparsity of the recorded sound field by applying ICA in an HOA-domain to calculate the sparsity of the recorded sound field.
The method may include analysing the recorded sound field in the HOA domain to obtain a vector of HOA-domain time signals, bHOA (t) , and computing from bH0A (t) a mixing matrix, MICA , using signal processing techniques. The method may include using instantaneous Independent Component Analysis as the signal processing technique.
The method may include projecting the mixing matrix, MICA , on the HOA direction vectors associated with a set of plane-wave basis directions by computing S is me transpose (Hermitian conjugate) of the real-
Figure imgf000007_0001
value (complex- valued) HOA direction matrix associated with the plane-wave basis directions and the hat-operator onYjw.HOA indicates it has been truncated to an HOA- order M.
The method may include estimating the sparsity, S, of the recorded data by first determining the number, Nsource , of dominant plane-wave directions represented by v. source and then computing where w is the number of analysis plane-
Figure imgf000008_0003
Figure imgf000008_0002
wave basis directions.
In a fourth aspect, the method may include estimating the sparsity of the recorded sound field by analysing recorded data using compressed sensing or convex optimization techniques to calculate the sparsity of the recorded sound field.
The method may include analysing the recorded sound field in the HOA domain to obtain a vector of HOA-domain time signals, bH0A (t) , and sampling the vector of
HOA-domain time signals over a given time frame, L, to obtain a collection of time samples at time instances t1 to tN to obtain a set of HOA-domain vectors at each time instant: ) expressed as a matrix, BH0A by:
Figure imgf000008_0009
Figure imgf000008_0010
The method may include applying singular value decomposition to BHOAto obtain a matrix decomposition:
Figure imgf000008_0001
The method may include forming a matrix Sreduced by keeping only the first m columns of S , where m is the number of rows of BH0A and forming a matrix, Ω , given by
Figure imgf000008_0006
The method may include solving the following convex programming problem for a matrix Γ :
Figure imgf000008_0004
where YpIwis the matrix (truncated to a high spherical harmonic order) whose columns are the values of the spherical harmonic functions for the set of directions corresponding to some set of analysis plane waves,
and
ελ is a non-negative real number.
The method may include obtaining G
Figure imgf000008_0008
from Γ using:
G
Figure imgf000008_0007
where VT is obtained from the matrix decomposition of BH0A .
The method may include obtaining an unmixing matrix, , for the Z-th time frame, by calculating:
Figure imgf000008_0005
where;
is an unmixing matrix for the L-l time frame,
or is a forgetting factor such that 0 < a < 1 .
The method may include obtaining G,,,^.^ using:
Figure imgf000009_0002
The method may include obtaining the vector of plane-wave signals,
Figure imgf000009_0009
from the collection of plane-wave time samples, Gplw.smooth , using standard overlap-add techniques. Instead when obtaining the vector of plane-wave signals , the
Figure imgf000009_0008
method may include obtaining, gpIw.cs (t) , from the collection of plane-wave time samples, Gplw , without smoothing using standard overlap-add techniques.
The method may include estimating the sparsity of the recorded data by first computing the number, Ncomp , of dominant components of and then
Figure imgf000009_0007
computing is the number of analysis plane-wave basis
Figure imgf000009_0003
directions.
In a fifth aspect the method may include reconstructing the recorded sound field, using frequency-domain techniques to analyse the recorded data in the sparse domain; and obtaining the plane-wave signals from the frequency-domain techniques to enable the recorded sound field to be reconstructed.
The method may include transforming the set of signals,
Figure imgf000009_0006
to the frequency domain using an FFT to obtain
Figure imgf000009_0005
The method may include analysing the recorded sound field in the frequency domain using plane-wave analysis to produce a vector of plane-wave amplitudes,
In a first embodiment of the fifth aspect, the method may include conducting the plane-wave analysis of the recorded sound field by solving the following convex programming problem for the vector of plane-wave amplitudes,
Figure imgf000009_0004
Figure imgf000009_0001
where:
Tpiw/mic is a transfer matrix between plane-waves and the microphones, smic is the set of signals recorded by the microphone array, and
ε1 is a non-negative real number. In a second embodiment of the fifth aspect, the method may include conducting the plane- wave analysis of the recorded sound field by solving the following convex programming problem for the vector of plane-wave amplitudes,
Figure imgf000010_0006
p
minimise
Figure imgf000010_0004
subject to
Figure imgf000010_0001
IPmic II2
where:
Figure imgf000010_0002
Tpiw/mio is a transfer matrix between the plane-waves and the microphones, smic is the set of signals recorded by the microphone array, and
εχ is a non-negative real number,
TpiwHOA is a transfer matrix between the plane-waves and the HOA-domain Fourier expansion,
bH0A is a set of HOA-domain Fourier coefficients given by bH0A = Tmic/H0Asmic where Tmjc/H0A is a transfer matrix between the microphones and the HOA-domain
Fourier expansion, and
ε2 is a non-negative real number.
In a third embodiment of the fifth aspect, the method may include conducting the plane-wave analysis of the recorded sound field by solving the following convex programming problem for the vector of plane-wave amplitudes, gpIw.cs :
minimise
Figure imgf000010_0005
Figure imgf000010_0003
where:
Tplw/mi0 is a transfer matrix between plane-waves and the microphones,
TmjC/HOA is a transfer matrix between the microphones and the HOA-domain Fourier expansion,
bH0A is a set of HOA-domain Fourier coefficients given by bH0A = Tmic/H0Asmic , ει is a non-negative real number.
In a fourth embodiment of the fifth aspect, the method may include conducting the plane-wave analysis of the recorded sound field by solving the following convex programming problem for the vector of plane-wave amplitudes, gplw.os :
Figure imgf000011_0002
Figure imgf000011_0003
Figure imgf000011_0004
where:
Tpiw/mic 1S a transfer matrix between plane-waves and the microphones,
£·, is a non-negative real number,
Tplw/H0A is a transfer matrix between the plane-waves and the HOA-domain
Fourier expansion,
bH0A is a set of HOA-domain Fourier coefficients given by bH0A = Tmjc/H0Asmjc where Tmic/H0A is a transfer matrix between the microphones and the HOA-domain
Fourier expansion, and
ε2 is a non-negative real number.
The method may include setting ελ based on the resolution of the spatial division of a set of directions corresponding to the set of analysis plane-waves and setting the value of ε2 based on the computed sparsity of the sound field. Further, the method may include transforming gpiw.cs back to the time-domain using an inverse FFT to obtain gp|W.cs (^) . The method may include identifying source directions with each entry of gplw.cs or gplw-cs (t) .
In a sixth aspect, the method may include using a time domain technique to analyse recorded data in the sparse domain and obtaining parameters generated from the selected time domain technique to enable the recorded sound field to be reconstructed.
The method may include analysing the recorded sound field in the time domain using plane-wave analysis according to a set of basis plane- waves to produce a set of plane-wave signals, gplw.cs (t) . The method may include analysing the recorded sound field in the HOA domain to obtain a vector of HOA-domain time signals, bH0A (t) , and sampling the vector of HOA-domain time signals over a given time frame, L, to obtain a collection of time samples at time instances t, to tN to obtain a set of HOA-domain vectors at each time instant: b
Figure imgf000011_0005
expressed as a matrix, B
Figure imgf000011_0001
The method may include computing a correlation vector,
Figure imgf000012_0006
where bomni is an omni-directional HOA-component of
Figure imgf000012_0012
In a first embodiment of the sixth aspect, the method may include solving the following convex programming problem for a vector of plane- wave gains,
Figure imgf000012_0011
minimise
Figure imgf000012_0004
subject to
Figure imgf000012_0002
where:
Figure imgf000012_0005
TTpiwHOA is a transfer matrix between the plane-waves and the HOA-domain
Fourier expansion,
ει is a non-negative real number.
In a second embodiment, of the sixth aspect, the method may include solving the following convex programming problem for a vector of plane-wave gains, pplw.os :
minimise subject to
Figure imgf000012_0003
Figure imgf000012_0001
where:
γ = Β HOA^omni '
^plw/HOA is a transfer matrix between the plane-waves and the HOA-domain
Fourier expansion,
εχ is a non-negative real number,
ε2 is a non-negative real number.
The method may include setting ελ based on the resolution of the spatial division of a set of directions corresponding to the set of analysis plane-waves and setting the value of ε2 based on the computed sparsity of the sound field. The method may include thresholding and cleaning pplw.cs to set some of its small components to zero.
The method may include forming a matrix,
Figure imgf000012_0009
according to the plane- wave basis and then reducing Y to by keeping only the columns
Figure imgf000012_0008
Figure imgf000012_0007
corresponding to the non-zero components in where YP1W.HOA is an HOA
Figure imgf000012_0010
direction matrix for the plane-wave basis and the hat-operator on A indicates it
Figure imgf000013_0008
has been truncated to some HO A-order M.
The method may include computing
Figure imgf000013_0007
(0 as
Figure imgf000013_0006
(0 ( ) (0 - Further' ™e method may include expanding
Figure imgf000013_0009
( ) to obtain gplw.cs (t) by inserting rows of time signals of zeros so that ( ) matches the plane-wave basis.
Figure imgf000013_0010
In a third embodiment of the sixth aspect, the method may include solving the following convex programming problem for a matrix
Figure imgf000013_0011
Figure imgf000013_0005
where Yplw is a matrix (truncated to a high spherical harmonic order) whose columns are the values of the spherical harmonic functions for the set of directions corresponding to some set of analysis plane waves, and
ελ is a non-negative real number.
The method may include obtaining an unmixing matrix, HL , for the L-th time frame, by calculating:
Figure imgf000013_0001
where
n , , refers to the unmixing matrix for the L-l time frame and
a is a forgetting factor such that 0≤ a≤ 1 .
In a fourth embodiment of the fifth aspect, the method may include applying singular value decomposition to BHOAto obtain a matrix decomposition:
Figure imgf000013_0004
The method may include forming a matrix Sreduced by keeping only the first m columns of S , where m is the number of rows of BH0A and forming a matrix, Ω , given by
Figure imgf000013_0003
The method may include solving the following convex programming problem for a matrix Γ :
m
s
Figure imgf000013_0002
where εχ and YpIw are as defined above.
The method may include obtaining Gplw from Γ using:
Figure imgf000014_0005
where VT is obtained from the matrix decomposition of
Figure imgf000014_0014
The method may include obtaining an unmixing matrix, Π, , for the I-th time frame, by calculating:
Figure imgf000014_0002
where;
Π^_, is an unmixing matrix for the L-\ time frame,
or is a forgetting factor such that 0 < a < 1 .
The method may include obtaining Gplw.sraooth using:
Figure imgf000014_0001
The method may include obtaining the vector of plane-wave signals, ,
Figure imgf000014_0010
from the collection of plane-wave time samples, Gplw.sm00th , using standard overlap-add techniques. Instead when obtaining the vector of plane-wave signals the
Figure imgf000014_0009
method may include obtaining,
Figure imgf000014_0006
from the collection of plane-wave time samples, without smoothing using standard overlap-add techniques. The method
Figure imgf000014_0007
may include identifying source directions with each entry of
Figure imgf000014_0008
The method may include modifying gp]w.cs (t) to reduce unwanted artifacts such as reverberations and/or unwanted sound sources. Further, the method may include, to reduce reverberations, reducing the signal values of some of the signals in the signal vector, The method may include, to separate sound sources in the sound
Figure imgf000014_0011
field reconstruction so that the unwanted sound sources can be reduced, setting to zero some of the signals in the signal vector,
Figure imgf000014_0012
Further, the method may include modifying dependent on the means
Figure imgf000014_0013
of playback of the reconstructed sound field. When the reconstructed sound field is to be played back over loudspeakers, in one embodiment, the method may include modifying as follows:
Figure imgf000014_0017
Figure imgf000014_0003
where:
is a loudspeaker panning matrix.
Figure imgf000014_0016
In another embodiment, when the reconstructed sound field is to be played back over loudspeakers, the method may include converting back to the HOA-
Figure imgf000014_0015
domain by computing:
Figure imgf000014_0004
where ) is a high-resolution HOA-domain representation of gpIw.cs (i)
Figure imgf000015_0014
capable of expansion to arbitrary HOA-domain order, where
Figure imgf000015_0015
P *s an HOA direction matrix for a plane- wave basis and the hat-operator on
Figure imgf000015_0016
p A indicates it has been truncated to some HOA-order M. The method may include decoding
O using HOA decoding techniques.
Figure imgf000015_0010
When the reconstructed sound field is to be played back over headphones, the method may include modifying ) to determine headphone gains as follows:
Figure imgf000015_0012
Figure imgf000015_0003
where:
0 is a head-related impulse response matrix of filters corresponding to
Figure imgf000015_0001
the set of plane wave directions.
In a seventh aspect, the method may include using time-domain techniques of Independent Component Analysis (ICA) in the HOA-domain to analyse recorded data in a sparse domain, and obtaining parameters from the selected time domain technique to enable the recorded sound field to be reconstructed.
The method may include analysing the recorded sound field in the HOA-domain to obtain a vector of HOA-domain time signals The method may include
Figure imgf000015_0013
analysing the HOA-domain time signals using ICA signal processing to produce a set of plane-wave source signals,
Figure imgf000015_0002
In a first embodiment of the seventh aspect, the method may include computing from bH0A (t) a mixing matrix,
Figure imgf000015_0017
using signal processing techniques. The method may include using instantaneous Independent Component Analysis as the signal processing technique. The method may include projecting the mixing matrix, MICA , on the HOA direction vectors associated with a set of plane-wave basis directions by computing is the transpose (Hermitian conjugate)
Figure imgf000015_0005
Figure imgf000015_0006
of the real- value (complex- valued) HOA direction matrix associated with the plane- wave basis and the hat-operator on indicates it has been truncated to some
Figure imgf000015_0009
HOA-order M.
The method may include using thresholding techniques to identify the columns of Vsource that indicate a dominant source direction. These columns may be identified on the basis of having a single dominant component.
The method may include reducing the matrix Y
Figure imgf000015_0007
to obtain a matrix, bY removing the plane-wave direction vectors in Y
Figure imgf000015_0008
pI that do not
Figure imgf000015_0011
correspond to dominant source directions associated with matrix V
Figure imgf000015_0004
The method may include estimating gpiw-ica.reduoed (0 as gpiw.ica.reduoed (0 = Pinv(Ypiw-HOA-reduoed ) BHOA(0- Instead, the method may include estimating gpiw-ica.reduced (t) by working in the frequency domain and computing smic as the FFT of smi0 (t) .
The method may include, for each frequency, reducing a transfer matrix, between the plane-waves and the microphones, Tplw/mio , to obtain a matrix, Tplw/mic.reduced , by removing the columns in Tplw/mic that do not correspond to dominant source directions associated with matrix Vsource .
The method may include estimating d by computing:
Figure imgf000016_0006
( ) Md transforming gplw,ca,educei back to the time-
Figure imgf000016_0007
domain using an inverse FFT to obtain gpIw-ica.reduced (t) · The method may include expanding gplw.ica.reduoed (t) to obtain gplw.ica (t) by inserting rows of time signals of zeros so that gplw.ica (t) matches the plane-wave basis.
In a second embodiment of the seventh aspect, the method may include computing from bH0A a mixing matrix, MICA , and a set of separated source signals, gica (t) using signal processing techniques. The method may include using instantaneous Independent Component Analysis as the signal processing technique. The method may include projecting the mixing matrix, MICA , on the HOA direction vectors associated with a set of plane-wave basis directions by computing ¾ where is the transpose (Hermitian conjugate) of the real-
Figure imgf000016_0001
Figure imgf000016_0002
value (complex-valued) HOA direction matrix associated with the plane-wave basis and the hat-operator on Y indicates it has been truncated to some HOA-order M.
Figure imgf000016_0005
The method may include using thresholding techniques to identify from Vsource the dominant plane- wave directions. Further, the method may include cleaning gioa (t) to obtain g which retains the signals corresponding to the dominant plane-wave
Figure imgf000016_0004
directions in and sets the other signals to zero.
Figure imgf000016_0003
The method may include modifying gplw-ioa (t) to reduce unwanted artifacts such as reverberations and/or unwanted sound sources. The method may include, to reduce reverberations, reducing the signal values of some of the signals in the signal vector, gpiw.ica (t) · Further, the method may include, to separate sound sources in the sound field reconstruction so that the unwanted sound sources can be reduced, setting to zero some of the signals in the signal vector,
Figure imgf000016_0008
Still further, the method may include modifying gpUv.ica (t) dependent on the means of playback of the reconstructed sound field. When the reconstructed sound field is to be played back over loudspeakers, in one embodiment the method may include modifying gplw.ica (t) as follows:
Figure imgf000017_0003
where:
Ppiw/spk is a loudspeaker panning matrix.
In another embodiment, when the reconstructed sound field is to be played back over loudspeakers, the method may include converting gplw.ioa (t) back to the HOA- domain by computing:
Figure imgf000017_0002
where:
bHOA-highres (0 is a high-resolution HOA-domain representation of
Figure imgf000017_0005
capable of expansion to arbitrary HOA-domain order,
Ypiw-HOA is an HOA direction matrix for a plane- wave basis and the hat-operator on Ypiw-HOA indicates it has been truncated to some HOA-order M.
The method may include decoding using HOA
Figure imgf000017_0004
decoding techniques.
When the reconstructed sound field is to be played back over headphones, the method may include modifying gplw-cs (t) to determine headphone gains as follows:
Figure imgf000017_0001
where:
Ppiw/hph (t) is a head-related impulse response matrix of filters corresponding to the set of plane wave directions.
The disclosure extends to a computer when programmed to perform the method as described above.
The disclosure also extends to a computer readable medium to enable a computer to perform the method as described above. Brief Description of Drawings
Fig. 1 shows a block diagram of an embodiment of equipment for reconstructing a recorded sound field and also estimating the sparsity of the recorded sound field;
Figs. 2-5 show flow charts of the steps involved in estimating the sparsity of a recorded sound field using the equipment of Fig. 1; Figs. 6-23 show flow charts of embodiments of reconstructing a recorded sound field using the equipment of Fig. 1 ;
Figs. 24A-24C show a first example of, respectively, a photographic representation of an HOA solution to reconstructing a recorded sound field, the original sound field and the solution offered by the present disclosure; and
Figs. 25A-25C show a second example of, respectively, a photographic representation of an HOA solution to reconstructing a recorded sound field, the original sound field and the solution offered by the present disclosure. Detailed Description of Exemplary Embodiments
In Fig. 1 of the drawings, reference numeral 10 generally designates an embodiment of equipment for reconstructing a recorded sound field and/or estimating the sparsity of the sound field. The equipment 10 includes a sensing arrangement 12 for measuring the sound field to obtain recorded data. The sensing arrangement 12 is connected to a signal processing module 14, such as a microprocessor, which processes the recorded data to obtain plane-wave signals enabling the recorded sound field to be reconstructed and/or processes the recorded data to obtain the sparsity of the sound field. The sparsity of the sound field, the separated plane-wave sources and their associated source directions are provided via an output port 24. The signal processing module 14 is referred to below, for the sake of conciseness, as the SPM 14.
A data accessing module 16 is connected to the SPM 14. In one embodiment the data accessing module 16 is a memory module in which data are stored. The SPM 14 accesses the memory module to retrieve the required data from the memory module as and when required. In another embodiment, the data accessing module 16 is a connection module, such as, for example, a modem or the like, to enable the SPM 14 to retrieve the data from a remote location.
The equipment 10 includes a playback module 18 for playing back the reconstructed sound field. The playback module 18 comprises a loudspeaker array 20 and/or one or more headphones 22.
The sensing arrangement 12 is a baffled spherical microphone array for recording the sound field to produce recorded data in the form of a set of signals, smic (T ).
The SPM 14 analyses the recorded data relating to the sound field using plane- wave analysis to produce a vector of plane-wave signals, gplw (t) . Producing the vector of plane- wave signals, gplw (t) , is to be understood as also obtaining the associated set of pale-wave source directions. Depending on the particular method used to produce the vector of plane wave amplitudes, gpIw (i) is referred to more specifically as §piw-os (0 if Compressed Sensing techniques are used or ica ( ) if IC A techniques are
Figure imgf000019_0009
used. As will be described in greater detail below, the SPM 14 is also used to modify gpiw (ή > if desired.
Once the SPM 14 has performed its analysis, it produces output data for the output port 24 which may include the sparsity of the sound field, the separated plane- wave source signals and the associated source directions of the plane-wave source signals. In addition, once the SPM 14 has performed its analysis, it generates signals, sout (t) , for rendering the determined gplw (t) as audio to be replayed over the loudspeaker array 20 and/or the one or more headphones 22.
The SPM 14 performs a series of operations on the set of signals, smic (t) , after the signals have been recorded by the microphone array 12, to enable the signals to be reconstructed into a sound field closely approximating the recorded sound field.
In order to describe the signal processing operations concisely, a set of matrices that characterise the microphone array 12 are defined. These matrices may be computed as needed by the SPM 14 or may be retrieved as needed from data storage using the data accessing module 16. When one of these matrices is referred to, it will be described as "one of the defined matrices".
The following is a list of Defined Matrices that may be computed or retrieved as required:
is a transfer matrix between the spherical harmonic domain and the
Figure imgf000019_0008
microphone signals, the matrix Tsph/mic being truncated to order M, as: where:
Figure imgf000019_0002
is the transpose of the matrix whose columns are the values of the spherical harmonic functions, Y where ( are the spherical coordinates for the 1-
Figure imgf000019_0005
Figure imgf000019_0003
th microphone and the hat-operator on indicates it has been truncated to some
Figure imgf000019_0006
order M; and
is the diagonal matrix whose coefficients are defined by
Figure imgf000019_0007
^ where R is the radius of the sphere of the
Figure imgf000019_0001
microphone array, is the spherical Hankel function of the second kind of order m, jm is the spherical Bessel function of order m, j'm and
Figure imgf000019_0004
are the derivatives of jm and ¾2) , respectively. Once again, the hat-operator on Wmic indicates that it has been truncated to some order M.
Tsph/mi0 is similar to Tsph/mio except it has been truncated to a much higher order
M' with (M' > M) .
YpIw is the matrix (truncated to the higher order M' ) whose columns are the values of the spherical harmonic functions for the set of directions corresponding to some set of analysis plane waves.
Yplw is similar to Yplw except it has been truncated to the lower order
M with (M < M') .
Tpiw/HOA *s a transfer matrix between the analysis plane waves and the HOA- estimated spherical harmonic expansion (derived from the microphone array 12) as:
Figure imgf000020_0001
Tpiw/mic is a transfer matrix between the analysis plane waves and the microphone array 12 as:
Figure imgf000020_0002
where:
Tsph/mic is as defined above.
Emic/HOA (7) is a matrix of filters that implements, via a convolution operation, that transformation between the time signals of the microphone array 12 and the HOA- domain time signals and is defined as:
Figure imgf000020_0003
where:
each frequency component of Eraic/H0A (ω) is given by Emic/H0A = pinv(Tsph/mic ) . The operations performed on the set of signals, smio (t) , are now described with reference to the flow charts illustrated in Figs. 2-16 of the drawings. The flow chart shown in Fig. 2 provides an overview of the flow of operations to estimate the sparsity, S, of a recorded sound field. This flow chart is broken down into higher levels of detail in Figs. 3-5. The flow chart shown in Fig. 6 provides an overview of the flow of operations to reconstruct a recorded sound field. The flow chart of Fig. 6 is broken down into higher levels of detail in Figs. 7-16.
The operations performed on the set of signals, smjc (t) , by the SPM 14 to determine the sparsity, S, of the sound field is now described with reference to the flow charts of Figs.2-5. In Fig. 2, at Step 1, the microphone array 12 is used to record a set of signals, smic (t) . At Step 2, the SPM 14 estimates the sparsity of the sound field. The flow chart shown in Fig. 3 describes the details of the calculations for Step 2. At Step 2.1, the SPM 14 calculates a vector of HOA-domain time signals bH0A (t) as:
Figure imgf000021_0001
At Step 2.2, there are two different options available: Step 2.2.A and Step 2.2.B.
At Step 2.2.A, the SPM 14 estimates the sparsity of the sound field by applying ICA in the HOA-domain. Instead, at Step 2.2.B the SPM 14 estimates the sparsity of the sound field using a Compressed Sampling technique.
The flow chart of Fig. 4 describes the details of Step 2.2. A. At Step 2.2.A.1, the SPM 14 determines a mixing matrix, MICA , using Independent Component Analysis techniques.
At Step 2.2.A.2, the SPM 14 projects the mixing matrix, MICA , on the HOA direction vectors associated with a set of plane-wave basis directions. This projection is obtained by computing V where is the transpose of the Defined
Figure imgf000021_0003
Figure imgf000021_0004
Matrix Yplw .
At Step 2.2.A.3, the SPM 14 applies thresholding techniques to clean Vsource in order to obtain Vsource.olean . The operation of cleaning Vsource occurs as follows. Firstly, the ideal format for Vsource is defined. Vsource is a matrix which is ideally composed of columns which either have all components as zero or contain a single dominant component corresponding to a specific plane wave direction with the rest of the column's components being zero. Thresholding techniques are applied to ensure that Vsouree takes its ideal format. That is to say, columns of Vsource which contain a dominant value compared to the rest of the column's components are thresholded so that all components less than the dominant component are set to zero. Also, columns of Vsource which do not have a dominant component have all of its components set to zero.
Applying the above thresholding operations to Vsource gives Vsource.clean .
At Step 2.2. A.4, the SPM 14 computes the sparsity of the sound field. It does this by calculating the number, Nsource , of dominant plane wave directions in Vsource.olean .
The SPM 14 then computes the sparsity, S, of the sound field as where
Figure imgf000021_0002
N lw is the number of analysis plane-wave basis directions.
The flow chart of Fig. 5 describes the details of Step 2.2.B in Fig. 3, step 2.2.B being an alternative to Step 2.2. A. At Step 2.2.B.1, the SPM 14 calculates the matrix BH0A from the vector of HOA signals bH0A (t) by setting each signal in bH0A (t) to run along the rows of BH0A so that time runs along the rows of the matrix BH0A and the various HOA orders run along the columns of the matrix BH0A . More specifically, the SPM 14 samples bH0A (t) over a given time frame, labelled by L, to obtain a collection of time samples at the time instances to tN . The SPM 14 thus obtains a set of HOA- domain vectors at each time instant: . The SPM 14
Figure imgf000022_0005
then forms the matrix, BH0A by:
Figure imgf000022_0002
At Step 2.2.B.2, the SPM 14 calculates a correlation vector, γ , as
Figure imgf000022_0003
where bomni is the omni-directional HOA-component of expressed as a
Figure imgf000022_0008
column vector.
At Step 2.2.B.3, the SPM 14 solves the following convex programming problem to obtain the vector of plane-wave gains, Pplw.cs :
Figure imgf000022_0001
where Tplw/H0A is one of the defined matrices and εχ is a non-negative real number.
At Step 2.2.B.4, the SPM 14 estimates the sparsity of the sound field. It does this by applying a thresholding technique to Pplw.cs in order to estimate the number,
Ncomp , of its Dominant Components. The SPM 14 then computes the sparsity, S, of the sound field as where N is the number of analysis plane-wave basis
Figure imgf000022_0007
Figure imgf000022_0004
directions.
The operations performed on the set of signals,
Figure imgf000022_0006
by the SPM 14 to reconstruct the sound field is now described and is illustrated using the flow charts of Figs.6-23.
In Fig. 6, Step 1 and Step 2 are the same as in the flow chart of Fig. 2 which has been described above. However, in the operational flow of Fig. 6, Step 2 is optional and is therefore represented by a dashed box.
At Step 3, the SPM 14 estimates the parameters, in the form of plane-wave signals gplw (t) , that allow the sound field to be reconstructed. The plane-wave signals are expressed either as depending on the method of
Figure imgf000022_0010
Figure imgf000022_0009
derivation. At Step 4 there is an optional step (represented by a dashed box) in which the estimated parameters are modified by the SPM 14 to reduce reverberation and/or separate unwanted sounds. At Step 5, the SPM 14 estimates the plane-wave signals,
Figure imgf000023_0005
( ),(possibly modified) that are used to reconstruct and play back the sound field.
The operations of Step 1 and Step 2 having been previously described, the flow of operations contained in Step 3 are now described.
The flow chart of Fig. 7 provides an overview of the operations required for Step 3 of the flow chart shown in Fig. 6. It shows that there are four different paths available: Step 3. A, Step 3.B, Step 3.C and Step 3.D.
At Step 3. A, the SPM 14 estimates the plane- wave signals using a Compressive
Sampling technique in the time-domain. At Step 3.B, the SPM 14 estimates the plane- wave signals using a Compressive Sampling technique in the frequency-domain. At Step 3.C, the SPM 14 estimates the plane- wave signals using ICA in the HOA-domain. At Step 3.D, the SPM 14 estimates the plane-wave signals using Compressive Sampling in the time domain using a multiple measurement vector technique.
The flow chart shown in Fig. 8 describes the details of Step 3. A. At Step 3.A.1 bHOA (t) and BH0A are determined by the SPM 14 as described above for Step 2.1 and
Step 2.2.B.1, respectively.
At Step 3.A.2 the correlation vector, γ , is determined by the SPM 14 as described above for Step 2.2.B.2.
At Step 3.A.3 there are two options: Step 3.A.3.A and Step 3.A.3.B. At Step
3.A.3.A, the SPM 14 solves a convex programming problem to determine plane-wave direction gains, pplw-cs . This convex programming problem does not include a sparsity constraint. More specifically, the following convex programming problem is solved:
Figure imgf000023_0001
where:
γ is as defined above and is one of the Defined Matrices,
Figure imgf000023_0004
ελ is a non-negative real number.
At Step 3.A.3.B, the SPM 14 solves a convex programming problem to determine plane-wave direction gains, , only this time a sparsity constraint is
Figure imgf000023_0003
included in the convex programming problem. More specifically, the following convex programming problem is solved to determine
Figure imgf000023_0002
where:
Figure imgf000024_0001
γ , ε1 are as defined above ,
Tpiw/HOA is one °f me Defined Matrices, and
ε2 is a non-negative real number.
For the convex programming problems at Step 3.A.3, εχ may be set by the SPM
14 based on the resolution of the spatial division of a set of directions corresponding to the set of analysis plane waves. Further, the value of ε2 may be set by the SPM 14 based on the computed sparsity of the sound field (optional Step 2).
At Step 3.A.4, the SPM 14 applies thresholding techniques to clean p_lw.os so that some of its small components are set to zero.
At Step 3.A.5, the SPM 14 forms a matrix, ΥΡ1ΝΝ.Η0Α 5 according to the plane- wave basis and then reduces Yplw-H0A to Ypiw-reduce_ by keeping only the columns corresponding to the non-zero components in Pplw.cs , where YP1W.HOA ^s an HOA direction matrix for the plane-wave basis and the hat-operator on YPIW.H0A indicates it has been truncated to some HO A-order M.
At Step 3.A.6, the SPM 14 calculates gpIw-0,reduced (i) as:
Figure imgf000024_0002
where YPLW.REDUCED an
At Step 3.A.7, the SPM 14 expands gplw.c,reduced (t) to obtain gpIw,s (t) by inserting rows of time signals of zeros to match the plane-wave basis that has been used for the analyses.
As indicated, above, an alternative to Step 3. A is Step 3.B. The flow chart of Fig. 9 details Step 3.B. At Step 3.B.1, the SPM 14 computes bH0A (t) as bH0A (t) = EMIC/H0A (t) ® smi0 (t) . Further, at Step 3 JB.1 , the SPM 14 calculates a FFT, smio . of smic (0 and/or a FFT, bHOA , of bH0A (t) .
At Step 3.B.2, the SPM 14 solves one of four optional convex programming problems. The convex programming problem shown at Step 3.B.2.A operates on smjo and does not use a sparsity constraint. More precisely, the SPM 14 solves the following convex programming problem to determine
Figure imgf000025_0005
Figure imgf000025_0003
where:
Tplw/mic is one of the Defined Matrices,
smic is as defined above, and
ελ is a non-negative real number.
The convex programming problem shown at Step 3.B.2.B operates on smjo and includes a sparsity constraint. More precisely, the SPM 14 solves the following convex programming problem to determine gplw-cs :
where:
Figure imgf000025_0001
Tpiw/mio ' Tpiw/HOA are each one of the Defined Matrices,
smic ' ^HOA Ε\ 3X6 as defined above, and
ε2 is a non-negative real number.
The convex programming problem shown at Step 3.B.2.C operates on b and does not use a sparsity constraint. More precisely, the SPM 14 solves the following convex programming problem to determine
Figure imgf000025_0004
Figure imgf000025_0002
where:
are each one of tne Defined Matrices, and
Figure imgf000025_0006
bH0A , and ε1 are as defined above. The convex programming problem shown at Step 3.B.2.D operates on bH0A and includes a sparsity constraint. More precisely, the SPM 14 solves the following convex programming problem to determine gplw.cs :
Figure imgf000026_0001
where:
Figure imgf000026_0003
are each one of the Defined Matrices, and
bH0A , ε1 , and ε2 are as defined above.
At Step 3.B.3, the SPM 14 computes an inverse FFT of gplw.cs to obtain gpiw-cs (0 · When operating on multiple time frames, overlap-and-add procedures are followed.
A further option to Step 3.A or Step 3.B is Step 3.C. The flow chart of Fig. 10 provides an overview of Step 3.C. At Step 3.C.1, the SPM 14 computes bH0A (t) as
Figure imgf000026_0002
At Step 3.C.2 there are two options, Step 3.C.2.A and Step 3.C.2.B. At Step 3.C.2.A, the SPM 14 uses ICA in the HOA-domain to estimate a mixing matrix which is then used to obtain gplw-ica (t) . Instead, at Step 3.C.2.B, the SPM 14 uses ICA in the
HOA-domain to estimate a mixing matrix and also a set of separated source signals.
Both the mixing matrix and the separated source signals are then used by the SPM 14 to obtain gplw-ica (t ) .
The flow chart of Fig. 11 describes the details of Step 3.C.2.A. At Step
3.C.2.A.1, the SPM 14 applies ICA to the vector of signals bH0A (t) to obtain the mixing matrix, MICA .
At Step 3.C.2.A.2, the SPM 14 projects the mixing matrix, MICA , on the HOA direction vectors associated with a set of plane-wave basis directions as described at Step 2.2.A.2. That is to say, the projection is obtained by computing Vsouroe = YjwMICA , where Yplw is the transpose of the defined matrix Yplw .
At Step 3.C.2.A.3, the SPM 14 applies thresholding techniques to Vsource to identify the dominant plane-wave directions in Vsouroe . This is achieved similarly to the operation described above with reference to Step 2.2.A.3. At Step 3.C.2.A.4, there are two options, Step 3.C.2.A.4.A and Step 3.C.2.A.4.B. At Step 3.C.2.A.4.A, the SPM 14 uses the HOA domain matrix, ¾w , to compute ) Instead, at Step 3.C.2.A.4.B, the SPM 14 uses the
Figure imgf000027_0006
microphone signals smic (t) and the matrix TplWmioto compute
Figure imgf000027_0007
)
The flow chart of Fig. 12 describes the details of Step 3.C.2.A.4.A. At Step
3.C.2.A.4.A.1, the SPM 14 reduces the matrix Yjw to obtain the matrix, Yjw.reduced , by removing the plane-wave direction vectors in Yplw that do not correspond to dominant source directions associated with matrix Vsource .
At Step 3.C.2.A.4.A.2, the SPM 14 calculates gplw-ica.reduced (f ) as:
Figure imgf000027_0001
where Yplw.reduced and bH0A (t) are as defined above.
An alternative to Step 3.C.2.A.4.A, is Step 3.C.2.A.4.B. The flow chart of Fig.
13 details Step 3.C.2.A.4.B.
At Step 3.C.2.A.4.B.1, the SPM 14 calculates a FFT, s At Step
Figure imgf000027_0008
3.C.2.A.4.B.2, the SPM 14 reduces the matrix Tplw/mio to obtain the matrix, T
Figure imgf000027_0009
by removing the plane- wave directions in Tplw/miothat do not correspond to dominant source directions associated with matrix Vsource .
At Step 3.C.2.A.4.B.3, the SPM 14 calculates gplw,ca,educed as:
Figure imgf000027_0002
)
where Tplw/mic.reduced and smio are as defined above.
At Step 3.C.2.A.4.B.4, the SPM 14 calculates gplw-ioa-reduced (f) as the IFFT of
Splw-ica-reduced '
Reverting to Fig. 11, at Step 3.C.2.A.5, the SPM 14 expands
Figure imgf000027_0010
to obtain gplw.ica (t) by inserting rows of time signals of zeros to match the plane-wave basis that has been used for the analyses.
An alternative to Step 3.C.2.A is Step 3.C.2.B. The flow chart of Fig. 14 describes the details of Step 3.C.2.B.
At Step 3.C.2.B.1, the SPM 14 applies ICA to the vector of signals bH0A (t) to obtain the mixing matrix, MICA , and a set of separated source signals gica (t) .
At Step 3.C.2.B.2, the SPM 14 projects the mixing matrix, MICA , on the HOA direction vectors associated with a set of plane-wave basis directions as described for Step 2.2. A.2, i.e the projection is obtained by computing
Figure imgf000027_0003
> where
Figure imgf000027_0004
is the transpose of the defined matrix
Figure imgf000027_0005
At Step 3.C.2.B.3, the SPM 14 applies thresholding techniques to Vsource to identify the dominant plane-wave directions in Vsource . This is achieved similarly to the operation described above for Step 2.2.A.3. Once the dominant plane-wave directions in Vsource have been identified, the SPM 14 cleans gioa (t) to obtain gplw-ica (t) which retains the signals corresponding to the dominant plane- wave directions Vsource and sets the other signals to zero.
As described above, a further option to Steps 3. A, 3.B and 3.C, is Step 3.D. The flow chart of Figure 15 provides an overview of Step 3.D.
At Step 3.D.1, the SPM 14 computes bH0A (t) as
The SPM 14 then calculates the matrix, BH0A , from
Figure imgf000028_0003
the vector of HOA signals bH0A (t) by setting each signal in bH0A (t) to run along the rows of BH0A so that time runs along the rows of the matrix BH0A and the various HOA orders run along the columns of the matrix BH0A . More specifically, the SPM 14 samples bH0A (t) over a given time frame, L, to obtain a collection of time samples at the time instances t1 to tN . The SPM 14 thus obtains a set of HOA-domain vectors at each time instant: bH0A (tj) , bH0A (t2) , . . ., bH0A (tN) . The SPM 14 forms the matrix, BHOA by:
Figure imgf000028_0002
At Step 3.D.2 there are two options, Step 3.D.2.A and Step 3.D.2.B. At Step 3.D.2.A, the SPM 14 computes gp!w-cs using a multiple measurement vector technique applied directly on BH0A . Instead at Step 3.D.2.B, the SPM 14 computes gplw.os using a multiple measurement vector technique based on the singular value decomposition of
The flow chart of Fig. 16 describes the details of Step 3.D.2.A. At Step 3.D.2.A.1, the SPM 14 solves the following convex programming problem to determine GPLW :
Figure imgf000028_0001
where:
YPLW is one of the Defined Matrices,
BH0A is as defined above, and
ε1 is a non-negative real number.
At Step 3.D.2.A.2, there are two options, i.e. Step 3.D.2.A.2.A and Step 3.D.2.A.2.B. At Step 3.D.2.A.2.A, the SPM 14 computes gplw.cs(t) directly from Gplw using an overlap-add technique. Instead at Step 3.D.2.A.2.B, the SPM 14 computes gplw.os(t) using a smoothed version of Gplw and an overlap-add technique.
The flow chart of Fig. 17 describes Step 3.D.2.A.2.B in greater detail.
At Step 3.D.2.A.2.B.1, the SPM 14 calculates an unmixing matrix, TlL , for the Z-th time frame, by calculating:
Figure imgf000029_0005
where ΠΙ_1 refers to the unmixing matrix for the L-l time frame and « is a forgetting factor such that 0 < a≤ 1 , and BH0A is as defined above.
At Step 3.D.2.A.2.B.2, the SPM 14 calculates Gplw.sra00th as:
Figure imgf000029_0004
where HL and BH0A are as defined above.
At Step 3.D.2.A.2.B.3, the SPM 14 calculates gplw-os(0 from Gplw.smooth using an overlap-add technique.
An alternative to Step 3.D.2.A is Step 3.D.2.B. The flow chart of Fig. 18 describes the details of Step 3.D.2.B.
At Step 3.D.2.B.1, the SPM 14 computes the singular value decomposition of BHOA to obtain the matrix decomposition:
Figure imgf000029_0003
At Step 3.D.2.B.2, the SPM 14 calculates the matrix, Sreduced , by keeping only the first m columns of S , where m is the number of rows of BH0A .
At Step 3.D.2.B.3, the SPM 14 calculates matrix Ω as:
^ = USreduced .
At Step 3.D.2.B.4, the SPM 14 solves the following convex programming problem for matrix Γ :
minimize
subject to
Figure imgf000029_0001
where:
YpIw is one of the defined matrices,
Ω is as defined above, and
εχ is a non-negative real number.
At Step 3.D.2.B.5, there are two options, Step 3.D.2.B.5.A and Step 3.D.2.B.5.B. At Step 3.D.2.B.5.A, the SPM 14 calculates G lw from Γ using:
Figure imgf000029_0002
where VT is obtained from the matrix decomposition of BH0A as described above. The SPM 14 then computes ) directly from w using an overlap-add technique.
Figure imgf000030_0006
Figure imgf000030_0007
Instead, at Step 3.D.2.B.5.B, the SPM 14 calculates gplw.cs (0 using a smoothed version of Gp]w and an overlap-add technique.
The flow chart of Fig. 19 shows the details of Step 3.D.2.B.5.B.
At Step 3.D.2.B.5.B.1, the SPM 14 calculates at unmixing matrix, nL, for the i-th time frame, by calculating:
Figure imgf000030_0002
where ΠL_1 refers to the unmixing matrix for the L-l time frame and a is a forgetting factor such that 0 < a < 1 , and Γ and Ω are as defined above.
At Step 3.D.2.B.5.B.2, the SPM 14 calculates Gplw.smooth as:
Figure imgf000030_0001
where and A are as defined above.
Figure imgf000030_0003
At Step 3.D.2.B.2.B.3, the SPM 14 calculates gplw.cs (0 from Gplw.sm00th using an overlap-add technique.
As described above, an optional step of reducing unwanted artifacts is shown at
Step 4 of the flow chart of Fig. 6 The SPM 14 controls the amount of reverberation present in the sound field reconstruction by reducing the signal values of some of the signals in the signal vector Instead, or in addition, the SPM 14 removes
Figure imgf000030_0004
undesired sound sources in the sound field reconstruction by setting to zero some of the signals in the signal vector gplw (t) .
In Step 5 of the flow chart of Fig. 6, the parameters gplw (t) are used to play back the sound field. The flow chart of Fig. 20 shows three optional paths for play back of the sound field: Step 5. A, Step 5.B, and Step 5.C. The flow chart of Fig. 21 describes the details of Step 5. A.
At Step 5.A.1, the SPM 14 computes or retrieves from data storage the loudspeaker panning matrix, Pplw/spk , in order to enable loudspeaker playback of the reconstructed sound field over the loudspeaker array 20. The panning matrix, Ppiw/spk , can be derived using any of the various panning techniques such as, for example, Vector Based Amplitude Panning (VBAP). At Step 5.A.2, the SPM 14 calculates the loudspeaker signals
Figure imgf000030_0005
( ) ( ) ( )
Another option is shown in the flow chart of Fig. 22 which describes the details of Step 5.B. At Step 5.B.1, the SPM 14 computes bH0A.highres (t) in order to enable loudspeaker playback of the reconstructed sound field over the loudspeaker array 20. ^HOA-highres (0 *s a high-resolution HOA-domain representation of gplw (^) that is capable of expansion to an arbitrary HOA-domain order. The SPM 14 calculates kH0A.hjghres (0 as
Figure imgf000031_0001
where Yplw is one of the Defined Matrices and the hat-operator on Yplw indicates it has been truncated to some HOA-order M.
At Step 5.B.2, the SPM 14 decodes ) using HOA decoding techniques .
Figure imgf000031_0007
An alternative to loudspeaker play back is headphone play back. The operations for headphone play back are shown at Step 5.C of the flow chart of Fig. 20. The flow chart of Fig. 23 describes the details of Step 5.C.
At Step 5.C.1, the SPM 14 computes or retrieves from data storage the head- related impulse response matrix of filters, Ppiw/hph (0 > corresponding to the set of analysis plane wave directions in order to enable headphone playback of the reconstructed sound field over one or more of the headphones 22. The head-related impulse response (HRIR) matrix of filters, Pplw/hph (t) , is derived from HRTF measurements.
At Step 5.C.2, the SPM 14 calculates the headphone signals as
Figure imgf000031_0006
using a fllter convolution operation.
Figure imgf000031_0005
It will be appreciated by those skilled in the art that the basic HOA decoding for loudspeakers is given (in the frequency domain) by:
Figure imgf000031_0002
where:
Nspk is the number of loudspeakers,
Yspk is the transpose of the matrix whose columns are the values of the spherical harmonic functions, Y where are the spherical coordinates for the
Figure imgf000031_0004
Figure imgf000031_0003
k-th loudspeaker and the hat-operator on Yspk indicates it has been truncated to order M, and
bH0A is the play back signals represented in the HOA-domain.
The basic HOA decoding in three dimensions is a spherical-harmonic-based method that possesses a number of advantages which include the ability to reconstruct the sound field easily using various and arbitrary loudspeaker configurations. However, it will be appreciated by those skilled in the art that it also suffers from limitations related to both the encoding and decoding process. Firstly, as a finite number of sensors is used to observe the sound field, the encoding suffers from spatial aliasing at high frequencies (see N. Epain and J. Daniel, "Improving spherical microphone arrays," in the Proceedings of the AES 124th Convention, May 2008). Secondly, when the number of loudspeakers that are used for playback is larger than the number of spherical harmonic components used in the sound field description, one generally finds deterioration in the fidelity of the constructed sound field (see A. Solvang, "Spectral impairment of two dimensional higher-order ambisonics," in the Journal of the Audio Engineering Society, volume 56, April 2008, pp. 267-279).
In both cases, the limitations are related to the fact that an under-determined problem is solved using the pseudo-inverse method. In the case of the present disclosure, these limitations are circumvented in some instances using general principles of compressive sampling or ICA. With regard to compressive sampling, the applicants have found that using a plane- wave basis as a sparsity domain for the sound field and then solving one of the several convex programming problems defined above leads to a surprisingly accurate reconstruction of a recorded sound field. The plane wave description is contained in the defined matrix
Figure imgf000032_0002
The distance between the standard HOA solution and the compressive sampling solution may be controlled using, for example, the constraint When ε2 is zero, the compressive sampling solution
Figure imgf000032_0001
is the same as the standard HOA solution. The SPM 14 may dynamically set the value of ε2 according to the computed sparsity of the sound field.
With regard to applying ICA in the HO A-domain, the applicants have found that the application of statistical independence benefits greatly from the fact that the HOA- domain provides an instantaneous mixture of the recorded signals. Further, the application of statistical independence seems similar to compressive sampling in that it also appears to impose a sparsity on the solution.
As described above, it is possible to estimate the sparsity of the sound field using techniques of compressive sampling or techniques of ICA in the HO A-domain.
In Figs. 24 and 25 simulation results are shown that demonstrate the power of sound field reconstruction using the present disclosure. In the simulations, the microphone array 12 is a 4 cm radius rigid sphere with thirty two omnidirectional microphones evenly distributed on the surface of the sphere. The sound fields are reconstructed using a ring of forty eight loudspeakers with a radius of 1 m.
In the HOA case, the microphone gains are HOA-encoded up to order 4. The compressive sampling plane-wave analysis is performed using a frequency-domain technique which includes a sparsity constraint and using a basis of 360 plane waves evenly distributed in the horizontal plane. The values of ε1 and ε2 have been fixed to
10~3 and 2, respectively. In every case, the directions of the sound sources that define the sound field have been randomly chosen in the horizontal plane.
Example 1
Referring to Fig. 24, in this simulation four sound sources at 2 kHz were used. The HOA solution is shown in Fig. 24A; the original sound field is shown in Fig. 24B; and the solution using the technique of the present disclosure is shown in Fig. 24C. Clearly, the method as described performs better than a standard HOA method.
Example 2
Referring to Fig. 25, in this simulation twelve sound sources at 16kHz were used. As before, the HOA solution is shown in Fig. 25A; the original sound field is shown in Fig. 25B; and the solution using the technique of the present disclosure is shown in Fig. 25C. It will be appreciated by those skilled in the art, that the results for Figure 25 are obtained outside of the Shannon-
Nyquist spatial aliasing limit of the microphone array but still provide an accurate reconstruction of the sound field.
It is an advantage of the described embodiments that an improved and more robust reconstruction of a sound field is provided so that the sweet spot is larger; there is little, if any, degradation in the quality of the reconstruction when parameters defining the system are under-constrained; and the accuracy of the reconstruction improves as the number of the loudspeakers increases.
It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the disclosure as shown in the specific embodiments without departing from the scope of the disclosure as broadly described.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.

Claims

CLAIMS:
1. Equipment for reconstructing a recorded sound field, the equipment including a sensing arrangement for measuring the sound field to obtain recorded data; and
a signal processing module in communication with the sensing arrangement and which processes the recorded data for the purposes of at least one of (a) estimating the sparsity of the recorded sound field and (b) obtaining plane-wave signals and their associated source directions to enable the recorded sound field to be reconstructed.
2. The equipment of claim 1 in which the sensing arrangement comprises a microphone array.
3. The equipment of claim 2 in which the microphone array is one of a baffled array and an open spherical microphone array.
4. The equipment of any one of the preceding claims in which the signal processing module is configured to estimate the sparsity of the recorded data.
5. The equipment of any one of the preceding claims in which the signal processing module is configured to analyse the recorded sound field to obtain a set of plane-wave signals that separate the sources in the sound field and identify the source directions and allow the sound field to be reconstructed.
6. The equipment of claim 5 in which the signal processing module is configured to modify the set of plane-wave signals to reduce unwanted artifacts.
7. The equipment of any one of the preceding claims which includes a playback device for playing back the reconstructed sound field.
8. The equipment of claim 7 in which the signal processing module is operative to modify the recorded data depending on which playback device is to be used for playing back the reconstructed sound field.
9. A method of reconstructing a recorded sound field, the method including
analysing recorded data in a sparse domain using one of a time domain technique and a frequency domain technique; and obtaining plane-wave signals and their associated source directions generated from the selected technique to enable the recorded sound field to be reconstructed.
10. The method of claim 9 which includes recording a time frame of audio of the sound field to obtain the recorded data in the form of a set of signals, smic (t) , using an acoustic sensing arrangement.
11. The method of claim 9 or claim 10 which includes estimating the sparsity of the recorded sound field by applying ICA in an HOA-domain to calculate the sparsity of the recorded sound field.
12. The method of claim 11 which includes analysing the recorded sound field in the HOA domain to obtain a vector of HOA-domain time signals, bH0A (t) , and computing from bH0A (t) a mixing matrix, MICA , using signal processing techniques.
13. The method of claim 12 which includes projecting the mixing matrix, MICA , on the HOA direction vectors associated with a set of plane-wave basis directions by computing is the transpose (Hermitian conjugate)
Figure imgf000035_0002
of the real-value (complex-valued) HOA direction matrix associated with the plane- wave basis directions and the hat-operator on Y indicates it has been truncated to
Figure imgf000035_0003
an HOA-order M.
14. The method of claim 13 which includes estimating the sparsity, S, of the recorded data by first determining the number, Nsource , of dominant plane-wave directions represented by Vsource and then computing where Nplw is the
Figure imgf000035_0001
number of analysis plane- wave basis directions.
15. The method of claim 9 or claim 10 which includes estimating the sparsity of the recorded sound field by analysing recorded data using compressed sensing or convex optimization techniques to calculate the sparsity of the recorded sound field.
16. The method of claim 15 which includes analysing the recorded sound field in the HOA domain to obtain a vector of HOA-domain time signals, bH0A (t) , and sampling the vector of HOA-domain time signals over a given time frame, L, to obtain a collection of time samples at time instances tx to tN to obtain a set of HOA-domain vectors at each time instant w) expressed as a matrix,
Figure imgf000036_0007
BHOA by:
Figure imgf000036_0001
17. The method of claim 16 which includes applying singular value decomposition to BH0A to obtain a matrix decomposition:
Figure imgf000036_0004
18. The method of claim 17 which includes forming a matrix Sreduced by keeping only the first m columns of S , where m is the number of rows of BH0A and forming a matrix, Ω , given by
Figure imgf000036_0005
19. The method of claim 17 which includes solving the following convex programming problem for a matrix Γ :
Figure imgf000036_0002
where
Yplwis the matrix (truncated to a high spherical harmonic order) whose columns are the values of the spherical harmonic functions for the set of directions corresponding to some set of analysis plane waves,
and
£j is a non-negative real number.
20. The method of claim 19 which includes obtaining Gplw from Γ using:
Figure imgf000036_0006
where VT is obtained from the matrix decomposition of B
Figure imgf000036_0009
21. The method of claim 20 which includes obtaining an unmixing matrix,
Figure imgf000036_0008
for the X-th time frame, by calculating:
Figure imgf000036_0003
where;
is an unmixing matrix for the L-l time frame,
a is a forgetting factor such that 0≤ 1≤ 1 .
22. The method of claim 21 which includes obtaining using:
Figure imgf000037_0006
Figure imgf000037_0001
23. The method of claim 22 which includes obtaining the vector of plane-wave signals, , from the collection of plane- wave time samples, using
Figure imgf000037_0011
Figure imgf000037_0005
standard overlap-add techniques.
24. The method of claim 22 which includes obtaining, , from the collection
Figure imgf000037_0004
of plane- wave time samples, without smoothing using standard overlap-add
Figure imgf000037_0010
techniques.
25. The method of claim 24 which includes estimating the sparsity of the recorded data by first computing the number, of dominant components of gplw.os (t) and
Figure imgf000037_0007
then computing where is the number of analysis plane-wave basis
Figure imgf000037_0008
Figure imgf000037_0003
directions.
26. The method of any one of claims 9 to 25 which includes reconstructing the recorded sound field, using frequency-domain techniques to analyse the recorded data in the sparse domain; and obtaining the plane-wave signals from the frequency-domain techniques to enable the recorded sound field to be reconstructed.
27. The method of claim 26 which includes transforming the set of signals, smic (t) , to the frequency domain using an FFT to obtain smio .
28. The method of claim 27 which includes analysing the recorded sound field in the frequency domain using plane-wave analysis to produce a vector of plane-wave amplitudes, gplw.cs .
29. The method of claim 28 which includes conducting the plane-wave analysis of the recorded sound field by solving the following convex programming problem for the vector of plane-wave amplitudes,
Figure imgf000037_0009
Figure imgf000037_0002
where:
T lw/mjo is a transfer matrix between plane-waves and the microphones, smic is the set of signals recorded by the microphone array, and
εχ is a non-negative real number.
30. The method of claim 29 which includes conducting the plane-wave analysis of the recorded sound field by solving the following convex programming problem for the vector of plane-wave amplitudes, :
Figure imgf000038_0005
minimise
Figure imgf000038_0003
subject to
Figure imgf000038_0004
Figure imgf000038_0001
where:
Tpiwmio is a transfer matrix between the plane- waves and the microphones, smi0 is the set of signals recorded by the microphone array, and
ελ is a non-negative real number,
lplw/HOA is a transfer matrix between the plane-waves and the HOA-domain
Fourier expansion,
bH0A is a set of HOA-domain Fourier coefficients given by bH0A = Tmic/H0Asmic where Tmic/H0A is a transfer matrix between the microphones and the HOA-domain
Fourier expansion, and
ε2 is a non-negative real number.
31. The method of claim 28 which includes conducting the plane- wave analysis of the recorded sound field by solving the following convex programming problem for the vector of plane-wave amplitudes, gplw.os :
Figure imgf000038_0002
where:
plw/mic is a transfer matrix between plane-waves and the microphones, lmic/HOA is a transfer matrix between the microphones and the HOA-domain
Fourier expansion, bH0A is a set of HOA-domain Fourier coefficients given by
Figure imgf000039_0003
£·, is a non-negative real number.
32. The method of claim 28 which includes conducting the plane-wave analysis of the recorded sound field by solving the following convex programming problem for the vector of plane-wave amplitudes,
Figure imgf000039_0002
Figure imgf000039_0001
where:
Tpiw/mic is a transfer matrix between plane- waves and the microphones, ε1 is a non-negative real number,
lplw/HOA is a transfer matrix between the plane-waves and the HOA-domain
Fourier expansion,
bH0A is a set of HOA-domain Fourier coefficients given by
Figure imgf000039_0004
where Tmic/HOA is a transfer matrix between the microphones and the HOA-domain Fourier expansion, and
ε2 is a non-negative real number
33. The method of claim 32 which includes setting ελ based on the resolution of the spatial division of a set of directions corresponding to the set of analysis plane-waves and setting the value of ε2 based on the computed sparsity of the sound field.
34. The method of any one of claims 28 to 33 which includes transforming
Figure imgf000039_0005
back to the time-domain using an inverse FFT to obtain gplw-cs (t) .
35. The method of any one of claims 9 to 25 which includes using a time domain technique to analyse recorded data in the sparse domain and obtaining parameters generated from the selected time domain technique to enable the recorded sound field to be reconstructed.
36. The method of claim 35 which includes analysing the recorded sound field in the time domain using plane- wave analysis according to a set of basis plane- waves to produce a set of plane-wave signals, gplw-os (t) .
37. The method of claim 36 which includes analysing the recorded sound field in the HOA domain to obtain a vector of HOA-domain time signals, bH0A (t) , and sampling the vector of HOA-domain time signals over a given time frame, L, to obtain a collection of time samples at time instances tx to tN to obtain a set of HOA-domain vectors at each time instant: expressed as a matrix, BH0A by:
Figure imgf000040_0006
Figure imgf000040_0004
38. The method of claim 37 which includes computing a correlation vector, γ , as
wnere
Figure imgf000040_0011
*s an omni-directional HOA-component of
Figure imgf000040_0012
Figure imgf000040_0010
39. The method of claim 38 which includes solving the following convex programming problem for a vector of plane-wave gains, Pplw.cs :
minimise
Figure imgf000040_0007
|| ||
Figure imgf000040_0001
where:
Figure imgf000040_0008
Tpiw/HOA is a transfer matrix between the plane-waves and the HOA-domain
Fourier expansion,
εχ is a non-negative real number.
40. The method of claim 38 which includes solving the following convex programming problem for a vector of plane-wave gains,
Figure imgf000040_0009
minimise
Figure imgf000040_0005
subject to
Figure imgf000040_0003
)
to
Figure imgf000040_0002
where:
Figure imgf000041_0001
is a transfer matrix between the plane-waves and the HOA-domain
Figure imgf000041_0013
Fourier expansion,
εχ is a non-negative real number,
ε2 is a non-negative real number.
41. The method of claim 40 which includes setting εχ based on the resolution of the spatial division of a set of directions corresponding to the set of analysis plane-waves and setting the value of ε2 based on the computed sparsity of the sound field.
42. The method of claim 40 or claim 41 which includes thresholding and cleaning Ppiw-cs to set some of its small components to zero.
43. The method of claim 42 which includes forming a matrix, Yplw.H0A , according to the plane-wave basis and then reducing
Figure imgf000041_0004
p to Y
Figure imgf000041_0005
by keeping only the columns corresponding to the non-zero components in
Figure imgf000041_0006
where is an
Figure imgf000041_0007
HOA direction matrix for the plane-wave basis and the hat-operator on A
Figure imgf000041_0008
indicates it has been truncated to some HOA-order M.
44. The method of claim 43 which includes computing
Figure imgf000041_0009
as
Figure imgf000041_0003
45. The method of claim 44 which includes expanding
Figure imgf000041_0010
p to obtain Spiw-cs (0 by inserting rows of time signals of zeros so that
Figure imgf000041_0011
matches the plane- wave basis.
46. The method of claim 38 which includes solving the following convex programming problem for a matrix
Figure imgf000041_0012
Figure imgf000041_0002
where Yplw is a matrix (truncated to a high spherical harmonic order) whose columns are the values of the spherical harmonic functions for the set of directions corresponding to some set of analysis plane waves, and εχ is a non-negative real number.
47. The method of claim 46 which includes obtaining an unmixing matrix, TlL , for the ί,-th time frame, by calculating:
5
Figure imgf000042_0001
where
II^j refers to the unmixing matrix for the L-\ time frame and
a is a forgetting factor such that 0 < a≤ 1 . 0 48. The method of claim 47 which includes applying singular value decomposition to BHOAto obtain a matrix decomposition:
Figure imgf000042_0002
49. The method of claim 48 which includes forming a matrix Sreduced by keeping only 5 the first m columns of S , where m is the number of rows of BH0A and forming a matrix, Ω , given by
Figure imgf000042_0003
50. The method of claim 49 which includes solving the following convex 0 programming problem for a matrix Γ :
i i i |
Figure imgf000042_0004
where ελ and Y lw are as defined above.
51. The method of claim 50 which includes obtaining Gplw from Γ using:
Figure imgf000042_0005
where VT is obtained from the matrix decomposition of BH0A .
52. The method of claim 51 which includes obtaining an unmixing matrix, TLL , for the Z-th time frame, by calculating:
30
Figure imgf000042_0006
where;
nL_, is an unmixing matrix for the L-l time frame,
a is a forgetting factor such that 0 < a≤ 1 .
35 53. The method of claim 52 which includes obtaining Gplw.smooth using:
Figure imgf000042_0007
54. The method of claim 53 which includes obtaining the vector of plane-wave signals, ) from the collection of plane-wave time samples, using
Figure imgf000043_0005
Figure imgf000043_0010
standard overlap-add techniques.
55. The method of claim 52 or claim 53 which includes obtaining,
Figure imgf000043_0009
from the collection of plane-wave time samples, Gplw , without smoothing using standard overlap-add techniques.
56. The method of any one of claims 34, 36-45 or 54-55 which includes modifying to reduce unwanted artifacts such as reverberations and/or unwanted sound
Figure imgf000043_0004
sources.
57. The method of claim 56 which includes, to reduce reverberations, reducing the signal values of some of the signals in the signal vector,
Figure imgf000043_0003
58. The method of claim 56 or claim 57 which includes, to separate sound sources in the sound field reconstruction so that the unwanted sound sources can be reduced, setting to zero some of the signals in the signal vector, gplw.cs (t) .
59. The method of any one of claims 56 to 58 which includes modifying
Figure imgf000043_0007
dependent on the means of playback of the reconstructed sound field.
60. The method of claim 59 which includes modifying
Figure imgf000043_0006
as follows:
Figure imgf000043_0002
where:
Ppiw/spk is a loudspeaker panning matrix.
61. The method of claim 59 which includes converting back to the HOA-
Figure imgf000043_0008
domain by computing:
Figure imgf000043_0001
where bH0A.highres (t) is a high-resolution HOA-domain representation of gplw.cs (t) capable of expansion to arbitrary HOA-domain order, where Yplw-H0A is an HOA direction matrix for a plane- wave basis and the hat-operator on indicates it has
Figure imgf000044_0009
been truncated to some HO A-order M.
62. The method of claim 61 which includes decoding
Figure imgf000044_0002
Figure imgf000044_0008
HOA decoding techniques.
63. The method of claim 59 which includes modifying to determine headphone gains as follows:
Figure imgf000044_0003
Figure imgf000044_0001
where:
is a head-related impulse response matrix of filters corresponding to
Figure imgf000044_0006
the set of plane wave directions.
64. The method of any one of claims 9 to 20 which includes using time-domain techniques of Independent Component Analysis (ICA) in the HOA-domain to analyse recorded data in a sparse domain, and obtaining parameters from the selected time domain technique to enable the recorded sound field to be reconstructed.
65. The method of claim 64 which includes analysing the recorded sound field in the HOA-domain to obtain a vector of HOA-domain time signals bH0A (t) .
66. The method of claim 65 which includes analysing the HOA-domain time signals using ICA signal processing to produce a set of plane-wave source signals,
Figure imgf000044_0010
67. The method of claim 66 or claim 67 which includes computing from bH0A (t) a mixing matrix, MICA , using signal processing techniques.
68. The method of claim 67 which includes projecting the mixing matrix, MICA , on the HOA direction vectors associated with a set of plane-wave basis directions by computing where is me transpose (Hermitian conjugate)
Figure imgf000044_0004
Figure imgf000044_0005
of the real-value (complex-valued) HOA direction matrix associated with the plane- wave basis and the hat-operator on indicates it has been truncated to some
Figure imgf000044_0007
HO A-order M.
69. The method of claim 68 which includes using thresholding techniques to identify the columns of Vsource that indicate a dominant source direction.
70. The method of claim 69 which includes reducing the matrix
Figure imgf000045_0010
Yp to obtain a matrix by removing the plane- wave direction vectors in p that do
Figure imgf000045_0009
Figure imgf000045_0011
not correspond to dominant source directions associated with matrix
Figure imgf000045_0012
71. The method of claim 70 which includes estimating as
Figure imgf000045_0005
Figure imgf000045_0001
72. The method of claim 70 which includes estimating gplw-ica.reduced (t) by working in the frequency domain and computing
Figure imgf000045_0006
73. The method of claim 72 which includes, for each frequency, reducing a transfer matrix, to obtain a matrix, T , by removing the columns in Tplw/mic that
Figure imgf000045_0008
Figure imgf000045_0007
do not correspond to dominant source directions associated with matrix Vsource .
74. The method of claim 73 which includes estimating gplw.ica.reduced by computing:
and transforming gp!w,ca,eduoed back to the time-
Figure imgf000045_0002
domain using an inverse FFT to obtain gplw-ica-reduced (t) .
75. The method of claim 74 which includes expanding
Figure imgf000045_0013
to obtain gpiw-ica (0 inserting rows of time signals of zeros so that gplw.ica (t) matches the plane- wave basis.
76. The method of claim 65 or claim 66 which includes computing from bH0A a mixing matrix, MICA , and a set of separated source signals, using signal
Figure imgf000045_0014
processing techniques.
77. The method of claim 76 which includes projecting the mixing matrix, MICA , on the HOA direction vectors associated with a set of plane-wave basis directions by computing
Figure imgf000045_0004
where is the transpose (Hermitian conjugate)
Figure imgf000045_0003
of the real-value (complex-valued) HOA direction matrix associated with the plane- wave basis and the hat-operator on indicates it has been truncated to some
HOA-order M
78. The method of claim 77 which includes using thresholding techniques to identify from V the dominant plane-wave directions.
Figure imgf000046_0007
79. The method of claim 78 which includes cleaning gioa (t) to obtain
Figure imgf000046_0006
which retains the signals corresponding to the dominant plane-wave directions in Vsource and sets the other signals to zero.
80. The method of claim 79 which includes modifying gplw-ica (i) to reduce unwanted artifacts such as reverberations and/or unwanted sound sources.
81 . The method of claim 80 which includes, to reduce reverberations, reducing the signal values of some of the signals in the signal vector,
Figure imgf000046_0008
82. The method of claim 80 or claim 81 which includes, to separate sound sources in the sound field reconstruction so that the unwanted sound sources can be reduced, setting to zero some of the signals in the signal vector,
Figure imgf000046_0009
83. The method of any one of claims 80 to 82 which includes modifying gplw.ica (t) dependent on the means of playback of the reconstructed sound field.
84. The method of claim 83 which includes modifying gp (t)
Figure imgf000046_0010
as follows:
Figure imgf000046_0002
where:
is a loudspeaker panning matrix.
Figure imgf000046_0011
85. The method of claim 83 which includes converting gplw.ioa (t) back to the HOA- domain by computing:
Figure imgf000046_0003
where:
is a high-resolution HOA-domain representation of
Figure imgf000046_0004
Figure imgf000046_0005
capable of expansion to arbitrary HOA-domain order, Ypiw-HOA is m HOA direction matrix for a plane-wave basis and the hat-operator on Yplw.H0A indicates it has been truncated to some HOA-order M.
86. The method of claim 85 which includes decoding
Figure imgf000047_0003
Figure imgf000047_0004
HOA decoding techniques.
87. The method of claim 83 which includes modifying to determine
Figure imgf000047_0002
headphone gains as follows:
Figure imgf000047_0001
where:
Figure imgf000047_0005
is a head-related impulse response matrix of filters corresponding to the set of plane wave directions.
88. A computer when programmed to perform the method of any one of claims 9 to 87.
89. A computer readable medium to enable a computer to perform the method of any one of claims 9 to 87.
PCT/AU2010/001312 2009-10-07 2010-10-06 Reconstruction of a recorded sound field WO2011041834A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
AU2010305313A AU2010305313B2 (en) 2009-10-07 2010-10-06 Reconstruction of a recorded sound field
EP10821476.8A EP2486561B1 (en) 2009-10-07 2010-10-06 Reconstruction of a recorded sound field
US13/500,045 US9113281B2 (en) 2009-10-07 2010-10-06 Reconstruction of a recorded sound field
JP2012532418A JP5773540B2 (en) 2009-10-07 2010-10-06 Reconstructing the recorded sound field

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2009904871A AU2009904871A0 (en) 2009-10-07 Reconstruction of a recorded sound field
AU2009904871 2009-10-07

Publications (1)

Publication Number Publication Date
WO2011041834A1 true WO2011041834A1 (en) 2011-04-14

Family

ID=43856294

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2010/001312 WO2011041834A1 (en) 2009-10-07 2010-10-06 Reconstruction of a recorded sound field

Country Status (5)

Country Link
US (1) US9113281B2 (en)
EP (1) EP2486561B1 (en)
JP (1) JP5773540B2 (en)
AU (1) AU2010305313B2 (en)
WO (1) WO2011041834A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016509812A (en) * 2013-02-08 2016-03-31 トムソン ライセンシングThomson Licensing Method and apparatus for determining the direction of uncorrelated sound sources in higher-order ambisonic representations of sound fields
CN109410965A (en) * 2012-12-12 2019-03-01 杜比国际公司 The method and apparatus that the high-order ambiophony of sound field is indicated to carry out compression and decompression

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5742340B2 (en) * 2011-03-18 2015-07-01 ソニー株式会社 Mastication detection device and mastication detection method
EP2541547A1 (en) * 2011-06-30 2013-01-02 Thomson Licensing Method and apparatus for changing the relative positions of sound objects contained within a higher-order ambisonics representation
US9558762B1 (en) * 2011-07-03 2017-01-31 Reality Analytics, Inc. System and method for distinguishing source from unconstrained acoustic signals emitted thereby in context agnostic manner
EP3629605B1 (en) 2012-07-16 2022-03-02 Dolby International AB Method and device for rendering an audio soundfield representation
US9736609B2 (en) * 2013-02-07 2017-08-15 Qualcomm Incorporated Determining renderers for spherical harmonic coefficients
US9883310B2 (en) * 2013-02-08 2018-01-30 Qualcomm Incorporated Obtaining symmetry information for higher order ambisonic audio renderers
US9609452B2 (en) * 2013-02-08 2017-03-28 Qualcomm Incorporated Obtaining sparseness information for higher order ambisonic audio renderers
US10178489B2 (en) * 2013-02-08 2019-01-08 Qualcomm Incorporated Signaling audio rendering information in a bitstream
EP2782094A1 (en) * 2013-03-22 2014-09-24 Thomson Licensing Method and apparatus for enhancing directivity of a 1st order Ambisonics signal
US20140358565A1 (en) 2013-05-29 2014-12-04 Qualcomm Incorporated Compression of decomposed representations of a sound field
US9466305B2 (en) * 2013-05-29 2016-10-11 Qualcomm Incorporated Performing positional analysis to code spherical harmonic coefficients
EP3073766A4 (en) 2013-11-19 2017-07-05 Sony Corporation Sound field re-creation device, method, and program
EP2879408A1 (en) * 2013-11-28 2015-06-03 Thomson Licensing Method and apparatus for higher order ambisonics encoding and decoding using singular value decomposition
US9602923B2 (en) * 2013-12-05 2017-03-21 Microsoft Technology Licensing, Llc Estimating a room impulse response
EP3090574B1 (en) * 2014-01-03 2019-06-26 Samsung Electronics Co., Ltd. Method and apparatus for improved ambisonic decoding
US10020000B2 (en) 2014-01-03 2018-07-10 Samsung Electronics Co., Ltd. Method and apparatus for improved ambisonic decoding
US9922656B2 (en) 2014-01-30 2018-03-20 Qualcomm Incorporated Transitioning of ambient higher-order ambisonic coefficients
US9489955B2 (en) 2014-01-30 2016-11-08 Qualcomm Incorporated Indicating frame parameter reusability for coding vectors
WO2015145782A1 (en) 2014-03-26 2015-10-01 Panasonic Corporation Apparatus and method for surround audio signal processing
US10134403B2 (en) * 2014-05-16 2018-11-20 Qualcomm Incorporated Crossfading between higher order ambisonic signals
US9852737B2 (en) 2014-05-16 2017-12-26 Qualcomm Incorporated Coding vectors decomposed from higher-order ambisonics audio signals
US10770087B2 (en) 2014-05-16 2020-09-08 Qualcomm Incorporated Selecting codebooks for coding vectors decomposed from higher-order ambisonic audio signals
US9620137B2 (en) 2014-05-16 2017-04-11 Qualcomm Incorporated Determining between scalar and vector quantization in higher order ambisonic coefficients
EP3149971B1 (en) 2014-05-30 2018-08-29 Qualcomm Incorporated Obtaining sparseness information for higher order ambisonic audio renderers
ES2696930T3 (en) * 2014-05-30 2019-01-18 Qualcomm Inc Obtaining symmetry information for higher order ambisonic audio renderers
US9747910B2 (en) 2014-09-26 2017-08-29 Qualcomm Incorporated Switching between predictive and non-predictive quantization techniques in a higher order ambisonics (HOA) framework
WO2018053050A1 (en) * 2016-09-13 2018-03-22 VisiSonics Corporation Audio signal processor and generator
CN112437392B (en) * 2020-12-10 2022-04-19 科大讯飞(苏州)科技有限公司 Sound field reconstruction method and device, electronic equipment and storage medium
CN113345448B (en) * 2021-05-12 2022-08-05 北京大学 HOA signal compression method based on independent component analysis

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007050593A2 (en) * 2005-10-25 2007-05-03 William Marsh Rice University Method and apparatus for signal detection, classification, and estimation from compressive measurements
US20070269063A1 (en) 2006-05-17 2007-11-22 Creative Technology Ltd Spatial audio coding based on universal spatial cues
US20080215651A1 (en) * 2005-02-08 2008-09-04 Nippon Telegraph And Telephone Corporation Signal Separation Device, Signal Separation Method, Signal Separation Program and Recording Medium
WO2009059279A1 (en) * 2007-11-01 2009-05-07 University Of Maryland Compressive sensing system and method for bearing estimation of sparse sources in the angle domain

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
NZ502603A (en) * 2000-02-02 2002-09-27 Ind Res Ltd Multitransducer microphone arrays with signal processing for high resolution sound field recording
US7333622B2 (en) * 2002-10-18 2008-02-19 The Regents Of The University Of California Dynamic binaural sound capture and reproduction
US20080056517A1 (en) * 2002-10-18 2008-03-06 The Regents Of The University Of California Dynamic binaural sound capture and reproduction in focued or frontal applications
EP1858296A1 (en) * 2006-05-17 2007-11-21 SonicEmotion AG Method and system for producing a binaural impression using loudspeakers

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080215651A1 (en) * 2005-02-08 2008-09-04 Nippon Telegraph And Telephone Corporation Signal Separation Device, Signal Separation Method, Signal Separation Program and Recording Medium
WO2007050593A2 (en) * 2005-10-25 2007-05-03 William Marsh Rice University Method and apparatus for signal detection, classification, and estimation from compressive measurements
US20070269063A1 (en) 2006-05-17 2007-11-22 Creative Technology Ltd Spatial audio coding based on universal spatial cues
WO2009059279A1 (en) * 2007-11-01 2009-05-07 University Of Maryland Compressive sensing system and method for bearing estimation of sparse sources in the angle domain

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
EPAIN N. ET AL.: "The Application of Compressive Sampling to the Analysis and Synthesis of Spatial Sound Fields", AUDIO ENGINEERING SOCIETY CONVENTION PAPER 7857. PROCEEDINGS OF THE 127TH AES CONVENTION, 9 October 2009 (2009-10-09) - 12 October 2009 (2009-10-12), NEW YORK, NY, USA, pages 1 - 12, XP040509138 *
RAFAELY B.: "Analysis and Design of Spherical Microphone Arrays", IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, vol. 13, no. 1, January 2005 (2005-01-01), pages 135 - 143, XP011123592 *
See also references of EP2486561A4 *
WARD D. ET AL.: "Reproduction of a Plane-Wave Sound Field using an Array of Loudspeakers", IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, vol. 9, no. 6, September 2001 (2001-09-01), pages 697 - 707, XP008160687 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109410965A (en) * 2012-12-12 2019-03-01 杜比国际公司 The method and apparatus that the high-order ambiophony of sound field is indicated to carry out compression and decompression
CN109410965B (en) * 2012-12-12 2023-10-31 杜比国际公司 Method and apparatus for compressing and decompressing higher order ambisonic representations of a sound field
JP2016509812A (en) * 2013-02-08 2016-03-31 トムソン ライセンシングThomson Licensing Method and apparatus for determining the direction of uncorrelated sound sources in higher-order ambisonic representations of sound fields
TWI647961B (en) * 2013-02-08 2019-01-11 瑞典商杜比國際公司 Method and apparatus for determining directions of uncorrelated sound sources in a higher order ambisonics representation of a sound field

Also Published As

Publication number Publication date
AU2010305313B2 (en) 2015-05-28
EP2486561A1 (en) 2012-08-15
JP5773540B2 (en) 2015-09-02
EP2486561B1 (en) 2016-03-30
US20120259442A1 (en) 2012-10-11
AU2010305313A1 (en) 2012-05-03
EP2486561A4 (en) 2013-04-24
US9113281B2 (en) 2015-08-18
JP2013507796A (en) 2013-03-04

Similar Documents

Publication Publication Date Title
US9113281B2 (en) Reconstruction of a recorded sound field
EP3320692B1 (en) Spatial audio processing apparatus
Pulkki et al. Parametric time-frequency domain spatial audio
Avni et al. Spatial perception of sound fields recorded by spherical microphone arrays with varying spatial resolution
Betlehem et al. Theory and design of sound field reproduction in reverberant rooms
CN106658343B (en) Method and apparatus for rendering the expression of audio sound field for audio playback
JP4938015B2 (en) Method and apparatus for generating three-dimensional speech
Zhang et al. On high-resolution head-related transfer function measurements: An efficient sampling scheme
Tylka et al. Fundamentals of a parametric method for virtual navigation within an array of ambisonics microphones
Sakamoto et al. Sound-space recording and binaural presentation system based on a 252-channel microphone array
KR102172051B1 (en) Audio signal processing apparatus and method
Masiero Individualized binaural technology: measurement, equalization and perceptual evaluation
Ajdler et al. Sound field analysis along a circle and its application to HRTF interpolation
Schultz et al. Data-based binaural synthesis including rotational and translatory head-movements
CN114503606A (en) Audio processing
Kashiwazaki et al. Sound field reproduction system using narrow directivity microphones and boundary surface control principle
US8675881B2 (en) Estimation of synthetic audio prototypes
Adams et al. State-space synthesis of virtual auditory space
Pinto et al. Digital acoustics: processing wave fields in space and time using DSP tools
McCormack et al. Spatial reconstruction-based rendering of microphone array room impulse responses
Moore et al. Processing pipelines for efficient, physically-accurate simulation of microphone array signals in dynamic sound scenes
Delikaris-Manias et al. Real-time underwater spatial audio: a feasibility study
McCormack Real-time microphone array processing for sound-field analysis and perceptually motivated reproduction
Chen Auditory space modeling and virtual auditory environment simulation
Fontana et al. A system for rapid measurement and direct customization of head related impulse responses

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10821476

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2012532418

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 2010821476

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2010305313

Country of ref document: AU

ENP Entry into the national phase

Ref document number: 2010305313

Country of ref document: AU

Date of ref document: 20101006

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 13500045

Country of ref document: US