WO2022034795A1 - Dispositif et procédé de traitement de signal, dispositif de suppression de bruit et programme - Google Patents

Dispositif et procédé de traitement de signal, dispositif de suppression de bruit et programme Download PDF

Info

Publication number
WO2022034795A1
WO2022034795A1 PCT/JP2021/027823 JP2021027823W WO2022034795A1 WO 2022034795 A1 WO2022034795 A1 WO 2022034795A1 JP 2021027823 W JP2021027823 W JP 2021027823W WO 2022034795 A1 WO2022034795 A1 WO 2022034795A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal processing
signal
speaker
spatial frequency
microphone
Prior art date
Application number
PCT/JP2021/027823
Other languages
English (en)
Japanese (ja)
Inventor
徹徳 板橋
直毅 村田
悠 前野
Original Assignee
ソニーグループ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニーグループ株式会社 filed Critical ソニーグループ株式会社
Publication of WO2022034795A1 publication Critical patent/WO2022034795A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • G10K11/178Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers

Definitions

  • the present technology relates to signal processing devices and methods, noise canceling devices, and programs, and in particular, to signal processing devices and methods capable of reducing delay times, noise canceling devices, and programs.
  • spatial noise canceling (hereinafter, also referred to as spatial NC (Noise Canceling)), which performs noise canceling using wave field synthesis technology, is known.
  • a method of performing operations such as filtering in the spatial frequency domain can be considered.
  • Non-Patent Document 1 A technique for performing an operation in the spatial frequency domain (see, for example, Non-Patent Document 1) has been proposed in order to realize wave field synthesis, and if the arithmetic in the spatial frequency domain is used, between channels corresponding to a plurality of speakers. It is possible to realize higher spatial NC performance in consideration of the correlation of.
  • time Fourier transform is performed on the microphone signals obtained by all the microphones, and spatial Fourier transform is performed on the resulting signals. It needs to be done and converted into a signal in the spatial frequency domain.
  • This technology was made in view of such a situation, and makes it possible to reduce the delay time.
  • the signal processing device of the first aspect of the present technology has one or a plurality of signal processing units that perform signal processing in the spatial frequency region, and the signal processing unit is a microphone obtained by sound collection by a plurality of microphones. The signal processing is performed on the signal converted in the spatial frequency region based on the signal.
  • the signal processing method or program of the first aspect of the present technology is a signal processing method or program of a signal processing apparatus having one or more signal processing units that perform signal processing in the spatial frequency region, and is the one or more.
  • the signal processing unit includes a step of performing the signal processing on a signal converted in the spatial frequency region based on the microphone signals obtained by collecting sounds from a plurality of microphones.
  • the sound is picked up by a plurality of microphones by the one or a plurality of the signal processing units.
  • the signal processing is performed on the signal converted in the spatial frequency region based on the microphone signal obtained by.
  • the noise canceling device on the second aspect of the present technology includes a plurality of microphones, one or a plurality of signal processing units that perform signal processing in the spatial frequency region, and a sound based on the noise canceling signal generated by the signal processing.
  • the signal processing unit is provided with a plurality of speakers for outputting the signal, and the signal processing unit performs the signal processing on the signal converted in the spatial frequency region based on the microphone signals obtained by the sound collection by the plurality of microphones.
  • the noise canceling signal is generated.
  • a plurality of microphones, one or a plurality of signal processing units that perform signal processing in the spatial frequency region, and a plurality of sounds based on the noise canceling signal generated by the signal processing are output.
  • the signal processing unit performs the signal processing on the signal converted in the spatial frequency region based on the microphone signals obtained by the sound collection by the plurality of microphones. , The noise canceling signal is generated.
  • This technology realizes spatial NC that does not require time-frequency conversion and vice versa by directly performing spatial frequency conversion on the microphone signal in the time domain and converting it into a signal in the spatial frequency domain. be. As a result, the delay time can be reduced and higher spatial NC performance can be obtained.
  • the temporal Fourier transform is performed on the microphone signals obtained by all the microphones, and the spatial Fourier transform is performed on the resulting signals. Will be. Then, after filtering is performed on the signal in the spatial frequency domain obtained by the spatial Fourier transform to generate a speaker signal, the spatial Fourier transform and the time Fourier transform are inversely transformed on all the speaker signals. , It is a speaker signal in the time domain.
  • the spatial Fourier transform, its inverse transform, and the filtering in the spatial frequency domain are performed using the signals obtained from the microphone signals of all the microphones.
  • microphones and speakers used for spatial NC are divided into multiple groups, and processing is performed for each group, so that multiple processing that could only be performed by one device in general spatial NC can be performed. It can be shared by devices and arithmetic units. As a result, the amount of calculation of each device or arithmetic unit can be reduced, and the number of input / output lines required for one device can also be reduced.
  • FIG. 1 is a diagram showing a configuration of a multi-input multi-output system that realizes noise canceling with general headphones or the like, that is, a parallel SISO (Single Input Single Output) system.
  • SISO Single Input Single Output
  • the parallel SISO system shown in FIG. 1 has microphones 11-1 to 11-6, SISO filters 12-1 to SISO filters 12-6, and speakers 13-1 to 13-6.
  • the microphones 11-1 to 11-6 pick up the ambient sound, and supply the resulting microphone signal to the SISO filter 12-1 to the SISO filter 12-6.
  • the SISO filter 12-1 to SISO filter 12-6 filters the microphone signals supplied from the microphones 11-1 to 11-6 by the SISO filter in the time domain, and obtains the speaker signal as a result. It is supplied to the speakers 13-1 to the speakers 13-6.
  • the speaker signal is a speaker signal for outputting sound (hereinafter, also referred to as noise canceling sound) from the speaker so that noise noise is canceled in a target area (position), that is, noise canceling is realized. It is a drive signal.
  • the speaker signal is a noise canceling signal for outputting a noise canceling sound from the speaker.
  • Speakers 13-1 to 13-6 output sound based on the speaker signals supplied from the SISO filters 12-1 to SISO filters 12-6, and realize noise canceling.
  • microphones 11-1 to 11-6 they are also simply referred to as microphones 11, and when it is not necessary to distinguish SISO filters 12-1 to SISO filters 12-6, they are simply SISO. Also referred to as a filter 12. Further, hereinafter, when it is not necessary to distinguish between the speaker 13-1 and the speaker 13-6, the speaker 13 is also simply referred to as a speaker 13.
  • a system consisting of a microphone 11-1, an SISO filter 12-1, and a speaker 13-1 is an SISO system corresponding to one channel, and a plurality of SISO systems are arranged in parallel in parallel.
  • the SISO system is configured.
  • each channel that is, each SISO system operates independently, the amount of calculation in each of those SISO systems can be small.
  • the correlation between channels is not taken into consideration, the higher the frequency, the greater the influence of the phase shift of the sound output by each speaker 13, and the lower the noise canceling effect.
  • FIG. 2 is a diagram showing a configuration of a multi-point control MIMO (Multi Input Multi Output) system.
  • MIMO Multi Input Multi Output
  • FIG. 2 the same reference numerals are given to the portions corresponding to those in FIG. 1, and the description thereof will be omitted as appropriate.
  • the multipoint control MIMO system shown in FIG. 2 has microphones 11-1 to 11-6, a MIMO filter 41, and speakers 13-1 to 13-6.
  • the microphone signals obtained from all the microphones 11 are input to one MIMO filter 41.
  • the MIMO filter 41 performs filtering by the MIMO filter on the microphone signal supplied from each microphone 11 to generate a speaker signal for each channel, and transfers the speaker signal of each channel to the speaker 13 corresponding to each channel. Output.
  • the filter calculation in the time domain is performed between each microphone 11 and each speaker 13, the correlation between channels is also taken into consideration, and a good noise canceling effect is obtained even at high frequencies. Obtainable.
  • the larger the number of channels the larger the amount of calculation in the MIMO filter 41 is in proportion to the square of the number of channels.
  • the number of channels is 48 channels
  • the amount of filtering calculation for 48 channels is sufficient.
  • the number of channels is 48 channels
  • FIG. 3 for example, it is known that the amount of calculation can be significantly reduced by performing processing in the spatial frequency domain by the spatial frequency domain processing system.
  • FIG. 3 the same reference numerals are given to the portions corresponding to those in FIG. 1, and the description thereof will be omitted as appropriate.
  • the spatial frequency domain processing system shown in FIG. 3 includes a microphone 11-1 to a microphone 11-6, a time FFT (Fast Fourier Transform) unit 71-1 to a time FFT unit 71-6, a space FFT unit 72, and a filter processing unit 73. It has a spatial inverse FFT unit 74, a time inverse FFT unit 75-1 to a time inverse FFT unit 75-6, and a speaker 13-1 to a speaker 13-6.
  • a spatial inverse FFT unit 74 a time inverse FFT unit 75-1 to a time inverse FFT unit 75-6
  • speaker 13-1 to a speaker 13-6.
  • the time FFT unit 71-1 to the time FFT unit 71-6 perform time FFT processing on the microphone signals supplied from the microphones 11-1 to 11-6, and obtain signals in the time frequency region obtained as a result. It is supplied to the space FFT unit 72.
  • the spatial FFT unit 72 performs spatial FFT processing on the signals supplied from the time FFT unit 71-1 to the time FFT unit 71-6, and the resulting spatial frequency domain signal (signal for each frequency bin). Is supplied to the filter processing unit 73.
  • time FFT unit 71 when it is not necessary to particularly distinguish between the time FFT unit 71-1 and the time FFT unit 71-6, it is also simply referred to as the time FFT unit 71.
  • time FFT unit 71 for example, STFT (Short Time Fourier Transform) or other time axis FFT processing (time Fourier transform) is performed as time FFT processing, and the microphone signal, which is a time signal, is converted into a signal in the time frequency region.
  • STFT Short Time Fourier Transform
  • time Fourier transform time Fourier transform
  • the time frequency in the space frequency domain is performed by performing the FFT process (spatial Fourier transform) on the space axis as the space FFT process for the signal in the time frequency domain obtained by the time FFT unit 71. Get the signal of.
  • FFT process spatial Fourier transform
  • the filter processing unit 73 filters the signal supplied from the spatial FFT unit 72 in the spatial frequency domain, and supplies the speaker signal for each frequency bin obtained as a result to the spatial inverse FFT unit 74.
  • the filtering in the filter processing unit 73 is a multiplication of the frequency axes, which is a large calculation compared with the multipoint control MIMO system of FIG. The amount can be reduced.
  • the spatial inverse FFT 74 performs the spatial inverse FFT processing, that is, the inverse transform of the spatial FFT processing on the speaker signal in the spatial frequency domain supplied from the filter processing unit 73, and the speaker in the temporal frequency domain of each channel obtained as a result.
  • the signal is supplied to the time reverse FFT unit 75-1 to the time reverse FFT unit 75-6.
  • the time inverse FFT unit 75-1 to the time inverse FFT unit 75-6 perform the time inverse FFT process, that is, the inverse transform of the time FFT process on the speaker signal in the time frequency domain supplied from the spatial inverse FFT 74, and the result is The obtained speaker signal in the time domain of each channel is supplied to the speaker 13-1 to the speaker 13-6.
  • time reverse FFT unit 75-1 when it is not necessary to distinguish the time reverse FFT unit 75-1 to the time reverse FFT unit 75-6, it is also simply referred to as a time reverse FFT unit 75.
  • Such a spatial frequency domain processing system can be used when the microphone 11 and the speaker 13 are arranged in an array in a specific shape such as an annular shape, and can be used in a space with a large number of channels and a wide frequency band with a low calculation amount. NC can be realized.
  • the correlation between channels is taken into consideration, so that higher spatial NC performance can be obtained up to higher frequencies than in the parallel SISO system, and moreover, the calculation is performed more than in the multipoint control MIMO system.
  • the amount can be kept low.
  • a wide region can be controlled, that is, a desired wavefront can be formed with high accuracy in a wide region, so that spatial NC can be performed for a wide region.
  • the adaptive processing of the filter used in the filter processing unit 73 converges quickly, so that the spatial NC that follows the environmental change can be performed.
  • DFT Discrete Fourier Transform
  • a microphone array consisting of a plurality of microphones arranged at equal intervals in a ring shape centered on the origin of the xy coordinate system, which is a predetermined two-dimensional Cartesian coordinate system set in space, and the origin of the xy coordinate system.
  • a speaker array consisting of a plurality of speakers arranged in a ring at equal intervals.
  • the number of elements of the microphone array and the speaker array that is, the number of microphones constituting the microphone array and the number of speakers constituting the speaker array are M respectively.
  • the radius of the microphone array that is, the distance from the center position (origin position) of the microphone array to the microphone is R mic , and the radius of the speaker array is R spc .
  • the coordinates indicating the arrangement position of each speaker constituting the speaker array can be expressed in the same manner as the coordinates indicating the arrangement position of each microphone.
  • n be the discrete-time index (time index)
  • x [m, n] be the microphone signal in the time domain obtained by the m-th microphone that constitutes the microphone array.
  • y [m, n] be the speaker signal in the time domain of the mth speaker constituting the speaker array.
  • k in the microphone signal X [m, k] is an index indicating the time frequency
  • N in the equation (2) indicates the time Fourier transform length
  • the spatial Fourier transform is defined in the same way as the time Fourier transform.
  • the Fourier transform is performed with respect to the time index n, but in the spatial Fourier transform, the Fourier transform is performed with respect to the microphone index m.
  • l be the index indicating the spatial frequency, that is, the index of the frequency bin of the spatial frequency
  • the signal in the spatial frequency region obtained by performing the spatial Fourier transform on the microphone signal X [m, k] is X'[l, k. ].
  • the relationship between the microphone signal X [m, k] and the signal X'[l, k] is shown in the following equation (3).
  • the speaker signal in the spatial frequency domain obtained by performing the spatial Fourier transform on the speaker signal Y [m, k] is Y'[l, k]
  • the relationship of the signal Y'[l, k] can also be expressed by the same equation as in equation (3).
  • filtering in the spatial frequency domain is performed on the signal X'[l, k] in the spatial frequency domain, and this filtering process filters the spatial frequency domain W'[. If l, k], it is expressed by the following equation (4). That is, the speaker signal Y'[l, k] can be obtained by calculating the equation (4) based on the signal X'[l, k] and the filter W'[l, k].
  • the filtering represented by the equation (4) is performed to generate a speaker signal in the spatial frequency domain.
  • w [m', n'] represents a time domain filter for a microphone with a microphone index m'corresponding to the filter W'[l, k], and (P) Q.
  • P mod
  • the equation (5) expresses the relationship between the time domain filter w [m, n] and the microphone signal x [m, n] and the time domain speaker signal y [m, n].
  • Filtering based on equation (5) can be used in spatial NC because it is not affected by the system delay due to the time Fourier transform, but the amount of computation is the same as the above-mentioned multipoint control MIMO system, and many computations are performed. It will be necessary.
  • the multipoint control MIMO system is described in detail in, for example, “C. Hansen, et al.,” Active Control of Noise and Vibration “, CRC press, 2012.” (hereinafter referred to as Reference 2). ing.
  • the spatial Fourier transform in which the total number M of the microphones is the DFT point length for the microphone signal x [m, n] in the time domain, that is, to the signal in the spatial frequency domain.
  • the signal obtained by performing the conversion (spatial frequency conversion) of is defined as x'[l, n].
  • the process of converting the microphone signal x [m, n] into the frequency domain only in the spatial direction is performed. Therefore, it can be said that the signal x'[l, n] obtained by the equation (6) is a time signal in the spatial frequency domain.
  • the time domain filter w [m, n] is not subjected to the time Fourier transform (temporal frequency transform), but only the spatial Fourier transform (spatial frequency transform).
  • the filter of the spatial frequency domain (frequency bin of each spatial frequency) obtained by this is defined as w'[l, n].
  • the spatial frequency domain (each) obtained by performing only the spatial Fourier transform (spatial frequency conversion) without performing the temporal Fourier transform (temporal frequency transform) for the speaker signal y [m, n] in the temporal domain.
  • the speaker signal of the frequency bin of the spatial frequency is defined as y'[l, n]. More specifically, the speaker signal y'[l, n] is not a speaker drive signal that drives the speaker alone, but the noise canceling sound output from each speaker is derived from this speaker signal y'[l, n]. It is calculated.
  • the following equation (7) can be obtained by performing only the inverse time Fourier transform without performing the reciprocal space Fourier transform on both sides of the equation (4).
  • the signal x'[l, n] is filtered by the filter w'[l, n] having a filter length N f , that is, the filter w'[l, n] and the signal x'[l, n].
  • the convolution process in the time direction the speaker signal y'[l, n] can be obtained.
  • the filtering operation shown in equation (7) is a process in the spatial frequency domain, but does not require spatial convolution, that is, it is independent of the spatial frequency index (frequency bin) l and is convoluted in the temporal direction. All you have to do is.
  • the actual amount of operation for obtaining the speaker signal y'[l, n] is the same as that in the above-mentioned parallel SISO system except for the operation of a constant multiple.
  • a system that generates a speaker signal as a noise canceling signal by the spatial frequency conversion shown in the equation (6) and the filtering shown in the equation (7) is also referred to as a low delay spatial frequency domain processing system. do.
  • the filtering operation is a convolution operation instead of a simple multiplication as in the spatial frequency domain processing system, but since SISO filtering is sufficient, multipoint control is required. Compared to the MIMO system, the amount of computation is significantly reduced.
  • the delay time (delay) generated in the system is extremely reduced.
  • spatial FFT spatial frequency conversion
  • the low-delay spatial frequency domain processing system becomes a realistic system with low delay and a small amount of computation, and can realize a higher-performance spatial NC.
  • FIG. 4 is a diagram showing a configuration example of a noise canceling device which is an example of an embodiment of a low delay spatial frequency domain processing system to which the present technology is applied.
  • the noise canceling device 101 shown in FIG. 4 has a microphone array 111, a signal processing device 112, and a speaker array 113.
  • the microphone array 111 is a microphone array such as an annular microphone array obtained by arranging the microphones 121-1 to 121-16 in a predetermined shape such as an annular shape or a rectangular shape.
  • the microphones 121-1 to 121-16 collect ambient sounds including noise to be canceled, and supply the resulting microphone signal to the signal processing device 112.
  • the microphones 121 are simply referred to as the microphones 121.
  • the signal processing device 112 comprises, for example, a personal computer having one or more arithmetic units, and generates a speaker signal in the time domain for spatial NC based on the microphone signal supplied from the microphone array 111, and the speaker array. Output to 113.
  • the speaker signal in this time domain is a noise canceling signal for spatial NC, and is a speaker driving signal that drives the speakers constituting the speaker array 113 to output a noise canceling sound.
  • the signal processing device 112 has a signal processing unit 131 including one arithmetic unit such as a DSP (Digital Signal Processor) or an FPGA (Field Programmable Gate Array).
  • a DSP Digital Signal Processor
  • FPGA Field Programmable Gate Array
  • the signal processing unit 131 has a spatial frequency conversion unit 141, a filter processing unit 142-1 to a filter processing unit 142-16, and a spatial frequency synthesis unit 143.
  • the spatial frequency conversion unit 141 performs spatial frequency conversion on the microphone signal in the time region supplied from the microphones 121-1 to 121-16, that is, the time signal, and filters the signal in the spatial frequency region obtained as a result. It is supplied to the processing unit 142-1 to the filter processing unit 142-16. In other words, the spatial frequency conversion unit 141 converts the microphone signal in the time domain in the spatial frequency domain.
  • the DFT shown in the equation (6) is performed as the spatial frequency conversion based on the microphone signals supplied from all the microphones 121.
  • the total number M of the microphones 121 is set to "16", and the signal x'[l] for the frequency bin l of each of the 16 spatial frequencies corresponding to the filter processing unit 142-1 to the filter processing unit 142-16. , N] is calculated.
  • the filter processing unit 142-1 to the filter processing unit 142-16 generate a speaker signal in the spatial frequency region by performing signal processing in the spatial frequency region with respect to the signal supplied from the spatial frequency conversion unit 141. It is supplied to the spatial frequency synthesis unit 143.
  • the filter processing unit 142-1 to the filter processing unit 142-16 hold an SISO filter for the spatial NC, and the SISO filter filters the signal in the spatial frequency domain from the spatial frequency conversion unit 141. , Performed as signal processing in the spatial frequency domain. More specifically, the process of convolving the filter coefficients constituting the SISO filter and the signal in the spatial frequency region is performed as filtering by the SISO filter.
  • the filter processing unit 142-1 to the filter processing unit 142-16 hold the above-mentioned filter w'[l, n] as an SISO filter, and are represented by the equation (7) as filtering. The calculation is performed and the speaker signal y'[l, n] is generated.
  • the filter processing unit 142 is also simply referred to as the filter processing unit 142.
  • one filter processing unit 142 performs filtering for one frequency bin l of the spatial frequency. Become.
  • the SISO filter held by the filter processing unit 142 is, for example, a FIR (Finite Impulse Response) filter generated in advance by LMS (Least Mean Squares) or the like based on the shape of the microphone array 111 or the total number of microphones 121.
  • FIR Finite Impulse Response
  • LMS Least Mean Squares
  • a SISO filter prepared in advance may be continuously used, or an SISO filter may be used based on a microphone signal obtained by collecting sound by a microphone or the like installed at a control point. May be updated sequentially.
  • the spatial frequency synthesis unit 143 generates a speaker signal in the time region for each speaker by performing spatial frequency synthesis on the speaker signal in the spatial frequency region supplied from each filter processing unit 142, and supplies the speaker signal to the speaker array 113. do.
  • the inverse conversion of the spatial frequency conversion performed by the spatial frequency conversion unit 141 is performed as the spatial frequency synthesis. Therefore, for example, when the DFT (spatial Fourier transform) shown in the equation (6) is performed by the spatial frequency transforming unit 141, the spatial frequency synthesizing unit 143 has an IDFT (Inverse Discrete Fourier Transform) (reverse) corresponding to the equation (6). Spatial Fourier transform) is performed.
  • DFT spatial Fourier transform
  • IDFT Inverse Discrete Fourier Transform
  • the speaker array 113 is a speaker array such as an annular speaker array obtained by arranging the speakers 151-1 to 151-16, which are speaker units, in a predetermined shape such as an annular shape or a rectangular shape. ..
  • Speakers 151-1 to 151-16 are driven based on the speaker signal in the time domain supplied from the spatial frequency synthesis unit 143, and output a noise canceling sound. As a result, the noise sound is canceled in the predetermined target area, and the spatial NC is realized.
  • the speaker 151-1 is also simply referred to as the speaker 151.
  • FIG. 5 the parts corresponding to the case in FIG. 4 are designated by the same reference numerals, and the description thereof will be omitted. Further, in FIG. 5, in order to make the figure easier to see, the reference numerals of the microphone 121 and the speaker 151 are not attached.
  • the user U11 who is a listener or the like listening to the content is in the predetermined area R11, and this area R11 is the area (cancellation area) targeted by the spatial NC.
  • the speakers 151 constituting the speaker array 113 are arranged in a ring shape so as to surround the area R11 which is a cancellation area to form an annular speaker array.
  • the microphones 121 constituting the microphone array 111 are arranged in a ring shape on the outside of the speaker array 113 so as to surround the speaker array 113 to form a ring-shaped microphone array.
  • the speaker array 113 and the microphone array 111 are arranged so that the center of the speaker array 113 and the microphone array 111 is at the center position of the circular region R11.
  • the microphone array 111 arranged outside the speaker array 113 when viewed from the region R11 collects noise (noise sound) generated outside the microphone array 111 and propagated to the region R11. Will be done.
  • a speaker signal is generated based on the microphone signal obtained by collecting the sound, and the noise canceling sound based on the speaker signal is output in the direction of the region R11 from each speaker 151 constituting the speaker array 113.
  • the wavefront of the noise canceling sound output from each speaker 151 is combined to form a wavefront that cancels (cancels) the noise sound in the region R11.
  • spatial NC by wave field synthesis is realized.
  • the numbers of the microphone 121 and the speaker 151 and the shapes of the microphone array 111 and the speaker array 113 do not necessarily have to be the same, and may be different numbers and shapes.
  • the spatial frequency conversion unit 141 or the spatial frequency synthesis unit 143 may perform upsampling or downsampling on the signal in the spatial frequency domain according to the numbers. good.
  • the number of microphones 121 and speakers 151 may be any number, and the shape (array shape) of the microphone array 111 and speaker array 113 may be any shape.
  • each microphone 121 of the microphone array 111 collects ambient sound, and supplies the microphone signal in the time domain obtained as a result to the spatial frequency conversion unit 141.
  • step S12 the spatial frequency conversion unit 141 performs spatial frequency conversion on the microphone signal in the time region supplied from the microphone 121, and supplies the signal in the spatial frequency region obtained as a result to each filter processing unit 142. For example, in step S12, the calculation of the above equation (6) is performed, and a signal in the spatial frequency region is generated.
  • step S13 the filter processing unit 142 filters the signal in the spatial frequency domain supplied from the spatial frequency conversion unit 141 by the holding SISO filter, and obtains the speaker signal in the spatial frequency domain obtained as a result. It is supplied to the spatial frequency synthesis unit 143. For example, in step S13, the calculation of the equation (7) is performed as filtering.
  • step S14 the spatial frequency synthesis unit 143 performs spatial frequency synthesis on the spatial frequency domain speaker signal supplied from the filter processing unit 142, and generates a temporal domain speaker signal.
  • step S15 the spatial frequency synthesis unit 143 supplies the speaker signal obtained in the process of step S14 to each speaker 151 of the speaker array 113 to output a sound (noise canceling sound).
  • the noise canceling device 101 performs spatial frequency conversion on the microphone signal in the time domain without performing time frequency conversion, and the speaker signal is based on the signal in the spatial frequency region obtained as a result. To generate.
  • ⁇ Second embodiment> ⁇ Processing in the spatial frequency domain>
  • the spatial frequency conversion unit 141 first performs spatial frequency conversion, and the microphone signal in the time region is performed. Is converted into a signal in the spatial frequency region.
  • the outputs (microphone signals) of all the microphones 121 constituting the microphone array 111 must be input to the signal processing unit 131 for spatial frequency conversion (DFT). Further, even after filtering in the spatial frequency domain, it is necessary to perform spatial frequency synthesis using the outputs of all the filter processing units 142.
  • DFT spatial frequency conversion
  • one arithmetic unit as the signal processing unit 131 must perform spatial frequency conversion, signal processing (filtering) in the spatial frequency region, and spatial frequency synthesis. That is, it is not possible to divide the hardware that performs these processes and share the processes with a plurality of hardware (arithmetic units) (distribute the processes).
  • the upper limit of the frequency targeted for noise canceling is set to 1 kHz and the cancel area (region R11) is to be a region with a diameter of 2 m, in order to obtain sufficient spatial NC performance.
  • arithmetic unit such as one DSP or FPGA as the signal processing unit 131 is the number of PINs (input terminals) of the input and output provided in the arithmetic unit. And the number of output terminals), it may be physically impossible.
  • one signal processing unit 131 may not be able to perform the calculation (processing) because the amount of calculation is too large, and it may not be possible to realize the spatial NC.
  • a plurality of microphones 121 and speakers 151 constituting the microphone array 111 and the speaker array 113 are divided into a plurality of groups, and spatial frequency conversion, filtering in the spatial frequency region, and spatial frequency synthesis are performed for each of the divided groups. May be done.
  • the calculation for spatial NC can be shared by multiple arithmetic units (arithmetic units), so it is high while reducing the number of PINs and the amount of computation required for one arithmetic unit. Spatial NC performance can be obtained.
  • FIG. 7 consider a case where a noise sound is generated with the position P11 outside the microphone array 111 as the sound source position of one point sound source.
  • the same reference numerals are given to the portions corresponding to those in FIG. 5, and the description thereof will be omitted as appropriate.
  • all the microphones 121 constituting the microphone array 111 and all the speakers 151 constituting the speaker array 113 are used in order to perform spatial NC.
  • the degree (contribution rate) used for sound collection and sound output for each microphone 121 and speaker 151 that is, for the realization of spatial NC.
  • the importance of is different.
  • the importance is as high as the microphone 121 and the speaker 151 arranged at a position close to the position P11 where the sound source of the noise sound is located, and conversely, the microphone 121 arranged at a position far from the position P11. And the speaker 151 become less important.
  • FIG. 8 it is also possible to perform spatial NC using only four microphones 121 and 12 speakers 151 near the position P11 where the noise source is located.
  • the parts corresponding to the case in FIG. 7 are designated by the same reference numerals, and the description thereof will be omitted as appropriate.
  • the number of microphones 121 used for spatial NC is four, while the number of speakers 151 used for spatial NC is twelve. Therefore, in order to generate a speaker signal for spatial NC by the same calculation as in the case of the noise canceling device 101, the microphone signals of the eight microphones 121 are further required.
  • each of the twelve will be calculated in the same manner as in the case of the noise canceling device 101.
  • the speaker signal of the speaker 151 can be generated.
  • the noise sound generated at the position P11 can be sufficiently canceled without using all the microphones 121 and the speakers 151.
  • the noise sound generation position (noise source position) must be near the position P11, and noise sound from all directions cannot be dealt with.
  • the microphone 121 constituting the microphone array 111 and the speaker 151 constituting the speaker array 113 are each divided into four groups and a speaker signal is generated for each group, the whole is completed. It is possible to deal with noise sounds from the direction.
  • the parts corresponding to the case in FIG. 7 are designated by the same reference numerals, and the description thereof will be omitted as appropriate.
  • the 16 microphones 121 constituting the microphone array 111 and the 16 speakers 151 constituting the speaker array 113 are each divided into four groups.
  • the group of microphones 121 will also be referred to as a microphone group
  • the group of speakers 151 will also be referred to as a speaker group.
  • the 16 microphones 121 constituting the microphone array 111 are divided into four microphone groups as shown by arrows Q21 to Q24.
  • the microphone groups are grouped so that one microphone 121 belongs to only one microphone group and the microphones 121 arranged adjacent to each other belong to one microphone group.
  • one microphone group consists of four microphones 121.
  • one microphone group is formed by four microphones 121 arranged adjacent to each other on the right front side when viewed from the user U11.
  • a microphone group is formed by four microphones 121 arranged adjacent to each other in the right rear, left rear, and left front directions when viewed from the user U11. Has been done.
  • a speaker group consisting of 12 speakers 151 arranged adjacent to each other with the position in front of the right side of the user U11 as the center with respect to the microphone group on the right front side when viewed from the user U11. Is formed.
  • the right rear, left rear, and left rear of the user U11 are also for the microphone groups in the right rear, left rear, and left front directions when viewed from the user U11.
  • a speaker group consisting of 12 speakers 151 arranged adjacent to each other is formed around the position in each direction on the front left side.
  • the speaker groups are grouped so that 12 speakers 151 arranged adjacent to each other belong to one speaker group, one speaker 151 belongs to three speaker groups. It will be.
  • the space can be generated.
  • the entire processing of NC can be divided into four. That is, for example, the hardware is divided into four by providing one arithmetic unit corresponding to the signal processing unit 131 for the corresponding microphone group and the speaker group, and the processing for spatial NC is performed by the plurality of arithmetic units. Can be dispersed.
  • a plurality of microphones 121 constituting the microphone array 111 are divided into four microphone groups so that four microphones 121 adjacent to each other belong to the same microphone group. That is, the microphone 121 to be used for one filtering is selected while shifting the microphone 121 by four.
  • the speaker 151 that is the output destination of the speaker signal obtained by one filtering is selected, and the speaker signal is supplied to the selected speaker 151.
  • grouping is performed to divide the entire processing part by part, and the speaker signals are added to the part where the filtering output destinations overlap to obtain the final speaker signal, resulting in all microphones.
  • Spatial NC can be performed using 121 and speaker 151. This makes it possible to deal with noise sounds from all directions.
  • the noise canceling device is configured as shown in FIG. 10, for example.
  • FIG. 10 the same reference numerals are given to the portions corresponding to those in FIG. 4, and the description thereof will be omitted as appropriate.
  • the noise canceling device 191 shown in FIG. 10 has a microphone array 111, a signal processing device 201, and a speaker array 113.
  • the microphone array 111 and the speaker array 113 are each divided into four groups.
  • microphones 121-1 to 121-4 are grouped together.
  • each of the microphones 121-5 to 121-8, the microphones 121-9 to 121-12, and the microphones 121-13 to 121-16 is grouped together.
  • the speakers 151-1 to the speakers 151-12 are grouped together, and the speakers 151-5 to the speakers 151-16 are also grouped together.
  • the speakers 151-9 to the speaker 151-16 and the speakers 151-1 to the speaker 151-4 are grouped together, and the speakers 151-13 to the speaker 151-16 and the speakers 151-1 to the speaker 1511 are grouped together.
  • -8 is one group.
  • the signal processing device 201 corresponds to the signal processing device 112 of FIG. 4, and is composed of, for example, a personal computer having one or a plurality of arithmetic units.
  • the signal processing device 201 has a signal processing unit 211-1 to a signal processing unit 211-4, and an addition unit 212-1 to an addition unit 212-16.
  • Each of the signal processing unit 211-1 to the signal processing unit 211-4 is composed of one arithmetic unit such as a DSP or FPGA, and corresponds to the signal processing unit 131 of FIG.
  • the signal processing unit 211-1 is the same as in the signal processing unit 131 based on the microphone signals supplied from the microphones 121-1 to 121-4 and the predetermined eight zero signals treated as microphone signals. Performs processing and generates a speaker signal.
  • the signal processing unit 211-1 generates speaker signals of each of the twelve channels, that is, speaker signals having the output destinations of the speakers 151-13 to the speaker 151-16 and the speakers 151-1 to 151-8 respectively. ..
  • the signal processing unit 211-1 supplies the generated speaker signal to the addition unit of the corresponding channel, that is, the addition unit 212-13 to the addition unit 212-16 and the addition unit 212-1 to the addition unit 212-8.
  • the signal processing unit 211-2 to the signal processing unit 211-4 also have 12 channels of speaker signals based on the microphone signals from the four microphones 121 and the eight zero signals. Is generated and output.
  • the signal processing unit 211-2 receives the microphone signal from the microphones 121-5 to 121-8, and supplies the speaker signal to the addition unit 212-1 to the addition unit 212-12.
  • the signal processing unit 211-3 receives the microphone signal from the microphones 121-9 to 121-12, and supplies the speaker signal to the addition unit 212-5 to the addition unit 212-16.
  • the signal processing unit 211-4 receives the microphone signal from the microphones 121-13 to 121-16, and is supplied to the addition unit 212-9 to the addition unit 212-16, and the addition unit 212-1 to the addition unit 212-4. Supply a speaker signal.
  • the signal processing unit 211-1 to the signal processing unit 211-4 are also simply referred to as the signal processing unit 211.
  • a signal processing unit 211 to which the microphone signal obtained by collecting the sound by the microphone 121 of the microphone group is input is predetermined for each microphone group.
  • Each signal processing unit 211 performs filtering by an SISO filter based on the microphone signals supplied from all the microphones 121 belonging to one microphone group, and corresponds to a part of the speakers 151 of the speaker array 113, that is, the microphone group.
  • the speaker signal of the speaker 151 belonging to the speaker group is generated.
  • the addition unit 212-1 to the addition unit 212-16 add speaker signals of the same channel supplied from a plurality of signal processing units 211 to obtain a final speaker signal, and the final speaker signal of the corresponding channel. It is supplied to the speaker 151.
  • the addition unit 212-1 to the addition unit 212-4 receive the speaker signal supply from the signal processing unit 211-1, the signal processing unit 211-2, and the signal processing unit 211-4, and the addition unit 212- 5 to the addition unit 212-8 receive a speaker signal from the signal processing unit 211-1 to the signal processing unit 211-3.
  • the addition unit 212-9 to the addition unit 212-12 receive a speaker signal from the signal processing unit 211-2 to the signal processing unit 211-4, and the addition unit 212-13 to the addition unit 212-16 are signal processing units.
  • the speaker signal is supplied from 211-1, the signal processing unit 211-3, and the signal processing unit 211-4.
  • addition unit 212-1 when it is not necessary to distinguish between the addition unit 212-1 and the addition unit 212-16, it is also simply referred to as the addition unit 212.
  • each of a plurality of corresponding addition units 212 is provided for each of the plurality of speakers 151 constituting the speaker array 113, and the addition units 212 are obtained by two or more signal processing units 211.
  • the speaker signals of the same speaker 151 are added and output.
  • each of these signal processing units 211 may be provided in a plurality of signal processing devices different from each other. ..
  • FIG. 11 is a diagram showing a configuration example of the signal processing unit 211 of the noise canceling device 191.
  • the signal processing unit 211 has a spatial frequency conversion unit 241, a filter processing unit 242-1 to a filter processing unit 242-12, and a spatial frequency synthesis unit 243.
  • the spatial frequency conversion unit 241, the filter processing unit 242-1 to the filter processing unit 242-12, and the spatial frequency synthesis unit 243 are the spatial frequency conversion unit 141, the filter processing unit 142, and the spatial frequency synthesis unit shown in FIG. Corresponds to part 143.
  • the spatial frequency conversion unit 241 performs spatial frequency conversion based on the time domain microphone signals supplied from each of the four microphones 121 and the eight zero signals supplied as dummy microphone signals.
  • the same DFT as in the equation (6) is performed as the spatial frequency conversion.
  • the DFT point length is set to 12
  • the signal x'[l, n] for the frequency bin l of each of the 12 spatial frequencies corresponding to the filter processing unit 242-1 to the filter processing unit 242-12 is calculated.
  • the spatial frequency conversion unit 241 supplies the signal in the spatial frequency domain obtained by the spatial frequency conversion to the filter processing unit 242-1 to the filter processing unit 242-12.
  • the filter processing unit 242-1 to the filter processing unit 242-12 generate a speaker signal in the spatial frequency region by performing signal processing in the spatial frequency region with respect to the signal supplied from the spatial frequency conversion unit 241. It is supplied to the spatial frequency synthesis unit 243.
  • the filter processing unit 242-1 to the filter processing unit 242-12 hold an SISO filter for spatial NC.
  • the filter processing unit 242-1 to the filter processing unit 242-12 generate a speaker signal by filtering the signal in the spatial frequency domain from the spatial frequency conversion unit 241 as signal processing by the holding SISO filter. ..
  • This SISO filter is, for example, the above-mentioned filter w'[l, n], and the calculation of the equation (7) is performed as filtering.
  • filter processing unit 242-1 when it is not necessary to distinguish between the filter processing unit 242-1 and the filter processing unit 242-12, it is also simply referred to as the filter processing unit 242.
  • the spatial frequency synthesis unit 243 generates a speaker signal in the time region for each speaker 151 by performing spatial frequency synthesis on the speaker signal in the spatial frequency region supplied from each filter processing unit 242, and causes the speaker array 113. Supply.
  • the inverse conversion of the spatial frequency conversion performed by the spatial frequency conversion unit 241 is performed as the spatial frequency synthesis.
  • the outputs of the plurality of microphones 121 constituting the microphone array 111 are divided into four and input to each signal processing unit 211.
  • each microphone signal supplied from the four microphones 121 is input to each of the four input terminals in the center of the twelve input terminals. Further, a zero signal, which is a dummy microphone signal, is input to each of a total of eight input terminals, four on each of the four input terminals on the left and right ends.
  • DFT is performed as spatial frequency conversion, for example, with a DFT point length "12" based on the microphone signal input from the input terminal, and then in each filter processing unit 242, the DFT output is subjected to. Filtering by SISO filter is performed.
  • IDFT is performed as spatial frequency synthesis for the output of each filter processing unit 242, and a speaker signal in the time domain of each channel is generated.
  • the speaker signal of each channel generated in this way is input to the speaker 151 corresponding to those channels, but before that, in each addition unit 212, the same channel from three signal processing units 211 adjacent to each other is used. Speaker signal is added.
  • the addition processing of the speaker signals of the same channel may be performed in the amplifier, or before the speaker signal is input to the amplifier.
  • the addition process may be performed in a digital or analog state.
  • each signal processing unit 211 the input / output of the spatial frequency conversion unit 241 and the spatial frequency synthesis unit 243, that is, the input / output (point length) of the DFT and IDFT is "12", so that the signal processing unit 131 shown in FIG. It is less than the point length "16" in the case of.
  • the number of PINs (number of input / output terminals) of the signal processing unit 211 can be reduced as compared with the case of the signal processing unit 131, and the amount of calculation (signal processing) performed by the signal processing unit 211 is also small. can do.
  • the noise canceling device 191 it is possible to reduce the number of PINs and the amount of calculation of one signal processing unit 211 and also reduce the delay time, and obtain high spatial NC performance in real time. Moreover, it is possible to deal with noise sounds from all directions.
  • each signal is processed according to the specifications of the signal processing unit (calculator) such as the number of PINs and the number of MIPS (Million Instructions Per Second), such as dividing a 256-channel microphone signal into 12 channels for processing.
  • the point length (number of divisions) in the section can be set arbitrarily.
  • step S51 Since the process of step S51 is the same as the process of step S11 of FIG. 6, the description thereof will be omitted. However, in step S51, the microphone signal obtained by each microphone 121 is supplied to the spatial frequency conversion unit 241 of the signal processing unit 211.
  • step S52 the spatial frequency conversion unit 241 of each signal processing unit 211 performs spatial frequency conversion on the microphone signals in the time domain supplied from the four microphones 121 and the eight zero signals, and the space obtained as a result.
  • a signal in the frequency domain is supplied to each filter processing unit 242.
  • step S52 the same calculation as in the above equation (6) is performed.
  • step S53 the filter processing unit 242 filters the signal in the spatial frequency domain supplied from the spatial frequency conversion unit 241 by the holding SISO filter, and obtains the speaker signal in the spatial frequency domain obtained as a result. It is supplied to the spatial frequency synthesis unit 243. For example, in step S53, the same calculation as in equation (7) is performed as filtering.
  • step S54 the spatial frequency synthesis unit 243 performs spatial frequency synthesis on the spatial frequency region speaker signal supplied from the filter processing unit 242, and supplies the resulting time domain speaker signal to the addition unit 212. ..
  • step S55 the addition unit 212 performs addition processing to add the speaker signals of the same channel supplied from the spatial frequency synthesis unit 243 of each of the three signal processing units 211, and obtains the final speaker signal.
  • each addition unit 212 supplies the speaker signal obtained in the process of step S55 to each speaker 151 of the speaker array 113 to output a sound (noise canceling sound), and the noise canceling process ends.
  • the noise canceling device 191 divides the output of the microphone array 111 into four parts and inputs them to the signal processing unit 211, and each signal processing unit 211 generates a speaker signal by signal processing in the spatial frequency region. do. By doing so, it is possible to reduce the number of PINs and the amount of calculation of one signal processing unit 211 and also reduce the delay time, and realize a high-performance spatial NC that can handle all directions in real time.
  • an addition unit 212 is provided after the signal processing unit 211 in order to share the processing with the plurality of signal processing units 211.
  • FIGS. 13 and 14 if the number of microphones 121 belonging to the microphone group is increased and the outputs of the microphones 121 are overlapped with a plurality of adjacent signal processing units 211 (computing units) for input,
  • the configuration may be such that the addition unit 212 is not provided.
  • FIG. 13 or FIG. 14 the parts corresponding to the case in FIG. 7 are designated by the same reference numerals, and the description thereof will be omitted as appropriate.
  • the noise sound from the position P11 is signal-processed, that is, filtered by using 12 microphones 121, and among the speaker signals obtained as a result, the speaker signals for 4 channels are obtained. Sound is output from the four speakers 151 used.
  • the microphone 121 and the speaker 151 are each divided into four groups.
  • one microphone group is formed by twelve microphones 121 arranged adjacent to each other around the position on the right front side of the user U11.
  • a microphone group is formed by twelve microphones 121 arranged adjacent to each other around the right rear, left rear, and left front directions of the user U11. Has been done.
  • one microphone 121 belongs to three microphone groups. Become.
  • a speaker group consisting of four speakers 151 arranged adjacent to each other is formed with respect to the microphone group on the right front side of the user U11, centered on the position on the right side front side of the user U11. Has been done.
  • the right rear, left rear, and left front of the user U11 are also for the microphone groups in the right rear, left rear, and left front directions of the user U11.
  • a speaker group consisting of four speakers 151 arranged adjacent to each other is formed with the position in each direction as the center.
  • the speaker groups are grouped so that one speaker 151 belongs to only one speaker group, and the speakers 151 arranged adjacent to each other belong to one speaker group.
  • FIG. 15 ⁇ Configuration example of noise canceling device>
  • the noise canceling device is configured as shown in FIG. 15, for example.
  • the same reference numerals are given to the portions corresponding to those in FIG. 10, and the description thereof will be omitted as appropriate.
  • the noise canceling device 281 shown in FIG. 15 has a microphone array 111, a signal processing device 201, and a speaker array 113. Further, the signal processing device 201 has a signal processing unit 211-1 to a signal processing unit 211-4.
  • the configuration of the noise canceling device 281 is different from the configuration of the noise canceling device 191 in that the addition unit 212 is not provided, and is the same configuration as the noise canceling device 191 in other respects.
  • the noise canceling device 281 and the noise canceling device 191 differ in the input / output relationship between the signal processing unit 211, the microphone 121, and the speaker 151.
  • microphones 121-1 to 121-8 and microphones 121-13 to 121-16 are grouped together, and the microphone signals of these microphones 121 are supplied to the signal processing unit 211-1. To.
  • the microphones 121-1 to 121-12 are grouped together, and the microphone signals of these microphones 121 are supplied to the signal processing unit 211-2.
  • Microphones 121-5 to 121-16 are grouped together, and the microphone signals of these microphones 121 are supplied to the signal processing unit 211-3.
  • Microphones 121-9 to 121-16 and microphones 121-1 to 121-4 are grouped together, and the microphone signals of these microphones 121 are supplied to the signal processing unit 211-4.
  • the output of one microphone 121 is input to two or more, more specifically, three signal processing units 211 predetermined for the microphone 121 (microphone group). Therefore, the dummy microphone signal (zero signal) is not supplied to the spatial frequency conversion unit 241 of each signal processing unit 211, and the microphone signals of the twelve microphones 121 are input.
  • the speakers 151-1 to 151-4 are grouped together, and the speakers 151-5 to 151-8 are also grouped together.
  • speakers 151-9 to speakers 151-12 are grouped together, and speakers 151-13 to speakers 151-16 are grouped together.
  • each signal processing unit 211 filtering by an SISO filter or the like is performed based on the microphone signal, and a speaker signal of a part of the speaker 151 of the speaker array 113, that is, all the speakers 151 belonging to the speaker group corresponding to the microphone group is generated. Will be done.
  • the spatial frequency synthesis unit 243 the same number of 12 channels of speaker signals as the input of the spatial frequency conversion unit 241, that is, speaker signals corresponding to each of the 12 speakers 151 can be obtained. However, among these speaker signals, only the speaker signals for four channels, that is, the speaker signals of some of the speakers 151 out of the twelve speakers 151 are actually output to the speaker 151.
  • a speaker signal is output from each of the four output terminals in the center of the twelve output terminals to the speaker 151 connected to those output terminals. Further, since the speaker 151 is not connected to each of the eight output terminals, four on each of the four output terminals on the left and right ends, the speaker signal is supplied from these output terminals to the speaker 151. I won't get it.
  • the speaker signal is output only from four output terminals out of the twelve output terminals, and the remaining eight output terminals are not used. Therefore, for example, a part of the output of spatial frequency synthesis (IDFT) may be omitted.
  • IDFT spatial frequency synthesis
  • the calculation for space NC performed by the noise canceling device 281 is completely equivalent to the calculation for space NC performed by the noise canceling device 191.
  • the noise canceling device 281 and the noise canceling device Which configuration of the ring device 191 may be selected.
  • the configuration of the noise canceling device 281 may be adopted.
  • the configuration of the noise canceling device 191 may be adopted.
  • noise canceling device 281 as described above, basically, the noise canceling process described with reference to FIG. 6 is performed.
  • step S11 the microphone signal obtained by each microphone 121 is supplied to the spatial frequency conversion unit 241 of the signal processing unit 211.
  • step S12 the spatial frequency conversion unit 241 of each signal processing unit 211 performs spatial frequency conversion, and the resulting signal is the filter processing unit 242-1 to the filter processing unit 242- of each signal processing unit 211. It is supplied to 12.
  • step S13 filtering is performed by the filter processing unit 242 of each signal processing unit 211, and the speaker signal in the spatial frequency region obtained as a result is supplied to the spatial frequency synthesis unit 243 of each signal processing unit 211.
  • step S14 spatial frequency synthesis is performed by the spatial frequency synthesis unit 243 of each signal processing unit 211, and the speaker signal in the time domain obtained as a result is supplied to the speaker 151 in step S15, and spatial NC is realized.
  • the series of processes described above can be executed by hardware or software.
  • the programs constituting the software are installed on the computer.
  • the computer includes a computer embedded in dedicated hardware and, for example, a general-purpose personal computer capable of executing various functions by installing various programs.
  • FIG. 16 is a block diagram showing a configuration example of computer hardware that executes the above-mentioned series of processes programmatically.
  • the CPU Central Processing Unit
  • ROM Read Only Memory
  • RAM Random Access Memory
  • An input / output interface 505 is further connected to the bus 504.
  • An input unit 506, an output unit 507, a recording unit 508, a communication unit 509, and a drive 510 are connected to the input / output interface 505.
  • the input unit 506 includes a keyboard, a mouse, a microphone, an image pickup element, and the like.
  • the output unit 507 includes a display, a speaker, and the like.
  • the recording unit 508 includes a hard disk, a non-volatile memory, and the like.
  • the communication unit 509 includes a network interface and the like.
  • the drive 510 drives a removable recording medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
  • the CPU 501 loads the program recorded in the recording unit 508 into the RAM 503 via the input / output interface 505 and the bus 504 and executes the above-mentioned series. Is processed.
  • the program executed by the computer (CPU501) can be recorded and provided on a removable recording medium 511 as a package medium or the like, for example.
  • the program can also be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
  • the program can be installed in the recording unit 508 via the input / output interface 505 by mounting the removable recording medium 511 in the drive 510. Further, the program can be received by the communication unit 509 and installed in the recording unit 508 via a wired or wireless transmission medium. In addition, the program can be pre-installed in the ROM 502 or the recording unit 508.
  • the program executed by the computer may be a program in which processing is performed in chronological order according to the order described in the present specification, in parallel, or at a necessary timing such as when a call is made. It may be a program in which processing is performed.
  • the embodiment of the present technology is not limited to the above-described embodiment, and various changes can be made without departing from the gist of the present technology.
  • this technology can take a cloud computing configuration in which one function is shared by multiple devices via a network and processed jointly.
  • each step described in the above flowchart can be executed by one device or shared by a plurality of devices.
  • the plurality of processes included in the one step can be executed by one device or shared by a plurality of devices.
  • this technology can also have the following configurations.
  • the signal processing unit has one or more signal processing units that perform signal processing in the spatial frequency domain.
  • the signal processing unit is a signal processing device that performs the signal processing on a signal converted in the spatial frequency domain based on the microphone signals obtained by collecting sounds from a plurality of microphones.
  • the signal processing device according to (1) wherein the signal processing unit generates a noise canceling signal by performing the signal processing.
  • a spatial frequency conversion unit that performs spatial frequency conversion for the microphone signals in a plurality of time domains, It is further equipped with a spatial frequency synthesizer that performs spatial frequency synthesis on the signal in the spatial frequency region obtained by the signal processing.
  • the signal processing unit performs the signal processing by inputting a plurality of signals based on the plurality of microphone signals obtained by the plurality of microphones, and outputs the plurality of signals (1) to (3).
  • the signal processing device according to paragraph 1.
  • the signal processing device according to any one of (1) to (4), wherein the signal processing unit has a plurality of filter processing units and performs filtering by the filter processing unit as the signal processing.
  • It has a plurality of the signal processing units and has a plurality of the signal processing units.
  • the signal processing unit performs the signal processing on the signal based on the microphone signal obtained by all the microphones belonging to one group when the plurality of microphones are divided into a plurality of groups, and performs the signal processing to the plurality of microphones.
  • the addition unit adds the speaker signals of the speaker corresponding to the addition unit obtained by two or more of the signal processing units of the plurality of signal processing units, and the final obtained by the addition.
  • the signal processing device according to any one of (1) to (5), which outputs a specific speaker signal to the corresponding speaker.
  • the plurality of the microphones are divided into a predetermined number of the groups so that the plurality of microphones adjacent to each other belong to the same group.
  • the signal processing unit of the predetermined number to which the microphone signal obtained by the microphone belonging to the group is input is defined for each of the predetermined number of the groups (6).
  • Signal processing device has a plurality of the signal processing units and has a plurality of the signal processing units. For each of the plurality of microphones, the microphone signal obtained by one microphone is input to two or more predetermined signal processing units among the plurality of signal processing units.
  • the signal processing unit performs signal processing on a signal based on the microphone signal input from the plurality of microphones to generate a speaker signal corresponding to each of the plurality of speakers, and the plurality of speakers.
  • the signal processing device according to any one of (1) to (5), which outputs the speaker signal to some of the speakers.
  • (11) With multiple microphones
  • the signal processing unit performs noise canceling to generate the noise canceling signal by performing the signal processing on the signal converted in the spatial frequency region based on the microphone signals obtained by the sound collection by the plurality of microphones.

Landscapes

  • Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

La présente technologie concerne un dispositif et un procédé de traitement de signal, un dispositif de suppression de bruit et un programme configurés pour permettre de réduire un temps de retard. Le dispositif de traitement de signal comprend au moins une unité de traitement de signal qui réalise un traitement de signal dans le domaine des fréquences spatiales. Les unités de traitement de signal réalisent le traitement de signal sur des signaux convertis dans le domaine des fréquences spatiales sur la base de signaux de microphone obtenus par la collecte de sons au moyen d'une pluralité de microphones. Cette technologie peut être appliquée à des dispositifs de suppression de bruit.
PCT/JP2021/027823 2020-08-11 2021-07-28 Dispositif et procédé de traitement de signal, dispositif de suppression de bruit et programme WO2022034795A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
JP2020135544 2020-08-11
JP2020-135544 2020-08-11
JP2020145742 2020-08-31
JP2020-145742 2020-08-31

Publications (1)

Publication Number Publication Date
WO2022034795A1 true WO2022034795A1 (fr) 2022-02-17

Family

ID=80247161

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/027823 WO2022034795A1 (fr) 2020-08-11 2021-07-28 Dispositif et procédé de traitement de signal, dispositif de suppression de bruit et programme

Country Status (1)

Country Link
WO (1) WO2022034795A1 (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017118189A (ja) * 2015-12-21 2017-06-29 日本電信電話株式会社 収音信号推定装置、収音信号推定方法、プログラム
WO2018163810A1 (fr) * 2017-03-07 2018-09-13 ソニー株式会社 Dispositif et procédé de traitement de signal, et programme
WO2019198557A1 (fr) * 2018-04-09 2019-10-17 ソニー株式会社 Dispositif de traitement de signal, procédé de traitement de signal et programme de traitement de signal

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2017118189A (ja) * 2015-12-21 2017-06-29 日本電信電話株式会社 収音信号推定装置、収音信号推定方法、プログラム
WO2018163810A1 (fr) * 2017-03-07 2018-09-13 ソニー株式会社 Dispositif et procédé de traitement de signal, et programme
WO2019198557A1 (fr) * 2018-04-09 2019-10-17 ソニー株式会社 Dispositif de traitement de signal, procédé de traitement de signal et programme de traitement de signal

Similar Documents

Publication Publication Date Title
JP7028238B2 (ja) 信号処理装置および方法、並びにプログラム
US20060188111A1 (en) Microphone apparatus
JP2012169895A (ja) 多重極スピーカ群とその配置方法と、音響信号出力装置とその方法と、その方法を用いたアクティブノイズコントロール装置と音場再生装置と、それらの方法とプログラム
KR102514060B1 (ko) 빔 형성 어레이의 드라이버 장치들을 위한 음향 빔 형성 방법 및 음향 장치
JP4787727B2 (ja) 音声収音装置、その方法、そのプログラム、およびその記録媒体
JP5734329B2 (ja) 音場収音再生装置、方法及びプログラム
WO2022034795A1 (fr) Dispositif et procédé de traitement de signal, dispositif de suppression de bruit et programme
WO2020085117A1 (fr) Dispositif, procédé et programme de traitement de signal
WO2017201603A1 (fr) Synthèse de champ d'onde par synthèse d'une fonction de transfert spatial sur une zone d'écoute
Zhuang et al. Study on the cone programming reformulation of active noise control filter design in the frequency domain
JP5603307B2 (ja) 音場収音再生装置、方法及びプログラム
WO2020171081A1 (fr) Dispositif de traitement de signal, procédé de traitement de signal et programme
JP5826712B2 (ja) マルチチャネルエコー消去装置、マルチチャネルエコー消去方法、およびプログラム
US9288577B2 (en) Preserving phase shift in spatial filtering
CN110637466B (zh) 扬声器阵列与信号处理装置
JP5628219B2 (ja) 音場収音再生装置、方法及びプログラム
CN114915875B (zh) 一种可调波束形成方法、电子设备及存储介质
JP2016092562A (ja) 音声処理装置および方法、並びにプログラム
CN113766396B (zh) 扬声器控制
Rafaely Bessel nulls recovery in spherical microphone arrays for time-limited signals
JP2015027046A (ja) 音場収音再生装置、方法及びプログラム
JP5741866B2 (ja) 音場収音再生装置、方法及びプログラム
JP2014007543A (ja) 音場収音再生装置、方法及びプログラム
JP5749221B2 (ja) 音場収音再生装置、方法及びプログラム
Zotter et al. Higher-order ambisonic microphones and the wave equation (linear, lossless)

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21855874

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21855874

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP