EP2988302A1 - System and method for separation of sound sources in a three-dimensional space - Google Patents

System and method for separation of sound sources in a three-dimensional space Download PDF

Info

Publication number
EP2988302A1
EP2988302A1 EP14461562.2A EP14461562A EP2988302A1 EP 2988302 A1 EP2988302 A1 EP 2988302A1 EP 14461562 A EP14461562 A EP 14461562A EP 2988302 A1 EP2988302 A1 EP 2988302A1
Authority
EP
European Patent Office
Prior art keywords
microphone
microphones
sound
computer
calculating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP14461562.2A
Other languages
German (de)
French (fr)
Inventor
Jacek Paczkowski
Krzysztof Kramek
Tomasz Nalewa
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Patents Factory Ltd Sp zoo
Original Assignee
Patents Factory Ltd Sp zoo
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Patents Factory Ltd Sp zoo filed Critical Patents Factory Ltd Sp zoo
Priority to EP14461562.2A priority Critical patent/EP2988302A1/en
Publication of EP2988302A1 publication Critical patent/EP2988302A1/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/028Voice signal separating using properties of sound source
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • H04R1/406Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2201/00Details of transducers, loudspeakers or microphones covered by H04R1/00 but not provided for in any of its subgroups
    • H04R2201/40Details of arrangements for obtaining desired directional characteristic by combining a number of identical transducers covered by H04R1/40 but not provided for in any of its subgroups
    • H04R2201/405Non-uniform arrays of transducers or a plurality of uniform arrays with different transducer spacing

Definitions

  • the present invention relates to a system and method for separation of sound sources in a three-dimensional space.
  • the target sound source can be a human speaker.
  • the reconstruction filters used in the sound source separation take into account the a priori knowledge of the target sound source, such as an estimate the spectra of the target sound source.
  • the filters may be generally constructed based on a speech recognition system. Matching the words of the dictionary of the speech recognition system to a reconstructed signal indicates whether proper separation has occurred. More specifically, the filters may be constructed based on a vector quantization codebook of vectors representing typical sound source patterns. Matching the vectors of the codebook to a reconstructed signal indicates whether proper separation has occurred.
  • the vectors may be linear prediction vectors, among others.
  • a publication of US 20110075860 A1 entitled “Sound source separation and display method, and system thereof” discloses a measurement system using a microphone array, which is a combination of a plurality of microphones, is widely used to identify and visualize the incoming directions of sound and the sound sources.
  • the measurement system can be configured with only a single microphone array, or can also use several reference signal sensors such as a microphone and a vibration pickup.
  • a microphone array by itself is used to equally evaluate sound sources lying in the intended direction of the microphone array.
  • a microphone array of planar shape is intended to analyze sound sources in the front direction.
  • a spherical microphone array is intended to analyze sound sources in all directions around the sphere. If target sounds have high sound pressure levels and show sufficient S/N ratios with respect to other background noise, the locations of the sound sources or the incoming directions can be analyzed without a reference signal.
  • Digital signal processing can be applied for mechanical determination.
  • the aim of the development of the present invention is an improved, more accurate and resources cost effective system and method for separation of sound sources in a three-dimensional space.
  • An object of the present invention is a method for sound source separation using a linear microphone array, the method comprising the steps of: calculating distance of each microphone of the microphone array to a known location of a target that is to be sampled; for each microphone calculating a delay with which sound reaches the given microphone from a given location in space; identifying a microphone having the lowest value of the delay and subtracting this value from all values of delays of all microphones; calculating a sound sample for a given location in space by adding sound samples from all microphones while taking into account the respective delays.
  • x i , y i and z i mean x, y and z coordinates of the i-th microphone and x t , y t and z t mean x, y and z coordinates of the target.
  • the linear microphone comprises a plurality of microphones and the microphones are located in at least two groups of at least two microphones whereas each group has a different spacing of the respective microphones.
  • the first group comprises seventeen microphones, while the remaining four groups comprise eight microphones each.
  • the linear microphone is an arrangement that comprises three linear microphone arrays according to the present invention, wherein first ends of all three microphone arrays, comprising the same arrangement of microphones, are in proximity or adjacent to each other; and the separate microphone arrays are positioned in different planes in three-dimensional space.
  • the other ends of the microphone arrays linearly extend on X, Y and Z axis respectively.
  • Another object of the present invention is a computer program comprising program code means for performing all the steps of the computer-implemented method according to the present invention when said program is run on a computer.
  • Another object of the present invention is a computer readable medium storing computer-executable instructions performing all the steps of the computer-implemented method according to the present invention when executed on a computer.
  • these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated in a computer system.
  • these signals are referred to as bits, packets, messages, values, elements, symbols, characters, terms, numbers, or the like.
  • a computer-readable (storage) medium typically may be non-transitory and/or comprise a non-transitory device.
  • a non-transitory storage medium may include a device that may be tangible, meaning that the device has a concrete physical form, although the device may change its physical state.
  • non-transitory refers to a device remaining tangible despite a change in state.
  • a microphone array according to the present invention comprises, as shown in Fig. 1 , a supporting body 101 and linearly, spatially located microphones 102A-L wherein the microphones are located in for example two groups 103A-C of for example two microphones whereas each group can have a different spacing of the respective microphones.
  • the microphones of a microphone array are linearly placed and spaced from each other.
  • the microphones may be spaced by equal distances or may be spaced by irregular, different distances.
  • the microphones 102 are preferably located on a straight line such that a first group of microphones comprises microphones spaced by for example 6,25mm, the second group of microphones comprises microphones spaced by for example 12,5mm, the third group of microphones comprises microphones spaced by for example 25mm, the fourth group of microphones comprises microphones spaced by for example 50mm and the fifth group of microphones comprises microphones spaced by for example 100mm. Therefore, there are five groups each comprising at least two microphones wherein spacing of respective microphones in groups is such that in subsequent group the spacing is for example twice of that of the preceding group.
  • the first group comprises 17 microphones, while the remaining four groups comprise eight microphones each.
  • This number is a preferred arrangement as shown by experiments and evaluation of response curve at different numbers of microphones in arrays.
  • a single linear microphone array according to the present invention allows obtain good quality of separation. In order to obtain good quality independent of placement of sound source relative to microphone array, it is necessary to apply at least three microphone arrays.
  • the microphone arrays must be spaced for example by 90 degrees wherein first ends of all microphone arrays (comprising the same arrangement of microphones) are in proximity or adjacent to a virtual center of a circle as shown in Fig. 2A.
  • Fig. 2A shows a view in a single plane but the separate microphone arrays must be positioned in different planes in 3D space.
  • the other ends of microphone arrays linearly extend on X, Y and Z axis respectively (for example forming three edges of a cube as shown in Fig. 2B ).
  • Such a microphone system may be located in a corner of a room near the ceiling.
  • Such a microphone system is able obtain good quality of separation of each sound source independent of placement of sound sources relative to three linear arrays.
  • Fig. 3 presents a diagram of the method according to the present invention.
  • x i , y i and z i means x, y and z coordinates of microphone i.
  • x t , y t and z t means x, y and z coordinates of target.
  • dt2 there is identified a microphone having the lowest value of dt2 and this value is subtracted from all values of dt2.
  • dt2 values will identify differences in time of arrival of signal from a given location in space to all microphones.
  • step 304 there is calculated a sound sample for a given location in space by adding sound samples from all microphones while taking into account the respective delays. Addition of sounds is adding sound samples (in terms of their values) from respective microphones whereas a sample from a microphone which is closest to the target is without delay, whereas samples from the remaining microphones have dt 2 delay (corrected by delay of the microphone which is closest to the target).
  • Mt is a sound sample
  • i is the microphone index
  • t is a sample index for a reference microphone.
  • the dt2 i values may be different for different microphones.
  • i-th microphone As a sum input, there is taken into account a sample delayed by dt2 i samples with respect to the microphone closest to the target.
  • Fig. 6A there are presented exemplary acoustic signals 601 - 1 kHz positioned at -45° angle with respect to the center of the linear microphone array, 602 - 2.2kHz positioned at +45° angle with respect to the center of the linear microphone array.
  • the X axis denotes sample number and the Y axis denotes signal amplitude.
  • Fig. 6B presents signals received by the microphones.
  • 16 microphones are used and the 603 is a signal from a first microphone, 604 is a signal from the 8 th microphone while 605 is a signal from the 16 th microphone.
  • Fig. 6C depicts the original signals 606-607 and signals separated 608-609 using the method according to the present invention.
  • the method according to the present invention allows for separation of sound from a given location in space from other sounds.
  • Three microphone arrays installed for example as shown in Fig. 5 allow to increase separation quality because some of the microphones are always in a good position with respect to a monitored location.
  • the number of microphones and the number of groups of microphones may be different than in the presented example.
  • Fig. 4 presents a diagram of the system according to the present invention.
  • the system comprises the microphone array arrangement 402 shown in Fig. 2A-B and an appropriate sampling module 403 managed by a controller 405.
  • the system may be realized using dedicated components or custom made FPGA or ASIC circuits.
  • the system comprises a data bus 401 communicatively coupled to a memory 404. Additionally, other components of the system are communicatively coupled to the system bus 401 so that they may be managed by the controller 405.
  • the memory 404 may store computer program or programs executed by the controller 405 in order to execute steps of the method according to the present invention.
  • controller 405 is configured to executed steps of the method described with reference to Fig. 3 .
  • the present invention results in a useful sound separation that may for example be used in surveillance systems. Such results are concrete and tangible thus not abstract. Therefore, the invention provides a useful, concrete and tangible result.
  • data acquired by different microphones are processed within a dedicated machine. Hence, the machine or transformation test is fulfilled and that the invention is not abstract.
  • the aforementioned method for separation of sound sources in a three-dimensional space may be performed and/or controlled by one or more computer programs.
  • Such computer programs are typically executed by utilizing the computing resources in a computing device.
  • Applications are stored on a non-transitory medium.
  • An example of a non-transitory medium is a non-volatile memory, for example a flash memory or volatile memory, for example RAM.
  • the computer instructions are executed by a processor.
  • These memories are exemplary recording media for storing computer programs comprising computer-executable instructions performing all the steps of the computer-implemented method according the technical concept presented herein.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

A method for sound source separation using a linear microphone array, the method comprising the steps of: calculating distance of each microphone of the microphone array to a known location of a target that is to be sampled; for each microphone calculating a delay with which sound reaches the given microphone from a given location in space; identifying a microphone having the lowest value of the delay and subtracting this value from values of delays of all other microphones; calculating a sound sample for a given location in space by adding sound samples from all microphones while taking into account the respective delays.

Description

  • The present invention relates to a system and method for separation of sound sources in a three-dimensional space.
  • Prior art defines US 7047189 B2 entitled "Sound source separation using convolutional mixing and a priori sound source knowledge" discloses sound source separation, without permutation, using convolutional mixing independent component analysis based on a priori knowledge of the target sound source is disclosed. The target sound source can be a human speaker. The reconstruction filters used in the sound source separation take into account the a priori knowledge of the target sound source, such as an estimate the spectra of the target sound source. The filters may be generally constructed based on a speech recognition system. Matching the words of the dictionary of the speech recognition system to a reconstructed signal indicates whether proper separation has occurred. More specifically, the filters may be constructed based on a vector quantization codebook of vectors representing typical sound source patterns. Matching the vectors of the codebook to a reconstructed signal indicates whether proper separation has occurred. The vectors may be linear prediction vectors, among others.
  • A publication of US 20110075860 A1 entitled "Sound source separation and display method, and system thereof" discloses a measurement system using a microphone array, which is a combination of a plurality of microphones, is widely used to identify and visualize the incoming directions of sound and the sound sources. The measurement system can be configured with only a single microphone array, or can also use several reference signal sensors such as a microphone and a vibration pickup. A microphone array by itself is used to equally evaluate sound sources lying in the intended direction of the microphone array. For example, a microphone array of planar shape is intended to analyze sound sources in the front direction. A spherical microphone array is intended to analyze sound sources in all directions around the sphere. If target sounds have high sound pressure levels and show sufficient S/N ratios with respect to other background noise, the locations of the sound sources or the incoming directions can be analyzed without a reference signal. Digital signal processing can be applied for mechanical determination.
  • The aim of the development of the present invention is an improved, more accurate and resources cost effective system and method for separation of sound sources in a three-dimensional space.
  • SUMMARY AND OBJECTS OF THE INVENTION
  • An object of the present invention is a method for sound source separation using a linear microphone array, the method comprising the steps of: calculating distance of each microphone of the microphone array to a known location of a target that is to be sampled; for each microphone calculating a delay with which sound reaches the given microphone from a given location in space; identifying a microphone having the lowest value of the delay and subtracting this value from all values of delays of all microphones; calculating a sound sample for a given location in space by adding sound samples from all microphones while taking into account the respective delays.
  • Preferably, the step of calculating distance is executed according to the formula of: dl = x i - x t 2 + y i - y t 2 + z i - z t 2
    Figure imgb0001

    wherein the i index denotes a microphone and t index denotes the target. xi, yi and zi mean x, y and z coordinates of the i-th microphone and xt, yt and zt mean x, y and z coordinates of the target.
  • Preferably, the step of calculating the delay is executed according to the formula of: dt = dl Vs
    Figure imgb0002

    wherein Vs is the speed of sound. dt 2 = dt * Fs
    Figure imgb0003

    wherein Fs is a sampling frequency.
  • Preferably, the step of calculating a sound sample for a given location in space is executed according to the formula of: M t = i = 1 17 M i , t + dt 2 i
    Figure imgb0004

    wherein Mt is a sound sample, i is the microphone index and t is a sample index for a reference microphone.
  • Preferably, the linear microphone comprises a plurality of microphones and the microphones are located in at least two groups of at least two microphones whereas each group has a different spacing of the respective microphones.
  • Preferably, there are five groups of microphones each comprising at least two microphones wherein spacing of respective microphones in groups is such that in a subsequent group the spacing is twice of that of the preceding group.
  • Preferably, there are five groups of microphones and that the first group comprises seventeen microphones, while the remaining four groups comprise eight microphones each.
  • Preferably, the linear microphone is an arrangement that comprises three linear microphone arrays according to the present invention, wherein first ends of all three microphone arrays, comprising the same arrangement of microphones, are in proximity or adjacent to each other; and the separate microphone arrays are positioned in different planes in three-dimensional space.
  • Preferably, the other ends of the microphone arrays linearly extend on X, Y and Z axis respectively.
  • Another object of the present invention is a computer program comprising program code means for performing all the steps of the computer-implemented method according to the present invention when said program is run on a computer.
  • Another object of the present invention is a computer readable medium storing computer-executable instructions performing all the steps of the computer-implemented method according to the present invention when executed on a computer.
  • These and other objects of the invention presented herein are accomplished by providing a system and method for separation of sound from selected place in a three-dimensional space. Further details and features of the present invention, its nature and various advantages will become more apparent from the following detailed description of the preferred embodiments shown in a drawing, in which:
    • Fig. 1 shows a microphone array;
    • Figs. 2A-B depict a microphone array system;
    • Fig. 3 presents a diagram of the method according to the present invention;
    • Fig. 4 presents a diagram of the system according to the present invention;
    • Fig. 5 shows an installation of the system in a room;
    • Fig. 6A presented exemplary acoustic signals;
    • Fig. 6B presents signals received by the microphones; and
    • Fig. 6C depicts the original signals and signals separated using the method according to the present invention.
    NOTATION AND NOMENCLATURE
  • Some portions of the detailed description which follows are presented in terms of data processing procedures, steps or other symbolic representations of operations on data bits that can be performed on computer memory. Therefore, a computer executes such logical steps thus requiring physical manipulations of physical quantities.
  • Usually these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated in a computer system. For reasons of common usage, these signals are referred to as bits, packets, messages, values, elements, symbols, characters, terms, numbers, or the like.
  • Additionally, all of these and similar terms are to be associated with the appropriate physical quantities and are merely convenient labels applied to these quantities. Terms such as "processing" or "creating" or "transferring" or "executing" or "determining" or "detecting" or "obtaining" or "selecting" or "calculating" or "generating" or the like, refer to the action and processes of a computer system that manipulates and transforms data represented as physical (electronic) quantities within the computer's registers and memories into other data similarly represented as physical quantities within the memories or registers or other such information storage.
  • A computer-readable (storage) medium, such as referred to herein, typically may be non-transitory and/or comprise a non-transitory device. In this context, a non-transitory storage medium may include a device that may be tangible, meaning that the device has a concrete physical form, although the device may change its physical state. Thus, for example, non-transitory refers to a device remaining tangible despite a change in state.
  • DESCRIPTION OF EMBODIMENTS
  • A microphone array according to the present invention comprises, as shown in Fig. 1, a supporting body 101 and linearly, spatially located microphones 102A-L wherein the microphones are located in for example two groups 103A-C of for example two microphones whereas each group can have a different spacing of the respective microphones.
  • It is to be noted however that for facilitating sound separation it is sufficient that the microphones of a microphone array are linearly placed and spaced from each other. In the broadest possible embodiment, the microphones may be spaced by equal distances or may be spaced by irregular, different distances.
  • The microphones 102 are preferably located on a straight line such that a first group of microphones comprises microphones spaced by for example 6,25mm, the second group of microphones comprises microphones spaced by for example 12,5mm, the third group of microphones comprises microphones spaced by for example 25mm, the fourth group of microphones comprises microphones spaced by for example 50mm and the fifth group of microphones comprises microphones spaced by for example 100mm. Therefore, there are five groups each comprising at least two microphones wherein spacing of respective microphones in groups is such that in subsequent group the spacing is for example twice of that of the preceding group.
  • Preferably, the first group comprises 17 microphones, while the remaining four groups comprise eight microphones each. This number is a preferred arrangement as shown by experiments and evaluation of response curve at different numbers of microphones in arrays.
  • A single linear microphone array according to the present invention allows obtain good quality of separation. In order to obtain good quality independent of placement of sound source relative to microphone array, it is necessary to apply at least three microphone arrays.
  • The microphone arrays must be spaced for example by 90 degrees wherein first ends of all microphone arrays (comprising the same arrangement of microphones) are in proximity or adjacent to a virtual center of a circle as shown in Fig. 2A. Fig. 2A shows a view in a single plane but the separate microphone arrays must be positioned in different planes in 3D space. Preferably, the other ends of microphone arrays linearly extend on X, Y and Z axis respectively (for example forming three edges of a cube as shown in Fig. 2B). Such a microphone system may be located in a corner of a room near the ceiling.
  • Such a microphone system is able obtain good quality of separation of each sound source independent of placement of sound sources relative to three linear arrays.
  • It is to be noted however that for facilitating sound separation it is sufficient that a single microphone array is used in the broadest possible embodiment, However, the configuration depicted in Fig. 3 embodiment will provide improved sound separation quality. Such embodiment is preferred but not mandatory.
  • Fig. 3 presents a diagram of the method according to the present invention. The method starts at step 301 from calculating distance of each microphone of the microphone array to a known location of a target that is to be sampled: dl = x i - x t 2 + y i - y t 2 + z i - z t 2
    Figure imgb0005

    wherein the i index denotes a microphone and t index denotes a target. xi, yi and zi means x, y and z coordinates of microphone i. xt, yt and zt means x, y and z coordinates of target.
  • Subsequently, at step 302, there is calculated a delay with which sound reaches each of the microphones from a given location in space dl: dt = dl Vs
    Figure imgb0006

    wherein Vs is the speed of sound. dt 2 = dt * Fs
    Figure imgb0007

    wherein Fs is a sampling frequency.
  • Next, at step 303, there is identified a microphone having the lowest value of dt2 and this value is subtracted from all values of dt2. Thus dt2 values will identify differences in time of arrival of signal from a given location in space to all microphones.
  • Subsequently, at step 304, there is calculated a sound sample for a given location in space by adding sound samples from all microphones while taking into account the respective delays. Addition of sounds is adding sound samples (in terms of their values) from respective microphones whereas a sample from a microphone which is closest to the target is without delay, whereas samples from the remaining microphones have dt2 delay (corrected by delay of the microphone which is closest to the target).
  • For a microphone closest to the given location in space the delay equals 0 and for the remaining microphones it is derived from a difference in distance from the given location in space with respect to the closest microphone: M t = i = 1 17 M i , t + dt 2 i
    Figure imgb0008
    wherein Mt is a sound sample, i is the microphone index and t is a sample index for a reference microphone. The dt2i values may be different for different microphones.
  • For i-th microphone, as a sum input, there is taken into account a sample delayed by dt2i samples with respect to the microphone closest to the target.
  • As a result there is obtained a set of sound samples. This is an equivalent of a directional microphone directed at a given location in space. The sound from this point will be amplified while the sounds from other locations will be attenuated.
  • In Fig. 6A there are presented exemplary acoustic signals 601 - 1 kHz positioned at -45° angle with respect to the center of the linear microphone array, 602 - 2.2kHz positioned at +45° angle with respect to the center of the linear microphone array. The X axis denotes sample number and the Y axis denotes signal amplitude.
  • Fig. 6B presents signals received by the microphones. In this arrangement 16 microphones are used and the 603 is a signal from a first microphone, 604 is a signal from the 8th microphone while 605 is a signal from the 16th microphone.
  • Fig. 6C depicts the original signals 606-607 and signals separated 608-609 using the method according to the present invention.
  • The method according to the present invention allows for separation of sound from a given location in space from other sounds. Three microphone arrays installed for example as shown in Fig. 5 allow to increase separation quality because some of the microphones are always in a good position with respect to a monitored location. Of course the number of microphones and the number of groups of microphones may be different than in the presented example.
  • Fig. 4 presents a diagram of the system according to the present invention. The system comprises the microphone array arrangement 402 shown in Fig. 2A-B and an appropriate sampling module 403 managed by a controller 405.
  • The system may be realized using dedicated components or custom made FPGA or ASIC circuits. The system comprises a data bus 401 communicatively coupled to a memory 404. Additionally, other components of the system are communicatively coupled to the system bus 401 so that they may be managed by the controller 405.
  • The memory 404 may store computer program or programs executed by the controller 405 in order to execute steps of the method according to the present invention.
  • Therefore, the controller 405 is configured to executed steps of the method described with reference to Fig. 3.
  • The present invention results in a useful sound separation that may for example be used in surveillance systems. Such results are concrete and tangible thus not abstract. Therefore, the invention provides a useful, concrete and tangible result.
  • According to the present invention data acquired by different microphones are processed within a dedicated machine. Hence, the machine or transformation test is fulfilled and that the invention is not abstract.
  • It can be easily recognized, by one skilled in the art, that the aforementioned method for separation of sound sources in a three-dimensional space may be performed and/or controlled by one or more computer programs. Such computer programs are typically executed by utilizing the computing resources in a computing device. Applications are stored on a non-transitory medium. An example of a non-transitory medium is a non-volatile memory, for example a flash memory or volatile memory, for example RAM. The computer instructions are executed by a processor. These memories are exemplary recording media for storing computer programs comprising computer-executable instructions performing all the steps of the computer-implemented method according the technical concept presented herein.
  • While the invention presented herein has been depicted, described, and has been defined with reference to particular preferred embodiments, such references and examples of implementation in the foregoing specification do not imply any limitation on the invention. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader scope of the technical concept. The presented preferred embodiments are exemplary only, and are not exhaustive of the scope of the technical concept presented herein.
  • Accordingly, the scope of protection is not limited to the preferred embodiments described in the specification, but is only limited by the claims that follow.

Claims (12)

  1. A method for sound source separation using a linear microphone array, the method being characterized in that it comprises the steps of:
    • calculating distance (301) of each microphone of the microphone array to a known location of a target that is to be sampled;
    • for each microphone calculating a delay (302) with which sound reaches the given microphone from a given location in space;
    • identifying (303) a microphone having the lowest value of the delay and subtracting this value from all values of delays of all microphones;
    • calculating (304) a sound sample for a given location in space by adding sound samples from all microphones while taking into account the respective delays.
  2. The method according to claim 1 characterized in that the step of calculating distance (301) is executed according to the formula of: dl = x i - x t 2 + y i - y t 2 + z i - z t 2
    Figure imgb0009

    wherein the i index denotes a microphone and t index denotes the target. xi, yi and zi mean x, y and z coordinates of the i-th microphone and xt, yt and zt mean x, y and z coordinates of the target.
  3. The method according to claim 1 characterized in that the step of calculating the delay (302) is executed according to the formula of: dt = dl Vs
    Figure imgb0010

    wherein Vs is the speed of sound. dt 2 = dt * Fs
    Figure imgb0011

    wherein Fs is a sampling frequency.
  4. The method according to claim 1 characterized in that the step of calculating (304) a sound sample for a given location in space is executed according to the formula of: M t = i = 1 17 M i , t + dt 2 i
    Figure imgb0012

    wherein Mt is a sound sample, i is the microphone index and t is a sample index for a reference microphone.
  5. The method according to claim 1 characterized in that the linear microphone comprises a plurality of microphones and the microphones are located in at least two groups (103A-C) of at least two microphones whereas each group has a different spacing of the respective microphones.
  6. The method according to claim 5 characterized in that there are five groups of microphones each comprising at least two microphones wherein spacing of respective microphones in groups is such that in a subsequent group the spacing is twice of that of the preceding group.
  7. The method according to claim 5 characterized in that there are five groups of microphones and that the first group comprises seventeen microphones, while the remaining four groups comprise eight microphones each.
  8. The method according to claim 5 characterized in that the linear microphone is an arrangement that comprises three linear microphone arrays according to claim 5, wherein first ends of all three microphone arrays, comprising the same arrangement of microphones, are in proximity or adjacent to each other; and the separate microphone arrays are positioned in different planes in three-dimensional space.
  9. The method according to claim 8 characterized in that the other ends of the microphone arrays linearly extend on X, Y and Z axis respectively.
  10. A computer program comprising program code means for performing all the steps of the computer-implemented method according to claim 1 when said program is run on a computer.
  11. A computer readable medium storing computer-executable instructions performing all the steps of the computer-implemented method according to claim 1 when executed on a computer.
  12. A system for sound source localization comprising
    • a microphone array;
    • a data bus (401) communicatively coupling components of the system;
    • a memory (404) for storing data;
    • a controller (405);
    • a sampling module (403);
    the system being characterized in that it comprises:
    • the microphone array system (402) is a linear microphone array;
    • whereas the controller (405) is configured to control the sampling module (403) and to execute all steps of the method according to claim 1.
EP14461562.2A 2014-08-21 2014-08-21 System and method for separation of sound sources in a three-dimensional space Withdrawn EP2988302A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP14461562.2A EP2988302A1 (en) 2014-08-21 2014-08-21 System and method for separation of sound sources in a three-dimensional space

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP14461562.2A EP2988302A1 (en) 2014-08-21 2014-08-21 System and method for separation of sound sources in a three-dimensional space

Publications (1)

Publication Number Publication Date
EP2988302A1 true EP2988302A1 (en) 2016-02-24

Family

ID=51359348

Family Applications (1)

Application Number Title Priority Date Filing Date
EP14461562.2A Withdrawn EP2988302A1 (en) 2014-08-21 2014-08-21 System and method for separation of sound sources in a three-dimensional space

Country Status (1)

Country Link
EP (1) EP2988302A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110544486A (en) * 2019-09-02 2019-12-06 上海其高电子科技有限公司 Speech enhancement method and system based on microphone array

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7047189B2 (en) 2000-04-26 2006-05-16 Microsoft Corporation Sound source separation using convolutional mixing and a priori sound source knowledge
US20110075860A1 (en) 2008-05-30 2011-03-31 Hiroshi Nakagawa Sound source separation and display method, and system thereof
US20120155703A1 (en) * 2010-12-16 2012-06-21 Sony Computer Entertainment, Inc. Microphone array steering with image-based source location

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7047189B2 (en) 2000-04-26 2006-05-16 Microsoft Corporation Sound source separation using convolutional mixing and a priori sound source knowledge
US20110075860A1 (en) 2008-05-30 2011-03-31 Hiroshi Nakagawa Sound source separation and display method, and system thereof
US20120155703A1 (en) * 2010-12-16 2012-06-21 Sony Computer Entertainment, Inc. Microphone array steering with image-based source location

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
IVAN HIMAWAN ET AL: "Dealing with uncertainty in microphone placement in a microphone array speech recognition system", ACOUSTICS, SPEECH AND SIGNAL PROCESSING, 2008. ICASSP 2008. IEEE INTERNATIONAL CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, 31 March 2008 (2008-03-31), pages 1565 - 1568, XP031250864, ISBN: 978-1-4244-1483-3 *
PATTARAPONG ROJANASTHIEN: "Microphone Array and Beamforming", ECE5525 SPEECH PROCESSING FINAL PRESENTATION, 8 December 2008 (2008-12-08), XP055163602, Retrieved from the Internet <URL:http://my.fit.edu/~vkepuska/ece5525/Projects/Fall2008/Pattarapong Rojanasthien/Rojanasthien.doc> [retrieved on 20150120] *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110544486A (en) * 2019-09-02 2019-12-06 上海其高电子科技有限公司 Speech enhancement method and system based on microphone array
CN110544486B (en) * 2019-09-02 2021-11-02 上海其高电子科技有限公司 Speech enhancement method and system based on microphone array

Similar Documents

Publication Publication Date Title
EP2988527A1 (en) System and method for detecting location of sound sources in a three-dimensional space
US10602265B2 (en) Coprime microphone array system
JP6635903B2 (en) Sound source position estimating apparatus, sound source position estimating method, and program
JP2020500492A5 (en) Methods, programs and systems for spatial ambient-aware personal audio supply devices
EP4375952A3 (en) Systems and methods for reducing data density in large datasets
EP3078210B1 (en) Estimating a room impulse response for acoustic echo cancelling
US9813832B2 (en) Mating assurance system and method
MX2018005090A (en) Apparatus, method or computer program for generating a sound field description.
KR20140135349A (en) Apparatus and method for asynchronous speech recognition using multiple microphones
JP2006194700A (en) Sound source direction estimation system, sound source direction estimation method and sound source direction estimation program
JP2010212818A (en) Method of processing multi-channel signals received by a plurality of microphones
EP2988302A1 (en) System and method for separation of sound sources in a three-dimensional space
KR101685084B1 (en) A method to estimate the shape of towed array sonar and an apparatus thereof
CN112750455A (en) Audio processing method and device
CN107843871B (en) Sound source orientation method and device and electronic equipment
EP3182734B1 (en) Method for using a mobile device equipped with at least two microphones for determining the direction of loudspeakers in a setup of a surround sound system
KR20090128221A (en) Method for sound source localization and system thereof
KR101442172B1 (en) Real-time SRP-PHAT sound source localization system and control method using a search space clustering method
WO2012164448A1 (en) Method for self - calibrating a set of acoustic sensors, and corresponding system
US9612310B2 (en) Method and apparatus for determining the direction of arrival of a sonic boom
JP2008089312A (en) Signal arrival direction estimation apparatus and method, signal separation apparatus and method, and computer program
KR20200066891A (en) Apparatus and method for three-dimensional sound source position detection using a two-dimensional microphone array
JP2018006826A5 (en) Audio processing apparatus and audio processing method
Ghamdan et al. Position estimation of binaural sound source in reverberant environments
JP6433630B2 (en) Noise removing device, echo canceling device, abnormal sound detecting device, and noise removing method

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20150504

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20170301