EP4064734A1 - Traitement audio - Google Patents

Traitement audio Download PDF

Info

Publication number
EP4064734A1
EP4064734A1 EP21165215.1A EP21165215A EP4064734A1 EP 4064734 A1 EP4064734 A1 EP 4064734A1 EP 21165215 A EP21165215 A EP 21165215A EP 4064734 A1 EP4064734 A1 EP 4064734A1
Authority
EP
European Patent Office
Prior art keywords
microphone
user
obstructed
audio input
audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP21165215.1A
Other languages
German (de)
English (en)
Inventor
Arto Juhani Lehtiniemi
Miikka Tapani Vilermo
Mikko Olavi Heikkinen
Antti Johannes Eronen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Technologies Oy
Original Assignee
Nokia Technologies Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Technologies Oy filed Critical Nokia Technologies Oy
Priority to EP21165215.1A priority Critical patent/EP4064734A1/fr
Priority to CN202210306872.5A priority patent/CN115132216A/zh
Publication of EP4064734A1 publication Critical patent/EP4064734A1/fr
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R29/00Monitoring arrangements; Testing arrangements
    • H04R29/004Monitoring arrangements; Testing arrangements for microphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/22Arrangements for obtaining desired frequency or directional characteristics for obtaining desired frequency characteristic only 
    • H04R1/24Structural combinations of separate transducers or of two parts of the same transducer and responsive respectively to two or more frequency ranges
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/22Arrangements for obtaining desired frequency or directional characteristics for obtaining desired frequency characteristic only 
    • H04R1/28Transducer mountings or enclosures modified by provision of mechanical or acoustic impedances, e.g. resonator, damping means
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • H04R1/406Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2201/00Details of transducers, loudspeakers or microphones covered by H04R1/00 but not provided for in any of its subgroups
    • H04R2201/40Details of arrangements for obtaining desired directional characteristic by combining a number of identical transducers covered by H04R1/40 but not provided for in any of its subgroups
    • H04R2201/4012D or 3D arrays of transducers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/03Synergistic effects of band splitting and sub-band processing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/20Processing of the output signals of the acoustic transducers of an array for obtaining a desired directivity characteristic
    • H04R2430/25Array processing for suppression of unwanted side-lobes in directivity characteristics, e.g. a blocking matrix
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2499/00Aspects covered by H04R or H04S not otherwise provided for in their subgroups
    • H04R2499/10General applications
    • H04R2499/11Transducers incorporated or for use in hand-held devices, e.g. mobile phones, PDA's, camera's
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R29/00Monitoring arrangements; Testing arrangements
    • H04R29/004Monitoring arrangements; Testing arrangements for microphones
    • H04R29/005Microphone arrays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones

Definitions

  • Embodiments of the present disclosure relate to audio processing.
  • Spatial audio enables the capturing of audio from an audio source while retaining information about the relative position of the audio source from an origin.
  • the audio can then be rendered to a listener at the same (or a different) relative position from the listener. It is also possible to separately attenuate or amplify specific audio sources or even remove them entirely from the rendered audio. This provides a 'focus' on a specific audio source or specific audio sources.
  • One way of capturing audio while retaining information about the position of the audio source is to use an array of microphones.
  • the microphones have known fixed positional differences and therefore audio from a particular audio source can reach each microphone with a different time delay. This phase information can accurately position the audio source if the array is carefully designed.
  • Two omnidirectional microphones can position a stationary audio source as lying on an intersect (e.g. circle) centered on the axis between the microphones.
  • a third omnidirectional microphone positions the stationary audio source at either of two points on the intersect (circle) on either side of the plane shared by the three microphones.
  • a fourth microphone can be used position the stationary audio source at a single point.
  • an apparatus comprising means for:
  • the frequency dependent filter is configured as a spatially dependent filter that differentially amplifies or attenuates audio sources at different spatial positions.
  • the apparatus is configured to detect as the user input, a spatially specific hand gesture that provides a spatially dependent, at least partial, user-controlled obstruction of audio reaching the at least one obstructed microphone.
  • the apparatus comprises a camera and means for detecting user input indicating a presence of a, at least partial, user-controlled obstruction comprising means for processing an output from the camera to recognize a presence of a hand and movement of the hand as a user hand gesture.
  • the frequency dependent filter is based on a difference in spectrum of the audio input between the at least one obstructed microphone and the at least one unobstructed microphone caused by acoustic shadowing by the, at least partial, user-controlled obstruction between the at least one obstructed microphone and a first region.
  • the apparatus comprises means for:
  • the frequency dependent filter selectively provides a gain to spectral components corresponding to spectral components of the obstructed audio signal.
  • the frequency dependent filter selectively provides a gain to lower frequency harmonics of spectral components that are attenuated by the, at least partial, user-controlled obstruction that have a harmonic structure.
  • the gain is controlled by the user.
  • the frequency dependent filter is a time-variable filter configured to fade over time.
  • the apparatus is configured to prompt the user to repeat creation of the frequency dependent filter, comprising:
  • the frequency dependent filter is based on a difference in spectrum of the audio input from the obstructed microphone and the unobstructed microphone caused by acoustic shadowing by a hand of the user.
  • the frequency dependent filter provides frequency dependent gain, and the gain is controlled by the user to be attenuation or amplification.
  • an apparatus comprising means for:
  • a group of objects or features can be identified using a reference numeral without a sub-script. Particular members of the group (if more than one) can be (but are not necessarily) identified using a reference numeral with a sub-script.
  • the reference 22' will be used to reference user-obstructed audio input to differentiate it from other audio input 22.
  • the reference 22 will be used to reference audio input in the absence of a user-obstruction 30.
  • the audio input in the absence of a user-obstruction 30 can be for the purpose of comparison to create a filter 50 (the audio input 22 is described at this stage as 'unobstructed' audio input to distinguish it from the obstructed audio input 22').
  • the audio input in the absence of a deliberate user-obstruction 30 can be for the purpose of filtration by the created filter 50.
  • an apparatus 100 comprises means for:
  • the frequency dependent filter 50 is based on a difference in spectrum between the audio inputs 22 from the at least one unobstructed microphone and the at least one obstructed microphone 20 caused by acoustic shadowing by the at least partial, obstruction 30 between the at least one obstructed microphone 20 and the first region 16.
  • the at least partial, obstruction 30 is a hand 132 of the user 130.
  • the frequency dependent filter 50 provides frequency dependent gain.
  • the gain can be negative (attenuation) or positive (amplification).
  • the gain is controlled by the user 130 to be attenuation or, alternatively, amplification.
  • the user provides that control via a gesture 134 of their hand 132.
  • Figs 1A, 1B, 2A, 2B, 2C illustrate one or more spatially distributed audio sources 10 i which produce respective audio 12 i .
  • the audio sources 10 i illustrated and their arrangement are only examples. There can be more or less audio sources 10, for example there may be a single audio source 10.
  • the positions of the audio sources 10 can be different than illustrated. Also, the audio sources 10 can be positioned within a three-dimensional space.
  • the audio sources 10 can have different spatial extent than illustrated.
  • an audio source can be an ambient audio source (for example background noise).
  • An audio source 10 can be a localized audio source (for example a human speaking).
  • a frequency dependent filter 50 (not illustrated in these FIGs) is created based on audio input 22 from a single microphone 20.
  • the comparison of obstructed audio input 22' ( FIG 1B ) and the unobstructed audio input 22 ( FIG 1A ) is based on audio input 22, 22' that is captured by the microphone 20 at different times.
  • the comparison is a time-divided comparison.
  • a frequency dependent filter 50 (not illustrated in these FIGs) is created based on audio input 22 from two microphones 20 1 , 20 2 that are distinct and separated in space.
  • the comparison of obstructed audio input 22' ( FIG 2B ) and the unobstructed audio input 22 ( FIG 2A ) can be based on audio input 22, 22' that is captured by the microphone 20 at different times as described for FIGs 1A, 1B .
  • the comparison of obstructed audio input 22 2 ' ( FIG 2C ) and the unobstructed audio input 22 ( FIG 2C ) can be based on audio input 22 1 , 22 2 ' that is captured by different microphones 20 1 , 20 2 at the same time (e.g. simultaneously or contemporaneously).
  • the comparison is a space-divided as the different microphones 20 1 , 20 2 are at different positions.
  • the microphone 20 provides, for further processing, unobstructed audio input 22 captured at a time (offset time) when the user is not providing the, at least partial, obstruction 30 between the at least one microphone 20 and the first region 16.
  • the user provides an obstruction 30 between the at least one microphone 20 and a first region 16.
  • the obstruction 30 can be a complete or partial obstruction.
  • the obstruction 30 obstructs the audio 12 2 produced by the audio source 10 2 .
  • the audio from the audio source 10 2 that reaches the microphone 20 (if any) is obstructed audio 12 2 '.
  • the microphone 20 captures the obstructed audio 12 2 ' and unobstructed audio 12 1 , 12 3 from other audio sources 10 1 , 10 3 (if any) to produce the obstructed audio input 22' at a time (reference time) when the user is providing the, at least partial, obstruction 30 between the microphone 20 and a first region 16.
  • the microphone 20 provides for further processing the obstructed audio input 22'.
  • an apparatus 100 which may or may not comprise the microphone 20 compares the obstructed audio input 22' and the unobstructed audio input 22 and creates a frequency dependent filter 50 in dependence upon the comparison. After the user-provided obstruction 30 has been removed, the created frequency dependent filter 50 can then be used to filter received audio input 22 from the microphone 20 to create filtered audio that amplifies or attenuates an audio source in the first region 16.
  • the offset time is a time offset relative to the reference time.
  • the offset time can be before or after the reference time.
  • the offset time can be immediately before or immediately after the reference time.
  • the user creates a spatially dependent, at least partial, obstruction 30 of audio reaching the at least one microphone.
  • a user input that indicates the reference time can be used to indicate a presence of the user-controlled obstruction 30 at the reference time.
  • the apparatus 100 can be configured to detect user input that determines the reference time, compare audio input 22 from the microphone 20 received at the reference time with audio input 22 from the microphone 20 received at the offset time (a time offset relative to the reference time), and create the frequency dependent filter 50 in dependence upon the comparison.
  • the microphone 20 is an unobstructed microphone 20.
  • the microphone 20 is an obstructed microphone 20.
  • microphones 20 provide, for further processing, unobstructed audio input 22 1 , 22 2 captured at a time (offset time) when the user is not providing the, at least partial, obstruction 30 between the at least one microphone 20 and the first region 16.
  • offset time a time when the user is not providing the, at least partial, obstruction 30 between the at least one microphone 20 and the first region 16.
  • a pair of microphones 20 1 , 20 2 is used, but more could be used.
  • the user provides an obstruction 30 between the pair of microphones 20 and the first region 16.
  • the obstruction 30 can be a complete or partial obstruction.
  • the obstruction 30 obstructs the audio 12 2 produced by the audio source 10 2 .
  • the audio from the audio source 10 2 that reaches the microphone 20 1 (if any) is obstructed audio 12 2 '.
  • the microphone 20 1 captures the obstructed audio 12 2 ' and unobstructed audio 12 1 , 12 3 from other audio sources 10 1 , 10 3 (if any) to produce the obstructed audio input 22 1 ' at a time (reference time) when the user is providing the, at least partial, obstruction 30 between the microphone 20 1 and the first region 16.
  • the microphone 20 1 provides the obstructed audio input 22 1 ' for further processing.
  • the audio from the audio source 10 2 that reaches the microphone 20 2 (if any) is obstructed audio 12 2 '.
  • the microphone 20 2 captures the obstructed audio 12 2 ' and unobstructed audio 12 1 , 12 3 from other audio sources 10 1 , 10 3 (if any) to produce the obstructed audio input 22 2 ' at the time (reference time) when the user is providing the, at least partial, obstruction 30 between the microphone 20 2 and the first region 16.
  • the microphone 20 2 provides the obstructed audio input 22 2 ' for further processing.
  • an apparatus 100 which may or may not comprise the microphone(s) 20 compares the obstructed audio input 22' (obstructed audio input 22 1 ' and obstructed audio input 22 2 ') and the unobstructed audio input 22 (unobstructed audio input 22 1 and unobstructed audio input 22 2 ) and creates a frequency dependent filter 50 in dependence upon the comparison.
  • a combination e.g. sum
  • a combination e.g. sum of the unobstructed audio input 22 1 ' and obstructed audio input 22 2 ' is compared to a combination (e.g. sum) of the unobstructed audio input 22 1 and unobstructed audio input 22 2 .
  • the created frequency dependent filter 50 can then be used to filter received audio input 22 from the microphone 20 to create filtered audio that amplifies or attenuates the audio source 10 2 in the first region 16.
  • the offset time is a time offset relative to the reference time.
  • the offset time can be before or after the reference time.
  • the offset time can be immediately before or immediately after the reference time.
  • the user creates a spatially dependent, at least partial, obstruction 30 of audio reaching the microphones 20.
  • a user input that indicates the reference time can be used to indicate a presence of the user-controlled obstruction 30 at the reference time.
  • the apparatus 100 can be configured to detect the user input that determines the reference time, compare audio input 22 from the microphones 20 received at the reference time with audio input 22 from the microphones 20 received at the offset time that has a time offset relative to the reference time, and create the frequency dependent filter 50 in dependence upon the comparison.
  • the microphones 20 1 , 20 2 are unobstructed microphones 20.
  • the microphones 20 1 , 20 2 are obstructed microphones 20
  • microphones 20 provide, for further processing, obstructed and unobstructed audio input.
  • a pair of microphones 20 1 , 20 2 is used, but more could be used.
  • the user provides an obstruction 30 between the microphone 20 2 and a first region 16 (but not between the microphone 20 1 and the first region 16).
  • the obstruction 30 can be a complete or partial obstruction.
  • the obstruction 30 obstructs the audio 12 2 produced by the audio source 10 2 and captured by the microphone 20 2 .
  • microphone 20 1 provides for further processing unobstructed audio input 22 1 and microphone 20 2 provides for further processing obstructed audio input 22 2 '.
  • the user-controlled obstruction 30 is between the microphone 20 2 and the first region 16 but not between the microphone 20 1 and the first region 16. Consequently, microphone 20 1 provides for further processing unobstructed audio input 22 1 captured when the user is not providing the, at least partial, obstruction 30 between the microphone 20 1 and the first region 16 and microphone 20 2 provides for further processing obstructed audio input 22 2 ' captured when the user is providing the, at least partial, obstruction 30 between the microphone 20 2 and the first region 16.
  • the audio from the audio source 10 2 that reaches the microphone 20 1 (if any) is unobstructed audio 12 2 .
  • the microphone 20 1 captures the unobstructed audio 12 2 and unobstructed audio 12 1 , 12 3 from other audio sources 10 1 , 10 3 (if any) to produce the unobstructed audio input 22 1 at a time (reference time) when the user is providing the, at least partial, obstruction 30 between the microphone 20 2 and the first region 16.
  • the microphone 20 1 provides the unobstructed audio input 22 1 for further processing.
  • the audio from the audio source 10 2 that reaches the microphone 20 2 (if any) is obstructed audio 12 2 '.
  • the microphone 20 2 captures the obstructed audio 12 2 ' and unobstructed audio 12 1 , 12 3 from other audio sources 12 1 , 12 3 (if any) to produce the obstructed audio input 22 2 ' at the time (reference time) when the user is providing the, at least partial, obstruction 30 between the microphone 20 2 and the first region 16.
  • the microphone 20 2 provides for further processing the obstructed audio input 22 2 '.
  • an apparatus 100 which may or may not comprise some or all of the microphones 20 compares the obstructed audio input 22' (obstructed audio input 22 2 ') and the unobstructed audio input 22 (unobstructed audio input 22 1 ) and creates a frequency dependent filter 50 in dependence upon the comparison.
  • the created frequency dependent filter 50 can then be used to filter received audio input 22 from the microphone(s) 20 to create filtered audio that amplifies or attenuates an audio source in the first region 16.
  • the user creates a spatially dependent, at least partial, obstruction 30 of audio reaching a microphone 20 2 .
  • a user input indicating a reference time can be used to indicate a presence of the user-controlled obstruction 30 at the reference time.
  • the apparatus 100 can be configured to detect the user input that determines the reference time, compare audio input 22 2 from the microphone 20 2 received at the reference time with audio input 22 1 from the microphone 20 1 received at the reference time, and create the frequency dependent filter 50 in dependence upon the comparison.
  • the microphones 20 2 is an obstructed microphone 20 and the microphone 20 1 is an unobstructed microphone 20
  • FIG 3 illustrates a method 200.
  • the method 200 creates a frequency-dependent filter 50 (not illustrated).
  • the method 200 comprises receiving obstructed audio input 22' from at least one obstructed microphone 20 when a user is providing an, at least partial, obstruction 30 between the at least one obstructed microphone 20 and a first region 16.
  • the method 200 comprises receiving unobstructed audio input 22 from at least one unobstructed microphone 20 when the user is not providing the, at least partial, obstruction 30 between the at least one unobstructed microphone 20 and the first region 16.
  • the method 200 comprises comparing the obstructed audio input 22' and the unobstructed audio input 22.
  • the method 200 comprises creating a frequency dependent filter 50 in dependence upon the comparison.
  • the method 200 comprises filtering received audio input from the at least one microphone 20 to create filtered audio that amplifies or attenuates an audio source in the first region 16.
  • the filtering at block 218 occurs after removal of the obstruction 30 referenced in block 210.
  • the at least one obstructed microphone 20 and the at least one unobstructed microphone 20 can be the same set of one or more microphones 20 at different times (time division).
  • the at least one obstructed microphone 20 and the at least one unobstructed microphone 20 can be the different sets of one or more microphones 20 at the same time (spatial division).
  • the frequency dependent filter 50 can, for example, be based on a difference in spectrum of the audio input 22 from the at least one microphone 20 caused by acoustic shadowing by a hand of the user.
  • the difference can be a change over time for one or more microphones 20 (time division).
  • the difference can be a difference between microphones 20 at the same time (spatial division).
  • FIG 4 illustrates a method 201 similar to method 200.
  • the method 201 creates a frequency-dependent filter 50 (not illustrated).
  • the method 201 comprises detecting user input that determines a reference time when a user is providing an, at least partial, obstruction 30 between the at least one obstructed microphone 20 and a first region 16 but not providing an obstruction 30 between the at least one unobstructed microphone 20 and the first region 16.
  • the method 201 comprises receiving obstructed audio input 22' from at least one obstructed microphone 20 when a user is providing an, at least partial, obstruction 30 between the at least one obstructed microphone 20 and a first region 16.
  • the method 201 comprises receiving unobstructed audio input 22 from at least one unobstructed microphone 20 when the user is not providing the, at least partial, obstruction 30 between the at least one unobstructed microphone 20 and the first region 16.
  • the method 201 comprises comparing the obstructed audio input 22' and the unobstructed audio input 22.
  • the method 201 comprises creating a frequency dependent filter 50 in dependence upon the comparison.
  • the method 201 comprises filtering received audio input from the at least one microphone 20 to create filtered audio that amplifies or attenuates an audio source in the first region 16.
  • the filtering at block 218 occurs after removal of the obstruction 30 referenced in block 210.
  • the at least one obstructed microphone 20 and the at least one unobstructed microphone 20 can be the same set of one or more microphones 20 at different times (time division).
  • the method 201 comprises receiving obstructed audio input 22' captured at the reference time by the microphone(s) 20 when a user is providing an, at least partial, obstruction 30 between the at least one obstructed microphone 20 and a first region 16 and at block 212, the method 201 comprises receiving unobstructed audio input 22 captured at an offset time (a time offset relative to the reference time) by the microphone(s) 20 when the user is not providing the, at least partial, obstruction 30 between the at least one unobstructed microphone 20 and the first region 16.
  • the at least one obstructed microphone 20 and the at least one unobstructed microphone 20 can be the different sets of one or more microphones 20 at the same time (spatial division).
  • the method 201 comprises receiving obstructed audio input 22' captured at the reference time by a first set of one or more microphones 20 when a user is providing an, at least partial, obstruction 30 between the first set of microphones 20 and a first region 16 and at block 212, the method 201 comprises receiving unobstructed audio input 22 captured at the reference time by a second set of one or more microphones 20 when the user is not providing the, at least partial, obstruction 30 between the second set of microphones 20 and the first region 16.
  • Fig 5A is an example of a spectral representation of unobstructed audio input 22 produced by microphone(s) 20 that capture unobstructed audio 12.
  • the example is simplified for the purpose of explanation.
  • the FIG illustrates the energy (y-axis) the unobstructed audio input 22 has within frequency bands (x-axis).
  • the frequency bands are of equal size, in other examples the frequency bands can be of different sizes. There can also be more or less frequency bands.
  • FIG 5A therefore represents a frequency spectrum of the unobstructed audio input 22.
  • Fig 5B is an example of a spectral representation of obstructed audio input 22' produced by microphone(s) 20 that capture obstructed audio 12' that is an obstructed variant of the unobstructed audio 12.
  • FIG 5B therefore represents a frequency spectrum of the obstructed audio input 22'.
  • Fig 5C illustrates a difference 40 between the frequency spectrum of the unobstructed audio input 22 ( FIG 5A ) and the frequency spectrum of the obstructed audio input 22' ( FIG 5B ).
  • the frequency spectrum of the obstructed audio input 22' ( FIG 5B ) is subtracted from the frequency spectrum of the unobstructed audio input 22 ( FIG 5A ).
  • the difference 40 represents a frequency spectrum of the audio from the first region 16 that has been obstructed.
  • the frequency spectrum of the audio from the first region 16 that has been obstructed has a range 42.
  • the comparison previously described can, for example, determine the difference 40 and use it to create the filter 50.
  • the frequency spectrum of the audio from the first region 16 that has been obstructed (e.g. Fig 5C ), is converted to a frequency filter 50 in FIG 6A and 6B .
  • the filter applies a gain to the range 42 of the frequency spectrum.
  • FIG 6A the gain is positive (>1) within the range 42 and negative ( ⁇ 1) outside the range 42.
  • FIG 6A therefore illustrates an amplification filter 50 that is configured to preferentially amplify the frequency spectrum of the audio from the first region 16 that has been obstructed.
  • FIG 6B the gain is negative ( ⁇ 1) within the range 42 and positive (>1) outside the range 42.
  • FIG 6B therefore illustrates an attenuation filter 50 that is configured to preferentially attenuate the frequency spectrum of the audio from the first region 16 that has been obstructed.
  • the frequency dependent filter 50 is configured as a spatially dependent filter 50 that differentially amplifies or attenuates audio sources 10 at different spatial positions.
  • the frequency dependent filter 50 is based on a difference 40 in spectrum of the obstructed audio input 22' and the unobstructed audio input 22 caused by acoustic shadowing by an obstruction 30.
  • the obstruction 30 can for example be a hand 132 of the user 130, that provides the, at least partial, obstruction 30 between the obstructed microphone 20 and the first region 16.
  • the frequency-dependent filter 50 selectively provides a gain to spectral components in the range 42 corresponding to spectral components of the audio signal that has been obstructed and not captured.
  • the gain can be controlled by a user 130.
  • the user 130 can select whether the filter is an amplification filter or an attenuation filter.
  • the user 130 can determine the magnitude of the gain, for example, the magnitude of amplification/ attenuation.
  • the user input can, in some examples, be via gesture 134 of a hand 132 (see FIG 7 ). That gesture can be part of or separate to a gesture used to place the hand 132 as the obstruction 30.
  • the frequency dependent filter 50 can, for example, be based on a difference in spectrum of the audio input 22 from the at least one microphone 20 caused by acoustic shadowing by a hand 132 of the user 130.
  • the difference 40 can be a change over time for one or more microphones 20 (time division).
  • the difference 40 can be a difference between microphones 20 at the same time (spatial division).
  • FIG 7 illustrates an example of an apparatus 100 comprising:
  • the apparatus 100 can also comprise means, frequency-dependent filter 50, for filtering received audio input 22 from the at least one microphone 20 to create filtered audio 24 that amplifies or attenuates an audio source in the first region 16.
  • the apparatus 100 additionally comprises, in this example, spectral analysis means 102 that performs spectral analysis of the received obstructed audio input 22' from the obstructed microphone(s) 20 and spectral analysis of the received unobstructed audio input 22 from the unobstructed microphone(s) 20.
  • the spectral analysis means can for example be a spectrum analyzer that is configured to create an unobstructed frequency spectrum of the unobstructed audio input 22 by converting the received unobstructed audio input 22 from the time domain to the frequency domain (e.g. as shown in FIG 5A ) and to create an obstructed frequency spectrum of the obstructed audio input 22' by converting the received obstructed audio input 22' from the time domain to the frequency domain (e.g. as shown in FIG 5B ).
  • the frequency-dependent filter 50 is generated at block 106 based on the difference 40 between the obstructed and unobstructed frequency spectrums.
  • the apparatus 100 is configured to assume an obstruction 30 at the user-indicated reference time and no obstruction 30 at an offset time.
  • the obstructed spectrum is obtained by analyzing the audio input from microphone(s) 20 captured at the reference time and the unobstructed spectrum is obtained by analyzing the audio input from those microphone(s) 20 captured at the offset time.
  • the control block 110 can be configured to control when the described methods are used or not used.
  • the control block 110 can be configured to recognize a user input.
  • a sensor 112 can be provided to detect a user input.
  • the sensor 112 (not a microphone 20) can detect motion 134 of a hand 132 of a user 130.
  • the sensor 112 can locate the hand 132 of a user 130 relative to the microphones 20. Any suitable sensor 112 can be used.
  • control block 110 is configured to recognize that the obstruction 30 is a hand 132 of the user 130.
  • the differential hand masking cause frequency differentiation in the audio spectrum.
  • the gain is controlled by the user 130 to be attenuation or, alternatively, amplification.
  • the user provides that control via gesture 134 of their hand 132.
  • the user 130 may control the preferred behavior for the obstruction by performing a secondary gesture 134 with the obstructing hand 132.
  • An example may be first placing the hand as an obstruction 30 and once the apparatus 100 provides an output acknowledging the action, the user can may either perform an amplification gesture (e.g. pinch-out gesture performed by moving thumb and first finger away from each other) or an attenuation gesture (e.g. pinch-in gesture performed by moving thumb and first finger towards each other) to control, respectively, whether the filter 50 should be an amplification filter or an attenuation filter.
  • the extent of amplification/attenuation could be controlled by a size of the amplification gesture (e.g.
  • the apparatus 100 can, for example provide feedback on the amplification/attenuation gesture to the user 130.
  • the senor 112 is a camera.
  • other sensors 112 can a proximity sensor, a hover sensor, a positioning sensor or some other sensor.
  • the camera 112 can for example be a camera that records a visual part of an audio-visual scene.
  • the microphones 20 can simultaneously record the audio part of the audio-visual scene. That audio part can be filtered by the frequency dependent filter 50 as described.
  • the sensor 112 can be configured to detect as the user input, a spatially specific hand gesture 134 that provides a spatially dependent, at least partial, obstruction 30 of audio reaching the at least one microphone.
  • control block 110 can, in some examples, be configured to process an output from the camera 112 to recognize a presence of a hand 132, a position of the hand 132 and movement of the hand 132 as a user hand gesture 134.
  • the control block 110 can, for example, discriminate different hand gestures 134 as different user inputs.
  • Computer vision processing can be used to differentiate a hand 132 from other objects and to differentiate different hand gestures 134.
  • a user 130 can indicate a preferred audio focus direction by placing his/her hand between a microphone 20 and that direction.
  • Hand gestures 134 provide an intuitive way of selecting where to focus (i.e. selecting the first region 16), even during audio/video capture.
  • the focus direction selection can be performed without manual interaction with the apparatus 100 which is especially good for example while wearing winter gloves
  • the apparatus 100 continues to perform spectral analysis of the audio input 22 to detect a change in a configuration of the audio sources 10, for example a change, addition, loss or movement of an audio source 10.
  • the spectral analysis can for example be limited to the range 42 of the frequency spectrum of the unobstructed audio input 22 to detect a substantial change in that region of the frequency spectrum.
  • the apparatus can then prompt the user to perform a recalibration process.
  • the recalibration process is, for example, a repeat of the original process used to create the frequency-dependent filter 50.
  • the apparatus 100 can vary the frequency-dependent filter over time or stop using the frequency-dependent filter 50 automatically.
  • the frequency dependent filter 50 can be a time-variable filter 50 configured to fade over time.
  • the magnitude of the gain difference provided by the filter 50 can be time-dependent and reduce over time (e.g. 10s) to zero.
  • the diminution of the filter 50 can for example be shown visually by the apparatus 100. The user then has to re-perform the process of creating the frequency-dependent filter 50, if it is still required.
  • the apparatus 100 can therefore be configured to prompt the user to perform 're-calibration' by repeating the process of creating the frequency dependent filter 50.
  • This can for example comprise:
  • the apparatus 100 is a hand-portable apparatus of a size that can fit into a jacket pocket.
  • the apparatus 100 is a flat screen tablet apparatus such as a mobile telephone, tablet computer, personal digital assistant or similar.
  • FIG 8 illustrates an example of an apparatus 100 that uses the frequency-dependent filter 50 to filter received audio input 22 from the microphone(s) 20 to create filtered audio 24 that amplifies or attenuates an audio source in the first region 16.
  • the same frequency-dependent filter 50 can be used to filter all the received audio inputs 22 from the microphones 20 to create filtered audio 24 that amplifies or attenuates an audio source in the first region 16.
  • the same frequency-dependent filter 50 can be used to filter a sub-set of the received audio inputs 22 from the microphones 20 to create filtered audio 24 that amplifies or attenuates an audio source in the first region 16.
  • FIG illustrates that the received audio inputs 22 can be pre-processed at pre-processing block 120 before being filtered.
  • the received audio inputs 22 can also be pre-processed before spectral analysis 102 is performed ( FIG 7 ).
  • the pre-processing can, for example, comprise noise reduction, equalization, spatialization of the microphone signals, wind noise reduction etc.
  • the audio input 22 that is filtered can be 'live', that is real-time or can be accessed from a memory.
  • the audio output, the filtered audio 24, can be rendered 'live', that is in real-time or can be recorded in a memory for future access.
  • FIG 9A illustrates an example of a controller 70 for the apparatus 100.
  • Implementation of a controller 70 may be as controller circuitry.
  • the controller 70 may be implemented in hardware alone, have certain aspects in software including firmware alone or can be a combination of hardware and software (including firmware).
  • controller 70 may be implemented using instructions that enable hardware functionality, for example, by using executable instructions of a computer program 76 in a general-purpose or special-purpose processor 72 that may be stored on a computer readable storage medium (disk, memory etc.) to be executed by such a processor 72.
  • a general-purpose or special-purpose processor 72 may be stored on a computer readable storage medium (disk, memory etc.) to be executed by such a processor 72.
  • the processor 72 is configured to read from and write to the memory 74.
  • the processor 72 may also comprise an output interface via which data and/or commands are output by the processor 72 and an input interface via which data and/or commands are input to the processor 72.
  • the memory 74 stores a computer program 76 comprising computer program instructions (computer program code) that controls the operation of the apparatus 100 when loaded into the processor 72.
  • the computer program instructions, of the computer program 76 provide the logic and routines that enables the apparatus to perform the methods illustrated in the Figs.
  • the processor 72 by reading the memory 74 is able to load and execute the computer program 76.
  • the apparatus 100 therefore comprises:
  • the computer program 76 may arrive at the apparatus 100 via any suitable delivery mechanism 78.
  • the delivery mechanism 78 may be, for example, a machine readable medium, a computer-readable medium, a non-transitory computer-readable storage medium, a computer program product, a memory device, a record medium such as a Compact Disc Read-Only Memory (CD-ROM) or a Digital Versatile Disc (DVD) or a solid state memory, an article of manufacture that comprises or tangibly embodies the computer program 76.
  • the delivery mechanism may be a signal configured to reliably transfer the computer program 76.
  • the apparatus 100 may propagate or transmit the computer program 76 as a computer data signal.
  • Computer program instructions for causing an apparatus to perform at least the following or for performing at least the following:
  • the computer program instructions may be comprised in a computer program, a non-transitory computer readable medium, a computer program product, a machine readable medium. In some but not necessarily all examples, the computer program instructions may be distributed over more than one computer program.
  • memory 74 is illustrated as a single component/circuitry it may be implemented as one or more separate components/circuitry some or all of which may be integrated/removable and/or may provide permanent/semi-permanent/ dynamic/cached storage.
  • processor 72 is illustrated as a single component/circuitry it may be implemented as one or more separate components/circuitry some or all of which may be integrated/removable.
  • the processor 72 may be a single core or multi-core processor.
  • FIGs 10A, 10B, 10C, 11A, 11B are equivalent to previous FIGs 5A, 5B, 5C, 6A, 6B .
  • FIGs 10A, 10B, 10C, 11A illustrate that the frequency dependent filter 50 (illustrated in FIG 6A ) can be extend to apply amplification outside the range 42 at lower frequency harmonic frequencies.
  • FIGs 10A, 10B, 10C, 11B illustrate that the frequency dependent filter 50 (illustrated in FIG 6B ) can be extend to apply attenuation outside the range 42 at lower frequency harmonic frequencies.
  • Fig 10A is an example of an unobstructed frequency spectrum that is a frequency spectrum of an unobstructed audio input 22 produced by microphone(s) 20 that captures unobstructed audio 12.
  • the unobstructed frequency spectrum comprises a harmonic structure (H).
  • Fig 10B is an example of an obstructed frequency spectrum that is a frequency spectrum of an obstructed audio input 22' produced by microphone(s) 20 that capture obstructed audio 12'.
  • Fig 10C illustrates a difference 40 between the unobstructed frequency spectrum of the unobstructed audio input 22 ( FIG 10A ) and the obstructed frequency spectrum of the obstructed audio input 22' ( FIG 10B ).
  • the obstructed frequency spectrum is subtracted from the unobstructed frequency spectrum.
  • the difference 40 represents a frequency spectrum of the audio from the first region 16 that has been obstructed.
  • the frequency spectrum of the audio from the first region 16 that has been obstructed has a range 42.
  • the frequency spectrum of the audio from the first region 16 that has been obstructed is converted to a frequency filter 50 in FIG 11A and 11B .
  • the filter applies a gain to the range 42 of the frequency spectrum and to the harmonics H outside the range 42.
  • the gain is positive (>1) within the range 42 and at the harmonics H that have a lower frequency than the range 42 and otherwise negative. It is negative ( ⁇ 1) outside the range 42 including the harmonics H that have a higher frequency than the range 42.
  • FIG 11A therefore illustrates an amplification filter 50 that is configured to preferentially amplify the frequency spectrum of the audio from the first region 16 that has been obstructed (and its lower frequency harmonics).
  • FIG 11B the gain is negative ( ⁇ 1) within the range 42 and at the harmonics H that have a lower frequency than the range 42 and is otherwise positive (>1). It is positive outside the range 42 including at harmonics H that have a higher frequency than the range 42.
  • FIG 11B therefore illustrates an amplification filter 50 that is configured to preferentially attenuate the frequency spectrum of the audio from the first region 16 that has been obstructed (and its lower frequency harmonics).
  • the frequency dependent filter 50 is configured as a spatially dependent filter 50 that differentially amplifies or attenuates audio sources 10 at different spatial positions (and their lower frequency harmonics).
  • the frequency dependent filter 50 is based on a difference 40 in spectrum of the obstructed audio input 22' and the unobstructed audio input 22 caused by acoustic shadowing by an obstruction 30.
  • the obstruction 30 can for example be a hand 132 of the user 130, that provides the, at least partial, obstruction 30 between the obstructed microphone 20 and the first region 16.
  • the gain can be controlled by a user 130.
  • the user can select whether the filter 50 is an amplification filter or an attenuation filter.
  • the user 130 can determine the magnitude of the gain, for example, the magnitude of amplification/ attenuation.
  • the user input can, in some examples, be a gesture 134 of a hand 132 (see FIG 7 ). That gesture 134 can be part of or separate to a gesture used to place the hand 132 as the obstruction 30.
  • the frequency dependent filter 50 can, for example, be based on a difference in spectrum of the audio input 22 from the at least one microphone 20 caused by acoustic shadowing by a hand of the user.
  • the difference can be a change over time for one or more microphones 20 (time division).
  • the difference can be a difference between microphones 20 at the same time (spatial division).
  • the example described in relation to FIGs 10A-10C and FIGs 11A & 11B can be useful if the target audio at the first region 16 contains very low frequencies.
  • the very low frequencies may not be affected by the acoustic shadowing caused by the hand.
  • the extension of the filter 50 to lower frequency harmonics can occur, for example, if the unobstructed frequency spectrum ( FIG 10A ) has a harmonic structure that is absent from the obstructed frequency spectrum ( FIG 10B ).
  • the harmonic frequencies 200, 300, 400, 500, 600, 700, 800, 900, and 1000Hz are substantially attenuated by hand shadowing.
  • the apparatus 100 can determine that the target audio is most likely a harmonic sound having the fundamental frequency of 100Hz. Even if the frequency 100Hz is not affected by the hand shadowing the system can nevertheless pick that frequency to be included in the frequency-dependent filter 50.
  • frequency can refer either to a single frequency or a frequency band with certain width.
  • references to 'computer-readable storage medium', 'computer program product', 'tangibly embodied computer program' etc. or a 'controller', 'computer', 'processor' etc. should be understood to encompass not only computers having different architectures such as single /multi- processor architectures and sequential (Von Neumann)/parallel architectures but also specialized circuits such as field-programmable gate arrays (FPGA), application specific circuits (ASIC), signal processing devices and other processing circuitry.
  • References to computer program, instructions, code etc. should be understood to encompass software for a programmable processor or firmware such as, for example, the programmable content of a hardware device whether instructions for a processor, or configuration settings for a fixed-function device, gate array or programmable logic device etc.
  • circuitry may refer to one or more or all of the following:
  • the blocks illustrated in the Figs may represent steps in a method and/or sections of code in the computer program 76.
  • the illustration of a particular order to the blocks does not necessarily imply that there is a required or preferred order for the blocks and the order and arrangement of the block may be varied. Furthermore, it may be possible for some blocks to be omitted.
  • the recording of data may comprise only temporary recording, or it may comprise permanent recording or it may comprise both temporary recording and permanent recording
  • Temporary recording implies the recording of data temporarily. This may, for example, occur during sensing or image capture, occur at a dynamic memory, occur at a buffer such as a circular buffer, a register, a cache or similar.
  • Permanent recording implies that the data is in the form of an addressable data structure that is retrievable from an addressable memory space and can therefore be stored and retrieved until deleted or over-written, although long-term storage may or may not occur.
  • the use of the term 'capture' in relation to an image or audio relates to temporary recording of the data.
  • the use of the term 'record' or 'store' in relation to an image or audio relates to permanent recording of the data.
  • the above described examples find application as enabling components of: automotive systems; telecommunication systems; electronic systems including consumer electronic products; distributed computing systems; media systems for generating or rendering media content including audio, visual and audio visual content and mixed, mediated, virtual and/or augmented reality; personal systems including personal health systems or personal fitness systems; navigation systems; user interfaces also known as human machine interfaces; networks including cellular, non-cellular, and optical networks; ad-hoc networks; the internet; the internet of things; virtualized networks; and related software and services.
  • a property of the instance can be a property of only that instance or a property of the class or a property of a sub-class of the class that includes some but not all of the instances in the class. It is therefore implicitly disclosed that a feature described with reference to one example but not with reference to another example, can where possible be used in that other example as part of a working combination but does not necessarily have to be used in that other example.
  • 'a' or 'the' is used in this document with an inclusive not an exclusive meaning. That is any reference to X comprising a/the Y indicates that X may comprise only one Y or may comprise more than one Y unless the context clearly indicates the contrary. If it is intended to use 'a' or 'the' with an exclusive meaning then it will be made clear in the context. In some circumstances the use of 'at least one' or 'one or more' may be used to emphasis an inclusive meaning but the absence of these terms should not be taken to infer any exclusive meaning.
  • the presence of a feature (or combination of features) in a claim is a reference to that feature or (combination of features) itself and also to features that achieve substantially the same technical effect (equivalent features).
  • the equivalent features include, for example, features that are variants and achieve substantially the same result in substantially the same way.
  • the equivalent features include, for example, features that perform substantially the same function, in substantially the same way to achieve substantially the same result.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Otolaryngology (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Circuit For Audible Band Transducer (AREA)
EP21165215.1A 2021-03-26 2021-03-26 Traitement audio Pending EP4064734A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21165215.1A EP4064734A1 (fr) 2021-03-26 2021-03-26 Traitement audio
CN202210306872.5A CN115132216A (zh) 2021-03-26 2022-03-25 音频处理

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP21165215.1A EP4064734A1 (fr) 2021-03-26 2021-03-26 Traitement audio

Publications (1)

Publication Number Publication Date
EP4064734A1 true EP4064734A1 (fr) 2022-09-28

Family

ID=75252408

Family Applications (1)

Application Number Title Priority Date Filing Date
EP21165215.1A Pending EP4064734A1 (fr) 2021-03-26 2021-03-26 Traitement audio

Country Status (2)

Country Link
EP (1) EP4064734A1 (fr)
CN (1) CN115132216A (fr)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120057733A1 (en) * 2009-04-28 2012-03-08 Keiko Morii Hearing aid device and hearing aid method
US20150312691A1 (en) * 2012-09-10 2015-10-29 Jussi Virolainen Automatic microphone switching
US20160373872A1 (en) * 2013-09-16 2016-12-22 Huawei Device Co., Ltd. Sound effect control method and apparatus

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120057733A1 (en) * 2009-04-28 2012-03-08 Keiko Morii Hearing aid device and hearing aid method
US20150312691A1 (en) * 2012-09-10 2015-10-29 Jussi Virolainen Automatic microphone switching
US20160373872A1 (en) * 2013-09-16 2016-12-22 Huawei Device Co., Ltd. Sound effect control method and apparatus

Also Published As

Publication number Publication date
CN115132216A (zh) 2022-09-30

Similar Documents

Publication Publication Date Title
CN110970057B (zh) 一种声音处理方法、装置与设备
CN111724823B (zh) 一种信息处理方法及装置
US9596437B2 (en) Audio focusing via multiple microphones
US9913027B2 (en) Audio signal beam forming
CN106960670B (zh) 一种录音方法和电子设备
US10667049B2 (en) Detecting the presence of wind noise
US11348288B2 (en) Multimedia content
EP4113514A1 (fr) Procédés, appareils et programmes informatiques pour la réduction du bruit
CN104301596A (zh) 一种视频处理方法及装置
US20170188140A1 (en) Controlling audio beam forming with video stream data
WO2020234015A1 (fr) Appareil et procédés associés de capture audio spatiale
EP4064734A1 (fr) Traitement audio
US20220260664A1 (en) Audio processing
CN113676687A (zh) 一种信息处理方法及电子设备
US11805312B2 (en) Multi-media content modification
US11882401B2 (en) Setting a parameter value
WO2019159050A1 (fr) Agencement de données audio
EP3917160A1 (fr) Capture de contenu
US20220360925A1 (en) Image and Audio Apparatus and Method
US20230185396A1 (en) Tactile audio display
US20240062769A1 (en) Apparatus, Methods and Computer Programs for Audio Focusing
CN117953870A (zh) 基于移动设备内置传感器的语音感知方法、***及终端
JP2019029981A (ja) 映像音声信号処理装置、その方法とプログラム

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN PUBLISHED

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20230323

RBV Designated contracting states (corrected)

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: EXAMINATION IS IN PROGRESS

17Q First examination report despatched

Effective date: 20240429