CN112242148B - Headset-based wind noise suppression method and device - Google Patents

Headset-based wind noise suppression method and device Download PDF

Info

Publication number
CN112242148B
CN112242148B CN202011258693.6A CN202011258693A CN112242148B CN 112242148 B CN112242148 B CN 112242148B CN 202011258693 A CN202011258693 A CN 202011258693A CN 112242148 B CN112242148 B CN 112242148B
Authority
CN
China
Prior art keywords
wind noise
function
microphone
determining
beam forming
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011258693.6A
Other languages
Chinese (zh)
Other versions
CN112242148A (en
Inventor
邱锋海
项京朋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sound+ Technology Co ltd
Original Assignee
Beijing Sound+ Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sound+ Technology Co ltd filed Critical Beijing Sound+ Technology Co ltd
Priority to CN202011258693.6A priority Critical patent/CN112242148B/en
Publication of CN112242148A publication Critical patent/CN112242148A/en
Application granted granted Critical
Publication of CN112242148B publication Critical patent/CN112242148B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/10Earpieces; Attachments therefor ; Earphones; Monophonic headphones
    • H04R1/1083Reduction of ambient noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02087Noise filtering the noise being separate speech, e.g. cocktail party
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The embodiment of the invention discloses a method and a device for suppressing wind noise and environmental noise based on a headset, wherein the method is applied to the headset, the headset comprises M microphones, and the method comprises the following steps: receiving M sound signals, selecting different two microphones to form microphone pairs, and carrying out joint judgment on the microphone pairs; carrying out complex coherence function analysis and sound source positioning angle analysis on the received sound signals by each microphone, and constructing a wind noise judging function corresponding to each microphone pair one by one; determining a wind noise judgment function with the smallest function value in the corresponding wind noise judgment functions of each microphone as a first wind noise judgment function, and determining the wind noise according to the first wind noise judgment function; and determining the adopted microphone pair and the corresponding beam forming device and control parameters according to the wind noise size to obtain a beam forming output signal. The embodiment of the invention effectively suppresses wind noise and environmental noise in the sound signal and improves the voice communication and voice recognition performance.

Description

Headset-based wind noise suppression method and device
Technical Field
The invention relates to the technical field of voice enhancement, in particular to a headset-based wind noise and environmental noise suppression method and device.
Background
At present, headphones have become an indispensable electronic product in daily entertainment and voice communication. In practical applications, the microphone on the earphone tends to pick up various noises, such as subway noise, road noise, interference noise, wind noise, and the like. These noise not only can severely affect voice call quality, but also can affect voice wake-up, voice recognition, etc. The headset has the advantages of good sound field effect, good wearing comfort and the like, and meanwhile, more microphones can be distributed due to the larger space of the shell of the earmuff, so that better conversation performance can be obtained.
In an outdoor scene, wind noise has the characteristics of strong randomness, strong low-frequency energy, wide band coverage range and the like, and seriously influences the conversation quality, so that bad user experience is caused for users.
In the current earphone application scene, three common wind noise suppression methods exist, namely, firstly, the earphone with a special physical structure is designed to weaken the influence of wind noise on a microphone, and although the method can obtain a better wind noise suppression effect, each earphone needs to be independently designed, so that universality is avoided, and the cost is high; 2. noise reduction is performed by a method based on an auxiliary sensor, and the voice of a wearer is picked up by a vibration sensor or an acceleration sensor (such as a bone conduction sensor) and the like, so that the method needs the auxiliary sensor to provide non-acoustic information for auxiliary judgment, and the accuracy is high, but the wearer is required to wear the earphone correctly and the cost is high; 3. based on the array signal processing technology and the post-filtering technology, the wind noise component in the sound signal received by the microphone is suppressed according to the spectrum characteristic difference of the voice and the wind noise, but the robustness of the algorithm and the wind noise suppression performance have problems in practical application.
Disclosure of Invention
The embodiment of the invention provides a method and a device for restraining wind noise and environmental noise based on a headset, which effectively restrain wind noise and environmental noise in sound signals, retain target voice signals and improve voice communication and voice recognition performance. The technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides a method for suppressing wind noise and environmental noise based on a headset, where the method is applied to the headset, and the headset includes M microphones, including:
receiving M sound signals; wherein the sound signal at least comprises a target voice signal, wind noise and environmental noise;
carrying out complex coherence function analysis and sound source positioning angle analysis on the received sound signals by each microphone, and constructing a wind noise judging function corresponding to each microphone pair one by one according to the complex coherence function analysis result and the sound source positioning angle analysis result; wherein each microphone pair consists of two different microphones, and a plurality of microphone pairs are adopted to carry out joint judgment;
determining a wind noise judging function with the smallest function value in the wind noise judging functions corresponding to all microphones, and taking the wind noise judging function as a first wind noise judging function;
And determining the wind noise according to the first wind noise judging function, determining the adopted microphone pair and the corresponding beam forming device and control parameters according to the wind noise, and inhibiting wind noise and environmental noise in the sound signal to obtain a beam forming output signal.
In one possible implementation, the determining the wind noise size according to the first wind noise judging function, and determining the adopted microphone pair and the corresponding beam forming device and the control parameters according to the wind noise size includes:
when the function value of the first wind noise judging function is smaller than a first preset threshold value, determining that wind noise in a sound signal received by a microphone pair corresponding to the first wind noise judging function is small wind noise, selecting a corresponding microphone pair according to the first wind noise judging function, forming a first beam forming device at the mouth of a person, and determining control parameters of the first beam forming device; or when the function value of the first wind noise judging function is larger than or equal to a first preset threshold value and smaller than a second preset threshold value, determining that wind noise in the sound signals received by the microphone pair corresponding to the first wind noise judging function is wind noise, selecting the corresponding microphone pair according to the first wind noise judging function, forming a second beam forming device at the position of the human mouth, and determining control parameters of the second beam forming device; or when the function value of the first wind noise judging function is larger than or equal to a second preset threshold value, determining that wind noise in the sound signals received by the microphone pair corresponding to the first wind noise judging function is large wind noise, and performing fixed beam forming processing by utilizing all microphone pairs to form a third beam forming device reaching the mouth of a person.
In one possible implementation, the first beamformer is an adaptive beamformer with a step size greater than a preset step size threshold and fixed; the second beam forming device is an adaptive beam forming device with step length dynamically changed; the third beamformer is a fixed beamformer.
In one possible implementation, the method further comprises:
and judging wind noise directions according to the wind noise judging function relation of each microphone pair, and judging the wind noise bandwidth upper limit of the sound signals received by the corresponding microphone pair according to the magnitude of the first wind noise judging function.
In a second aspect, an embodiment of the present invention further proposes a headset-based wind noise and environmental noise suppression device, where the device is located on a headset, and the headset includes M microphones, and the device includes:
the signal receiving module is used for receiving M sound signals; wherein the sound signal at least comprises a target voice signal, wind noise and environmental noise;
the function construction module is used for carrying out complex coherence function analysis and sound source positioning angle analysis on the received sound signals of each microphone, and constructing a wind noise judgment function corresponding to each microphone pair one by one according to the complex coherence function analysis result and the sound source positioning angle analysis result; wherein each microphone pair consists of two different microphones, and a plurality of microphone pairs are adopted to carry out joint judgment;
The function determining module is used for determining the wind noise judging function with the smallest function value in the wind noise judging functions corresponding to all the microphones, and taking the wind noise judging function as a first wind noise judging function;
and the output obtaining module is used for determining the wind noise according to the first wind noise judging function, determining the adopted microphone pair and the corresponding beam forming device and control parameters according to the wind noise, and inhibiting wind noise and environmental noise in the sound signal to obtain a beam forming output signal.
In a third aspect, an embodiment of the present invention proposes a headset based wind noise and environmental noise suppression device comprising at least one processor for executing a program stored in a memory, which when executed causes the device to perform the steps of the method of the first aspect and in various possible implementations.
In a fourth aspect, embodiments of the present invention provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method of the first aspect and in various possible implementations.
As can be seen from the above technical solutions, in the embodiment of the present invention, by performing complex coherence function analysis and sound source localization angle analysis on two paths of sound signals received by each microphone pair, a wind noise judgment function corresponding to each microphone pair one to one is constructed; then determining the wind noise judgment function with the minimum function value as a first wind noise judgment function; then, determining a beam former and control parameters which are directed to the human mouth by the microphone according to the first wind noise judging function, and effectively inhibiting wind noise and environmental noise in the sound signal; and finally, further adopting single-channel voice enhancement processing to the beam forming output signal, and further enhancing the target voice signal.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings that are necessary for the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention and that other drawings can be obtained from these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic block diagram of a headset-based wind noise and ambient noise suppression method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of wind noise and environmental noise in a sound signal received by a microphone according to an embodiment of the present invention;
fig. 3 (a) -fig. 3 (e) are a spectrogram of sound signals received by two microphones on the same side when the small wind noise is enhanced to be large wind noise, a corresponding complex coherence function amplitude spectrum, a sound source positioning angle deviation characteristic function after smoothing processing, and a wind noise judging function F (l) of the microphone to the corresponding microphone calculated according to microphones Mic1 and Mic3, which are provided by the embodiment of the present invention;
fig. 4 is a schematic diagram of a microphone array configuration according to an embodiment of the present invention;
FIGS. 5 (a) -5 (b) are diagrams of the sound signals received by the left microphone and the right microphone when wind noise provided by the embodiment of the invention is incident from the left side;
FIGS. 6 (a) -6 (b) are schematic diagrams of switching output microphone pairs according to transient microphone pair changes provided by embodiments of the present invention;
fig. 7 (a) -7 (f) are diagrams of sound signal speech patterns received by the left microphone and the right microphone, corresponding wind noise judgment functions, minimum wind noise judgment functions of four microphone pairs and corresponding microphone pairs when wind noise provided by the embodiment of the invention is changed from left incidence to right incidence;
fig. 8 is a schematic flow chart of a beam forming device method based on a wind noise judgment function according to an embodiment of the present invention;
fig. 9 (a) -9 (e) are spectrograms of sound signals received by four microphones and signals processed by an algorithm, wherein the sound signals are received by the four microphones in front right, rear right, front left and rear left under the condition of wind noise with different directions and different intensities;
fig. 10 is a flowchart of a method for suppressing wind noise and environmental noise based on a headphone according to an embodiment of the present invention;
fig. 11 is a schematic structural diagram of a headset-based wind noise and environmental noise suppression device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the following detailed description of the embodiments of the present invention is given with reference to the accompanying drawings.
At present, in the actual wind noise and environmental noise suppression scenes, the performance of the single-channel voice enhancement algorithm is sharply reduced under the condition of strong wind noise, and the multi-channel voice enhancement algorithm not only can improve the noise reduction performance of the algorithm, but also can enhance the robustness of the algorithm, so that the method is widely applied to the current earphone voice enhancement scheme. The current wind noise and environmental noise suppression method is mostly used for in-ear or semi-in-ear headphones (for example, a method based on an auxiliary sensor), and research on the wind noise and environmental noise suppression method of the headset is limited. Compared with in-ear or semi-in-ear headphones, the headset has the characteristics of large aperture, multiple microphones, flexible position arrangement and the like, and can obtain better wind noise and environmental noise suppression performance. Therefore, the embodiment of the invention provides a method and a device for suppressing wind noise and environmental noise based on a headset. In one possible implementation, the method and apparatus may be applied to both real-time speech and audio communication systems, and non-real-time speech enhancement and speech wakeup scenarios. A schematic block diagram of the headset-based wind noise and ambient noise suppression method is shown in fig. 1. The coherence function and the wind noise function mentioned in fig. 1 are complex coherence functions and wind noise judgment functions mentioned below, respectively.
It should be noted that, the method for suppressing wind noise and environmental noise based on the headset provided by the embodiment of the invention is applied to the headset, and the headset includes M microphones.
In one possible implementation, the M microphones on the headset each receive a sound signal that includes at least a target speech signal, wind noise, and ambient noise. Because the wind noise is obvious when the headset is worn outdoors, the outdoor application scene is considered, and the sound signal x received by the ith microphone is assumed i (n) is:
x i (n)=s i (n)+d i (n)+w i (n) (1)
wherein s is i (n)、d i (n) and w i (n) target speech signals (i.e., speaker sounds of the wearer) received by the ith microphone, ambient noise, and wind noise, respectively; i=1, 2, …, M. Fig. 2 shows a schematic diagram of wind noise and ambient noise in a sound signal received by a microphone. In practical application, X in formula (2) i (k, l) can be obtained by reacting x i (n) Short time Fourier transform (Short-Time Fourier Transform, STFT) with frequency domain expression of:
X i (k,l)=S i (k,l)+D i (k,l)+W i (k,l) (2)
Wherein X is i (k,l)、S i (k,l)、D i (k, l) and W i (k, l) are each x i (n)、s i (n)、d i (n) and w i (n) a corresponding kth frequency bin short-term spectrum of the first frame. This patent describes a frame length of 512 points (32 ms) and a frame shift of 256 points (16 ms) for example at a sampling rate of 16 kHz.
In an actual application scene, wind noise energy is concentrated at low frequency and has strong randomness, so that the wind noise complex coherence function amplitude spectrum is low, and the target voice signal complex coherence function amplitude spectrum is strong, and wind noise detection can be performed by utilizing the difference of the wind noise energy and the target voice signal complex coherence function amplitude spectrum. Therefore, complex coherence function analysis and sound source localization angle analysis are performed on the received sound signals for each microphone pair, and a wind noise judgment function corresponding to each microphone pair one by one is constructed according to the analysis result. It should be noted that, in the embodiment of the present invention, the microphone pair is composed of two different microphones, and a plurality of microphone pairs are used to make a joint decision.
First, complex coherence function analysis is described for two received sound signals for each microphone. In the embodiment of the invention, first-order smooth power spectrums of two microphones in each microphone pair and first-order smooth power spectrums of the two microphones are determined according to two paths of sound signals and smoothing factors received by each microphone pair; and determining a complex coherence function between the sound signals of each microphone pair according to the first-order smooth power spectrums of the two microphones and the first-order smooth power spectrums of the two microphones. As a preferred embodiment of the present invention, the complex coherence function between the sound signals received by the two microphones in each microphone pair is defined as:
Figure SMS_1
Wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_2
is a first order smoothed power spectrum, gamma is a smoothing factor, X i (k, l) is a frequency domain signal of the received sound signal, and i and j represent the i-th and j-th microphones, respectively. Fig. 3 (a) and 3 (b) show the speech patterns of sound signals received by two microphones on the same side of the headset when the small wind noise is gradually increased to the large wind noise. As can be seen from the figure, the greater the wind noise (i.e., the darker the color in the figure), the stronger the coherence and the greater the coverage bandwidth. Fig. 3 (c) shows the magnitude spectrum of the complex coherence function corresponding at this time.
Next, sound source localization angle analysis is performed for the two received sound signals for each microphone. For a headset application scenario, the mouth is substantially fixed relative to the microphone position, so the target speech signal is a typical directional sound source. The time delay difference between two microphones estimated by adopting a generalized cross-correlation (Generalized Cross Correlation, GCC) time delay estimation method is relatively fixed, and the sound source positioning angle has continuity between frames. Because the microphone has phase difference to the received sound signals, the delay can be estimated by using a GCC algorithm, and the sound source positioning angle can be obtained through the delay. In contrast, wind noise does not belong to a directional sound source and has strong randomness, and the sound source localization angle does not have continuity between frames. Meanwhile, the wind noise intensity can influence the wind noise bandwidth, and the wind noise bandwidth upper limit is higher as the wind noise is stronger. Based on the characteristics, the embodiment of the invention determines the sound source positioning estimation angle according to the phase difference of the microphone on the received two paths of sound signals; and determining the sound source positioning angle deviation according to the sound source positioning estimated angle and the priori preset angle. As a preferred embodiment of the present invention, the sound source localization angle deviation characteristic function (as shown in fig. 3 (d)) of the microphone on the received two paths of sound signals is defined as:
Figure SMS_3
Wherein B is the number of sub-bands divided, θ (B, l) is the B sub-band of the first frame, a time delay estimation method is adopted to obtain the incidence estimation angle of the sound source,
Figure SMS_4
the angle is preset a priori. In the embodiment of the invention, 4 sub-bands are divided for processing by a bandwidth of 1kHz in consideration of wind noise influence of 0-4 kHz.
In the embodiment of the invention, the GCC algorithm can be utilized to estimate the time delay difference and the sound source incidence angle between the received sound signals of each microphone. As shown in the schematic diagram of the microphone array configuration shown in fig. 4, the microphones Mic1 and Mic2 on the front opposite sides form a microphone pair 3, and the microphones Mic3 and Mic4 on the rear opposite sides form a microphone pair 4, and the priori preset angle is the vertical direction, namely
Figure SMS_5
Degree. The microphones Mic1 and Mic3 on the same side on the left constitute a microphone pair 1, the microphones Mic2 and Mic4 on the same side on the right constitute a microphone pair 2, and the priori preset angle is the endfire direction, namely +.>
Figure SMS_6
Degree. It can be appreciated that in practical application, the wearing of headphones by different persons results in a practical +.>
Figure SMS_7
There may be a deviation and thus +.>
Figure SMS_8
And performing online calibration. In order to ensure the continuity of sound source positioning results and the capability of quickly tracking wind noise, the embodiment of the invention adopts a method of 'quick rise and slow fall' to position the angle deviation characteristic function F on the sound source bia (l) The smoothing process is carried out, specifically:
Figure SMS_9
where α and β are the first and second smoothing factors, respectively, in one possible implementation α=0.5, β=0.05. FIG. 3 (d) shows the smoothed sound source localization angle deviation characteristic function F for different intensities of wind noise bia_sm (l) A. The invention relates to a method for producing a fibre-reinforced plastic composite From the results, it was found that,as the wind noise increases, the sound source localization angle deviation characteristic function after the smoothing process also gradually increases.
In summary, as a preferred embodiment of the present invention, a wind noise judging function F (l) that characterizes the wind noise intensity of the microphone on the received two paths of sound signals is constructed according to the complex coherence function analysis result and the sound source positioning angle analysis result (i.e., according to the complex coherence function and the smoothed sound source positioning angle deviation characteristic function):
F(l)=F C (l)·F bia_sm (l) (6)
wherein F is C (l) Representing a function based on the complex coherence function magnitude spectrum. In one possible implementation, F C (l) The expression is:
Figure SMS_10
wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_11
k L and k H Respectively represent frequency points corresponding to upper and lower boundaries of a frequency band range in which wind noise may occur, in one possible implementation, k L And k H The frequency points are respectively 300Hz and 3 kHz. Fig. 3 (e) shows the microphone pair corresponding wind noise determination function F (l) calculated from the microphones Mic1, mic 3. The result shows that the wind noise judging function F (l) is obviously increased along with the wind noise enhancement value, and the relative intensity of the selected microphone to the wind noise can be effectively identified. Here, the abscissa of fig. 3 (a) to fig. 3 (e) represents time in seconds.
In an outdoor environment, because of the influence of head shielding and the difference of wind noise directions and sizes, the energy size and frequency band distribution of wind noise in sound signals received by all microphones are greatly different, and the middle-low frequency energy of a target voice signal in the sound signals received by all microphones is basically consistent, and the high-frequency energy is influenced by head shielding and is partially attenuated compared with the middle-low frequency. Meanwhile, the microphone near the mouth has stronger high frequency energy for the received sound signal than the microphone far from the mouth. Fig. 5 (a) shows a spectrum of a sound signal received by the left microphone when wind noise is incident from the left, fig. 5 (b) shows a spectrum of a sound signal received by the right microphone when wind noise is incident from the left, and the dark color of fig. 5 (a) represents that wind noise energy is particularly strong, and it can be seen that wind noise energy of the left microphone is significantly stronger than that of the right microphone, and at the same time, wind noise audio frequency width of the left microphone is also significantly wider than that of the right microphone. Similar differences will also appear in the spectrograms of the microphones when wind noise is incident from the right side, from the front side and from the rear side, and will not be described in detail in the embodiments of the present invention. By utilizing the analysis process, the wind noise direction can be judged according to the wind noise judging function relation of each microphone pair, and the wind noise bandwidth upper limit of the sound signal received by the corresponding microphone pair can be judged according to the magnitude of the first wind noise judging function. As can be seen from fig. 5 (a) and 5 (b), the frequency band of wind noise in the sound signal received by the left microphone is wider than that of the sound signal received by the right microphone. As shown in fig. 5 (a), the frequency band is wider and the color is darker, which means that wind noise is incident from the direction in which the microphone pair is located, i.e., from the left side. In this case, the microphone is not used for the received audio signal.
For the analysis, the microphone pair with smaller wind noise is directly selected to effectively reduce the influence of wind noise, so that the embodiment of the invention provides a method for detecting the microphone pair with smaller wind noise and judging the corresponding wind noise bandwidth. The method comprises the following specific steps:
1) As a preferred embodiment of the present invention, taking the foregoing m=4 as an example, using the formula (6), the wind noise determination function F (l) of 4 microphone pairs in front-back, left-right and left-right as shown in fig. 4 is calculated according to the microphone position distribution characteristics, wherein the wind noise determination function of microphone pair Mic1, mic3 is denoted as F L (l) The wind noise judgment function of microphone to Mic2 and Mic4 is marked as F R (l) The wind noise judgment function of microphone to Mic1 and Mic2 is marked as F F (l) The wind noise judgment function of microphone to Mic3, mic4 is marked as F B (l)。
2) As a preferred embodiment of the present invention, take F L (l)、F R (l)、F F (l) And F B (l) The minimum value of (2) is used as a first wind noise judgment function, and the first wind noise judgment function is marked as F min (l) Indicating that wind noise is minimal in this direction. When F min (l) When the wind power generation speed is smaller than a first preset threshold value, the wind power generation speed is considered to be in a windless or small wind noise state; when F min (l) When the wind noise is larger than or equal to a second preset threshold value, the wind noise is considered to be in a large wind noise state, and wind noise of all microphones is large and the frequency band is wide; when F min (l) F when being positioned between the first preset threshold value and the second preset threshold value min (l) The larger the wind noise energy is, the more the frequency band is widened. In the embodiment of the present invention, typical values of the first preset threshold and the second preset threshold are 0.1 and 0.4. According to the actual test result, when F min (l) When the wind noise frequency band is not more than 0.1, the upper limit of the wind noise frequency band range is less than 1kHz; when F min (l) When the wind noise frequency band is not more than 0.4, the upper limit of the wind noise frequency band range is less than 3kHz; when F min (l) When the frequency band is larger than 0.4, the upper limit of the wind noise frequency band range is larger than 3kHz and can reach more than 4k at most.
3) In practical application, stroke noise has the characteristics of high uncertainty and randomness, and the frequency bandwidth range can change greatly with time, thereby leading to F min (l) The variation is large, so that a long-time smooth statistical method can be adopted to obtain a rough wind noise bandwidth range. According to F min (l) To prevent repeated switching in a short time when switching channels (i.e. selecting a microphone pair), further assistance decisions using long-term smoothing statistics are needed, which are based on F only when the long-term smoothing statistics change steadily min (l) And switching channels. One possible implementation method is: all frames in past 0.5s are according to F min (l) The corresponding microphone pair, more than 80% of which are the same microphone pair and which are not consistent with the currently adopted microphone pair, are switched. One specific example is shown in fig. 6 (a) and 6 (b).
According to F min (l) The microphone pair number corresponding to the instantaneous minimum of (a) is changed unstably in the first 11 frames, and most of the time, the microphone pair number 1 is changed occasionally to the microphone pair number 2, and the corresponding output microphone pair (as shown in fig. 6 (b)) is numbered 1. As shown in fig. 6 (a), the instantaneous microphone pairAfter frame 11, a steady change occurs, at which point the headset detects a continuous change in the instantaneous microphone pair to the microphone pair numbered 2. As shown in fig. 6 (b), after about 0.4s from the start of the 11 frames (i.e., from the 36 th frame), the output microphone pair is switched from the microphone pair numbered 1 to the microphone pair numbered 2.
Fig. 7 (a) and 7 (b) show the speech patterns of the sound signals received by the left microphone and the right microphone when the wind noise is changed from the left incident to the right incident, respectively. Fig. 7 (c) and 7 (d) show wind noise determination functions corresponding to the left microphone pair and the right microphone pair, respectively. FIGS. 7 (e) and 7 (F) respectively show the minimum wind noise determination function F of four microphone pairs min (l) A corresponding pair of microphones. In fig. 7 (f), a value of 1 indicates the use of left microphone pair data, and a value of 0 indicates the use of right microphone pair data, and a delay of about 0.5s is required to avoid erroneous judgment. As can be seen from the result, the wind noise judgment function can effectively judge the microphone pair corresponding to the minimum wind noise along with the change of the wind noise direction. Subsequent applications require selection of microphone pairs with minimal wind noise for beam forming comb. Here, fig. 7 (a) and fig. 7 (f) are shown with the abscissa representing time in seconds.
In a headset practical application scenario, wind noise and environmental noise often exist at the same time. Common environmental noise suppression methods include adaptive beamforming, fixed beamforming, post filtering, and the like. However, wind noise affects the performance of the beam forming or post-filtering method, and particularly when a microphone with strong wind noise is used for adaptive filtering, the strong wind noise frequency points easily cause the divergence of the filter coefficients, thereby affecting the filtering performance and even deteriorating the voice quality of the processed signals. In order to solve the above problems, an embodiment of the present invention provides a beam former method based on a wind noise judgment function. According to F min (l) And the comparison result of the first preset threshold value and the second preset threshold value, constructing a beam forming device pointing to the human mouth and passing through F min (l) The control parameters of the beam forming device are controlled to update, so that the environmental noise can be effectively restrained, and the wind noise can be effectively restrained. As shown in fig. 8, the beam forming method based on the wind noise judgment function is specificThe method comprises the following steps:
1) When F min (l) F when the threshold value is smaller than the first preset threshold value min (l) The wind noise of the corresponding microphone pair belongs to a small wind noise, the first beam former of the microphone pair pointing to the mouth of the person is constructed and the control parameters of the first beam former are determined. Typical beamformers include end-beam beamformers with microphones Mic1, mic3 or vertical-beam beamformers with microphones Mic1, mic 2. Taking the end-fire beam formers of Mic1 and Mic3 as examples, the wind noise is small at this time, so the self-adaptive beam former is adopted, and the environmental noise can be effectively eliminated by adopting a fixed step length and a step length larger than a preset step length threshold value. Typical generalized sidelobe canceller (Generalized sidelobe canceller, GSC) based adaptive beamformer update expression is:
Figure SMS_12
Wherein u is const Is a fixed step control parameter, can take a value of 0.2 in practice, P X (k, l) and P E (k, l) are the signals X of the principal microphone (i.e. the microphone near the mouth of the selected microphone pair) corresponding to the kth frequency point of the first frame 1 Power spectrum of (k, l) and power spectrum of residual signal E (k, l), residual signal E (k, l) =x 1 (k, l) -W (k, l-1). R (k, l), R (k, l) is a reference signal corresponding to the kth frequency point of the first frame, and target voice at the mouth of a person should be eliminated as much as possible in the reference signal, wherein two sound signal differences after phase compensation are adopted as reference signals, specifically:
Figure SMS_13
where c is the sound velocity, f s For sampling rate, N FFT For the length of the fast fourier transform (Fast Fourier Transform, FFT), τ 0 The time delay difference to be compensated is calculated according to the relative positions of the human mouth relative to Mic1 and Mic 3. The frequency range of formula (8) is 0-4kHz because the wind noise is small at this time. Obtained by using the above-mentioned sceneThe filter W (k, l) performs filter processing on the received sound signal to obtain a filter output Y (k, l) =x 1 (k,l)-W(k,l)·R(k,l)。
2) When F min (l) F when being larger than or equal to the first preset threshold value and smaller than the second preset threshold value min (l) Wind noise in a corresponding microphone pair belongs to wind noise, at the moment, the wind noise has obviously affected the voice call quality, and a second beam forming device control parameter of the microphone pair pointing to the human mouth are constructed. Therefore, in order to secure voice quality, noise reduction processing is required. Specifically, an adaptive beam former can be adopted, a variable step processing mode is adopted in a frequency band range with larger influence of wind noise, and a processing mode with fixed step and step larger than a preset step threshold value is adopted in a frequency band range with smaller influence of wind noise. As a preferred embodiment of the present invention, a typical generalized sidelobe canceller (Generalized sidelobe canceller, GSC) based adaptive beamformer update expression is:
Figure SMS_14
Wherein k is L And k 4000 The frequency points are respectively corresponding to the demarcation frequency points and the 4 kHz. u (F) min (l) Step length control parameter, the value is generally controlled between 0.01 and 0.2, F min (l) The larger the u (F) min (l) The smaller the wind noise, the smaller the step size, and the slower the filter update speed. P (P) X (k, l) and P E (k, l) are the signals X of the principal microphone (i.e. the microphone close to the mouth of the selected microphone pair) corresponding to the kth frequency point of the first frame 1 Power spectrum of (k, l) and power spectrum of residual signal E (k, l), residual signal E (k, l) =x 1 (k, l) -W (k, l-1). R (k, l), R (k, l) is a reference signal corresponding to the kth frequency point of the first frame, and is defined as the same as the formula (9). At the same time F min (l) Also determines the frequency range using equation (10), F min (l) The smaller k L (F min (l) The smaller). The filter designed by utilizing the thought can quickly update the filter coefficient at the frequency point with small wind noise, thereby effectively suppressingMaking environmental noise; the filter coefficient is updated slowly or is not updated basically at the frequency point with large wind noise, so that the condition that the filter performance is reduced or even the voice quality is deteriorated due to the divergence of the filter is avoided; at the middle-high frequency point which is not affected by wind noise, the filter coefficient is kept to be updated in a faster fixed step length, and the environmental noise can be effectively eliminated. Specifically, F min (l) And u (F) min (l) Corresponding relation of F) min (l) And k is equal to L (F min (l) The correspondence relation of the above is adjustable according to the actual scene.
In the embodiment of the present invention, F min (l) And u (F) min (l) Corresponding relation of F) min (l) And k is equal to L The corresponding relation of (2) is:
Figure SMS_15
Figure SMS_16
wherein k is 1000 And k 3000 Representing the frequency points corresponding to 1kHz and 3kHz respectively.
The received sound signal is filtered by the filter w (k, l) obtained by the scene to obtain a filtered output Y (k, l) =x 1 (k,l)-W(k,l)·R(k,l)。
3) When F min (l) F when the threshold value is greater than or equal to a second preset threshold value min (l) Wind noise in the corresponding microphone pair belongs to large wind noise, at this time, wind noise in sound signals received by the 4 pairs of microphones is strong, and beam performance of the single group of microphone pairs is difficult to meet application requirements. Therefore, a fixed beam forming process using 4 microphone pairs is required to form a third beam former directed at the mouth of a person. Since the adaptive beamformer is easily diverged in case of a large wind noise, in the embodiment of the present invention, the fixed beamformer is used for the speech enhancement. The fixed beamformer control parameters are fixed independent of the received sound signal. Specific filter coefficient expressions mayThe writing is as follows:
Figure SMS_17
wherein f s For sampling rate, N FFT For the length of the fast fourier transform (Fast Fourier Transform, FFT), τ 1 ,τ 2 ,...,τ M The relative time delay from the position of the human mouth to each microphone can be set in advance according to the relative position of the microphones. Final beamformed output Y (k, l) =w H (k, l) ·x (k, l), wherein X (k, l) = [ X 1 (k,l),X 2 (k,l),...X M (k,l)] T A received signal matrix is formed for the sound signals received by all microphones. Finally, the beam forming output signals of the adaptive beam forming device or the fixed beam forming device are further subjected to noise reduction treatment by adopting a single-channel voice enhancement algorithm, and enhanced voice signals y (t) are obtained after Inverse Short-time Fourier transform (Inverse Short-Time Fourier Transform, ISTFT) and Overlap-Add (overlay-Add). Fig. 9 (a) -9 (e) sequentially show the voice signals received by the front right, rear right, front left and rear left microphones and the signal processed by the algorithm under the condition of wind noise with different directions and different intensities. Wherein the front 10s wind noise is incident from the front, the 10-20s wind noise is incident from the left side, the 20-30s wind noise is incident from the right side, and the 30-40s wind noise is incident from the rear. The abscissa of fig. 9 (a) to 9 (e) represents time in seconds. As can be seen from fig. 9 (a) -9 (e), the method for suppressing wind noise and environmental noise based on the headset according to the embodiments of the present invention can effectively suppress wind noise under different wind noise conditions, while retaining the target voice signal.
It should be noted that the invention can be directly used for controlling wind noise and environmental noise suppression during communication, and can also provide effective support for controlling a feedback filter in an active noise reduction earphone. The wind noise judging function provided by the invention can detect the wind noise intensity in each pair of microphones. When wind noise is strong, the filter coefficient in the active noise reduction earphone is not updated or updated slowly. And when wind noise is weak, the filter coefficient in the active noise reduction earphone is updated quickly.
Fig. 10 shows a flowchart of a method for suppressing wind noise and environmental noise based on a headphone according to an embodiment of the present invention, where the flowchart includes: S1001-S1004; the method is applied to a headset comprising M microphones.
S1001, receiving M sound signals; wherein the sound signal includes at least a target speech signal, wind noise, and ambient noise.
In an embodiment of the present invention, M microphones on a headphone receive sound signals, respectively. The sound signal includes at least a target speech signal (i.e., a person speaking sound), wind noise, and ambient noise.
S1002, carrying out complex coherence function analysis and sound source positioning angle analysis on received sound signals by each microphone, and constructing a wind noise judging function corresponding to each microphone pair one by one according to a complex coherence function analysis result and a sound source positioning angle analysis result; wherein each microphone pair consists of two different microphones, and a plurality of microphone pairs are adopted to carry out joint decision.
In the embodiment of the invention, each microphone has coherence between two paths of received sound signals. And carrying out complex coherence function analysis and sound source positioning angle analysis on the received two paths of sound signals by each microphone. Specifically, according to the received sound signals and the smoothing factors of each microphone pair, determining first-order smoothing power spectrums of two microphones in each microphone pair and first-order smoothing power spectrums of the two microphones; determining a complex coherence function between each microphone pair received sound signals according to the first-order smooth power spectrums of the two microphones and the first-order smooth power spectrums of the two microphones; determining a sound source positioning estimation angle according to the phase difference of each microphone to the received sound signals; and determining the sound source positioning angle deviation according to the sound source positioning estimated angle and the priori preset angle. Then, to further reduce the deviation, the sound source localization angle deviation is smoothed. And constructing wind noise judging functions corresponding to the M microphone pairs one by one according to the complex coherence function and the sound source positioning angle deviation after the smoothing treatment.
S1003, determining a wind noise judgment function with the smallest function value in the corresponding wind noise judgment functions of all microphones, and taking the wind noise judgment function as a first wind noise judgment function.
In the embodiment of the present invention, a wind noise judgment function with the smallest function value among the wind noise judgment functions corresponding to each microphone pair constructed in S1002 is determined as a first wind noise judgment function.
S1004, determining the wind noise according to the first wind noise judging function, determining the adopted microphone pair and the corresponding beam forming device and control parameters according to the wind noise, and inhibiting wind noise and environmental noise in the sound signal to obtain a beam forming output signal.
In the embodiment of the invention, the wind noise is determined according to the first wind noise judging function, the adopted microphone pair and the corresponding beam forming device and the control parameters are determined according to the wind noise, the wind noise and the environmental noise in the sound signal received in S1001 are restrained, and the beam forming output signal is obtained. Specifically, when the function value of the first wind noise judging function is smaller than a first preset threshold value, determining that wind noise in the received sound signals of the microphone pair corresponding to the first wind noise judging function is small wind noise, selecting the corresponding microphone pair according to the first wind noise judging function, forming a first beam forming device at the position of the human mouth, and determining control parameters of the first beam forming device; or when the function value of the first wind noise judging function is larger than or equal to a first preset threshold value and smaller than a second preset threshold value, determining that wind noise in the sound signals received by the microphone pair corresponding to the first wind noise judging function is wind noise, selecting the corresponding microphone pair according to the first wind noise judging function, forming a second beam forming device at the mouth of a person, and determining control parameters of the second beam forming device; or when the function value of the first wind noise judging function is larger than or equal to a second preset threshold value, determining that wind noise in the received sound signals is large wind noise by the microphones corresponding to the first wind noise judging function, and performing fixed beam forming processing by utilizing all microphones to form a third beam forming device reaching the mouth of a person.
As a preferred embodiment of the present invention, the first beamformer is preferably an adaptive beamformer having a step length greater than a preset step length threshold and being fixed; the second beam forming device is an adaptive beam forming device with step length dynamically changing; the third beamformer is a fixed beamformer.
To further enhance the voice signal, a single-channel voice enhancement process is performed on the beamformed output signal obtained in S1004.
The wind noise direction is determined according to the wind noise determination function relation of each microphone pair, and the wind noise bandwidth upper limit of the sound signal received by the corresponding microphone pair is determined according to the magnitude of the first wind noise determination function.
Fig. 11 shows a schematic structural diagram of a headset-based wind noise and environmental noise suppression device according to an embodiment of the present invention, where the schematic structural diagram includes: a signal receiving module 1101, a function constructing module 1102, a function determining module 1103, an output obtaining module 1104; the device is located in a headset comprising M microphones.
A signal receiving module 1101 for receiving M sound signals; wherein the sound signal at least comprises a target voice signal, wind noise and environmental noise;
The function construction module 1102 is configured to perform complex coherence function analysis and sound source localization angle analysis on the received sound signal for each microphone, and construct a wind noise judgment function corresponding to each microphone pair one by one according to the complex coherence function analysis result and the sound source localization angle analysis result; each microphone pair consists of two different microphones, and a plurality of microphone pairs are adopted to carry out joint judgment;
the function determining module 1103 is configured to determine a wind noise judging function with a smallest function value among the wind noise judging functions corresponding to all the microphones, and take the wind noise judging function as a first wind noise judging function;
and the output obtaining module 1104 is configured to determine a wind noise size according to the first wind noise judging function, determine a microphone pair and a corresponding beam former and control parameters according to the wind noise size, and suppress wind noise and environmental noise in the sound signal to obtain a beam forming output signal.
The embodiment of the invention also provides a headset-based wind noise and environmental noise suppression device, which comprises at least one processor, wherein the processor is used for executing a program stored in a memory, and when the program is executed, the device is caused to execute the following steps:
Receiving M sound signals; wherein the sound signal at least comprises a target voice signal, wind noise and environmental noise; carrying out complex coherence function analysis and sound source positioning angle analysis on the received sound signals by each microphone, and constructing a wind noise judging function corresponding to each microphone pair one by one according to the complex coherence function analysis result and the sound source positioning angle analysis result; wherein each microphone pair consists of two different microphones, and a plurality of microphone pairs are adopted to carry out joint judgment; determining a wind noise judging function with the smallest function value in the wind noise judging functions corresponding to all microphones, and taking the wind noise judging function as a first wind noise judging function; and determining the wind noise according to the first wind noise judging function, determining the adopted microphone pair and the corresponding beam forming device and control parameters according to the wind noise, and inhibiting wind noise and environmental noise in the sound signal to obtain a beam forming output signal.
The embodiment of the invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the following steps when being executed by a processor:
Receiving M sound signals; wherein the sound signal at least comprises a target voice signal, wind noise and environmental noise; carrying out complex coherence function analysis and sound source positioning angle analysis on the received sound signals by each microphone, and constructing a wind noise judging function corresponding to each microphone pair one by one according to the complex coherence function analysis result and the sound source positioning angle analysis result; wherein each microphone pair consists of two different microphones, and a plurality of microphone pairs are adopted to carry out joint judgment; determining a wind noise judging function with the smallest function value in the wind noise judging functions corresponding to all microphones, and taking the wind noise judging function as a first wind noise judging function; and determining the wind noise according to the first wind noise judging function, determining the adopted microphone pair and the corresponding beam forming device and control parameters according to the wind noise, and inhibiting wind noise and environmental noise in the sound signal to obtain a beam forming output signal.
It should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (6)

1. A headset-based wind noise suppression method, the method being applied to a headset, the headset including M microphones, comprising:
receiving M sound signals; wherein the sound signal at least comprises a target voice signal and wind noise;
carrying out complex coherence function analysis and sound source positioning angle analysis on the received sound signals by each microphone, and constructing a wind noise judging function corresponding to each microphone pair one by one according to the complex coherence function analysis result and the sound source positioning angle analysis result; wherein each microphone pair consists of two different microphones, and a plurality of microphone pairs are adopted to carry out joint judgment;
determining a wind noise judging function with the smallest function value in the wind noise judging functions corresponding to all microphones, and taking the wind noise judging function as a first wind noise judging function;
determining the wind noise according to the first wind noise judging function, determining the adopted microphone pair and the corresponding beam forming device and control parameters according to the wind noise, and inhibiting wind noise in the sound signal to obtain a beam forming output signal;
wherein wind noise determination function F (l) =f C (l)·F bia_sm (l),F C (l) Representing a function based on the complex coherence function magnitude spectrum,
Figure FDA0004233625420000011
Figure FDA0004233625420000012
Figure FDA0004233625420000013
For complex coherence functions between sound signals received by two microphones of each microphone pair, i and j represent the i and j th microphones, respectively, and l and k represent the kth frequency point, k, of the first frame of sound signals L And k H Respectively representing frequency points corresponding to the upper and lower boundaries of the frequency band range where wind noise is likely to occur, < ->
Figure FDA0004233625420000014
Alpha and beta are the first and second smoothing factors, respectively, < >>
Figure FDA0004233625420000015
B is the number of sub-bands divided, θ (B, l) is the B sub-band of the first frame, and a delay estimation method is adopted to obtain the estimated angle of sound source incidence, +.>
Figure FDA0004233625420000016
The angle is preset a priori;
the determining the wind noise according to the first wind noise judging function, and determining the adopted microphone pair and the corresponding beam forming device and the control parameters according to the wind noise, including:
when the function value of the first wind noise judging function is smaller than a first preset threshold value, determining that wind noise in a sound signal received by a microphone pair corresponding to the first wind noise judging function is small wind noise, selecting a corresponding microphone pair according to the first wind noise judging function, forming a first beam forming device at the mouth of a person, and determining control parameters of the first beam forming device; or (b)
When the function value of the first wind noise judging function is larger than or equal to a first preset threshold value and smaller than a second preset threshold value, determining that wind noise in a sound signal received by a microphone pair corresponding to the first wind noise judging function is wind noise, selecting a corresponding microphone pair according to the first wind noise judging function, forming a second beam forming device at the mouth of a person, and determining control parameters of the second beam forming device; or (b)
When the function value of the first wind noise judging function is larger than or equal to a second preset threshold value, determining that wind noise in the sound signals received by the microphone corresponding to the first wind noise judging function is large wind noise, and performing fixed beam forming processing by utilizing all microphone pairs to form a third beam forming device reaching the mouth of a person.
2. The method of claim 1, wherein the first beamformer is an adaptive beamformer having a step size greater than a preset step size threshold and fixed; the second beam forming device is an adaptive beam forming device with step length dynamically changed; the third beamformer is a fixed beamformer.
3. The method according to claim 1 or 2, characterized in that the method further comprises:
and judging wind noise directions according to the wind noise judging function relation of each microphone pair, and judging the wind noise bandwidth upper limit of the sound signals received by the corresponding microphone pair according to the magnitude of the first wind noise judging function.
4. A headset-based wind noise suppression device, the device being located in a headset, the headset comprising M microphones, comprising:
the signal receiving module is used for receiving M sound signals; wherein the sound signal at least comprises a target voice signal and wind noise;
The function construction module is used for carrying out complex coherence function analysis and sound source positioning angle analysis on the received sound signals of each microphone, and constructing a wind noise judgment function corresponding to each microphone pair one by one according to the complex coherence function analysis result and the sound source positioning angle analysis result; wherein each microphone pair consists of two different microphones, and a plurality of microphone pairs are adopted to carry out joint judgment;
the function determining module is used for determining the wind noise judging function with the smallest function value in the wind noise judging functions corresponding to all the microphones, and taking the wind noise judging function as a first wind noise judging function;
the output obtaining module is used for determining the wind noise according to the first wind noise judging function, determining the adopted microphone pair and the corresponding beam forming device and control parameters according to the wind noise, and restraining the wind noise in the sound signal to obtain a beam forming output signal;
wherein wind noise determination function F (l) =f C (l)·F bia_sm (l),F C (l) Representing a function based on the complex coherence function magnitude spectrum,
Figure FDA0004233625420000021
Figure FDA0004233625420000022
for complex coherence functions between sound signals received by two microphones of each microphone pair, i and j represent the i and j th microphones, respectively, and l and k represent the kth frequency point, k, of the first frame of sound signals L And k H Respectively representing frequency points corresponding to the upper and lower boundaries of the frequency band range where wind noise is likely to occur, < ->
Figure FDA0004233625420000023
Alpha and beta are the first and second smoothing factors, respectively, < >>
Figure FDA0004233625420000024
B is the number of sub-bands divided, θ (B, l) is the B sub-band of the first frame, and a delay estimation method is adopted to obtain the estimated angle of sound source incidence, +.>
Figure FDA0004233625420000025
The angle is preset a priori;
the determining the wind noise according to the first wind noise judging function, and determining the adopted microphone pair and the corresponding beam forming device and the control parameters according to the wind noise, including:
when the function value of the first wind noise judging function is smaller than a first preset threshold value, determining that wind noise in a sound signal received by a microphone pair corresponding to the first wind noise judging function is small wind noise, selecting a corresponding microphone pair according to the first wind noise judging function, forming a first beam forming device at the mouth of a person, and determining control parameters of the first beam forming device; or (b)
When the function value of the first wind noise judging function is larger than or equal to a first preset threshold value and smaller than a second preset threshold value, determining that wind noise in a sound signal received by a microphone pair corresponding to the first wind noise judging function is wind noise, selecting a corresponding microphone pair according to the first wind noise judging function, forming a second beam forming device at the mouth of a person, and determining control parameters of the second beam forming device; or (b)
When the function value of the first wind noise judging function is larger than or equal to a second preset threshold value, determining that wind noise in the sound signals received by the microphone corresponding to the first wind noise judging function is large wind noise, and performing fixed beam forming processing by utilizing all microphone pairs to form a third beam forming device reaching the mouth of a person.
5. A headset-based wind noise suppression device, comprising at least one processor for executing a program stored in a memory, which when executed, causes the device to perform the method of any of claims 1-3.
6. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the method according to any of claims 1-3.
CN202011258693.6A 2020-11-12 2020-11-12 Headset-based wind noise suppression method and device Active CN112242148B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011258693.6A CN112242148B (en) 2020-11-12 2020-11-12 Headset-based wind noise suppression method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011258693.6A CN112242148B (en) 2020-11-12 2020-11-12 Headset-based wind noise suppression method and device

Publications (2)

Publication Number Publication Date
CN112242148A CN112242148A (en) 2021-01-19
CN112242148B true CN112242148B (en) 2023-06-16

Family

ID=74166706

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011258693.6A Active CN112242148B (en) 2020-11-12 2020-11-12 Headset-based wind noise suppression method and device

Country Status (1)

Country Link
CN (1) CN112242148B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11521633B2 (en) * 2021-03-24 2022-12-06 Bose Corporation Audio processing for wind noise reduction on wearable devices
CN113132845A (en) * 2021-04-06 2021-07-16 北京安声科技有限公司 Signal processing method and device, computer readable storage medium and earphone
CN113490093B (en) * 2021-06-28 2023-11-07 北京安声浩朗科技有限公司 TWS earphone
CN114040309B (en) * 2021-09-24 2024-03-19 北京小米移动软件有限公司 Wind noise detection method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106448696A (en) * 2016-12-20 2017-02-22 成都启英泰伦科技有限公司 Adaptive high-pass filtering speech noise reduction method based on background noise estimation
CN109215677A (en) * 2018-08-16 2019-01-15 北京声加科技有限公司 A kind of wind suitable for voice and audio is made an uproar detection and suppressing method and device
CN111163391A (en) * 2020-04-03 2020-05-15 恒玄科技(北京)有限公司 Method for noise reduction of headphones and noise reduction headphones

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101168002B1 (en) * 2004-09-16 2012-07-26 프랑스 텔레콤 Method of processing a noisy sound signal and device for implementing said method
TWI412023B (en) * 2010-12-14 2013-10-11 Univ Nat Chiao Tung A microphone array structure and method for noise reduction and enhancing speech
US8929564B2 (en) * 2011-03-03 2015-01-06 Microsoft Corporation Noise adaptive beamforming for microphone arrays
JP6074263B2 (en) * 2012-12-27 2017-02-01 キヤノン株式会社 Noise suppression device and control method thereof
US10176823B2 (en) * 2014-05-09 2019-01-08 Apple Inc. System and method for audio noise processing and noise reduction
TWI823334B (en) * 2016-10-24 2023-11-21 美商艾孚諾亞公司 Automatic noise cancellation using multiple microphones
US10706868B2 (en) * 2017-09-06 2020-07-07 Realwear, Inc. Multi-mode noise cancellation for voice detection

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106448696A (en) * 2016-12-20 2017-02-22 成都启英泰伦科技有限公司 Adaptive high-pass filtering speech noise reduction method based on background noise estimation
CN109215677A (en) * 2018-08-16 2019-01-15 北京声加科技有限公司 A kind of wind suitable for voice and audio is made an uproar detection and suppressing method and device
CN111163391A (en) * 2020-04-03 2020-05-15 恒玄科技(北京)有限公司 Method for noise reduction of headphones and noise reduction headphones

Also Published As

Publication number Publication date
CN112242148A (en) 2021-01-19

Similar Documents

Publication Publication Date Title
CN112242148B (en) Headset-based wind noise suppression method and device
JP7011075B2 (en) Target voice acquisition method and device based on microphone array
CN107454538B (en) Hearing aid comprising a beamformer filtering unit comprising a smoothing unit
US11270696B2 (en) Audio device with wakeup word detection
US10885907B2 (en) Noise reduction system and method for audio device with multiple microphones
CN107484080B (en) Audio processing apparatus and method for estimating signal-to-noise ratio of sound signal
CN107360527B (en) Hearing device comprising a beamformer filtering unit
US8606571B1 (en) Spatial selectivity noise reduction tradeoff for multi-microphone systems
US11245976B2 (en) Earphone signal processing method and system, and earphone
CN104661152B (en) Spatial filter bank for hearing system
US10339949B1 (en) Multi-channel speech enhancement
EP3422736B1 (en) Pop noise reduction in headsets having multiple microphones
JP2008507926A (en) Headset for separating audio signals in noisy environments
KR20100054873A (en) Robust two microphone noise suppression system
CN111063366A (en) Method and device for reducing noise, electronic equipment and readable storage medium
JP5903921B2 (en) Noise reduction device, voice input device, wireless communication device, noise reduction method, and noise reduction program
CN113949955B (en) Noise reduction processing method and device, electronic equipment, earphone and storage medium
CN112735370B (en) Voice signal processing method and device, electronic equipment and storage medium
US10575085B1 (en) Audio device with pre-adaptation
Directionality Maximizing the voice-to-noise ratio (VNR) via voice priority processing
US11930333B2 (en) Noise suppression method and system for personal sound amplification product
Lotter et al. A stereo input-output superdirective beamformer for dual channel noise reduction.
Zhang et al. A frequency domain approach for speech enhancement with directionality using compact microphone array.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant