CN110931032A - Dynamic echo cancellation method and device - Google Patents
Dynamic echo cancellation method and device Download PDFInfo
- Publication number
- CN110931032A CN110931032A CN201911133606.1A CN201911133606A CN110931032A CN 110931032 A CN110931032 A CN 110931032A CN 201911133606 A CN201911133606 A CN 201911133606A CN 110931032 A CN110931032 A CN 110931032A
- Authority
- CN
- China
- Prior art keywords
- threshold
- current frame
- signal
- sequence
- frequency domain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 54
- 239000003550 marker Substances 0.000 claims description 40
- 108010076504 Protein Sorting Signals Proteins 0.000 claims description 30
- 230000003044 adaptive effect Effects 0.000 claims description 9
- 238000002372 labelling Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 7
- 238000009499 grossing Methods 0.000 abstract description 5
- 230000035772 mutation Effects 0.000 abstract description 4
- 230000008030 elimination Effects 0.000 abstract description 3
- 238000003379 elimination reaction Methods 0.000 abstract description 3
- 238000004364 calculation method Methods 0.000 description 8
- 238000001228 spectrum Methods 0.000 description 6
- 238000012935 Averaging Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 230000003111 delayed effect Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000002592 echocardiography Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0264—Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02082—Noise filtering the noise being echo, reverberation of the speech
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)
Abstract
The invention discloses a dynamic echo cancellation method and a device, wherein the method comprises the following steps: acquiring a near-end signal picked up by a microphone and a far-end signal played by a loudspeaker; determining respective frequency domain sub-band signals through the acquired far-end and near-end signals; updating corresponding sub-band thresholds according to sub-band signals of the far-end signal and the near-end signal; determining delay values through sub-band signals and sub-band thresholds of the far-end signal and the near-end signal; counting a delay value within a period of time, and performing smoothing to determine a current delay value; and adjusting the self-adaptive filter according to the far-end signal and the near-end signal of the current elimination delay to obtain the subtraction of the estimated value of the echo signal and the signal of the receiving end, thereby realizing the dynamic echo elimination. The invention can improve the performance of the algorithm, prevent the time delay mutation, ensure the accuracy of the time delay estimation and improve the echo cancellation effect.
Description
Technical Field
The present invention relates to echo cancellation methods and apparatuses, and in particular, to a dynamic echo cancellation method and apparatus.
Background
Acoustic echo is the result of coupling between the microphone and the speaker. In the development of intelligent information, echo interference is formed in application scenes such as a remote audio conference system, a Bluetooth headset, an intelligent sound box, a smart phone and the like, so that echo cancellation has far-reaching significance on a voice device.
The echo cancellation algorithm establishes a far-end signal echo estimation model by using the correlation between a far-end signal of a known device and a multi-path echo signal generated by the far-end signal as a main basis, approaches an optimal echo signal by using a self-adaptive filter, and subtracts the estimated echo signal from a near end to achieve the purpose of echo cancellation. The echo cancellation algorithm also compares the signal received by the microphone with the far-end signal history to cancel acoustic echoes of various delayed multiple reflections. The historical values of the far-end signals are stored in the calculation process, so that the calculation amount of the algorithm is large, and the algorithm is not suitable for embedded equipment with limited calculation capacity. Meanwhile, sudden changes of time delay parameters can be generated, and the echo cancellation effect is influenced. The existing echo cancellation method calculates the maximum similarity between a near-end signal and a far-end signal as a delay estimation value, and moves the far-end signal to update the coefficient of an adaptive filter to obtain an echo signal which is close to the echo of the near-end signal, and then the echo cancellation is achieved by subtracting the echo signal which is close to the echo of the near-end signal. The existing scheme also has the problems of sudden change of a delay estimation value, estimation accuracy of a delay value and algorithm efficiency. The accuracy of the delay estimation of the far-end signal has a large influence on the echo cancellation effect, so the accuracy of the delay estimation in the echo cancellation effect still needs to be improved.
Disclosure of Invention
The present invention provides a dynamic echo cancellation method and apparatus, which are used to solve the problem of poor echo cancellation effect caused by inaccurate estimation of far-end signal delay in the echo cancellation method in the prior art.
In order to realize the task, the invention adopts the following technical scheme:
a dynamic echo cancellation method, said method being performed according to the steps of:
step 1, obtaining a mark sequence T (m) of a current mth frame signal X (m), wherein m is a positive integer, an initial value of m is 1, the mark sequence T (1) does not exist, and the current frame mark sequence T (m) comprises a current frame near-end mark sequence Tn(m) and the current frame far-end marker sequence Tf(m);
Marking the near end of the current frame with a sequence Tn(m) input to step3, marking the far end of the current frame with a sequence Tf(m) after the saving, performing step 2;
the obtaining of the marker sequence t (m) of the current mth frame signal x (m) specifically includes:
step 1.1, obtaining current mth frame signal X (m), performing short-time Fourier transform on the mth frame signal X (m) to obtain current frame frequency domain subband signal Y (m), wherein the current mth frame signal X (m) comprises current frame near-end signal X (m)n(m) and the current frame far-end signal Xf(m); the current frame frequency domain sub-band signal Y (m) comprises a current frame near-end frequency domain sub-band signal Yn(m) and the far-end frequency domain subband signal Y of the current framef(m), wherein n represents a proximal end and f represents a distal end;
when m is equal to 1, after the current frame frequency domain sub-band signal Y (1) is stored, setting m to m +1, and returning to the step 1.1;
step 1.2, obtaining a current frame frequency domain sub-band threshold P (m) by using L historical frame frequency domain sub-band signals { Y (m-1), Y (m-2), …, Y (m-L), …, Y (m-L) | L ═ 1,2, …, L } and a current frame frequency domain sub-band signal Y (m), wherein the current frame frequency domain sub-band threshold includes a current frame near-end frequency domain sub-band threshold P (m)n(m) and the far-end frequency domain sub-band threshold P of the current framef(m);
Wherein L < m is greater than or equal to 1, and Y (m-L) represents the frequency domain subband signal obtained in the step 1.1 performed for the first time before the step 1.1 is performed this time;
step 1.3, obtaining a current frame marker sequence t (m), specifically including:
the near-end frequency domain sub-band signal Y of the current frame obtained in the step 1.1n(m) is recorded as a first sequence, and the near-end frequency domain sub-band threshold P of the current frame obtained in the step 1.2 is recorded as a first sequencen(m) inputting the difference value marking method by marking as the second sequence to obtain the near-end marking sequence T of the current framen(m);
The far-end frequency domain sub-band signal Y of the current frame obtained in the step 1.1f(m) is recorded as a first sequence, and the far-end frequency domain sub-band threshold P of the current frame obtained in the step 1.2 is recorded as a first sequencef(m) recording as a second sequence input difference value marking method to obtain a current frame far-end marking sequence Tf(m);
The difference marking method specifically comprises the following steps:
subtracting the first sequence and the second sequence according to the bit to obtain a difference sequence;
carrying out binarization on the difference sequence to obtain a marking sequence;
when binarization is carried out, if the difference sequence is more than or equal to 0, the element at the same position in the mark sequence is 1, otherwise, the element is 0;
step 2, obtaining the far-end mark sequences { T ] of the S historical frames saved before the step 1 is executed this timef(m-1),Tf(m-2),…,Tf(m-s),…,Tf(m-S) | S ═ 1,2, …, S } and step3 is performed, 1 ≦ S<m, wherein Tf(m-s) represents the remote marker sequence obtained by executing step 1 the s time before executing step 1 this time;
step3, marking the near-end mark sequence T of the current frame obtained in the step 1n(m) and the S far-end marker sequences { T } of the historical frames obtained in step 2f(m-1),Tf(m-2),…,Tf(m-s),…,TfPerforming bitwise exclusive-or operation on (m-S) | S ═ 1,2, …, and S } respectively to obtain S binary sequences;
calculating the occurrence frequency of 1 in the binary sequence, and taking the historical frame far-end marker sequence with the most frequency as a marker signal sequence;
calculating the time difference between the marking signal sequence and the near-end signal sequence of the current frame, obtaining and storing the initial delay value of the current frame;
and 4, obtaining preliminary delay values of the N historical frames stored before the step3 is executed, obtaining the delay value with the largest occurrence frequency for the preliminary delay values of the N historical frames, and obtaining a delay value estimation N1, wherein N1< ═ S.
Step 5, for the current frame near-end signal X obtained in step 1.1n(m) obtaining time domain correlation with a far-end signal sequence with a delay value of n1 to obtain a correlation value obtained in the step 5 executed this time;
comparing the correlation value obtained this time with the correlation value obtained by executing the step 5 last time, and selecting the far-end signal corresponding to the maximum correlation value as the far-end signal after the current frame is aligned;
step 6, utilizing the aligned far-end signal of the current frame obtained in step 5 and the near-end signal X of the current frame obtained in step 1.1nAnd (m), after echo cancellation is carried out to obtain a signal after current frame echo cancellation, setting m to be m +1, returning to the step 1 until the current frame signal is the last frame signal, and ending.
Further, said step 1.2 adopts formula I to obtain the current frame frequency domain sub-band threshold p (m):
where win () represents a weighting function, 0< win () < 1.
Further, the step 1.2 specifically includes:
step A, obtaining a first threshold P of a signal by adopting a formula II1(m), said signal first threshold P1(m) a first threshold comprising a near-end signalOr a first threshold of the far-end signal
Where win () represents a weighting function, 0< win () < 1;
step B, adopting a formula III to obtain a second threshold P2(m) said P2(m) includes a proximal second thresholdOr a remote second threshold
P2(m)=P2(m-1)×gamma1+Y(m)×(1-gamma1) Formula III
Wherein P is2(m-1) is a second threshold, P, of the subband signal in the m-1 th frame frequency domain2(m-1) a near end second threshold comprising a subband signal in the m-1 th frame frequency domainOr a far-end second threshold of the sub-band signal in the m-1 th frame frequency domaingamma1Is a first weighted value of 0<gamma1<1;
Step C, obtaining a current frame frequency domain sub-band threshold P (m) by adopting a formula IV, wherein P (m) comprises a current frame near-end frequency domain sub-band threshold Pn(m) or far-end frequency domain sub-band threshold P of current framef(m):
P(m)=P1(m)×gamma2+P2(m)×(1-gamma2) Formula IV
Wherein gamma is2Is a second weighted value, 0<gamma2<1。
Further, the step 1.2 specifically includes:
step I, adopting a formula II to obtain a first threshold P1(m), said first threshold P1(m) includes a near-end first thresholdOr a remote first threshold
Where win () represents a weighting function, 0< win () < 1;
step II, adopting the formula III to obtain a second threshold P2(m) said P2(m) includes a proximal second thresholdOr a remote second threshold
P2(m)=P2(m-1)×gamma1+Y(m)×(1-gamma1) Formula III
Wherein P is2(m-1) is a second threshold, P, of the subband signal in the m-1 th frame frequency domain2(m-1) including a second threshold of the m-1 th frame near-end frequency domain sub-band signalOr the second threshold of the far-end frequency domain sub-band signal of the m-1 th framegamma1Is a first weighted value of 0<gamma1<1;
Step III, obtaining a current frame frequency domain sub-band threshold P (m) by adopting a formula IV, wherein P (m) comprises a current frame near-end frequency domain sub-band threshold Pn(m) or far-end frequency domain sub-band threshold P of current framef(m):
P(m)=max(P1(m)×gamma3,P2(m)×gamma4) Formula V
Where max () denotes taking the maximum value, gamma3Is a third weighted value, 0<gamma3<1,gamma4Is a fourth weight value of 0<gamma4<1。
A kind of dynamic echo cancellation device, including the current frame marks the sequence and obtains the module, historical frame marks the sequence and obtains the module, preliminary delay value obtains the module, current delay value obtains the module, delay value judge module and echo cancellation module;
the current frame marking sequence obtaining module comprises a signal obtaining sub-module, a threshold obtaining sub-module and a marking sequence obtaining sub-module;
the signal acquisition submodule is used for acquiring a current mth frame signal X (m), wherein m is an integer and an initial value of m is 1, and performing short-time Fourier transform on the current frame signal to acquire a current frame frequency domain subband signalNumber Y (m), wherein the current frame frequency domain sub-band signal Y (m) includes a current frame near-end frequency domain sub-band signal Yn(m) and the far-end frequency domain subband signal Y of the current framef(m), wherein n represents a proximal end and f represents a distal end;
when m is 1, after the current frame frequency domain sub-band signal Y (1) is stored, setting m to be m +1 and returning to a signal acquisition sub-module;
the threshold obtaining sub-module is used for obtaining sub-band signals { Y (m-1), Y (m-2), …, Y (m-L), …, Y (m-L) | L ═ 1,2, …, L } and current frame frequency domain sub-band signals Y (m), wherein L is more than or equal to 1<m, Y (m-l) represents the frequency domain sub-band signal obtained at the m-l times, and the current frame frequency domain sub-band threshold P (m) is obtained, wherein the current frame frequency domain sub-band threshold comprises the current frame near-end frequency domain sub-band threshold Pn(m) and the far-end frequency domain sub-band threshold P of the current framef(m);
The marker sequence obtaining submodule is used for obtaining a marker sequence T (m) of the current frame,
wherein obtaining the current frame marker sequence t (m) specifically includes:
the near-end frequency domain sub-band signal Y of the current framen(m) recording as a first sequence, and taking the near-end frequency domain sub-band threshold P of the current framen(m) inputting the difference value marking method by marking as the second sequence to obtain the near-end marking sequence T of the current framen(m);
The far-end frequency domain sub-band signal Y of the current framef(m) recording as a first sequence, and taking the far-end frequency domain sub-band threshold P of the current framef(m) recording as a second sequence input difference value marking method to obtain a current frame far-end marking sequence Tf(m);
The difference marking method specifically comprises the following steps:
subtracting the first sequence and the second sequence according to the bit to obtain a difference sequence;
carrying out binarization on the difference sequence to obtain a marking sequence;
when binarization is carried out, if any element in the difference sequence is more than or equal to 0, the element at the same position in the mark sequence is 1, otherwise, the element is 0;
the historical frame mark sequence is obtainedThe module is used for obtaining the remote mark sequence { T } of the previously saved S historical framesf(m-1),Tf(m-2),…,Tf(m-s),…,Tf(m-S) | S ═ 1,2, …, S } and step3 is performed, 1 ≦ S<m, wherein Tf(m-s) represents the distal marker sequence obtained the s-th time before the present execution;
the preliminary delay value obtaining module is used for obtaining the near-end mark sequence T of the current framen(m) and S sequences of remote tags for historical frames { Tf(m-1),Tf(m-2),…,Tf(m-s),…,TfPerforming bitwise exclusive-or operation on (m-S) | S ═ 1,2, …, and S } respectively to obtain S binary sequences;
taking a history frame far-end signal sequence corresponding to a history frame far-end marking sequence corresponding to the binary sequence with the most 1 occurrence times in the S binary sequences as a marking signal sequence;
calculating the time difference between the marking signal sequence and the current frame near-end signal sequence, obtaining and storing the initial delay value of the current frame;
the current delay value obtaining module is used for obtaining preliminary delay values of N historical frames stored before, solving the delay value with the largest number of times of the current frame and the preliminary delay values of the N historical frames, and obtaining a current frame delay value N1, wherein N1< ═ S;
the delay value judging module is used for judging the near-end signal X of the current framen(m) obtaining time domain correlation from a far-end signal sequence corresponding to the delay value of the current frame to obtain a current correlation value;
comparing the current correlation value with the last correlation value, and selecting the far-end signal corresponding to the maximum correlation value as the far-end signal after the current frame is aligned;
the echo cancellation module is used for utilizing the obtained far-end signal after the current frame is aligned and the current frame near-end signal Xn(m) obtaining adaptive filter coefficients;
obtaining an echo signal according to the self-adaptive filter coefficient and the far-end signal after the current frame is aligned;
using the near-end signal X of the current framen(m) subtracting saidAfter the echo signal obtains the signal after the echo of the current frame is eliminated, setting m to be m +1, returning to the current frame marking sequence obtaining module until the current frame signal is the last frame signal, and ending.
Further, the threshold obtaining sub-module obtains the current frame frequency domain sub-band threshold p (m) by adopting formula I:
where win () represents a weighting function, 0< win () < 1.
Further, the threshold obtaining sub-module includes a first threshold obtaining unit, a second threshold obtaining unit and a threshold obtaining unit;
the first threshold obtaining unit is used for obtaining a first threshold P by adopting a formula II1(m), said first threshold P1(m) includes a near-end first thresholdOr a remote first threshold
Where win () represents a weighting function, 0< win () < 1;
the second threshold obtaining unit is used for obtaining a second threshold P by adopting a formula III2(m) said P2(m) includes a proximal second thresholdOr a remote second threshold
P2(m)=P2(m-1)×gamma1+Y(m)×(1-gamma1) Formula III
Wherein P is2(m-1) is a second threshold, P, of the subband signal in the m-1 th frame frequency domain2(m-1) a near end second threshold comprising a subband signal in the m-1 th frame frequency domainOr a far-end second threshold of the sub-band signal in the m-1 th frame frequency domaingamma1Is a first weighted value of 0<gamma1<1;
The threshold obtaining unit is configured to obtain a current frame frequency domain sub-band threshold P (m) using formula IV, where P (m) includes a current frame near-end frequency domain sub-band threshold Pn(m) or far-end frequency domain sub-band threshold P of current framef(m):
P(m)=P1(m)×gamma2+P2(m)×(1-gamma2) Formula IV
Wherein gamma is2Is a second weighted value, 0<gamma2<1。
Further, the threshold obtaining sub-module includes a first threshold obtaining unit, a second threshold obtaining unit and a threshold obtaining unit;
the first threshold obtaining unit is used for obtaining a first threshold P by adopting a formula II1(m), said first threshold P1(m) includes a near-end first thresholdOr a remote first threshold
Where win () represents a weighting function, 0< win () < 1;
the second threshold obtaining unit is used for obtaining a second threshold P by adopting a formula III2(m) said P2(m) Including a proximal second thresholdOr a remote second threshold
P2(m)=P2(m-1)×gamma1+Y(m)×(1-gamma1) Formula III
Wherein P is2(m-1) is a second threshold, P, of the subband signal in the m-1 th frame frequency domain2(m-1) including a second threshold of the m-1 th frame near-end frequency domain sub-band signalOr the second threshold of the far-end frequency domain sub-band signal of the m-1 th framegamma1Is a first weighted value of 0<gamma1<1;
The threshold obtaining unit is configured to obtain a current frame frequency domain sub-band threshold P (m) using formula IV, where P (m) includes a current frame near-end frequency domain sub-band threshold Pn(m) or far-end frequency domain sub-band threshold P of current framef(m):
P(m)=max(P1(m)×gamma3,P2(m)×gamma4) Formula V
Where max () denotes taking the maximum value, gamma3Is a third weighted value, 0<gamma3<1,gamma4Is a fourth weight value of 0<gamma4<1。
Compared with the prior art, the invention has the following technical effects:
1. after the frequency domain sub-band signals are obtained, the dynamic echo cancellation method and the device only adopt the frequency point range of interest, improve the accuracy of delay detection and remove the interference of other frequency components;
2. according to the dynamic echo cancellation method and device, the sub-band threshold is obtained by the methods of weighted summation, averaging, maximum value solving and the like of the sub-band signal energy of a plurality of frames or two adjacent frames of frequency spectrums, so that the problems of energy mutation, jitter and the like are effectively solved, the energy of the frequency spectrums and the sub-bands can be more stable, the anti-interference capability is stronger, and the accuracy of subsequent delay detection is improved;
3. the dynamic echo cancellation method and the device provided by the invention provide the judgment of updating the delay value, the delay value is smooth, and the problems of sudden change and jitter of the delay value are solved;
4. the dynamic echo cancellation method and the device provided by the invention have the advantages that the calculation of the intra-frame margin is provided, and the precision of delay alignment is effectively improved.
Drawings
Fig. 1 is a schematic diagram of a dynamic echo cancellation scenario provided by the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and examples. So that those skilled in the art can better understand the present invention. It is to be expressly noted that in the following description, a detailed description of known functions and designs will be omitted when it may obscure the subject matter of the present invention.
The following definitions or conceptual connotations relating to the present invention are provided for illustration:
example one
In this embodiment, a dynamic echo cancellation method is disclosed, which is performed according to the following steps:
step 1, obtaining a mark sequence T (m) of a current mth frame signal X (m), wherein m is a positive integer, the initial value of m is 1, the mark sequence T (1) does not exist, and the current frame mark sequence T (m) comprises a current frame near-end mark sequence Tn(m) and the current frame far-end marker sequence Tf(m);
Marking the near end of the current frame with a sequence Tn(m) inputting to step3, and labeling the far-end marker sequence T of the current framef(m) after the saving, performing step 2;
the obtaining of the marker sequence t (m) of the current mth frame signal x (m) specifically includes:
step 1.1, obtaining current mth frame signal X (m), m being an integer and m being an initial value of 1, performing short-time Fourier transform on the current frame signal to obtain current frame frequency domain sub-band signal Y (m), wherein the current frame frequency domain sub-band signal Y (m) comprises current frame near-end frequency domain sub-band signal Y (m)n(m) and the far-end frequency domain subband signal Y of the current framef(m), wherein n represents a proximal end and f represents a distal end;
when the current frame signal is the first frame signal X (1), after the current frame frequency domain sub-band signal Y (1) is stored, setting m to m +1, and returning to the step 1;
in this embodiment, first, a current frame near-end signal and a current frame far-end signal are obtained;
in the present embodiment, as shown in fig. 1, a description is given of the echo source in a scenario. The near-end microphone often picks up signals due to reflections of far-end speech from indoor walls or other obstacles. These reflected signals are signals that are reflected back through the obstruction due to the sound played by the speaker, generally referred to as echoes, as shown at 420 in FIG. 1; the time difference between the signal sent from the far end to the near end is recorded as the delay of the far end signal as the echo signal to the near end.
In the step, a near-end signal picked up by a microphone and a far-end signal played by a loudspeaker are obtained; the near-end signal comprises a target output signal and an echo signal delayed with respect to the far-end signal.
In this embodiment, short-time fourier transform is performed on the far-end signal and the near-end signal, respectively, to obtain frequency-domain subband signals.
Step 1.2, obtaining a current frame frequency domain sub-band threshold P (m) by using L historical frame frequency domain sub-band signals { Y (m-1), Y (m-2), …, Y (m-L), …, Y (m-L) | L ═ 1,2, …, L } and a current frame frequency domain sub-band signal Y (m), wherein the current frame frequency domain sub-band threshold includes a current frame near-end frequency domain sub-band threshold P (m)n(m) and the far-end frequency domain sub-band threshold P of the current framef(m);
Wherein L < m is greater than or equal to 1, and Y (m-L) represents the frequency domain subband signal obtained in the step 1.1 performed the first time before the step 1 is performed this time;
three methods of obtaining frequency domain sub-band thresholds are provided in the present invention.
Optionally, in step 1.2, obtaining the current frame frequency domain sub-band threshold p (m) by using formula I:
where win () represents a weighting function, 0< win () < 1.
In this embodiment, taking the near-end signal as an example, the calculation steps are as follows:
directly weighting and summing the historical values to obtain a sub-band threshold;
according to historical near-end signal frequency domain information Yn(m-l),l∈[1,L]And weighting to form a sub-band threshold:
wherein, Pn(m) is the subband threshold of the near-end signal, win (L) is the weight value, L ∈ [1, L [)];
The threshold obtaining method provided in the embodiment sufficiently considers the influence of the historical frame data on the current frame data, so as to obtain a more accurate threshold.
Optionally, step 1.2 specifically includes:
step A, adopting a formula II to obtain a first threshold P1(m), said first threshold P1(m) includes a near-end first thresholdOr a remote first threshold
Where win () represents a weighting function, 0< win () < 1;
step B, extractingObtaining a second threshold P using equation III2(m) said P2(m) includes a proximal second thresholdOr a remote second threshold
P2(m)=P2(m-1)×gamma1+Y(m)×(1-gamma1) Formula III
Wherein P is2(m-1) is the second threshold, P, of the subband signal in the previous frame frequency domain2(m-1) a near end second threshold comprising a subband signal in the previous frame frequency domainOr a far-end second threshold of the subband signal in the previous frame frequency domaingamma1Is a first weighted value of 0<gamma1<1;
Step C, obtaining a current frame frequency domain sub-band threshold P (m) by adopting a formula IV, wherein P (m) comprises a current frame near-end frequency domain sub-band threshold Pn(m) or far-end frequency domain sub-band threshold P of current framef(m):
P(m)=P1(m)×gamma2+P2(m)×(1-gamma2) Formula IV
Wherein gamma is2Is a second weighted value, 0<gamma2<1。
In this embodiment, a subband threshold is obtained according to the weighted sum of the historical thresholds and the historical smooth value of the current threshold, taking the near-end signal as an example:
step A, according to historical near-end signal frequency domain information Yn(m-l),l∈[1,L]And weighting to form a sub-band threshold:
step B, according to the current near-end signal frequency domain information Yn(m), smoothing to obtain a subband threshold:
wherein the content of the first and second substances,a near-end second threshold, gamma, of the subband signal in the previous frame frequency domain1Is a smoothing factor;
Wherein, gamma is2Are weighted values.
The threshold obtaining method provided in this embodiment can fully reflect the difference between the current signal and the threshold when the signal component is complex, so as to obtain a more accurate threshold.
Optionally, step 1.2 specifically includes:
step I, adopting a formula II to obtain a first threshold P1(m), said first threshold P1(m) includes a near-end first thresholdOr a remote first threshold
Where win () represents a weighting function, 0< win () < 1;
step II, adopting the formula III to obtain a second threshold P2(m) said P2(m) includes a proximal second thresholdOr a remote second threshold
P2(m)=P2(m-1)×gamma1+Y(m)×(1-gamma1) Formula III
Wherein P is2(m-1) is the second threshold, P, of the subband signal in the previous frame frequency domain2(m-1) a near end second threshold comprising a subband signal in the previous frame frequency domainOr a far-end second threshold of the subband signal in the previous frame frequency domaingamma1Is a first weighted value of 0<gamma1<1;
Step III, obtaining a current frame frequency domain sub-band threshold P (m) by adopting a formula IV, wherein P (m) comprises a current frame near-end frequency domain sub-band threshold Pn(m) or far-end frequency domain sub-band threshold P of current framef(m):
P(m)=max(P1(m)×gamma3,P2(m)×gamma4) Formula V
Where max () denotes taking the maximum value, gamma3Is a third weighted value, 0<gamma3<1,gamma4Is a fourth weight value of 0<gamma4<1。
In this embodiment, the calculation idea of the third method is the same as that of the second method, namely, Step a and Step B, and the near-end signal is taken as an example and is applied to Step3Andweighting and solving the maximum value as the sub-band threshold of the near-end signal;
wherein max represents the maximum of the two numbers; gamma of3And gamma4Are weighted values.
The threshold obtaining method provided in this embodiment can fully reflect the difference between the current signal and the threshold when the signal component is complex, so as to obtain a more accurate threshold.
In the step, the subband threshold is obtained by methods of weighted summation, averaging, maximum value solving and the like of the energy of the subband signals of the frequency spectrums of multiple frames or two adjacent frames, so that the problems of energy mutation, jitter and the like are effectively solved, the energy of the frequency spectrums and the subband can be more stable, and the anti-jamming capability is stronger. Greatly improving the stability and accuracy of subsequent calculation.
Step 1.3, obtaining a current frame marker sequence t (m), specifically including:
the near-end frequency domain sub-band signal Y of the current frame obtained in the step 1.1n(m) is recorded as a first sequence, and the near-end frequency domain sub-band threshold P of the current frame obtained in the step 1.2 is recorded as a first sequencen(m) inputting the difference value marking method by marking as the second sequence to obtain the near-end marking sequence T of the current framen(m);
The far-end frequency domain sub-band signal Y of the current frame obtained in the step 1.1f(m) is recorded as a first sequence, and the far-end frequency domain sub-band threshold P of the current frame obtained in the step 1.2 is recorded as a first sequencef(m) recording as a second sequence input difference value marking method to obtain a current frame far-end marking sequence Tf(m);
The difference marking method specifically comprises the following steps:
subtracting the first sequence and the second sequence according to the bit to obtain a difference sequence;
carrying out binarization on the difference sequence to obtain a marking sequence;
when binarization is carried out, if any element in the difference sequence is more than or equal to 0, the element at the same position in the mark sequence is 1, otherwise, the element is 0;
in the step, the similarity is determined by the coherence between the energy of the spectrum sub-band signal and the threshold frequency domain of the sub-band. Taking the near-end signal as an example, the calculation steps are as follows;
and determining the mark of each frequency point by subtracting or dividing the energy of the current frequency spectrum sub-band signal of the near-end signal from the sub-band threshold, and determining whether the subtraction is greater than zero or whether the ratio exceeds 1, namely, the subtraction is greater than zero or the ratio is greater than 1, and the sub-band mark is marked as 1, otherwise, the mark is marked as 0.
The correlation is calculated as follows:
or
Wherein, TnAnd (m, k) is a sub-band mark, m is a frame number, and k is a frequency domain number. T isn(m, k) is a sequence for 0 and 1.
Step 2, obtaining the far-end mark sequences { T ] of the S historical frames saved before the step 1 is executed this timef(m-1),Tf(m-2),…,Tf(m-s),…,Tf(m-S) | S ═ 1,2, …, S } and step3 is performed, 1 ≦ S<m, wherein Tf(m-s) represents the remote marker sequence obtained by executing step 1 the s time before executing step 1 this time;
step3, marking the near-end mark sequence T of the current frame obtained in the step 1n(m) and the S far-end marker sequences { T } of the historical frames obtained in step 2f(m-1),Tf(m-2),…,Tf(m-s),…,TfPerforming bitwise exclusive-or operation on (m-S) | S ═ 1,2, …, and S } respectively to obtain S binary sequences;
taking a history frame far-end signal sequence corresponding to a history frame far-end marking sequence corresponding to the binary sequence with the most 1 occurrence times in the S binary sequences as a marking signal sequence;
calculating the time difference between the marking signal sequence and the near-end signal sequence of the current frame, obtaining and storing the initial delay value of the current frame;
in this embodiment, S ═ 3 is taken as an example:
the current frame near-end marker sequence is:
[1,0,1,1,0,1];
the historical far-end marker sequence of the previous frame of the current frame is as follows:
[0,0,1,1,1,0];
the historical far-end marker sequence of the first two frames of the current frame is as follows:
[1,0,0,0,0,1];
the historical far-end marker sequence of the first three frames of the current frame is as follows:
[1,1,1,1,1,0];
firstly, carrying out bitwise exclusive OR operation on a current frame near-end mark sequence and a historical far-end mark sequence of a previous frame of the current frame, namely carrying out bitwise exclusive OR operation on [1,0,1,1,0,1] and [0,0,1,1,1,0] to obtain a sequence 1[0,1,1,1,0,0 ];
similarly, performing bitwise exclusive nor operation on the current frame near-end marker sequence and the historical far-end marker sequences of the previous two frames of the current frame to obtain a sequence 2[1,1,0,0,1,1 ];
similarly, performing bitwise exclusive-nor operation on the current frame near-end marker sequence and the historical far-end marker sequences of the first three frames of the current frame to obtain a sequence 3[1,0,1,1,0,0 ];
it can be judged from the sequence 1, the sequence 2 and the sequence 3 that the number of occurrences of 1 in the sequence 2 is the largest, and then the far-end signal sequence of the historical frame of the previous two frames of the current frame corresponding to the sequence 2 is used as the marker signal sequence, so as to obtain the preliminary delay value.
Step 4, obtaining preliminary delay values of N historical frames stored before the step3 is executed at this time, and obtaining the average value of the preliminary delay values of the current frame and the preliminary delay values of the N historical frames to obtain the delay value of the current frame, wherein N is less than or equal to S;
the method for determining the current delay value by smoothing the delay historical value is to prevent the influence of sudden delay change on the accuracy of a subsequent echo cancellation result. The sliding window method is used in the smoothing process, so that the delay value of each current frame can be effectively obtained in real time; the width of the sliding window is N, and new data is placed at the tail end of the window when the data is updated every time, so that the first data in the original window is overflowed and discarded. The sliding window can ensure the real-time performance of updating the delay value and simultaneously eliminate the problems of delay jitter, delay mutation and the like of the delay value.
Step 5, for the current frame near-end signal X obtained in step 1.1n(m) obtaining time domain correlation from a far-end signal sequence corresponding to the current frame delay value to obtain a correlation value obtained in the step 5 executed this time;
comparing the correlation value obtained this time with the correlation value obtained by executing the step 5 last time, and selecting the far-end signal corresponding to the maximum correlation value as the far-end signal after the current frame is aligned;
in this embodiment, in order to further ensure the accuracy of the delay, the present invention determines whether to update the current delay value by a time domain correlation method. And obtaining time domain correlation between the near-end signal of the current frame and the far-end signal corresponding to the current frame delay, obtaining time domain correlation between the voice signal of the current frame and the voice signal corresponding to the previous frame delay, and judging the magnitude of a correlation value, wherein the magnitude of the correlation value is the current delay value.
Step 6, utilizing the aligned far-end signal of the current frame obtained in step 5 and the near-end signal X of the current frame obtained in step 1.1n(m) obtaining adaptive filter coefficients;
obtaining an echo signal according to the self-adaptive filter coefficient and the far-end signal after the current frame is aligned;
utilizing the current frame near-end signal X obtained in step 1.1nAnd (m) subtracting the echo signal to obtain a signal after the current frame echo is eliminated, setting m to be m +1, returning to the step 1 until the current frame signal is the last frame signal, and ending.
In this embodiment, the intra-frame delay margin may cause the filter coefficients to converge poorly, thereby affecting the echo cancellation effect. Therefore, the intra-frame delay margin is obtained through the filter coefficients, the specific method is to divide the filter length equally, and the variance of each section of filter coefficients is respectively obtained, wherein the section with the largest variance is the corresponding position of the intra-frame delay.
And adjusting the coefficient of the adaptive filter according to the processed far-end signal and the processed near-end signal to obtain the subtraction of the estimated value of the echo signal and the signal of the receiving end, thereby realizing the dynamic echo cancellation of the output signal. And inputting the far-end signal aligned with the echo signal contained in the near end after the time delay elimination into an adaptive filter, and estimating the echo signal contained in the near end. The convergence speed of the adaptive filter is improved by resetting the filter coefficient, moving the sliding window and the like. The implementation of dynamic echo cancellation is to subtract the echo signal contained in the near-end signal estimated by the adaptive filter.
Example two
A kind of dynamic echo cancellation device, including the current frame marks the sequence and obtains the module, historical frame marks the sequence and obtains the module, preliminary delay value obtains the module, current delay value obtains the module, delay value judge module and echo cancellation module;
the current frame marking sequence obtaining module comprises a signal obtaining sub-module, a threshold obtaining sub-module and a marking sequence obtaining sub-module;
the signal obtaining sub-module is used for obtaining current mth frame signal X (m), m is an integer and the initial value of m is 1, and performing short-time Fourier transform on the current frame signal to obtain current frame frequency domain sub-band signal Y (m), wherein the current frame frequency domain sub-band signal Y (m) comprises current frame near-end frequency domain sub-band signal Y (m)n(m) and the far-end frequency domain subband signal Y of the current framef(m), wherein n represents a proximal end and f represents a distal end;
when m is 1, after the current frame frequency domain sub-band signal Y (1) is stored, setting m to be m +1 and returning to a signal acquisition sub-module;
the threshold obtaining submodule is used for obtaining the frequency domain subband signals Y (m-1), Y (m-2), …, Y (m-L), …, Y (m-L) | L ═ 1,2, …, L according to L historical frame frequency domain subband signals { Y (m-1), Y (m-2), and current frame frequency domain subband signals Y (m), wherein 1 ≦ L<m, Y (m-l) represents the m-l frequency domain sub-band signal obtained before, and the current frame frequency domain sub-band threshold P (m) is obtained, wherein the current frame frequency domain sub-band threshold comprises the current frame near-end frequency domain sub-band threshold Pn(m) and current frame far-end frequency domain sub-band gateLimit of Pf(m);
The tag sequence obtaining sub-module is used for obtaining a current frame tag sequence T (m),
wherein obtaining the current frame marker sequence t (m) specifically includes:
the near-end frequency domain sub-band signal Y of the current frame obtained in the step 1.1n(m) is recorded as a first sequence, and the near-end frequency domain sub-band threshold P of the current frame obtained in the step 1.2 is recorded as a first sequencen(m) inputting the difference value marking method by marking as the second sequence to obtain the near-end marking sequence T of the current framen(m);
The far-end frequency domain sub-band signal Y of the current frame obtained in the step 1.1f(m) is recorded as a first sequence, and the far-end frequency domain sub-band threshold P of the current frame obtained in the step 1.2 is recorded as a first sequencef(m) recording as a second sequence input difference value marking method to obtain a current frame far-end marking sequence Tf(m);
The difference marking method specifically comprises the following steps:
subtracting the first sequence and the second sequence according to the bit to obtain a difference sequence;
carrying out binarization on the difference sequence to obtain a marking sequence;
when binarization is carried out, if any element in the difference sequence is more than or equal to 0, the element at the same position in the mark sequence is 1, otherwise, the element is 0;
the historical frame mark sequence obtaining module is used for obtaining the far-end mark sequences { T } of the S historical frames saved beforef(m-1),Tf(m-2),…,Tf(m-s),…,Tf(m-S) | S ═ 1,2, …, S } and step3 is performed, 1 ≦ S<m, wherein Tf(m-s) represents the distal marker sequence obtained the s-th time before the present execution;
the preliminary delay value obtaining module is used for obtaining the near-end mark sequence T of the current framen(m) and S sequences of remote tags for historical frames { Tf(m-1),Tf(m-2),…,Tf(m-s),…,TfPerforming bitwise exclusive-or operation on (m-S) | S ═ 1,2, …, and S } respectively to obtain S binary sequences;
taking a history frame far-end signal sequence corresponding to a history frame far-end marking sequence corresponding to the binary sequence with the most 1 occurrence times in the S binary sequences as a marking signal sequence;
calculating the time difference between the marking signal sequence and the current frame near-end signal sequence, obtaining and storing the initial delay value of the current frame;
the current delay value obtaining module is used for obtaining preliminary delay values of N previous stored historical frames, averaging the preliminary delay values of the current frame and the preliminary delay values of the N historical frames to obtain the delay value of the current frame, wherein N is less than or equal to S;
the delay value judging module is used for judging the near-end signal X of the current framen(m) obtaining time domain correlation from a far-end signal sequence corresponding to the delay value of the current frame to obtain a current correlation value;
comparing the current correlation value with the last correlation value, and selecting the far-end signal corresponding to the maximum correlation value as the far-end signal after the current frame is aligned;
the echo cancellation module is used for utilizing the obtained far-end signal after the current frame is aligned and the current frame near-end signal Xn(m) obtaining adaptive filter coefficients;
obtaining an echo signal according to the self-adaptive filter coefficient and the far-end signal after the current frame is aligned;
using the near-end signal X of the current framenAnd (m) subtracting the echo signal to obtain a signal after the current frame echo is eliminated, setting m to be m +1, returning to the current frame marking sequence obtaining module until the current frame signal is the last frame signal, and ending.
Optionally, the threshold obtaining sub-module obtains the current frame frequency domain sub-band threshold p (m) by using formula I:
where win () represents a weighting function, 0< win () < 1.
Optionally, the threshold obtaining sub-module includes a first threshold obtaining unit, a second threshold obtaining unit, and a threshold obtaining unit;
a first threshold obtaining unit for obtaining a first threshold P by using formula II1(m),A first threshold P1(m) includes a near-end first thresholdOr a remote first threshold
Where win () represents a weighting function, 0< win () < 1;
a second threshold obtaining unit for obtaining a second threshold P using formula III2(m),P2(m) includes a proximal second thresholdOr a remote second threshold
P2(m)=P2(m-1)×gamma1+Y(m)×(1-gamma1) Formula III
Wherein P is2(m-1) is a second threshold, P, of the subband signal in the m-1 th frame frequency domain2(m-1) a near end second threshold comprising a subband signal in the m-1 th frame frequency domainOr a far-end second threshold of the sub-band signal in the m-1 th frame frequency domaingamma1Is a first weighted value of 0<gamma1<1;
The threshold obtaining unit is used for obtaining the current frame frequency domain sub-band threshold P (m) by adopting the formula IV, wherein P (m) comprises the current frame near-end frequency domain sub-band threshold Pn(m) or far-end frequency domain sub-band threshold P of current framef(m):
P(m)=P1(m)×gamma2+P2(m)×(1-gamma2) Formula IV
Wherein gamma is2Is a second weighted value, 0<gamma2<1。
Optionally, the threshold obtaining sub-module includes a first threshold obtaining unit, a second threshold obtaining unit, and a threshold obtaining unit;
a first threshold obtaining unit for obtaining a first threshold P by using formula II1(m), a first threshold P1(m) includes a near-end first thresholdOr a remote first threshold
Where win () represents a weighting function, 0< win () < 1;
a second threshold obtaining unit for obtaining a second threshold P using formula III2(m),P2(m) includes a proximal second thresholdOr a remote second threshold
P2(m)=P2(m-1)×gamma1+Y(m)×(1-gamma1) Formula III
Wherein P is2(m-1) is a second threshold, P, of the subband signal in the m-1 th frame frequency domain2(m-1) including a second threshold of the m-1 th frame near-end frequency domain sub-band signalOr the second threshold of the far-end frequency domain sub-band signal of the m-1 th framegamma1Is a first weighted value of 0<gamma1<1;
The threshold obtaining unit is used for obtaining the current frame frequency domain sub-band threshold P (m) by adopting the formula IV, wherein P (m) comprises the current frame near-end frequency domain sub-band threshold Pn(m) or far-end frequency domain sub-band threshold P of current framef(m):
P(m)=max(P1(m)×gamma3,P2(m)×gamma4) Formula V
Where max () denotes taking the maximum value, gamma3Is a third weighted value, 0<gamma3<1,gamma4Is a fourth weight value of 0<gamma4<1。
Through the above description of the embodiments, those skilled in the art will clearly understand that the present invention may be implemented by software plus necessary general hardware, and certainly may also be implemented by hardware, but in many cases, the former is a better embodiment. Based on such understanding, the technical solutions of the present invention may be substantially implemented or a part of the technical solutions contributing to the prior art may be embodied in the form of a software product, where the computer software product is stored in a readable storage medium, such as a floppy disk, a hard disk, or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
Claims (8)
1. A dynamic echo cancellation method, characterized in that said method is performed according to the following steps:
step 1, obtaining a mark sequence T (m) of a current mth frame signal X (m), wherein m is a positive integer, an initial value of m is 1, the mark sequence T (1) does not exist, and the current frame mark sequence T (m) comprises a current frame near-end mark sequence Tn(m) and the current frame far-end marker sequence Tf(m);
Marking the near end of the current frame with a sequence Tn(m) inputting to step3, and labeling the far-end marker sequence T of the current framef(m) after the saving, performing step 2;
the obtaining of the marker sequence t (m) of the current mth frame signal x (m) specifically includes:
step 1.1, obtaining current mth frame signal X (m), performing short-time Fourier transform on the mth frame signal X (m) to obtain current frame frequency domain subband signal Y (m), wherein the current mth frame signal X (m) comprises current frame near-end signal X (m)n(m) and the current frame far-end signal Xf(m); the current frame frequency domain sub-band signal Y (m) comprises a current frame near-end frequency domain sub-band signal Yn(m) and the far-end frequency domain subband signal Y of the current framef(m), wherein n represents a proximal end and f represents a distal end;
when m is equal to 1, after the current frame frequency domain sub-band signal Y (1) is stored, setting m to m +1, and returning to the step 1.1;
step 1.2, obtaining a current frame frequency domain sub-band threshold P (m) by using L historical frame frequency domain sub-band signals { Y (m-1), Y (m-2), …, Y (m-L), …, Y (m-L) | L ═ 1,2, …, L } and a current frame frequency domain sub-band signal Y (m), wherein the current frame frequency domain sub-band threshold includes a current frame near-end frequency domain sub-band threshold P (m)n(m) and the far-end frequency domain sub-band threshold P of the current framef(m);
Wherein L < m is greater than or equal to 1, and Y (m-L) represents the frequency domain subband signal obtained in the step 1.1 performed for the first time before the step 1.1 is performed this time;
step 1.3, obtaining a current frame marker sequence t (m), specifically including:
the near-end frequency domain sub-band signal Y of the current frame obtained in the step 1.1n(m) is recorded as a first sequence, and the near-end frequency domain sub-band threshold P of the current frame obtained in the step 1.2 is recorded as a first sequencen(m) inputting the difference value marking method by marking as the second sequence to obtain the near-end marking sequence T of the current framen(m);
The far-end frequency domain sub-band signal Y of the current frame obtained in the step 1.1f(m) is recorded as a first sequence, and the far-end frequency domain sub-band threshold P of the current frame obtained in the step 1.2 is recorded as a first sequencef(m) recording as a second sequence input difference value marking method to obtain a current frame far-end marking sequence Tf(m);
The difference marking method specifically comprises the following steps:
subtracting the first sequence and the second sequence according to the bit to obtain a difference sequence;
carrying out binarization on the difference sequence to obtain a marking sequence;
when binarization is carried out, if the difference sequence is more than or equal to 0, the element at the same position in the mark sequence is 1, otherwise, the element is 0;
step 2, obtaining the far-end mark sequences { T ] of the S historical frames saved before the step 1 is executed this timef(m-1),Tf(m-2),…,Tf(m-s),…,Tf(m-S) | S ═ 1,2, …, S } and step3 is performed, 1 ≦ S<m, wherein Tf(m-s) represents the remote marker sequence obtained by executing step 1 the s time before executing step 1 this time;
step3, marking the near-end mark sequence T of the current frame obtained in the step 1n(m) and the S far-end marker sequences { T } of the historical frames obtained in step 2f(m-1),Tf(m-2),…,Tf(m-s),…,TfPerforming bitwise exclusive-or operation on (m-S) | S ═ 1,2, …, and S } respectively to obtain S binary sequences;
calculating the occurrence frequency of 1 in the binary sequence, and taking the historical frame far-end marker sequence with the most frequency as a marker signal sequence;
calculating the time difference between the marking signal sequence and the near-end signal sequence of the current frame, obtaining and storing the initial delay value of the current frame;
and 4, obtaining preliminary delay values of the N historical frames stored before the step3 is executed, obtaining the delay value with the largest occurrence frequency for the preliminary delay values of the N historical frames, and obtaining a delay value estimation N1, wherein N1< ═ S.
Step 5, for the current frame near-end signal X obtained in step 1.1n(m) obtaining time domain correlation with a far-end signal sequence with a delay value of n1 to obtain a correlation value obtained in the step 5 executed this time;
comparing the correlation value obtained this time with the correlation value obtained by executing the step 5 last time, and selecting the far-end signal corresponding to the maximum correlation value as the far-end signal after the current frame is aligned;
step 6, utilizing the aligned far-end signal of the current frame obtained in step 5 and the near-end signal X of the current frame obtained in step 1.1n(m), performing echo cancellation,and after obtaining the signal after the echo of the current frame is eliminated, setting m to be m +1, returning to the step 1 until the current frame signal is the last frame signal, and ending.
3. The dynamic echo cancellation method according to claim 1, wherein said step 1.2 specifically comprises:
step A, obtaining a first threshold P of a signal by adopting a formula II1(m), said signal first threshold P1(m) a first threshold comprising a near-end signalOr a first threshold of the far-end signal
Where win () represents a weighting function, 0< win () < 1;
step B, adopting a formula III to obtain a second threshold P2(m) said P2(m) includes a proximal second thresholdOr a remote second threshold
P2(m)=P2(m-1)×gamma1+Y(m)×(1-gamma1) Formula III
Wherein P is2(m-1) is a second threshold, P, of the subband signal in the m-1 th frame frequency domain2(m-1) a near end second threshold comprising a subband signal in the m-1 th frame frequency domainOr a far-end second threshold of the sub-band signal in the m-1 th frame frequency domaingamma1Is a first weighted value of 0<gamma1<1;
Step C, obtaining a current frame frequency domain sub-band threshold P (m) by adopting a formula IV, wherein P (m) comprises a current frame near-end frequency domain sub-band threshold Pn(m) or far-end frequency domain sub-band threshold P of current framef(m):
P(m)=P1(m)×gamma2+P2(m)×(1-gamma2) Formula IV
Wherein gamma is2Is a second weighted value, 0<gamma2<1。
4. The dynamic echo cancellation method according to claim 1, wherein said step 1.2 specifically comprises:
step I, adopting a formula II to obtain a first threshold P1(m), said first threshold P1(m) includes a near-end first thresholdOr a remote first threshold
Where win () represents a weighting function, 0< win () < 1;
step II, adopting the formula III to obtain a second threshold P2(m) said P2(m) includes a proximal second thresholdOr a remote second threshold
P2(m)=P2(m-1)×gamma1+Y(m)×(1-gamma1) Formula III
Wherein P is2(m-1) is a second threshold, P, of the subband signal in the m-1 th frame frequency domain2(m-1) including a second threshold of the m-1 th frame near-end frequency domain sub-band signalOr the second threshold of the far-end frequency domain sub-band signal of the m-1 th framegamma1Is a first weighted value of 0<gamma1<1;
Step III, obtaining a current frame frequency domain sub-band threshold P (m) by adopting a formula IV, wherein P (m) comprises a current frame near-end frequency domain sub-band threshold Pn(m) or far-end frequency domain sub-band threshold P of current framef(m):
P(m)=max(P1(m)×gamma3,P2(m)×gamma4) Formula V
Where max () denotes taking the maximum value, gamma3Is a third weighted value, 0<gamma3<1,gamma4Is a fourth weight value of 0<gamma4<1。
5. A dynamic echo cancellation device is characterized by comprising a current frame marking sequence obtaining module, a historical frame marking sequence obtaining module, a preliminary delay value obtaining module, a current delay value obtaining module, a delay value judging module and an echo cancellation module;
the current frame marking sequence obtaining module comprises a signal obtaining sub-module, a threshold obtaining sub-module and a marking sequence obtaining sub-module;
the signal obtaining sub-module is used for obtaining a current mth frame signal X (m), m is an integer, the initial value of m is 1, and performing short-time Fourier transform on the current frame signal to obtain a current frame frequency domain sub-band signal Y (m), wherein the current frame frequency domain sub-band signal Y (m) comprises a current frame near-end frequency domain sub-band signal Y (m)n(m) and the far-end frequency domain subband signal Y of the current framef(m), wherein n represents a proximal end and f represents a distal end;
when m is 1, after the current frame frequency domain sub-band signal Y (1) is stored, setting m to be m +1 and returning to a signal acquisition sub-module;
the threshold obtaining sub-module is used for obtaining sub-band signals { Y (m-1), Y (m-2), …, Y (m-L), …, Y (m-L) | L ═ 1,2, …, L } and current frame frequency domain sub-band signals Y (m), wherein L is more than or equal to 1<m, Y (m-l) represents the frequency domain sub-band signal obtained at the m-l times, and the current frame frequency domain sub-band threshold P (m) is obtained, wherein the current frame frequency domain sub-band threshold comprises the current frame near-end frequency domain sub-band threshold Pn(m) and the far-end frequency domain sub-band threshold P of the current framef(m);
The marker sequence obtaining submodule is used for obtaining a marker sequence T (m) of the current frame,
wherein obtaining the current frame marker sequence t (m) specifically includes:
the near-end frequency domain sub-band signal Y of the current framen(m) recording as a first sequence, and taking the near-end frequency domain sub-band threshold P of the current framen(m) inputting the difference value marking method by marking as the second sequence to obtain the near-end marking sequence T of the current framen(m);
The far-end frequency domain sub-band signal Y of the current framef(m) recording as a first sequence, and taking the far-end frequency domain sub-band threshold P of the current framef(m) recording as a second sequence input difference value marking method to obtain a current frame far-end marking sequence Tf(m);
The difference marking method specifically comprises the following steps:
subtracting the first sequence and the second sequence according to the bit to obtain a difference sequence;
carrying out binarization on the difference sequence to obtain a marking sequence;
when binarization is carried out, if any element in the difference sequence is more than or equal to 0, the element at the same position in the mark sequence is 1, otherwise, the element is 0;
the historical frame marking sequence obtaining module is used for obtaining the remote marking sequences { T } of the previously saved S historical framesf(m-1),Tf(m-2),…,Tf(m-s),…,Tf(m-S) | S ═ 1,2, …, S } and step3 is performed, 1 ≦ S<m, wherein Tf(m-s) represents the distal marker sequence obtained the s-th time before the present execution;
the preliminary delay value obtaining module is used for obtaining the near-end mark sequence T of the current framen(m) and S sequences of remote tags for historical frames { Tf(m-1),Tf(m-2),…,Tf(m-s),…,TfPerforming bitwise exclusive-or operation on (m-S) | S ═ 1,2, …, and S } respectively to obtain S binary sequences;
taking a history frame far-end signal sequence corresponding to a history frame far-end marking sequence corresponding to the binary sequence with the most 1 occurrence times in the S binary sequences as a marking signal sequence;
calculating the time difference between the marking signal sequence and the current frame near-end signal sequence, obtaining and storing the initial delay value of the current frame;
the current delay value obtaining module is used for obtaining preliminary delay values of N historical frames stored before, solving the delay value with the largest number of times of the current frame and the preliminary delay values of the N historical frames, and obtaining a current frame delay value N1, wherein N1< ═ S;
the delay value judging module is used for judging the near-end signal X of the current framen(m) obtaining time domain correlation from a far-end signal sequence corresponding to the delay value of the current frame to obtain a current correlation value;
comparing the current correlation value with the last correlation value, and selecting the far-end signal corresponding to the maximum correlation value as the far-end signal after the current frame is aligned;
the echo cancellation module is used for utilizing the obtainedThe far-end signal after aligning the current frame and the near-end signal X of the current framen(m) obtaining adaptive filter coefficients;
obtaining an echo signal according to the self-adaptive filter coefficient and the far-end signal after the current frame is aligned;
using the near-end signal X of the current framenAnd (m) subtracting the echo signal to obtain a signal after the current frame echo is eliminated, setting m to be m +1, returning to the current frame marking sequence obtaining module until the current frame signal is the last frame signal, and ending.
7. The dynamic echo cancellation device of claim 5, wherein said threshold acquisition sub-module comprises a first threshold acquisition unit, a second threshold acquisition unit, and a threshold acquisition unit;
the first threshold obtaining unit is used for obtaining a first threshold P by adopting a formula II1(m), said first threshold P1(m) includes a near-end first thresholdOr a remote first threshold
Where win () represents a weighting function, 0< win () < 1;
the second threshold obtaining unit is used for obtaining a second threshold P by adopting a formula III2(m) said P2(m) includes a proximal second thresholdOr a remote second threshold
P2(m)=P2(m-1)×gamma1+Y(m)×(1-gamma1) Formula III
Wherein P is2(m-1) is a second threshold, P, of the subband signal in the m-1 th frame frequency domain2(m-1) a near end second threshold comprising a subband signal in the m-1 th frame frequency domainOr a far-end second threshold of the sub-band signal in the m-1 th frame frequency domaingamma1Is a first weighted value of 0<gamma1<1;
The threshold obtaining unit is configured to obtain a current frame frequency domain sub-band threshold P (m) using formula IV, where P (m) includes a current frame near-end frequency domain sub-band threshold Pn(m) or far-end frequency domain sub-band threshold P of current framef(m):
P(m)=P1(m)×gamma2+P2(m)×(1-gamma2) Formula IV
Wherein gamma is2Is a second weighted value, 0<gamma2<1。
8. The dynamic echo cancellation device of claim 5, wherein said threshold acquisition sub-module comprises a first threshold acquisition unit, a second threshold acquisition unit, and a threshold acquisition unit;
the first threshold obtaining unit is used for obtaining a first threshold P by adopting a formula II1(m) ofA first threshold P1(m) includes a near-end first thresholdOr a remote first threshold
Where win () represents a weighting function, 0< win () < 1;
the second threshold obtaining unit is used for obtaining a second threshold P by adopting a formula III2(m) said P2(m) includes a proximal second thresholdOr a remote second threshold
P2(m)=P2(m-1)×gamma1+Y(m)×(1-gamma1) Formula III
Wherein P is2(m-1) is a second threshold, P, of the subband signal in the m-1 th frame frequency domain2(m-1) including a second threshold of the m-1 th frame near-end frequency domain sub-band signalOr the second threshold of the far-end frequency domain sub-band signal of the m-1 th framegamma1Is a first weighted value of 0<gamma1<1;
The threshold obtaining unit is configured to obtain a current frame frequency domain sub-band threshold P (m) using formula IV, where P (m) includes a current frame near-end frequency domain sub-band threshold Pn(m) or far-end frequency domain sub-band threshold P of current framef(m):
P(m)=max(P1(m)×gamma3,P2(m)×gamma4) Formula V
Where max () denotes taking the maximum value, gamma3Is a third weighted value, 0<gamma3<1,gamma4Is a fourth weight value of 0<gamma4<1。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911133606.1A CN110931032B (en) | 2019-11-19 | 2019-11-19 | Dynamic echo cancellation method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911133606.1A CN110931032B (en) | 2019-11-19 | 2019-11-19 | Dynamic echo cancellation method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110931032A true CN110931032A (en) | 2020-03-27 |
CN110931032B CN110931032B (en) | 2022-08-02 |
Family
ID=69853545
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911133606.1A Active CN110931032B (en) | 2019-11-19 | 2019-11-19 | Dynamic echo cancellation method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110931032B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111785290A (en) * | 2020-05-18 | 2020-10-16 | 深圳市东微智能科技股份有限公司 | Microphone array voice signal processing method, device, equipment and storage medium |
CN112397082A (en) * | 2020-11-17 | 2021-02-23 | 北京达佳互联信息技术有限公司 | Method, apparatus, electronic device and storage medium for estimating echo delay |
CN112489670A (en) * | 2020-12-01 | 2021-03-12 | 广州华多网络科技有限公司 | Time delay estimation method and device, terminal equipment and computer readable storage medium |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100067712A1 (en) * | 2008-09-12 | 2010-03-18 | Yuuji Maeda | Echo Cancelling Device, Signal Processing Device and Method Thereof, and Program |
CN102505966A (en) * | 2011-11-23 | 2012-06-20 | 太原海斯特电子有限公司 | Accurate positioning method based on echo coherence effect |
CN104395957A (en) * | 2012-04-30 | 2015-03-04 | 创新科技有限公司 | A universal reconfigurable echo cancellation system |
CN104778950A (en) * | 2014-01-15 | 2015-07-15 | 华平信息技术股份有限公司 | Microphone signal delay compensation control method based on echo cancellation |
CN105472189A (en) * | 2014-09-30 | 2016-04-06 | 想象技术有限公司 | Detection of acoustic echo cancellation |
CN105847611A (en) * | 2016-03-21 | 2016-08-10 | 腾讯科技(深圳)有限公司 | Echo time delay detection method, echo elimination chip and terminal device |
CN105872275A (en) * | 2016-03-22 | 2016-08-17 | Tcl集团股份有限公司 | Speech signal time delay estimation method and system used for echo cancellation |
CN106506872A (en) * | 2016-11-02 | 2017-03-15 | 腾讯科技(深圳)有限公司 | Talking state detection method and device |
CN107333018A (en) * | 2017-05-24 | 2017-11-07 | 华南理工大学 | A kind of echo delay time estimation and method for tracing |
CN107331406A (en) * | 2017-07-03 | 2017-11-07 | 福建星网智慧软件有限公司 | A kind of method of dynamic adjustment Echo-delay |
US20180329671A1 (en) * | 2017-05-15 | 2018-11-15 | MIXHalo Corp. | Systems and methods for providing real-time audio and data |
CN109727607A (en) * | 2017-10-31 | 2019-05-07 | 腾讯科技(深圳)有限公司 | Delay time estimation method, device and electronic equipment |
CN110138990A (en) * | 2019-05-14 | 2019-08-16 | 浙江工业大学 | A method of eliminating mobile device voip phone echo |
-
2019
- 2019-11-19 CN CN201911133606.1A patent/CN110931032B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100067712A1 (en) * | 2008-09-12 | 2010-03-18 | Yuuji Maeda | Echo Cancelling Device, Signal Processing Device and Method Thereof, and Program |
CN102505966A (en) * | 2011-11-23 | 2012-06-20 | 太原海斯特电子有限公司 | Accurate positioning method based on echo coherence effect |
CN104395957A (en) * | 2012-04-30 | 2015-03-04 | 创新科技有限公司 | A universal reconfigurable echo cancellation system |
CN104778950A (en) * | 2014-01-15 | 2015-07-15 | 华平信息技术股份有限公司 | Microphone signal delay compensation control method based on echo cancellation |
CN105472189A (en) * | 2014-09-30 | 2016-04-06 | 想象技术有限公司 | Detection of acoustic echo cancellation |
CN105847611A (en) * | 2016-03-21 | 2016-08-10 | 腾讯科技(深圳)有限公司 | Echo time delay detection method, echo elimination chip and terminal device |
CN105872275A (en) * | 2016-03-22 | 2016-08-17 | Tcl集团股份有限公司 | Speech signal time delay estimation method and system used for echo cancellation |
CN106506872A (en) * | 2016-11-02 | 2017-03-15 | 腾讯科技(深圳)有限公司 | Talking state detection method and device |
US20180329671A1 (en) * | 2017-05-15 | 2018-11-15 | MIXHalo Corp. | Systems and methods for providing real-time audio and data |
CN107333018A (en) * | 2017-05-24 | 2017-11-07 | 华南理工大学 | A kind of echo delay time estimation and method for tracing |
CN107331406A (en) * | 2017-07-03 | 2017-11-07 | 福建星网智慧软件有限公司 | A kind of method of dynamic adjustment Echo-delay |
CN109727607A (en) * | 2017-10-31 | 2019-05-07 | 腾讯科技(深圳)有限公司 | Delay time estimation method, device and electronic equipment |
CN110138990A (en) * | 2019-05-14 | 2019-08-16 | 浙江工业大学 | A method of eliminating mobile device voip phone echo |
Non-Patent Citations (4)
Title |
---|
R.J. TREMBLAY ET AL: "Development and analysis of echo classification using time delays", 《IEEE JOURNAL OF OCEANIC ENGINEERING ( VOLUME: 21, ISSUE: 2, APR 1996)》 * |
R.J. TREMBLAY ET AL: "Development and analysis of echo classification using time delays", 《IEEE JOURNAL OF OCEANIC ENGINEERING ( VOLUME: 21, ISSUE: 2, APR 1996)》, 30 April 1996 (1996-04-30) * |
姚力等: "VoIP中一种基于WebRTC的回声消除改进算法", 《计算机科学》 * |
姚力等: "VoIP中一种基于WebRTC的回声消除改进算法", 《计算机科学》, 15 June 2017 (2017-06-15) * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111785290A (en) * | 2020-05-18 | 2020-10-16 | 深圳市东微智能科技股份有限公司 | Microphone array voice signal processing method, device, equipment and storage medium |
CN111785290B (en) * | 2020-05-18 | 2023-12-26 | 深圳市东微智能科技股份有限公司 | Microphone array voice signal processing method, device, equipment and storage medium |
CN112397082A (en) * | 2020-11-17 | 2021-02-23 | 北京达佳互联信息技术有限公司 | Method, apparatus, electronic device and storage medium for estimating echo delay |
CN112397082B (en) * | 2020-11-17 | 2024-05-14 | 北京达佳互联信息技术有限公司 | Method, device, electronic equipment and storage medium for estimating echo delay |
CN112489670A (en) * | 2020-12-01 | 2021-03-12 | 广州华多网络科技有限公司 | Time delay estimation method and device, terminal equipment and computer readable storage medium |
CN112489670B (en) * | 2020-12-01 | 2023-08-18 | 广州华多网络科技有限公司 | Time delay estimation method, device, terminal equipment and computer readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN110931032B (en) | 2022-08-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110931032B (en) | Dynamic echo cancellation method and device | |
EP3468162B1 (en) | Method and device for tracking echo delay | |
CN111899752B (en) | Noise suppression method and device for rapidly calculating voice existence probability, storage medium and terminal | |
JP4842583B2 (en) | Method and apparatus for multisensory speech enhancement | |
WO2019080552A1 (en) | Echo cancellation method and apparatus based on time delay estimation | |
WO2010052749A1 (en) | Noise suppression device | |
US10771621B2 (en) | Acoustic echo cancellation based sub band domain active speaker detection for audio and video conferencing applications | |
CN109688284B (en) | Echo delay detection method | |
CN110211602B (en) | Intelligent voice enhanced communication method and device | |
TW201913646A (en) | Continuous updating method and device for finite impulse response filter coefficient vector | |
JP2008544328A (en) | Multisensory speech enhancement using clean speech prior distribution | |
US20120158401A1 (en) | Music detection using spectral peak analysis | |
CN112309417B (en) | Method, device, system and readable medium for processing audio signal with wind noise suppression | |
CN108010536A (en) | Echo cancel method, device, system and storage medium | |
CN106571147A (en) | Method for suppressing acoustic echo of network telephone | |
CN109991520A (en) | A kind of cable oscillation wave partial discharge detecting system velocity of wave New calculating method | |
WO2022218254A1 (en) | Voice signal enhancement method and apparatus, and electronic device | |
CN111223492A (en) | Echo path delay estimation method and device | |
CN105654959B (en) | Adaptive filtering coefficient updating method and device | |
WO2020252629A1 (en) | Residual acoustic echo detection method, residual acoustic echo detection device, voice processing chip, and electronic device | |
CN111355855B (en) | Echo processing method, device, equipment and storage medium | |
EP2716023A1 (en) | Control of adaptation step size and suppression gain in acoustic echo control | |
KR20110061781A (en) | Apparatus and method for subtracting noise based on real-time noise estimation | |
CN116106826A (en) | Sound source positioning method, related device and medium | |
JP3720795B2 (en) | Sound source receiving position estimation method, apparatus, and program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |