CN109243482B

CN109243482B - Micro-array voice noise reduction method for improving ACROC and beam forming

Info

Publication number: CN109243482B
Application number: CN201811275824.4A
Authority: CN
Inventors: 曾庆宁; 罗瀛; 方韶劻; 林凤梅; 谢先明; 龙超
Original assignee: Shenzhen Aangsi Science & Technology Co ltd
Current assignee: Shenzhen Aangsi Science & Technology Co ltd
Priority date: 2018-10-30
Filing date: 2018-10-30
Publication date: 2022-03-18
Anticipated expiration: 2038-10-30
Also published as: CN109243482A

Abstract

The invention discloses a micro-array voice noise reduction method for improving ACROC and beam forming, which relates to the technical field of voice signal processing and solves the technical problem of how to further improve the noise suppression performance of voice reduction by the ACROC method, and comprises the following steps: the ACROANC method is improved, and the method comprises the following specific steps: (1) obtaining a plurality of paths of distorted voice signals after noise reduction through multi-path adaptive noise cancellation; (2) taking the multi-path distorted voice signals as the input of a recovery filter in an ACROC system, thereby obtaining noise-reduced voice; (II) forming a beam, which comprises the following specific steps: (1) establishing a plurality of improved ACROC subsystems and a self-adaptive mode control AMC subsystem to obtain multi-channel noise reduction voice; (2) and obtaining better noise-reduced voice by beamforming for multipath noise-reduced voice. The invention can make the output voice effect better and further improve the voice noise reduction effect.

Description

Micro-array voice noise reduction method for improving ACROC and beam forming

Technical Field

The invention relates to the technical field of voice signal processing, in particular to a micro-array voice noise reduction method for improving ACROC and beam forming.

Background

The voice noise reduction technology can effectively improve the voice quality and the recognition rate of a voice recognition system, and the micro-array voice noise reduction technology is an effective voice noise reduction method. The micro microphone is an array with a small array aperture, the array aperture is usually within 5 cm, and the number of array elements is small. The micro-array is easy to embed into various application devices, so that the micro-array has wide application value. A Generalized Sidelobe Cancellation (Generalized Sidelobe Cancellation) method (abbreviated as VAD-GSC) based on VAD (voice Activity detector) is a common and effective method for reducing noise of a microphone voice. Array Crosstalk Resistant Adaptive Noise Cancellation (abbreviated as ACRANC) is also an effective micro microphone voice Noise reduction method, and the ACRANC method has better Noise reduction effect than VAD-GSC and many improved methods thereof in many occasions, especially occasions with a short voice source distance Array.

In ACRANC, the input of the second stage filter is only one signal, which is actually one distorted speech signal, i.e. the output of the first stage filter, and the function of the second stage filter is to recover a pure speech signal from the distorted speech signal, i.e. to make the output of the second stage filter approach the pure speech signal in the main microphone. Due to the complexity of the audio signal propagation in the actual environment and the distortion caused by the ACRANC first-stage filter to the audio signal, the voice effect of the second-stage filter to recover the output is still insufficient.

Disclosure of Invention

Aiming at the defects of the prior art, the technical problem solved by the invention is how to further improve the noise suppression performance of the ACROC method.

In order to solve the above technical problems, the technical solution adopted by the present invention is a micro-array voice noise reduction method for improving ACRANC And beam forming, which performs voice noise reduction by inputting multiple distorted voice channels into a recovery filter And combining with DAS (Delay And Sum) beam forming, And comprises the following steps:

the ACROANC method is improved, and the method comprises the following specific steps:

(1) the distorted voice signals after the multi-path noise reduction are obtained through multi-path adaptive noise cancellation, and the specific process is as follows:

suppose that the speech signal is s (k) and the noise signal is n (k), which reach the microphone through multiple paths respectivelyM_iAnd converted into a signal s_i(k) And n_i(k) (ii) a From the speech source and the noise source to the microphone M_iIs assumed to be h_si(k) And h_ni(k) (ii) a Microphone M_iThe signal actually picked up is denoted x_i(k)＝s_i(k)+n_i(k) Where i is 1,2, … N, k is 0,1,2, …, where N represents the number of microphones in the array and k is a discrete time index, we obtain:

x_i(k)＝s_i(k)+n_i(k) (1)

s_i(k)＝h_si(k)*s(k)(2)

n_i(k)＝h_ni(k)*n(k)i＝1,2,…,N (3)

wherein, is convolution operation symbol;

setting a speech signal s_iTo the speech signal s_jHas an impact response of

While the noise signal n_iTo the noise signal n_jHas an intermediate propagation impulse response of

Then:

in this substep, for each microphone M_iWith a microphone M_iThe obtained signal x_i(k) As main path signal, and signals x obtained by other N-1 microphones_j(k) (j ═ 1, …, i-1, i +1, …, N) as a reference signal; in the global silence stage, i.e. the stage that all the signals are silent, the signals are passed through the filter A_iAdaptively canceling noise in the main path with noise in the multi-path reference signal; in addition, theIn the non-global silence stage, the coefficient of the filter Ai is kept unchanged, and only filtering output is carried out; thus, a multi-path distorted speech signal can be obtained. The reason is as follows:

due to speech signal s in the global silence phase_i(k) 0, i-1, 2, …, N, so there are:

x_i(k)＝y_i1(k)+e_i1(k) (6)

n_i(k)＝w_in_i(k)+err_i(k) (7)

in the formula x_i(k)＝n_i(k)，e_i1(k)＝err_i(k) Is the prediction error, y_i1＝w_in_i(k) Is a filter A_iOutput of (d), w_iIs a filter A of dimension 1 × (N-1) (L +1)_iThe coefficient row vector of (a), i.e.:

w_i＝(w_i1,…,w_i(i-1),w_i(i+1)…,w_N) (8)

in the formula w_ij＝(w_ij0,w_ij1,…,w_ijL),n_i(k) A noise signal column vector of (N-1) (L +1) × 1 dimensions;

n_i(k)＝[n_i1(k),…,n_i(i-1)(k),n_i(i+1)(k),…,n_iN(k)]^T (9)

in the formula n_ij(k)＝[n_ij(k),n_ij(k-1),…,n_ij(k-L)]^TL is the number of samples delayed by the reference channel noise signal;

setting the minimum error power as

And the corresponding optimal coefficient vector is:

to obtain the above

And

only filter a needs to be adjusted_iSuch that e_i1The sum of squares of (a) is minimum;

at a stage immediately following the global silence stage, filter a is kept under the assumption that the noise environment is constant or slowly varying_iThe optimal coefficients of (a) are not changed, and only the filtered output is made, so that:

in the formula x_i(k) And s_i(k) Representing the picked noisy speech vector and pure speech vector, respectively, as given by equations (6) and (11):

wherein:

above e_i1(k) Is a distorted speech with residual noise, p_i(k) It is the distorted speech from which it is distorted from the clean speech signal in the N-way, as can be seen from equation (13);

e_i1(k) if i is from 1 to N, each signal is used as main signal and the rest signals are used as reference signal, then N paths of distorted speech signals e containing residual noise can be obtained_j1(k)(j＝1,2,…N)。

(2) The method comprises the following steps of taking a plurality of paths of distorted voice signals as the input of a recovery filter in an ACROC system to obtain voice signals after noise reduction, wherein the specific process is as follows:

will distort the multi-path speech signal e_j1(k) (j ═ 1,2, … N), input into ACRONC systemSecond stage filter B_iAdjusting filter B at a stage other than the global silence stage_iSo that it outputs e_2i(k) The sum of squares of (a) is minimal, wherein:

||e_i2(k)||²＝||x_i(k)-y_i2(k)||²

＝||s_i(k)+n_i(k)-y_i2(k)||²

＝||n_i(k)||²+||s_i(k)-y_i2(k)||²+2n_i(k)[s_i(k)-y_i2(k)] (14)

as can be seen from formula (15), minimization

Equivalent to minimizing E [ s ]_i(k)-y_i2(k)²]The latter is equivalent to minimizing y_i2(k) And speech s_i(k) So that the filter B_iOutput y of_i2(k) Can approach to a clean speech signal s_i(k) In that respect Due to the filter B_iThe input of the method is not only a single-path but also a multi-path distorted voice signal, thereby obtaining better voice noise reduction effect than ACROC, and recording the better voice noise reduction signal as

And (II) beam forming, wherein the voice noise reduction effect is further improved by combining the improved ACROC with the beam forming, and the method comprises the following specific steps:

(1) establishing a plurality of improved ACROC subsystems and a self-adaptive mode control AMC subsystem to obtain multi-channel noise reduction voice, and the specific process is as follows:

each path of signal is used as a main signal, and the rest signals are used as reference signals, an improved ACROC is established, and therefore N subsystems are established.

In each improved ACROC, filter B_iThe input of (A) is all filters A_i(i-1, 2, … N) instead of a filter a_iAn output of (d); adaptive mode control AMC is used to control when the filters in these subsystems update coefficients and when fixed coefficients are unchanged;

in the silence period without voice, namely NVP period, the filter A can be adjusted_iTo compensate for errors caused by changes in environmental factors. To this end, a global silence phase, i.e. an ONVP phase, is defined, the first filter a of each subsystem_iAdjusting the optimal coefficients only during ONVP;

by a microphone M_iPicking up the ith path of noisy speech signal x_i(k) Is set to nvp (i), which consists of a series of discrete intervals, namely:

wherein the discrete interval:

[k'_ij,k”_ij]＝{k'_ij,k'_ij+1,…,k”_ij}

the discrete interval is x_i(k) The jth NVP of (a), obviously NVP (i)₁) Not necessarily with NVP (i)₂) Equal, i₁≠i₂,i₁,i₂E {1,2, …, N }. But NVP (i)₁) NVP (i) only₂) Translation results on the time axis;

define ONVP as:

thus, it is easy to prove that:

wherein:

if k "_j＜k'_jThen define [ k 'in formula (18)'_j,k”_j]＝φ；

Adjusting the filter A_iWhen the optimal coefficient is obtained, no voice signal is contained in any path of signal, otherwise, the voice is cancelled as noise together, therefore, the filter A is adjusted only in the following L-ONVP stage_iThe coefficient of (a);

where L is the reference signal input filter A_iAnd the number of delay time samples of:

[k'_j+L,k”_j]＝{k'_j+L,k'_j+L+1,…,k”_j} (20)

if k "_j＜k'_j+ L, likewise defined as [ k 'in formula (26)'_j+L,k”_j]＝φ；

In the L-ONVP stage, all signals and the delay used belong to the silence stage, and no speech signal is included, so that the filter A can be adjusted in the L-ONVP stage_iThe aforementioned NVP stage refers to L-ONVP or a part of L-ONVP;

filter A is performed during the (Delta, Delta') -ONVP stage_iAdjusting the optimal coefficient:

in the formula

Is to constitute the ith₀NVP (i) of way signal₀) The discrete time interval of (a) is a positive integer, which can be arbitrarily selected according to the accuracy of VD decisionThe aim is to ensure that the time interval used is a pure noise interval, and delta is also an optional positive integer, but this is satisfied

Δ≥L+δ+Δ' (22)

Where δ is the propagation of noise from other microphones of the microphone array to the ith microphone₀The time delay between microphones is counted by the number of delay samples, and the maximum number of delay samples is:

wherein d is_iIs a microphone

And a microphone M_iF is the sampling frequency of the array, and c is the speed of propagation of the audio signal in air;

at a stage outside (Δ, Δ') -ONVP, the filter a of each subsystem_iThe optimal coefficient of (A) is kept unchanged, and the filter A_iOnly for filtering purposes.

Adaptively adjusting all filters B in the rest of the phase except the global silence phase_iFor simplicity, may also be given to B_iContinuously carrying out self-adaptive adjustment from beginning to end;

(2) and obtaining final noise-reduction voice through DAS beam forming by delay and sum, wherein the specific process is as follows:

the output of each subsystem is a path of voice signal after noise reduction, all N paths of outputs can be input into a beam former to obtain better voice noise reduction effect, if a common DAS beam former is used, the following input and output relationship can be described as follows:

in the formula tau_iRelative to a selected one of the reference microphones in the array

In other words, the speech reaches the microphone M_iThe delay time of (d); reference microphone

Optionally any one of the microphones in the array, typically selecting the microphone at or near the center of the microphone as the reference microphone;

delay time tau_iThe cross-correlation method or generalized cross-correlation method may be used for calculation or the following method:

1) selecting an (delta, T) _ OVP discrete time interval [ k ', k' ], wherein k is more than or equal to k '+ delta and k- (k' + delta) is as small as possible;

2) finding tau_iSatisfies the following conditions:

all tau if the array aperture of the microphone is small and the sampling frequency of the array signal is not very high_iCan be considered as a 0 process.

Compared with the prior art, the invention has the beneficial effects that:

compared with the original method that only one distorted voice is input into the recovery filter, the method has better voice noise reduction effect through the improved ACRANC method compared with the common ACRANC method, and the improved ACRANC method is combined with the beam forming method to further improve the noise reduction effect.

Drawings

FIG. 1 is a schematic flow diagram of the present invention;

FIG. 2 is a schematic diagram of speech and noise propagation and crosstalk;

FIG. 3 is a schematic diagram of an improved ACRONC system;

fig. 4 is a schematic diagram of improved ACRANC in combination with beamforming.

Detailed Description

The following further describes the embodiments of the present invention with reference to the drawings, but the present invention is not limited thereto.

Fig. 1 shows a micro-array voice noise reduction method for improving ACRANC and beam forming, which performs voice noise reduction by inputting multiple distorted voices to a recovery filter of ACRANC and combining with beam forming, and comprises the following steps:

suppose that the speech signal is s (k) and the noise signal is n (k), as shown in FIG. 2, they reach the microphone M through multiple paths, respectively_iAnd converted into a signal s_i(k) And n_i(k) (ii) a From the speech source and the noise source to the microphone M_iIs assumed to be h_si(k) And h_ni(k) (ii) a Microphone M_iThe signal actually picked up is denoted x_i(k)＝s_i(k)+n_i(k) Where i is 1,2, … N, k is 0,1,2, …, where N represents the number of microphones in the array and k is a discrete time index, we obtain:

x_i(k)＝s_i(k)+n_i(k) (1)

s_i(k)＝h_si(k)*s(k) (2)

n_i(k)＝h_ni(k)*n(k)i＝1,2,…,N (3)

wherein, is convolution operation symbol;

setting a speech signal s_iTo the speech signal s_jHas an impact response of

Then:

in this substep, for each microphone M_iWith a microphone M_iThe obtained signal x_i(k) As main path signal, and signals x obtained by other N-1 microphones_j(k) (j ═ 1, …, i-1, i +1, …, N) as a reference signal; in the global silence period, i.e. the period when each signal is silent, as shown in FIG. 3, pass filter A_iAdaptively canceling noise in the main path with noise in the multi-path reference signal; in the non-global silence stage, the coefficient of the filter Ai is kept unchanged, and only filtering output is carried out; thus, a multi-path distorted speech signal can be obtained. The reason is as follows:

x_i(k)＝y_i1(k)+e_i1(k) (6)

n_i(k)＝w_in_i(k)+err_i(k) (7)

w_i＝(w_i1,…,w_i(i-1),w_i(i+1)…,w_N) (8)

n_i(k)＝[n_i1(k),…,n_i(i-1)(k),n_i(i+1)(k),…,n_iN(k)]^T (9)

setting the minimum error power as

And the corresponding optimal coefficient vector is:

to obtain the above

And

wherein:

above e_i1(k) Is a distorted speech with residual noise, p_i(k) Is thereinDistorted speech, which is actually distorted from the clean speech signal in the N-way, as can be seen from equation (13);

will distort the multi-path speech signal e_j1(k) (j ═ 1,2, … N), input into the second stage filter B in the ACRANC system_iAdjusting filter B at a stage other than the global silence stage_iSo that it outputs e_2i(k) The sum of squares of (a) is minimal, wherein:

||e_i2(k)||²＝||x_i(k)-y_i2(k)||²

＝||s_i(k)+n_i(k)-y_i2(k)||²

＝||n_i(k)||²+||s_i(k)-y_i2(k)||²+2n_i(k)[s_i(k)-y_i2(k)] (14)

as can be seen from formula (15), minimization

Equivalent to minimizing E [ s ]_i(k)-y_i2(k)²]The latter is equivalent to minimizing y_i2(k) And speech s_i(k) So that the filter B_iOutput y of_i2(k) Can approach to a clean speech signal s_i(k)。

Due to the filter B_iIs inputted with N-way signal e_j1(k) (j ═ 1,2, … N), theyAll of which are distorted speech signals formed by N paths of speech according to equation (13), the output approximation generated by the multiple paths of input will be greater than that of only one path of signal e_i1(k) The output approximation effect generated by the input is better, theoretically, only the filter B is needed_iTo other input signals e_j1(k) When all coefficients (j-1, …, (i-1), (i +1), … N) take 0 values, the N inputs are degenerated to only one signal e_i1(k) The input situation of (1). Therefore, the improved ACRANC method also has better effect than the existing ACRANC method, and the better voice noise reduction signal is recorded as

In each improved ACROC, filter B_iThe input of (A) is all filters A_i(i-1, 2, … N) instead of a filter a_iAn output of (d); as shown in fig. 4, adaptive mode control AMC is used to control when the filters in these subsystems update coefficients and when fixed coefficients are unchanged;

wherein the discrete interval:

[k’_ij,k”_ij]＝{k’_ij,k’_ij+1,…,k”_ij}

define ONVP as:

thus, it is easy to prove that:

wherein:

if k "_j＜k'_jThen define [ k 'in formula (18)'_j,k”j]＝φ；

where L is the reference signal input filter A_iThe number of delay time samples of (a),and:

[k'_j+L,k”_j]＝{k'_j+L,k'_j+L+1,…,k”_j} (20)

if k "_j＜k’_j+ L, likewise defined as [ k 'in formula (26)'_j+L,k”_j]＝φ；

in the formula

Is to constitute the ith₀NVP (i) of way signal₀) The discrete time interval of (a) is a positive integer, which can be arbitrarily selected according to the accuracy of VD decision, in order to ensure that the used time interval is a pure noise interval, and Δ is also an optional positive integer, but should satisfy:

Δ≥L+δ+Δ' (22)

wherein d is_iIs a microphone

And a microphone M_iF is the sampling frequency of the array, and c is the speed of propagation of the audio signal in airDegree;

delay time tau_iThe cross-correlation method or the generalized cross-correlation method may be used or calculated as follows:

2) finding tau_iSatisfies the following conditions:

For example, if any one microphone M in the array is present_iTo the reference microphone

Is less than 2 cm and the snapshot sampling frequency of the array is 8000Hz, the maximum extension time will be less than half the sampling time interval, so that all tau will not be taken_i＝0。

(3) Complexity with respect to computation

Fig. 4 shows a voice noise reduction process combining ACRANC with DAS beamforming, in which the amount of computation of both AMC and DAS beamformer is small, and AMC can be implemented by a vad (voice Activity detector). The computational complexity of the method therefore depends mainly on the computational load estimation of the improved ACRANC algorithm of the N subsystems, which in turn depends on all filters a for each improved ACRANC_iAnd B_iThe adaptive algorithm used. If the LMS adaptive algorithm is adopted, the calculation amount of the improved ACROC algorithm of the N subsystems is not more than difficult to calculate

(2_A+3_M)[(L+1)(N-1)+(L_B+1)N]Nf (26)

In the formula 2_ARepresenting 2 addition operations, 3_MRepresenting 3 multiplications, L being the decision filter A_iThe number of delay time samples used by the reference signal in the order of equation (10), where N is the number of microphones in the array, and L_BIs a filter B_iF is the sampling rate of the microphone array. Since many chips can complete an addition and multiplication operation in one operation, the real computation time is much shorter than the time required by equation (32).

For example, if the decision filter A is selected_iLength L24, determining filter B_iLength L_BIf the sampling frequency f is 8000 and the array is made up of N5 microphones, then the calculation of interest is no more than 41MFLOPS, as can be derived from equation (32).

Compared with the prior art, the invention has the beneficial effects that:

The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the described embodiments. It will be apparent to those skilled in the art that various changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention.

Claims

1. A method for improving ACROC and beamforming microarray voice noise reduction is characterized in that voice noise reduction is carried out by inputting multi-path distorted voice to a recovery filter and combining with beamforming, and the method comprises the following steps:

suppose that the speech signal is s (k) and the noise signal is n (k), which reach the microphone M through multiple paths respectively_iAnd converted into a signal s_i(k) And n_i(k) (ii) a From the speech source and the noise source to the microphone M_iIs assumed to be h_si(k) And h_ni(k) (ii) a Microphone M_iThe signal actually picked up is denoted x_i(k)＝s_i(k)+n_i(k) Where i is 1,2, … N, k is 0,1,2, …, where N represents the number of microphones in the array and k is a discrete time index, we obtain:

x_i(k)＝s_i(k)+n_i(k) (1)

s_i(k)＝h_si(k)*s(k) (2)

n_i(k)＝h_ni(k)*n(k) i＝1,2,…,N (3)

wherein, is convolution operation symbol;

setting a speech signal s_iTo the speech signal s_jHas an impact response of

Then:

in this substep, for each microphone M_iWith a microphone M_iThe obtained signal x_i(k) As main path signal, and signals x obtained by other N-1 microphones_j(k) (j ═ 1, …, i-1, i +1, …, N) as a reference signal; in the global silence stage, i.e. the stage that all the signals are silent, the signals are passed through the filter A_iAdaptively canceling noise in the main path with noise in the multi-path reference signal; in the non-global silence stage, the coefficient of the filter Ai is kept unchanged, and only filtering output is carried out; thus, a plurality of distorted speech signals can be obtained; the reason is as follows:

x_i(k)＝y_i1(k)+e_i1(k) (6)

n_i(k)＝w_in_i(k)+err_i(k) (7)

w_i＝(w_i1,…,w_i(i-1),w_i(i+1)…,w_N) (8)

n_i(k)＝[n_i1(k),…,n_i(i-1)(k),n_i(i+1)(k),…,n_iN(k)]^T (9)

let the minimum error power be P [ err ]_i ⁰(k)]And the corresponding optimal coefficient vector is:

to obtain the above

And P [ err ]_i ⁰(k)]Only need to adjust filter A_iSuch that e_i1The sum of squares of (a) is minimum;

wherein:

e_i1(k) if i is from 1 to N, each signal is used as main signal and the rest signals are used as reference signal, then N paths of distorted speech signals e containing residual noise can be obtained_j1(k)(j＝1,2,…N)；

(2) The method comprises the following specific processes of taking a plurality of paths of distorted voice signals as the input of a recovery filter in an ACROC system so as to obtain noise-reduced voice:

||e_i2(k)||²＝||x_i(k)-y_i2(k)||²

＝||s_i(k)+n_i(k)-y_i2(k)||²

＝||n_i(k)||²+||s_i(k)-y_i2(k)||²+2n_i(k)[s_i(k)-y_i2(k)] (14)

as can be seen from formula (15), minimization

Equivalent to minimizing E [ s ]_i(k)-y_i2(k)²]The latter is equivalent to minimizing y_i2(k) And speech s_i(k) So that the filter B_iOutput y of_i2(k) Can approach to a clean speech signal s_i(k) (ii) a Due to the filter B_iThe input of the method is not only a single-path but also a multi-path distorted voice signal, thereby obtaining better voice noise reduction effect than ACROC, and recording the better voice noise reduction signal as

；

(1) establishing a plurality of improved ACROC subsystems and a self-adaptive mode control AMC subsystem to obtain multi-channel noise reduction voice, wherein the specific process is as follows:

each path of signal is used as a main signal, and the rest signals are used as reference signals, an improved ACROC is established, so that N subsystems are established;

in the silence period without voice, namely NVP period, the filter A can be adjusted_iIs most preferablyThe number is used for compensating errors caused by the change of the environmental factors; to this end, a global silence phase, i.e. an ONVP phase, is defined, the first filter a of each subsystem_iAdjusting the optimal coefficients only during ONVP;

wherein the discrete interval:

[k′_ij,k″_ij]＝{k′_ij,k′_ij+1,…,k″_ij}

define ONVP as:

thus, it is easy to prove that:

wherein:

if k ″)_j＜k′_jThen define [ k 'in formula (18)'_j,k″_j]＝φ；

Adjusting the filter A_iIs most preferredIn the case of coefficients, no speech signal should be contained in any one path of signal, otherwise, speech is cancelled as noise, and therefore, the filter a is adjusted only in the following L-ONVP stage_iThe coefficient of (a);

[k′_j+L,k″_j]＝{k′_j+L,k′_j+L+1,…,k″_j} (20)

if k ″)_j＜k′_j+ L, likewise defined as [ k 'in formula (26)'_j+L,k″_j]＝φ；

in the formula

Δ≥L+δ+Δ' (22)

where δ is the propagation of noise from other microphones of the microphone array to the ith microphone₀Between microphonesThe time delay is counted by the number of delay samples, and the maximum number of delay samples is as follows:

wherein d is_iIs a microphone

at a stage outside (Δ, Δ') -ONVP, the filter a of each subsystem_iThe optimal coefficient of (A) is kept unchanged, and the filter A_iOnly used for filtering;

adaptively adjusting all filters B in the rest of the phase except the global silence phase_iThe optimum coefficient of (a);

Optionally any one microphone in the arrayA microphone located at or near the center of the microphone is typically selected as the reference microphone.

2. The method of claim 1, wherein the delay time τ is greater than the delay time τ_iThe cross-correlation method or the generalized cross-correlation method may be used or calculated as follows:

2) finding tau_iSatisfies the following conditions: