CN1822092A

CN1822092A - Method and its device for elliminating background noise in speech input

Info

Publication number: CN1822092A
Application number: CNA2006100115725A
Authority: CN
Inventors: 杨作兴
Original assignee: Vimicro Corp
Current assignee: Vimicro Corp
Priority date: 2006-03-28
Filing date: 2006-03-28
Publication date: 2006-08-23
Anticipated expiration: 2026-03-28
Also published as: CN1822092B

Abstract

This invention discloses a method and a device for eliminating background noises in eliminating phone input, in which, after receiving analog input phone signals, an A/D conversion module converts the signals into a digital phonetic signal sampling point sequence, an amplitude statistic module computes the energy value of each sampling point of the sampling block and calculates them to get a unified value, if said statistic value is lower than a designed noise threshold value, the noise eliminating module carries out noise attenuation to the sampling point sequence of the current block to be converted to analog phone signals and output by the A/D conversion module, otherwise, the noise eliminating module converts the original sampling point sequence to analog phone signals and output by the A/D conversion module.

Description

A kind of method and device thereof of eliminating ground unrest in the phonetic entry

Technical field

The present invention relates to a kind of method and device thereof of eliminating ground unrest in the phonetic entry.

Background technology

At the ground unrest problem in the voice call, mainly contain two kinds of solutions at present: adopt analog filter to eliminate ground unrest and employing digital signal processor (DSP) elimination ground unrest, wherein:

Adopt the method for analog filter, as shown in Figure 1, analog filter mainly comprises high pass and two kinds of wave filters of low pass.Hi-pass filter can filter the low-frequency range part that voice call is not paid close attention in the noise, as is lower than the part below the 200Hz.Low-pass filter can filter the high band part that voice call is not paid close attention in the noise, as is higher than the part more than the 4kHz.

Adopt the advantage of analog filter method to be to realize that simply hardware cost is low, power consumption is little.But it is very limited that its distinct disadvantage is an effect, and range of application is rather narrow.Because it can only remove low frequency and HFS in the noise, for powerless with the noise of frequency range with voice.Yet unfortunately, ground unrest also mainly concentrates on the relatively more responsive Mid Frequency of the sense of hearing.

Adopt the method for digital signal processor (DSP), as shown in Figure 2, the simulating signal of Mike's output is transformed into digital signal through modulus (AD); After the certain algorithm process of DSP basis, digital signal is delivered to digital-to-analogue (DA) transducer; The DA transducer is reduced to simulating signal with digital signal, and this signal is sent into speaker's mobile phone.

The advantage of DSP method is flexibly, applied range, and it can adopt different software algorithms according to the characteristics (as spectrum distribution, amplitude distribution and other some statistical natures) of noise, can realize goodish noise removing effect.But its shortcoming is to realize complexity, hardware cost height (needing AD, DA and DSP), and power consumption is big.

Summary of the invention

The technical problem to be solved in the present invention provides a kind of method and device thereof of eliminating ground unrest in the phonetic entry, can realize more performance cost ratio.

In order to solve the problems of the technologies described above, the invention provides a kind of method of eliminating ground unrest in the phonetic entry, may further comprise the steps:

(a) system becomes audio digital signals sampled point sequence with described signal transformation after receiving the voice signal of analog input;

(b) calculate the energy value of current each sampled point of sampling block and adding up, obtain a statistical value;

(c) if being lower than one, described statistical value sets noise gate, sampled point to current sampling block carries out noise attentuation, and then the sampled point sequence after will decaying is converted to analog voice signal output, otherwise, directly former sampled point sequence is converted to analog voice signal output.

Further, said method also can have following characteristics: in the described step (a), when described signal transformation is audio digital signals, analog voice signal with input is for conversion into 1 over-sampling rate signal earlier, again this signal being carried out the frequency range compression filtering handles, become the multistation digital signal of 1 sampling rate, thus the high band part in the filtering noise.

Further, said method also can have following characteristics: in the described step (a), after obtaining becoming the multistation digital signal of 1 sampling rate, also allow this signal through a Hi-pass filter, with the low-frequency range part in the filtering noise.

Further, said method also can have following characteristics: the energy value of described sampled point is to represent with the range value of sampled point or performance number.

Further, said method also can have following characteristics: the described statistical value that obtains in the described step (b) is meant the amplitude of current each sampled point of sampling block or the statistics maximal value in the power.

Further, said method also can have following characteristics: when in the described step (c) sampled point of current sampling block being carried out noise attentuation, be will the output sampled point amplitude be adjusted into its former amplitude and multiply by statistics maximal value in described amplitude or the power again divided by described Noise gate limit value.

Further, said method also can have following characteristics: the noise gate of described setting is 10～20mV.

The present invention also provides a kind of device of eliminating ground unrest in the phonetic entry, comprises the analog to digital conversion module, the amplitude statistics module, and noise cancellation module and digital to analog conversion module is characterized in that:

Described analog to digital conversion module is used for the analog voice signal of input is transformed to audio digital signals, and exports to noise cancellation module and amplitude statistics module respectively;

Described amplitude statistics module is used to calculate the energy value of current each sampled point of sampling block and add up, and obtains a statistical value, outputs to noise cancellation module;

Described noise cancellation module is used to judge whether described statistical value is lower than one and sets noise gate, if, sampled point to current sampling block carries out noise attentuation, output to described digital to analog conversion module then, otherwise, directly former sampled point sequence is outputed to described digital to analog conversion module;

Described digital to analog conversion module, the sampled point sequence after the decay that is used to import or become analog voice signal output without the sampled point sequence transformation of decay.

Further, said apparatus also can have following characteristics: described analog to digital conversion module further comprises western trellis code-De Erta conversion module and frequency range compression filtering module, wherein:

Described western trellis code-De Erta conversion module is used for the analog voice signal of input is exported to frequency range compression filtering module through the digital signal that western trellis code-De Erta is for conversion into 1 over-sampling rate;

Described frequency range compression filtering module is used for the multistation digital signal output of 1 sampling rate that above-mentioned 1 signal is become.

Further, said apparatus also can have following characteristics: described analog to digital conversion module further comprises the high-pass filtering module, is used to receive the digital signal of described frequency range compression filtering module output, the low-frequency range part in the filtering noise.

Further, said apparatus also can have following characteristics: the described statistical value that described amplitude statistics module obtains is meant the amplitude of current each sampled point of sampling block or the statistics maximal value in the power.

Further, said apparatus also can have following characteristics: described noise cancellation module when carrying out noise attentuation, be will the output sampled point amplitude be adjusted into its former amplitude and multiply by statistics maximal value in described amplitude or the power again divided by described Noise gate limit value.

In sum, adopt the present invention to eliminate the method and the device thereof of ground unrest in the phonetic entry, performance cost of the present invention all is between existing two kinds of schemes, but can realize in the scope of appointment of the present invention than former scheme more performance cost ratio.

Description of drawings

Fig. 1 is the existing device synoptic diagram that adopts analog filter to eliminate ground unrest;

Fig. 2 is the device synoptic diagram that the existing DSP of employing eliminates ground unrest;

Fig. 3 is the synoptic diagram of the applied system of embodiment of the invention device;

Fig. 4 is the synoptic diagram of embodiment of the invention AD conversion module.

Embodiment

Purpose of the present invention mainly is the neighbourhood noise of filtering people when silent, because the amplitude of background noise is less than the amplitude of voice signal generally speaking, therefore think that the very little signal of amplitude is a ground unrest, small amplitude signal is carried out amplitude fading, thereby reach the purpose of eliminating noise.

Present embodiment adopts digital technology, realizes the elimination of noise with the method for hardware.As shown in Figure 3, the present embodiment device comprises the AD conversion module, the amplitude statistics module, and noise cancellation module and DA conversion module, wherein:

The AD conversion module is used for the analog voice signal of input is transformed to audio digital signals, and exports to noise cancellation module and amplitude statistics module respectively.

The amplitude statistics module is used to calculate the amplitude of current each sampled point of sampling block, counts the wherein statistics maximal value of amplitude (this maximal value might not be actual maximal value, adds up maximal value so be called), outputs to noise cancellation module.

Noise cancellation module, whether the statistics maximal value that is used for the comparison amplitude is lower than one is set noise gate, if, sampled point to current sampling block carries out amplitude fading, output to described digital to analog conversion module then, otherwise, directly former sampled point sequence is outputed to described digital to analog conversion module;

The digital to analog conversion module is used for becoming analog voice signal output with the sampled point sequence after the decay of input or without the sampled point sequence transformation that decays.

As shown in Figure 4, the AD conversion module of present embodiment also has low-frequency range part and the high band function partly in the filtering noise, further comprises with lower unit:

West trellis code-De Erta (SIGMA-DELTA) converter unit, be used for will input analog voice signal export to frequency range compression filtering module through the digital signal that SIGMA-DELTA is for conversion into 128 times of over-sampling rates (also can be 64 times, 256 times or the like) of 1 (BIT);

Frequency range compression filtering unit, the digital signal that is used for above-mentioned 1BIT signal is become the 16BIT (can set as required, for example 24BIT etc.) of 1 sampling rate is exported to the high-pass filtering module;

High pass filter unit is used for the low-frequency range part of filtering noise.

Voice signal can be compressed to the high band noise frequency range greater than 0.5 sampling rate after above-mentioned western trellis code-De Erta converter unit and the processing of frequency range compression unit, and with its filtering, therefore makes the AD conversion module have good low-pass characteristic.

Eliminate the method for ground unrest in the present embodiment, be applied to speech processing system as shown in Figure 3, may further comprise the steps:

After step 1, system are received the analog voice signal of input, it is carried out the AD conversion, obtain audio digital signals sampled point sequence, and high band in the filtering noise and low-frequency range part;

In the present embodiment, earlier the analog voice signal process SIGMA-DELTA of input is for conversion into the digital signal of the over-sampling rate (128 sampling rate) of 1BIT, in the digital signal that this 1BIT signal is become the 16BIT of 1 sampling rate.

Step 2 calculates sampled point amplitude in the current sampling block, and adds up, and obtains the statistics maximal value Emax of amplitude;

In the present embodiment, utilize following algorithm to obtain the amplitude statistics maximal value Emax of current sampling block, but also can adopt any other algorithm.

Suppose that e (n) is the amplitude sequence corresponding to sampled point sequence x (n), x (n) is current 16BIT data, n=0, and 1 ..., L-1, L are the sampled point number that sampling block comprises, present embodiment L=1024.

Make e (0)=α | x (0) |, e (n)=α | x (n) |+(1-α) e (n-1);

When | x (n) |＞e (n-1) is the rapid ascent stage, α adopts rapid ascent stage factor alpha _ attack (user can be provided with this coefficient by register), otherwise α adopts non-rapid ascent stage factor alpha _ non_attack (user can be provided with this coefficient by register);

Calculate the statistics maximal value Emax=Max (e (n)) in each sampled point amplitude of this sampling block then.

Step 3, if being lower than one, the Emax that obtains sets threshold value noise_threshold, sampled point to current sampling block carries out amplitude fading, and then the sampled point sequence after will decaying is converted to analog voice signal output, otherwise, directly former sampled point sequence is converted to analog voice signal output.

In the present embodiment, if Emax＜noise_threshold (noise gate of noise_threshold for setting, this thresholding can be set up on their own by the user, and scope is preferable at 10～20mV.The amplitude of then adjusting the sampled point of output multiply by Emax again divided by noise_threshold for its former amplitude.Thereby reach the purpose of attenuate acoustic noise.

In another embodiment of the present invention, when Emax＜noise_threshold, can make the digital signal after the noise eliminating is 0, but this processing makes speaker's last or end syllable interrupt suddenly, make the hearer feel bad, and the method for above-mentioned attenuate acoustic noise then can make speaker's last or end syllable diminish gradually, is humanized design.

On the basis of the foregoing description, also can be various other mapping mode, as change the power of calculating sampling point into, replace signal amplitude to adjudicate with signal power, effect is the same, what in fact reflect all is the energy of signal.In addition, the present invention also is not limited to the judgment rule among the embodiment, for example, after the amplitude (or power) of M sampled point of amplitude maximum averages in the word voice signal sampled point sequence of can peeking, come again to compare with a thresholding, as greater than this thresholding, carry out the decay of signal amplitude again, also be fine, M can get fixed number, the perhaps ratio of sample block length, or the like.And in when decay, other that also amplitude can be reduced to former amplitude be less than 1 value, and as 1/4 etc., but that adaptivity is wanted relatively is a little bit poorer.

Claims

1, a kind of method of eliminating ground unrest in the phonetic entry may further comprise the steps:

2, the method for claim 1, it is characterized in that, in the described step (a), when described signal transformation is audio digital signals, analog voice signal with input is for conversion into 1 over-sampling rate signal earlier, again this signal is carried out the frequency range compression filtering and handle, become the multistation digital signal of 1 sampling rate, thus the high band part in the filtering noise.

3, method as claimed in claim 2 is characterized in that, in the described step (a), after obtaining becoming the multistation digital signal of 1 sampling rate, also allows this signal through a Hi-pass filter, with the low-frequency range part in the filtering noise.

4, the method for claim 1 is characterized in that, the energy value of described sampled point is to represent with the range value of sampled point or performance number.

5, method as claimed in claim 4 is characterized in that, the described statistical value that obtains in the described step (b) is meant the amplitude of current each sampled point of sampling block or the statistics maximal value in the power.

6, method as claimed in claim 5, it is characterized in that, when in the described step (c) sampled point of current sampling block being carried out noise attentuation, be will the output sampled point amplitude be adjusted into its former amplitude and multiply by statistics maximal value in described amplitude or the power again divided by described Noise gate limit value.

7, method as claimed in claim 5 is characterized in that, the noise gate of described setting is 10～20mV.

8, a kind of device of eliminating ground unrest in the phonetic entry comprises the analog to digital conversion module, the amplitude statistics module, and noise cancellation module and digital to analog conversion module is characterized in that:

9, device as claimed in claim 8 is characterized in that, described analog to digital conversion module further comprises western trellis code-De Erta conversion module and frequency range compression filtering module, wherein:

10, device as claimed in claim 9 is characterized in that, described analog to digital conversion module further comprises the high-pass filtering module, is used to receive the digital signal of described frequency range compression filtering module output, the low-frequency range part in the filtering noise.

11, method as claimed in claim 8 is characterized in that, the described statistical value that described amplitude statistics module obtains is meant the amplitude of current each sampled point of sampling block or the statistics maximal value in the power.

12, method as claimed in claim 8 is characterized in that, described noise cancellation module when carrying out noise attentuation, be will the output sampled point amplitude be adjusted into its former amplitude and multiply by statistics maximal value in described amplitude or the power again divided by described Noise gate limit value.