CA1332626C - Noise reduction - Google Patents
Noise reductionInfo
- Publication number
- CA1332626C CA1332626C CA000588588A CA588588A CA1332626C CA 1332626 C CA1332626 C CA 1332626C CA 000588588 A CA000588588 A CA 000588588A CA 588588 A CA588588 A CA 588588A CA 1332626 C CA1332626 C CA 1332626C
- Authority
- CA
- Canada
- Prior art keywords
- signal
- linear
- spectral
- spectral components
- noise
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 230000009467 reduction Effects 0.000 title claims abstract description 8
- 230000003595 spectral effect Effects 0.000 claims abstract description 73
- 238000000034 method Methods 0.000 claims abstract description 23
- 238000012546 transfer Methods 0.000 claims abstract description 20
- 238000012545 processing Methods 0.000 claims description 58
- 238000006243 chemical reaction Methods 0.000 claims description 31
- 230000002238 attenuated effect Effects 0.000 claims description 11
- 238000001228 spectrum Methods 0.000 claims description 10
- 230000001131 transforming effect Effects 0.000 claims description 4
- 230000007704 transition Effects 0.000 claims description 4
- 238000004458 analytical method Methods 0.000 claims description 2
- 210000003127 knee Anatomy 0.000 abstract description 19
- 230000000694 effects Effects 0.000 abstract description 8
- 230000002596 correlated effect Effects 0.000 abstract 1
- 238000000354 decomposition reaction Methods 0.000 abstract 1
- 230000002708 enhancing effect Effects 0.000 abstract 1
- 239000000872 buffer Substances 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 4
- 238000012935 Averaging Methods 0.000 description 2
- 238000005562 fading Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000012886 linear function Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000007493 shaping process Methods 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03G—CONTROL OF AMPLIFICATION
- H03G7/00—Volume compression or expansion in amplifiers
- H03G7/007—Volume compression or expansion in amplifiers of digital or coded signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Noise Elimination (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
ABSTRACT
A noise reduction system for enhancing noisy speech signals by performing a spectral decomposition on the signal, passing each spectral component through a non-linear stage which progressively attenuates lower intensity spectral components (uncorrelated noise) but passes higher intensity spectral components (correlated speech) relatively unattenuated, and reconstituting the signal. Frames of noisy signal are transformed into the frequency domain by an FFT device, with windowing.
Each transformed frame is then processed to effect a non-linear transfer characteristic, which is linear above a soft "knee" region, and rolls off below, and transformed back to a reconstituted time-domain signal with reduced noise by an IFFT device (with overlapping). A level control matches the signal to the characteristic. In further embodiments, the characteristic may vary between frequency bands, and may be matched to speech formants by tracking formants using an LSP technique.
A noise reduction system for enhancing noisy speech signals by performing a spectral decomposition on the signal, passing each spectral component through a non-linear stage which progressively attenuates lower intensity spectral components (uncorrelated noise) but passes higher intensity spectral components (correlated speech) relatively unattenuated, and reconstituting the signal. Frames of noisy signal are transformed into the frequency domain by an FFT device, with windowing.
Each transformed frame is then processed to effect a non-linear transfer characteristic, which is linear above a soft "knee" region, and rolls off below, and transformed back to a reconstituted time-domain signal with reduced noise by an IFFT device (with overlapping). A level control matches the signal to the characteristic. In further embodiments, the characteristic may vary between frequency bands, and may be matched to speech formants by tracking formants using an LSP technique.
Description
1332~2~i -NOISE REDUCTION
This invention relates to a method of reducing the level of noise in a signal, and to apparatus for reducing noise using this method; particularly but not exclusively this invention relates to a method of reducing noise in a speech signal, and to apparatus for thus producing a speech signal with enhanced intelligibility.
A signal will often acquire broadband noise so that the time-average noise power is spread across a portion of lo the noise spectrum. In a speech system, noise may cause a listener severe fatigue or discomfort.
It is obviously desirable to reduce noise, and many methods of doing so are known; in speech systems, some types of noise are more perceptually acceptable than others. Especially desirable are methods which may be used with existing transmission equipment, and preferably are easily added at the receiver end. -It is known to reduce noise in high noise environments (-6 to +6dB signal-to-noise ratio) by so-called spectral subtraction techniques, in which the -~
signal is processed by transforming it into tXe frequency domain, then subtracting an estimate of the noise power in each spectral band, then re-transforming into the time domain. This technique suffers from several drawbacks, 25 however. Firstly, it is necessary to measure the noise ~-~
power in each spectral line; this involves identifying 'non-speech' periods, which can be complicated and unreliable. Secondly, it requires the assumption that the ~
noise spectrum is stationary between the instants at which ; ~-30 the noise power is measured; this is not necessarily the case. Thirdly, if an estimate of noise power made in one ~i~
non-speech period is applied to the next non-speech period ,,.
, .. -.,.. . . ... -,, . . ,- , ~ .. . - -. ! , . , . ., ' .; ., . . ,`,', ~ ' . ~ . : , '~:, ' - 2 - ~ 3 ~ 2 62 ~
correctly, there will be a total absence of background noise during non-speech periods, and this modulation of the background noise sounds unpleasant to a listener.
S According to the invention, there is provided a noise reduction apparatus comprising: first conversion means for receiving a time-varying signal and producing therefrom output signals representing the magnitude of spectral components thereof, processing means for receiving the output of the first conversion means, the processinq means having a non-linear transfer characteristic such that in use low magnitude inputs thereto are attenuated relative to high magnitude inputs, the transfer characteristic being linear for high magnitude spectral components and non-linear for low magnitude spectral components, wherein the average slope of the non-linear region represented on identical logarithmic axes does not exceed 10 at detectable signal levels, and second conversion means for receiving the output of the processing means and reconstituting therefrom a time-varying signal.
Preferably, the transition between the linear and non-linear regions of the characteristic is gradual and substantially without discontinuities in slope, so as to progressively roll off lower magnitude ~noise) spectral components.
Preferably, a level adjusting operation is performed so that the signal is maintained in a predetermined relation to the transfer characteristic, which may be an automatic gain control operation on the signal.
Preferably the first conversion operates on frames of the signal and uses a one dimensional or complex transform to produce a series of transform coefficients, and the second conversion applies the inverse transform to reconstitute the signal. In a preferred embodiment a Fast Fourier transform is utilized. Where such a transform is employed, it will be advantageous to provide shaping of each frame using a window function, so as to reduce r , ' ' , '. ~ ' ' ;' `' ,:
~ ;~
- _ 3 - i 3 3 2 ~ 2 ~
frequency 'leakage' when the frame is transformed. Where such a window function is employed, the sampled data ~ frames are preferably overlapped.
j In a second embodiment, several different transfer characteristics are employed within the processing so that ! a more severe attenuation is effected in certain spectral regions. ~Where the signal is a speech signal, these regions may be assigned on a fixed basis, employing knowledge of the spectral position of speech formant bands ¦ lo for an average speaker, or may be derived by the apparatus I for each speaker by initially measuring formant band I time-averaged positions.
In a third embodiment, several different transfer characteristics are employed, and the spectral positions of the dominant bands of the signal continuously tracked so that a more severe attenuation may be effected in spectral regions where there are no significant components of the signal. This is advantageously achieved by using a Line Spectral Pair ~LSP) technique with a filter of suitable order to track the formants of a speech signal.
A transmission channel may be pos~itioned either before or after the processing means, so that the apparatus may comprise a transform coding transmission system. In these aspects, also provided are a transmitter 1 2s including such processing means and, separately, a receiver including such processing means (in any such system, only one end needs the processing means).
According to another aspect of the invention there is provided a method of reducing noise in a time-varying signal comprising the steps of; converting the signal into a plurality of signals representing the magnitude of spectral components of the signal, processing each such - signal so that low magnitude spectral components are attenuated relative to high magnitude spectral components, I
:. ~. .. , . : . .
b'.," .. '' ": ., l332e2~
leaving the relationship between such high magnitude spectral components undistorted; and converting the signals thus processed so as to produce a reconstituted time-varying signal having an attenuated noise content.
Brief Description of the Drawings:
These embodiments~ of the invention will no~ be described by way of example with reference to the drawings, in which:
- Figure 1 shows schematically the method of the invention, and the operation of the apparatus of the invention;
- Figures 2a-b show schematically transfer characteristics in accordance with the invention drawn on logarithmic axes;
- Figure 3a-e shows schematically how a noisy triangular signal is processed by various stages of the invention;
- Figure 4 shows schematically apparatus according to a first embodiment of the invention;
- Figure 5 shows schematically the form of a window function for use in accordance with one embodiment of with the invention; 3 - Figure 6a-b shows the effect of overlapping frames of data in accordance with one embodiment of the invention;
- Figure 7a shows schematically a second embodiment of the invention;
- Figure 7b shows schematically a further modification of this second embodiment ; and , - Figure 8 shows schematically a third embodiment of the invention.
Description of Drawings:
Referring to Figure 1, a signal which includes noise is received and resolved into a series of signals representing the magnitude of the various components present; this first conversion operation could for example simply comprise filtering the signal through a plurality "Ç~ ' ., .
.,, .. - . . , ~; . .
~,, - ~ . . . .
... . .. .
` ~ 5 - i33~2~
of parallel band pass filters, but will preferably comprise performing a one dimensional or complex transform operation such as the Discrete Fourier transform (DFT) or the Discrete Cosine Transform (DCT) on frames of samples s of the signal.
The transform operation may be performed by a suitably ~programmed general purpose computer, or by separate conversion means such as one of the many dedicated Fast Fourier Transform chip packages currently o available.
The output may comprise parallel signals, as indicated, or these may be multiplexed into serial frames of spectral component data. These data are then processed in a manner which attenuates low magnitude spectral components relative to high magnitude spectral components.
If the output data from the first conversion stage comprises a frame of analogue representations of spectral components then the processing may be simply achieved by providing an element with a non-linear transfer characteristic (as hereinafter described); if the output data from the first conversion comprise~s a number of parallel analogue representations then a bank of such elements may be provided.
If the output from the first conversion stage is in digital form, it may readily be processed by general-purpose or dedicated digital data processing means programmed to provide a non-linear response, as hereinafter described, for example by providing a look-up ;~ table of output levels for given inputs or a polynomial approximating to the desired characteristic.
Referring to Figure 2a, which shows a typical non-linear characteristic exhibited by the processing stage, it will be evident that a signal representing a spectral component having a magnitude larger than the top .
1332~6 of the non-linear portion of the characteristic (in this case, labelled X dB) will be treated linearly by the processing stage, since the slope of the log/log representation of the characteristic is unity (it will be understood that on log/log axes, a non-linear function may be represented by a non-unity slope and references to 'non-linear' herein refer to normal rather than logarithmic axes). The relationship between the magnitudes of all spectral components having a magnitude o larger than X dB is therefore undisturbed by the processing staqe, since all such components are amplified or attenuated by an equal factor.
Although the non-linear portion of the curve shown in Figure 2a could theoretically follow any smooth curve lS between a straight line with unity slope and a vertical straight line, it will always be a compromise between these extremes, as the first is ineffective and the second (which corresponds to gating in the frequency domain) will generally introduce unacceptable distortion. The processed signal produced by the in~ention is thus a compromise between a reduced level of~ noise and an introduced level of distortion, and the acceptability of ~- the result is strongly dependent upon the shape of the~ nonlinear portion of the transfer characteristic, and on - 25 the position of the knee region relative to the signal level.
- Below the X dB point is a smooth 'knee' region, where the non-linear portion of the characteristic joins , the linear portion without discontinuities in slope.
Immediately below the knee region is a non-linear portion, which on the log/log plot in Figure 2a has an average slope of approximately 2.2 for most of its length. The ~ shape of the non-linear portion at very low input levels -~ is not particularly important, provided it continues to ,~
, ~ -, ,~
:::
~ ,, ~,~;;. : , . - -~ 7 ~ 1 3 ~ 2 g ~ ~
have a positive slope: the important features of the characteristic as a whole are that above the knee there is a linear portion so that the harmonic relationship of components above this level are undisturbed, that the non-linear portion should fall away steeply enough to attenuate noise below the knee region, and that the knee region itself should be a smooth curve so that the listener does not perceive any significant difference as a spectral component moves through the knee region with time.
o If the signal to noise ratio is high, a non-linear portion which deviates only slightly from linearity will be preferred so as to introduce the minimum signal distortion. For low signal to noise ratio conditions on the other hand, a greater deviation from linearity is required. Figure 2b shows an extreme example of a characteristic according to the invention in which on the log/log axes the non-linear portion has a slope of approximately 10 below the knee region down to the limit of audibility (labelled 'OdB'). Although noise is effectively reduced by this characteristic, the quality of a speech signal is distorted to a normally unacceptable (though intelligible) level so that for most speech signal purposes (for example telephone subscriber services) this represents the extreme limit to the severity of the non-linear portion.
Such a characteristic may be derived, for example, by iterative techniques. Equally, the production of an analogue device having such a transfer function is straightforward to one skilled in the art.
Finally, if the signals representing the spectral components are in fact simply those spectral components (as when a bank of band pass filters are used) then the transfer function of the processing means must be nonlinear with regard to the peak or average value of each `~
~'`''^~ .
~3~%~
component, rather than to its instantaneous value, or the signal will be distorted. The processing means is thus akin to an audio compander.
After processing, major components of the signal S will therefore have been passed by the processing means with linear amplification or attenuation, but noise in regions of the spectrum where there are no major components of the signal will have been relatively attenuated by a greater amount (as of course will weak o components of the signal).It will be seen that noise is not altogether removed, but merely relatively attenuated, and this gives a more natural sounding result during non-speech periods.
Referring again to ~igure 1, the signals representing the spectral components are then reconverted back to an intelligible time-varying signal by a second conversion stage which simply performs the inverse operation of the first conversion stage. In the case of a system employing a Discrete Fourier Transform as its first stage, for example, the second conversion performs the Inverse Discrete Fourier Transform (IDFT). ~
Referring now to Figure 3a-e, an input signal illustrated in this case by a triangular wave for simplicity is corrupted by random noise (see Figure 3a).
The input is resolved into its spectral components, so that for the triangular signal the signal power is concentrated in spectral components except at odd multiples of the fundamental frequency of the signal.
The magnitude of the noise signal in any frequency interval, on the other hand, is (for white nolse) proportional to the width of that frequency interval, so that the noise power is spread over the spectrum.
This is illustrated (diagramatically) in Figure 3b ~where it is apparent that the harmonic at 7 times the .
-., . . ~ .
...... . ~ . , 1332~r ~
fundamental frequency is below the level of the noise in that spectral region).
The processing stage characteristic shown in Figure 3c has a knee region at a point above the level of the noise (note that the transfer characteristic is illustrated for convenience with its axes reversed relative to Figure 2a and 2b, and with linear rather than logarithmic scales). If the slope of the linear portion of the characteristic on identical linear axes is 45 o degrees, for example, any signal above the knee region will be passed unattenuated and any signal below will be attenuated. In this case, the first three lines (n=1, 3 and 5) of the spectrum of the triangular signal are passed unattenuated and the noise spectrum (together with higher order lines of the signal spectrum3 are strongly attenuated (see Figure 3d).
The second conversion stage will then reconstruct a time-domain signal as indicated in Figure 3e, with the noise level strongly reduced, and some minor distortion of the signal produced by the attenuation of higher harmonics of the signal.
Figure 4 shows a specific embodiment of the invention in which each stage of signal processing is performed by discrete means. The first conversion stage is eff~cted by a conversion means- 1, which comprises a Fast Fourier Transform device of known type. Such a device is arranged to receive data input in frames of sampled values. For a speech signal, the length of such a frame should at any rate be shorter than the length of a syllable, and to maintain accuracy should preferably be as short as possible (a further factor is the possibility - that unacceptable delays may be introduced by long frames). On the other hand, to obtain a reasonable transform it is desirable to sample a large number of .
" :
,, ~, : - : .
13~2~
points which requires fairly long frames. In practice, frames of between 128 and 1024 points have been found practicable.
When using short frames and hence limited numbers of samples, the effects of the shape and si2e of the frame are evident in the transform as frequency "leakage" of the spectral components of the signal. The sampled frame is in effect the product of multiplying the input signal with a rectangular window function having a value of 1 during the sampling period and o before and afterwards.
It will be evident to one skilled in the art that the spectrum produced by the transform is therefore the convolution of the true signal spectrum with the transform of the rectangular window function, which will of course introduce extra unwanted frequency components (as explained for example in "Introduction to Digital Filter~ng " edited by R E Bogner and A G Constanides, published by John Wiley & Sons, at pl34 ). This problem can be to some extent compensated by the use of a non-rectangular window function to weight the sampled data, A great many functions of this type a~e known in the art.
Accordingly, conversion means 1 includes a window function means la, which multiplies received data points in a frame by windowing coefficients. P referably~ a Hanning function is employed. Figure 5 illustrates the general form of such a function.
Each such windowed frame is received by the transform means which executes a Fast Fourier Transform upon the data in known fashion and produces a number of spectral component signals (the Fourier coefficients), the number being governed by the number of sample data in each frame.
.- ~
_.
., - 11 1 3 ~ 2 ~ ~
The spectral components, which will usually comprise frames of digital samples, are then passed to a non-linear processing means 2 which may be provided for example by using a look-up table, and are either (if above the knee region of the characteristic) passed linearly or (if below the knee region of the characteristic) strongly relatively attenuate~ as described above.
The frames of processed spectral components are then passed to the second conversion means 3, which executes o the Inverse Fast Fourier Transform to reconstitute a time-domain signal.
Where a window function has been employed prior to transforming the input data, there will be variations in the level of the input to the transform device with time since the level will fall away towards each end of each frame, and so when the inverse transform is executed by ~he conversion means 3, the reconstituted time-domain signal is in effect amplitude modulated by the window function at the frame frequency. To reduce these amplitude variations, and hence improve the quality of the output signal, it is desirable to "overlap" data ~rom succeeding output frames (in a manner generally known in the art), which has the effect of restoring the envelope of the signal to a good approximation.
Accordingly, the second conversion means 3 includes an overlapping means 3a, such as a pair of overlapped data buffers 3b, 3c and an adder 3d, which produce frames of output data with some degree of overlap . The degree of overlapping that is necessary and desirable depends on the shape of the window function, and varies from zero in the case of a rectangular window upwards for other windows. In the case of a Hanning function, an overlap of 50% is found particularly effective.
.
B
- 12 - 1332~2~
Fiqure 6 shows the effect of overlapping by 50% of a frame. In Figure 6a, the amplitude of each output frame 1,2,3 produced by buffer 3b is multiplied by the window function so that there is an audible modulation at frame frequency. Buffer 3c produces an output of frames 1,2,3 but delayed by n samples (in other words 50% of the length of each frame). Adder 3d adds the outputs of buffers 3b and 3c together, in other words adds to each sample ik produced by buffer 3b, the corresponding sample ik_n lo produced by buffer 3c, to produce overlapped output frames - I,II,IlI.
- The means to effect such windowing and overlapping functions may, of course, comprise either analogue or digital means as convenient, and it will be understood that window function means la and overlapping means 3a ~ight be included within conversion means 1 and 3 respectively as part of a single chip device.
In many systems, the level of the signal may vary slowly with time (as in the case of a fading radio signal, for example) and, independently, the noise level may also vary. In some cases, the two wil~ vary together (as, for example, when an already noisy signal is subject to fading). For the invention to work effectively, it is desirable that most of the signal should remain above the knee region of the characteristic (and the knee region should remain above the noise level), and so some means of positioning the signal relative to the knee region is necessary (although it will be appreciated that the characteristic could itself be adjusted instead).
Accordingly, level adjusting means 4 and level restoring means 4a are provided (see Figure 4) which ensure that the signal is correctly positioned upon the trans~er characteristic of non-linear processing means 2.
As shown, the level adjusting means 4 detects slow changes -i' ~
.~ ~ .. ,. ~ ..
- 13 - ~3.~
in the total power of the signal, and amplifies or attenuates the signal to keep the noise spectrum below the knee and most of the signal above the knee. At the same time level adjusting means 4 sends a control signal to level restoring means 4a so that the processed signal may be restored to its original level. In the simple case where the levels of signal and noise vary together, without significant change in the signal-to-noise ratio, the level adjusting means 4 may be an automatic gain o control, and the level control signal is an indication of the gain which acts to control the gain of the level restoring means 4a (the response being slow enough to smooth out fluctuations in level caused by, for example, pauses between spoken words). The invention is generally most effective with signal-to-noise ratios of above ~lOdB, and preferably above +18dB, so the automatic gain control (which responds to the level of signal+noise) is effectively responding to the signal level.
With very low signal to noise ratio applications, however, ` the level adjusting means could alternately measure one or the other separately, ~although this separation is technically difficult.
Level adjusting means 4 could equally be placed between the transform means 1 and processing-means 2, so as to operate in the frequency domain, and likewise level restoring means 4a could equally be placed between processing means 2 and inverse transform means 3. In this case, an estimation of signal level can be made as before by examining the magnitude of the largest transform coefficients (which should usually represent signal terms).
Using this latter approach, it will also be possible under some circumstances to derive an approximate signal-to-noise ratio by comparing this signal level with a noise level derived from the magnitudes of the smallest ., . . . ~ . . - .
1332~
transform coefficients, which should represent noise data;
this may also be used to position the signal relative to the characteristic.
It is also possible to omit level restoring means 4a, if a constant level output signal is acceptable.
In a second embodiment of the invention, available knowledge about the spectral position of signal data may be utilized to further enhance the noise reduction capability of the invention. Human speech consists of a o mixture of ~voiced" and l'unvoiced" sounds, depending on the presence or absence of glottal action. In most cases these waveforms are processed by the vocal tract, which, being tubelike, gives rise to spectral enhancement in certain bands of frequencies. T hese enhancements are known as Iformants'.
The spectral position of each formant varie~ between individuals, and further varies while an individual is speaking.
Nonetheless, it will often be possible to statistically predict that signal information is more likely to lie in certain spectral bands t~an in others.
In a se~ond embodiment different non-linear processing is applied to spectral bands where signals are 5 liXely than is applied to bands~~where noise is likely. The non-linearity will be more pronounced in "noise" bands than in ~'signal" bands. A range of elements exhibiting different non-linear characteristics, either having different knee regions or different shapes in their non-linear regions, or both, may be provided so that the transition between spectral bands is smoothed.
In one such method illustrated in Figure ~a, a speech signal is level adjusted, windowed and transformed as previously described. The spectral component signals are then passed to processing means 2, which assigns ., - 15 - ~ 332~
different component signals to processing elements 2a, 2b, etc., having different characteristics. As sho~n, if the spectral component signals form a spatially separated series of signals, then signals are physically connected directly to processing elements 2a, 2b etc. Element 2a, having a very non-linear characteristic, is used to process signals in bands where speech components are statistically rare (noise bands) and element 2b, having a less non-linear characteristic, is employed to process o signals in bands where formants are commonly found (speech bands).
If the spectral component signals are provided in time-divided frames, then processing means 2 may include a demultiplexer (not shown) to assign the spectral component signals to discrete elements 2a, 2b etc, or a single processing element may be used and its characteristic controlled by control means (not shown) within the processing means 2, so that it exhibits the required predetermined characteristic for each spectral component signal. The processed signals are then retransformed and overlapped by second conversion means 3, a~d their level restored by level restoring means 4a, as described previously.
In another such method shown in Figure 7b, means are arranged to detect the time-averaged positions of signal bands and non-signal bands for each call over the initial part of the signal (for example the first few seconds of a phone call), and the output of such means is then used to assign the spectral components to processing elements as before for the duration of the call; this embodiment is therefore capable of adapting to different callers.
Referring to Figure 7b, the incoming signal is windowed and transformed as previously described. The spectral component signals are then passed to processing means 2, " ~
- 16 - 1 3 3~ ~ 2 g which assigns component signals to processing elements 2a, 2b, etc., having different characteristics The separately processed components are then recombined, retransformed and overlapped as previously described by conversion means 3.
The processing means 2 may include assignment means 20 capa~le of routing spectral component signals to different processing elements 2a, 2b, etc., in accordance with assignment control signals as shown, or alternatively lo the processing means 2 may comprise one or a plurality of processing elements with characteristics which may be varied in accordance with assignment control signals. The assignment control signals are here provided by averaging means 5, which de~rive time-averaged information on the positions of formant bands from the output of transform means 1 over the first part of a call and then transmit assignment control signals to processing means 2 to fix for the rest of the call the processing which each spectral component will undergo. The averaging means 5 could form part of the processing means.
It should be emphasized~ that in ~the above two versions of the second embodiment, data representing respectively the population-averaged or time-averaged likely positions of the speech formant bands is used to fix the processing applied to spectral components either for the duration of the call or for a relatively long re-adaptation period.
In a third embodiment of the invention, however, a means is provided for continuously tracking the positions of the formant bands during a call as illustrated in Figure 8. This enables a much closer and more rapid matching of the processing elements with the formant bands and corresponding more effective noise reduction, since noise outside the formant band can be virtually .. . .
,~............ ... : . .
- 17 - 1 ~ 32 ~
eliminated. The characteristics of the processing elements may be graduated between formant and non-formant regions, so as to produce a smooth transltion. The more the available data on the shape of the formant band, the more effective is the matching of the processing means.
one technique which may be employsd is the 'Line Spectral Pair~ or LSP technique which can provide an estimate of both formant frequency and formant width information if a filter of suitable order is employed.
o The operation of this embodiment is as described above for Figure 7b, except that instead of assigning the signals to processing once, the processing is continually reassigned in accordance with assignment control data from tracking means 6, which here comprises a means for executing an LSP analysis of the signal to determine its formant spectral positions and spectral widths.
It will be appreciated that references to speech signals above apply equally to any type of signal having a similar spectral content, and that the invention is applicable also to voiceband data signalling.
In many implementations, a signal (~for example, a speech signal) is decomposed into its spectral components at a transmitter, representations of the spectral components are transmitted to a receiver, and the original signal is there reconstituted. It will readily be appreciated that the invention described above is equally applicable to this class of coding schemes, to remove or reduce any broadband noise which accompanies the input signal (for example, broadband background noise in a speech system). Such implementations merely constitute positioning the transmission link between the non-linear processing stage and one of the transform stages. In a first such embodiment, an input signal is transform coded and the transform coefficients thus produced are processed _ .; ~ ..
t ' , ;.: . ' 1332~2~
according to one of the methods described above at the transmitter, the processed coe~ficients then being transmitted to a receiver of conventional type which affects the inverse transform to reconstitute the signal.
In a second such embodiment, the transform coder at the transmitter is of conventional type, and at the receiver the received transform coefficients are subjected to a non-linear processing stage as described above, prior to the inverse transform operation to reconstitute the o original signal.
It will be appreciated that although discrete means for performing each function are illustrated, the invention may be advantageously provided as a single integrated circuit, such as a suitably programmed Digital Signal Processing (DSP~ chip package, and in its method aspect, each step may be performed by a suitably programmed digital data processing means.
B
~ . ;.. ,. .. ., . .. , .. , . ~ ~ .
. Z:: ....
i.. ~ ~ .... .
. ;
This invention relates to a method of reducing the level of noise in a signal, and to apparatus for reducing noise using this method; particularly but not exclusively this invention relates to a method of reducing noise in a speech signal, and to apparatus for thus producing a speech signal with enhanced intelligibility.
A signal will often acquire broadband noise so that the time-average noise power is spread across a portion of lo the noise spectrum. In a speech system, noise may cause a listener severe fatigue or discomfort.
It is obviously desirable to reduce noise, and many methods of doing so are known; in speech systems, some types of noise are more perceptually acceptable than others. Especially desirable are methods which may be used with existing transmission equipment, and preferably are easily added at the receiver end. -It is known to reduce noise in high noise environments (-6 to +6dB signal-to-noise ratio) by so-called spectral subtraction techniques, in which the -~
signal is processed by transforming it into tXe frequency domain, then subtracting an estimate of the noise power in each spectral band, then re-transforming into the time domain. This technique suffers from several drawbacks, 25 however. Firstly, it is necessary to measure the noise ~-~
power in each spectral line; this involves identifying 'non-speech' periods, which can be complicated and unreliable. Secondly, it requires the assumption that the ~
noise spectrum is stationary between the instants at which ; ~-30 the noise power is measured; this is not necessarily the case. Thirdly, if an estimate of noise power made in one ~i~
non-speech period is applied to the next non-speech period ,,.
, .. -.,.. . . ... -,, . . ,- , ~ .. . - -. ! , . , . ., ' .; ., . . ,`,', ~ ' . ~ . : , '~:, ' - 2 - ~ 3 ~ 2 62 ~
correctly, there will be a total absence of background noise during non-speech periods, and this modulation of the background noise sounds unpleasant to a listener.
S According to the invention, there is provided a noise reduction apparatus comprising: first conversion means for receiving a time-varying signal and producing therefrom output signals representing the magnitude of spectral components thereof, processing means for receiving the output of the first conversion means, the processinq means having a non-linear transfer characteristic such that in use low magnitude inputs thereto are attenuated relative to high magnitude inputs, the transfer characteristic being linear for high magnitude spectral components and non-linear for low magnitude spectral components, wherein the average slope of the non-linear region represented on identical logarithmic axes does not exceed 10 at detectable signal levels, and second conversion means for receiving the output of the processing means and reconstituting therefrom a time-varying signal.
Preferably, the transition between the linear and non-linear regions of the characteristic is gradual and substantially without discontinuities in slope, so as to progressively roll off lower magnitude ~noise) spectral components.
Preferably, a level adjusting operation is performed so that the signal is maintained in a predetermined relation to the transfer characteristic, which may be an automatic gain control operation on the signal.
Preferably the first conversion operates on frames of the signal and uses a one dimensional or complex transform to produce a series of transform coefficients, and the second conversion applies the inverse transform to reconstitute the signal. In a preferred embodiment a Fast Fourier transform is utilized. Where such a transform is employed, it will be advantageous to provide shaping of each frame using a window function, so as to reduce r , ' ' , '. ~ ' ' ;' `' ,:
~ ;~
- _ 3 - i 3 3 2 ~ 2 ~
frequency 'leakage' when the frame is transformed. Where such a window function is employed, the sampled data ~ frames are preferably overlapped.
j In a second embodiment, several different transfer characteristics are employed within the processing so that ! a more severe attenuation is effected in certain spectral regions. ~Where the signal is a speech signal, these regions may be assigned on a fixed basis, employing knowledge of the spectral position of speech formant bands ¦ lo for an average speaker, or may be derived by the apparatus I for each speaker by initially measuring formant band I time-averaged positions.
In a third embodiment, several different transfer characteristics are employed, and the spectral positions of the dominant bands of the signal continuously tracked so that a more severe attenuation may be effected in spectral regions where there are no significant components of the signal. This is advantageously achieved by using a Line Spectral Pair ~LSP) technique with a filter of suitable order to track the formants of a speech signal.
A transmission channel may be pos~itioned either before or after the processing means, so that the apparatus may comprise a transform coding transmission system. In these aspects, also provided are a transmitter 1 2s including such processing means and, separately, a receiver including such processing means (in any such system, only one end needs the processing means).
According to another aspect of the invention there is provided a method of reducing noise in a time-varying signal comprising the steps of; converting the signal into a plurality of signals representing the magnitude of spectral components of the signal, processing each such - signal so that low magnitude spectral components are attenuated relative to high magnitude spectral components, I
:. ~. .. , . : . .
b'.," .. '' ": ., l332e2~
leaving the relationship between such high magnitude spectral components undistorted; and converting the signals thus processed so as to produce a reconstituted time-varying signal having an attenuated noise content.
Brief Description of the Drawings:
These embodiments~ of the invention will no~ be described by way of example with reference to the drawings, in which:
- Figure 1 shows schematically the method of the invention, and the operation of the apparatus of the invention;
- Figures 2a-b show schematically transfer characteristics in accordance with the invention drawn on logarithmic axes;
- Figure 3a-e shows schematically how a noisy triangular signal is processed by various stages of the invention;
- Figure 4 shows schematically apparatus according to a first embodiment of the invention;
- Figure 5 shows schematically the form of a window function for use in accordance with one embodiment of with the invention; 3 - Figure 6a-b shows the effect of overlapping frames of data in accordance with one embodiment of the invention;
- Figure 7a shows schematically a second embodiment of the invention;
- Figure 7b shows schematically a further modification of this second embodiment ; and , - Figure 8 shows schematically a third embodiment of the invention.
Description of Drawings:
Referring to Figure 1, a signal which includes noise is received and resolved into a series of signals representing the magnitude of the various components present; this first conversion operation could for example simply comprise filtering the signal through a plurality "Ç~ ' ., .
.,, .. - . . , ~; . .
~,, - ~ . . . .
... . .. .
` ~ 5 - i33~2~
of parallel band pass filters, but will preferably comprise performing a one dimensional or complex transform operation such as the Discrete Fourier transform (DFT) or the Discrete Cosine Transform (DCT) on frames of samples s of the signal.
The transform operation may be performed by a suitably ~programmed general purpose computer, or by separate conversion means such as one of the many dedicated Fast Fourier Transform chip packages currently o available.
The output may comprise parallel signals, as indicated, or these may be multiplexed into serial frames of spectral component data. These data are then processed in a manner which attenuates low magnitude spectral components relative to high magnitude spectral components.
If the output data from the first conversion stage comprises a frame of analogue representations of spectral components then the processing may be simply achieved by providing an element with a non-linear transfer characteristic (as hereinafter described); if the output data from the first conversion comprise~s a number of parallel analogue representations then a bank of such elements may be provided.
If the output from the first conversion stage is in digital form, it may readily be processed by general-purpose or dedicated digital data processing means programmed to provide a non-linear response, as hereinafter described, for example by providing a look-up ;~ table of output levels for given inputs or a polynomial approximating to the desired characteristic.
Referring to Figure 2a, which shows a typical non-linear characteristic exhibited by the processing stage, it will be evident that a signal representing a spectral component having a magnitude larger than the top .
1332~6 of the non-linear portion of the characteristic (in this case, labelled X dB) will be treated linearly by the processing stage, since the slope of the log/log representation of the characteristic is unity (it will be understood that on log/log axes, a non-linear function may be represented by a non-unity slope and references to 'non-linear' herein refer to normal rather than logarithmic axes). The relationship between the magnitudes of all spectral components having a magnitude o larger than X dB is therefore undisturbed by the processing staqe, since all such components are amplified or attenuated by an equal factor.
Although the non-linear portion of the curve shown in Figure 2a could theoretically follow any smooth curve lS between a straight line with unity slope and a vertical straight line, it will always be a compromise between these extremes, as the first is ineffective and the second (which corresponds to gating in the frequency domain) will generally introduce unacceptable distortion. The processed signal produced by the in~ention is thus a compromise between a reduced level of~ noise and an introduced level of distortion, and the acceptability of ~- the result is strongly dependent upon the shape of the~ nonlinear portion of the transfer characteristic, and on - 25 the position of the knee region relative to the signal level.
- Below the X dB point is a smooth 'knee' region, where the non-linear portion of the characteristic joins , the linear portion without discontinuities in slope.
Immediately below the knee region is a non-linear portion, which on the log/log plot in Figure 2a has an average slope of approximately 2.2 for most of its length. The ~ shape of the non-linear portion at very low input levels -~ is not particularly important, provided it continues to ,~
, ~ -, ,~
:::
~ ,, ~,~;;. : , . - -~ 7 ~ 1 3 ~ 2 g ~ ~
have a positive slope: the important features of the characteristic as a whole are that above the knee there is a linear portion so that the harmonic relationship of components above this level are undisturbed, that the non-linear portion should fall away steeply enough to attenuate noise below the knee region, and that the knee region itself should be a smooth curve so that the listener does not perceive any significant difference as a spectral component moves through the knee region with time.
o If the signal to noise ratio is high, a non-linear portion which deviates only slightly from linearity will be preferred so as to introduce the minimum signal distortion. For low signal to noise ratio conditions on the other hand, a greater deviation from linearity is required. Figure 2b shows an extreme example of a characteristic according to the invention in which on the log/log axes the non-linear portion has a slope of approximately 10 below the knee region down to the limit of audibility (labelled 'OdB'). Although noise is effectively reduced by this characteristic, the quality of a speech signal is distorted to a normally unacceptable (though intelligible) level so that for most speech signal purposes (for example telephone subscriber services) this represents the extreme limit to the severity of the non-linear portion.
Such a characteristic may be derived, for example, by iterative techniques. Equally, the production of an analogue device having such a transfer function is straightforward to one skilled in the art.
Finally, if the signals representing the spectral components are in fact simply those spectral components (as when a bank of band pass filters are used) then the transfer function of the processing means must be nonlinear with regard to the peak or average value of each `~
~'`''^~ .
~3~%~
component, rather than to its instantaneous value, or the signal will be distorted. The processing means is thus akin to an audio compander.
After processing, major components of the signal S will therefore have been passed by the processing means with linear amplification or attenuation, but noise in regions of the spectrum where there are no major components of the signal will have been relatively attenuated by a greater amount (as of course will weak o components of the signal).It will be seen that noise is not altogether removed, but merely relatively attenuated, and this gives a more natural sounding result during non-speech periods.
Referring again to ~igure 1, the signals representing the spectral components are then reconverted back to an intelligible time-varying signal by a second conversion stage which simply performs the inverse operation of the first conversion stage. In the case of a system employing a Discrete Fourier Transform as its first stage, for example, the second conversion performs the Inverse Discrete Fourier Transform (IDFT). ~
Referring now to Figure 3a-e, an input signal illustrated in this case by a triangular wave for simplicity is corrupted by random noise (see Figure 3a).
The input is resolved into its spectral components, so that for the triangular signal the signal power is concentrated in spectral components except at odd multiples of the fundamental frequency of the signal.
The magnitude of the noise signal in any frequency interval, on the other hand, is (for white nolse) proportional to the width of that frequency interval, so that the noise power is spread over the spectrum.
This is illustrated (diagramatically) in Figure 3b ~where it is apparent that the harmonic at 7 times the .
-., . . ~ .
...... . ~ . , 1332~r ~
fundamental frequency is below the level of the noise in that spectral region).
The processing stage characteristic shown in Figure 3c has a knee region at a point above the level of the noise (note that the transfer characteristic is illustrated for convenience with its axes reversed relative to Figure 2a and 2b, and with linear rather than logarithmic scales). If the slope of the linear portion of the characteristic on identical linear axes is 45 o degrees, for example, any signal above the knee region will be passed unattenuated and any signal below will be attenuated. In this case, the first three lines (n=1, 3 and 5) of the spectrum of the triangular signal are passed unattenuated and the noise spectrum (together with higher order lines of the signal spectrum3 are strongly attenuated (see Figure 3d).
The second conversion stage will then reconstruct a time-domain signal as indicated in Figure 3e, with the noise level strongly reduced, and some minor distortion of the signal produced by the attenuation of higher harmonics of the signal.
Figure 4 shows a specific embodiment of the invention in which each stage of signal processing is performed by discrete means. The first conversion stage is eff~cted by a conversion means- 1, which comprises a Fast Fourier Transform device of known type. Such a device is arranged to receive data input in frames of sampled values. For a speech signal, the length of such a frame should at any rate be shorter than the length of a syllable, and to maintain accuracy should preferably be as short as possible (a further factor is the possibility - that unacceptable delays may be introduced by long frames). On the other hand, to obtain a reasonable transform it is desirable to sample a large number of .
" :
,, ~, : - : .
13~2~
points which requires fairly long frames. In practice, frames of between 128 and 1024 points have been found practicable.
When using short frames and hence limited numbers of samples, the effects of the shape and si2e of the frame are evident in the transform as frequency "leakage" of the spectral components of the signal. The sampled frame is in effect the product of multiplying the input signal with a rectangular window function having a value of 1 during the sampling period and o before and afterwards.
It will be evident to one skilled in the art that the spectrum produced by the transform is therefore the convolution of the true signal spectrum with the transform of the rectangular window function, which will of course introduce extra unwanted frequency components (as explained for example in "Introduction to Digital Filter~ng " edited by R E Bogner and A G Constanides, published by John Wiley & Sons, at pl34 ). This problem can be to some extent compensated by the use of a non-rectangular window function to weight the sampled data, A great many functions of this type a~e known in the art.
Accordingly, conversion means 1 includes a window function means la, which multiplies received data points in a frame by windowing coefficients. P referably~ a Hanning function is employed. Figure 5 illustrates the general form of such a function.
Each such windowed frame is received by the transform means which executes a Fast Fourier Transform upon the data in known fashion and produces a number of spectral component signals (the Fourier coefficients), the number being governed by the number of sample data in each frame.
.- ~
_.
., - 11 1 3 ~ 2 ~ ~
The spectral components, which will usually comprise frames of digital samples, are then passed to a non-linear processing means 2 which may be provided for example by using a look-up table, and are either (if above the knee region of the characteristic) passed linearly or (if below the knee region of the characteristic) strongly relatively attenuate~ as described above.
The frames of processed spectral components are then passed to the second conversion means 3, which executes o the Inverse Fast Fourier Transform to reconstitute a time-domain signal.
Where a window function has been employed prior to transforming the input data, there will be variations in the level of the input to the transform device with time since the level will fall away towards each end of each frame, and so when the inverse transform is executed by ~he conversion means 3, the reconstituted time-domain signal is in effect amplitude modulated by the window function at the frame frequency. To reduce these amplitude variations, and hence improve the quality of the output signal, it is desirable to "overlap" data ~rom succeeding output frames (in a manner generally known in the art), which has the effect of restoring the envelope of the signal to a good approximation.
Accordingly, the second conversion means 3 includes an overlapping means 3a, such as a pair of overlapped data buffers 3b, 3c and an adder 3d, which produce frames of output data with some degree of overlap . The degree of overlapping that is necessary and desirable depends on the shape of the window function, and varies from zero in the case of a rectangular window upwards for other windows. In the case of a Hanning function, an overlap of 50% is found particularly effective.
.
B
- 12 - 1332~2~
Fiqure 6 shows the effect of overlapping by 50% of a frame. In Figure 6a, the amplitude of each output frame 1,2,3 produced by buffer 3b is multiplied by the window function so that there is an audible modulation at frame frequency. Buffer 3c produces an output of frames 1,2,3 but delayed by n samples (in other words 50% of the length of each frame). Adder 3d adds the outputs of buffers 3b and 3c together, in other words adds to each sample ik produced by buffer 3b, the corresponding sample ik_n lo produced by buffer 3c, to produce overlapped output frames - I,II,IlI.
- The means to effect such windowing and overlapping functions may, of course, comprise either analogue or digital means as convenient, and it will be understood that window function means la and overlapping means 3a ~ight be included within conversion means 1 and 3 respectively as part of a single chip device.
In many systems, the level of the signal may vary slowly with time (as in the case of a fading radio signal, for example) and, independently, the noise level may also vary. In some cases, the two wil~ vary together (as, for example, when an already noisy signal is subject to fading). For the invention to work effectively, it is desirable that most of the signal should remain above the knee region of the characteristic (and the knee region should remain above the noise level), and so some means of positioning the signal relative to the knee region is necessary (although it will be appreciated that the characteristic could itself be adjusted instead).
Accordingly, level adjusting means 4 and level restoring means 4a are provided (see Figure 4) which ensure that the signal is correctly positioned upon the trans~er characteristic of non-linear processing means 2.
As shown, the level adjusting means 4 detects slow changes -i' ~
.~ ~ .. ,. ~ ..
- 13 - ~3.~
in the total power of the signal, and amplifies or attenuates the signal to keep the noise spectrum below the knee and most of the signal above the knee. At the same time level adjusting means 4 sends a control signal to level restoring means 4a so that the processed signal may be restored to its original level. In the simple case where the levels of signal and noise vary together, without significant change in the signal-to-noise ratio, the level adjusting means 4 may be an automatic gain o control, and the level control signal is an indication of the gain which acts to control the gain of the level restoring means 4a (the response being slow enough to smooth out fluctuations in level caused by, for example, pauses between spoken words). The invention is generally most effective with signal-to-noise ratios of above ~lOdB, and preferably above +18dB, so the automatic gain control (which responds to the level of signal+noise) is effectively responding to the signal level.
With very low signal to noise ratio applications, however, ` the level adjusting means could alternately measure one or the other separately, ~although this separation is technically difficult.
Level adjusting means 4 could equally be placed between the transform means 1 and processing-means 2, so as to operate in the frequency domain, and likewise level restoring means 4a could equally be placed between processing means 2 and inverse transform means 3. In this case, an estimation of signal level can be made as before by examining the magnitude of the largest transform coefficients (which should usually represent signal terms).
Using this latter approach, it will also be possible under some circumstances to derive an approximate signal-to-noise ratio by comparing this signal level with a noise level derived from the magnitudes of the smallest ., . . . ~ . . - .
1332~
transform coefficients, which should represent noise data;
this may also be used to position the signal relative to the characteristic.
It is also possible to omit level restoring means 4a, if a constant level output signal is acceptable.
In a second embodiment of the invention, available knowledge about the spectral position of signal data may be utilized to further enhance the noise reduction capability of the invention. Human speech consists of a o mixture of ~voiced" and l'unvoiced" sounds, depending on the presence or absence of glottal action. In most cases these waveforms are processed by the vocal tract, which, being tubelike, gives rise to spectral enhancement in certain bands of frequencies. T hese enhancements are known as Iformants'.
The spectral position of each formant varie~ between individuals, and further varies while an individual is speaking.
Nonetheless, it will often be possible to statistically predict that signal information is more likely to lie in certain spectral bands t~an in others.
In a se~ond embodiment different non-linear processing is applied to spectral bands where signals are 5 liXely than is applied to bands~~where noise is likely. The non-linearity will be more pronounced in "noise" bands than in ~'signal" bands. A range of elements exhibiting different non-linear characteristics, either having different knee regions or different shapes in their non-linear regions, or both, may be provided so that the transition between spectral bands is smoothed.
In one such method illustrated in Figure ~a, a speech signal is level adjusted, windowed and transformed as previously described. The spectral component signals are then passed to processing means 2, which assigns ., - 15 - ~ 332~
different component signals to processing elements 2a, 2b, etc., having different characteristics. As sho~n, if the spectral component signals form a spatially separated series of signals, then signals are physically connected directly to processing elements 2a, 2b etc. Element 2a, having a very non-linear characteristic, is used to process signals in bands where speech components are statistically rare (noise bands) and element 2b, having a less non-linear characteristic, is employed to process o signals in bands where formants are commonly found (speech bands).
If the spectral component signals are provided in time-divided frames, then processing means 2 may include a demultiplexer (not shown) to assign the spectral component signals to discrete elements 2a, 2b etc, or a single processing element may be used and its characteristic controlled by control means (not shown) within the processing means 2, so that it exhibits the required predetermined characteristic for each spectral component signal. The processed signals are then retransformed and overlapped by second conversion means 3, a~d their level restored by level restoring means 4a, as described previously.
In another such method shown in Figure 7b, means are arranged to detect the time-averaged positions of signal bands and non-signal bands for each call over the initial part of the signal (for example the first few seconds of a phone call), and the output of such means is then used to assign the spectral components to processing elements as before for the duration of the call; this embodiment is therefore capable of adapting to different callers.
Referring to Figure 7b, the incoming signal is windowed and transformed as previously described. The spectral component signals are then passed to processing means 2, " ~
- 16 - 1 3 3~ ~ 2 g which assigns component signals to processing elements 2a, 2b, etc., having different characteristics The separately processed components are then recombined, retransformed and overlapped as previously described by conversion means 3.
The processing means 2 may include assignment means 20 capa~le of routing spectral component signals to different processing elements 2a, 2b, etc., in accordance with assignment control signals as shown, or alternatively lo the processing means 2 may comprise one or a plurality of processing elements with characteristics which may be varied in accordance with assignment control signals. The assignment control signals are here provided by averaging means 5, which de~rive time-averaged information on the positions of formant bands from the output of transform means 1 over the first part of a call and then transmit assignment control signals to processing means 2 to fix for the rest of the call the processing which each spectral component will undergo. The averaging means 5 could form part of the processing means.
It should be emphasized~ that in ~the above two versions of the second embodiment, data representing respectively the population-averaged or time-averaged likely positions of the speech formant bands is used to fix the processing applied to spectral components either for the duration of the call or for a relatively long re-adaptation period.
In a third embodiment of the invention, however, a means is provided for continuously tracking the positions of the formant bands during a call as illustrated in Figure 8. This enables a much closer and more rapid matching of the processing elements with the formant bands and corresponding more effective noise reduction, since noise outside the formant band can be virtually .. . .
,~............ ... : . .
- 17 - 1 ~ 32 ~
eliminated. The characteristics of the processing elements may be graduated between formant and non-formant regions, so as to produce a smooth transltion. The more the available data on the shape of the formant band, the more effective is the matching of the processing means.
one technique which may be employsd is the 'Line Spectral Pair~ or LSP technique which can provide an estimate of both formant frequency and formant width information if a filter of suitable order is employed.
o The operation of this embodiment is as described above for Figure 7b, except that instead of assigning the signals to processing once, the processing is continually reassigned in accordance with assignment control data from tracking means 6, which here comprises a means for executing an LSP analysis of the signal to determine its formant spectral positions and spectral widths.
It will be appreciated that references to speech signals above apply equally to any type of signal having a similar spectral content, and that the invention is applicable also to voiceband data signalling.
In many implementations, a signal (~for example, a speech signal) is decomposed into its spectral components at a transmitter, representations of the spectral components are transmitted to a receiver, and the original signal is there reconstituted. It will readily be appreciated that the invention described above is equally applicable to this class of coding schemes, to remove or reduce any broadband noise which accompanies the input signal (for example, broadband background noise in a speech system). Such implementations merely constitute positioning the transmission link between the non-linear processing stage and one of the transform stages. In a first such embodiment, an input signal is transform coded and the transform coefficients thus produced are processed _ .; ~ ..
t ' , ;.: . ' 1332~2~
according to one of the methods described above at the transmitter, the processed coe~ficients then being transmitted to a receiver of conventional type which affects the inverse transform to reconstitute the signal.
In a second such embodiment, the transform coder at the transmitter is of conventional type, and at the receiver the received transform coefficients are subjected to a non-linear processing stage as described above, prior to the inverse transform operation to reconstitute the o original signal.
It will be appreciated that although discrete means for performing each function are illustrated, the invention may be advantageously provided as a single integrated circuit, such as a suitably programmed Digital Signal Processing (DSP~ chip package, and in its method aspect, each step may be performed by a suitably programmed digital data processing means.
B
~ . ;.. ,. .. ., . .. , .. , . ~ ~ .
. Z:: ....
i.. ~ ~ .... .
. ;
Claims (19)
1. A noise reduction apparatus comprising:
first conversion means for receiving a time-varying signal and producing therefrom output signals representing the magnitude of spectral components thereof;
processing means for receiving the output of the first conversion means, the processing means having a non-linear transfer characteristic such that in use low magnitude inputs thereto are attenuated relative to high magnitude inputs, the transfer characteristic being substantially linear for high magnitude spectral components and non-linear for low magnitude spectral components, wherein the average slope of the non-linear region represented on identical logarithmic axes does not exceed 10 at detectable signal levels; and second conversion means for receiving the output of the processing means and reconstituting thereform a time-varying signal.
first conversion means for receiving a time-varying signal and producing therefrom output signals representing the magnitude of spectral components thereof;
processing means for receiving the output of the first conversion means, the processing means having a non-linear transfer characteristic such that in use low magnitude inputs thereto are attenuated relative to high magnitude inputs, the transfer characteristic being substantially linear for high magnitude spectral components and non-linear for low magnitude spectral components, wherein the average slope of the non-linear region represented on identical logarithmic axes does not exceed 10 at detectable signal levels; and second conversion means for receiving the output of the processing means and reconstituting thereform a time-varying signal.
2. Apparatus as claimed in claim 1, in which the transition from the linear to the non-linear region of the characteristic is gradual and substantially without discontinuities in slope.
3. Apparatus as claimed in claim 1, in which the first conversion means is arranged to receive the time-varying signal in frames, and to apply to the frames a one-dimensional or complex transform, the output signals thereof being transform coefficients, and the second conversion means is arranged in operation to apply the inverse of that transform to the processed transform coefficients.
4. Apparatus as claimed in claim 3, in which the transform is a Fast Fourier transform.
5. Apparatus as claimed in claim 3, in which the first conversion means is arranged to multiply each of the frames by a window function prior to transforming that frame.
6. Apparatus as claimed in claim 4, in which the first conversion means is arranged to multiply each of the frames by a window function prior to transforming that frame.
7. Apparatus as claimed in claim 5 or 6, in which the second conversion means is arranged to overlap consecutive frames.
8. Apparatus as claimed in claim 1, in which the processing means is arranged to employ a plurality of different transfer characteristics, assigned to spectral component signals corresponding to different portions of the frequency spectrum.
9. Apparatus as claimed in claim 8, in which the frequency assignment of the said different transfer characteristics is predetermined.
10. Apparatus as claimed in claim 9, including means arranged to derive a time-averaged spectral distribution of components of the signal, and to periodically determine the frequency assignment of different transfer characteristics in accordance therewith.
11. Apparatus as claimed in claim 8, in which is provided means arranged to detect the spectral position of components of the signal, and to vary the frequency assignment of different transfer characteristics in accordance therewith.
12. Apparatus as claimed in claim 11, in which the tracking means employs a Line Spectral Pair analysis method.
13. Apparatus as claimed in any one of claims 1, 8, 9, 10, 11 or 12 including level adjusting means adapted to adjust the level of the signals received by the processing means so as to maintain the level of at least some spectral components within a predetermined relationship to the or each transfer characteristic of the processing means.
14. Apparatus according to claim 13, in which the level adjusting means is an automatic gain control circuit responsive to the average level of the time varying signal.
15. Apparatus as claimed in any one of claims 1, 8, 9, 10, 11 or 12, including means adapted to scale the, or each, characteristic of the processing means so as to maintain the level of at least some spectral components within a predetermined relationship to the, or each, transfer characteristic.
16. Subscriber telephone apparatus including noise reduction apparatus according to claim l, 8, 9, 10, 11 or 12.
17. A method of reducing noise in a time-varying signal comprising the steps of:
converting the signal into a plurality of signals representing the magnitude of spectral components of the signal;
processing each such signal so that low magnitude spectral components are attenuated relative to high magnitude spectral components leaving the relationship between such high magnitude spectral components undistorted wherein the average slope of the non-linear region represented on identical logarithmic axes does not exceed 10 at detectable signal levels;
and converting the signals thus processed so as to produce a reconstituted time-varying signal having an attenuated noise content.
converting the signal into a plurality of signals representing the magnitude of spectral components of the signal;
processing each such signal so that low magnitude spectral components are attenuated relative to high magnitude spectral components leaving the relationship between such high magnitude spectral components undistorted wherein the average slope of the non-linear region represented on identical logarithmic axes does not exceed 10 at detectable signal levels;
and converting the signals thus processed so as to produce a reconstituted time-varying signal having an attenuated noise content.
18. A method of reducing noise as claimed in claim 17, in which the transition between the linear and non-linear processing is gradual and substantially without discontinuities in slope.
19. A method of reducing noise as claimed in claim 17 or 18, in which signals representing different spectral components are differently processed.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB8801014 | 1988-01-18 | ||
GB888801014A GB8801014D0 (en) | 1988-01-18 | 1988-01-18 | Noise reduction |
Publications (1)
Publication Number | Publication Date |
---|---|
CA1332626C true CA1332626C (en) | 1994-10-18 |
Family
ID=10630116
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA000588588A Expired - Fee Related CA1332626C (en) | 1988-01-18 | 1989-01-18 | Noise reduction |
Country Status (8)
Country | Link |
---|---|
US (1) | US5133013A (en) |
EP (1) | EP0367803B1 (en) |
JP (1) | JP3204501B2 (en) |
CA (1) | CA1332626C (en) |
DE (1) | DE68913139T2 (en) |
GB (2) | GB8801014D0 (en) |
HK (1) | HK121496A (en) |
WO (1) | WO1989006877A1 (en) |
Families Citing this family (56)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5479560A (en) * | 1992-10-30 | 1995-12-26 | Technology Research Association Of Medical And Welfare Apparatus | Formant detecting device and speech processing apparatus |
GB2272615A (en) * | 1992-11-17 | 1994-05-18 | Rudolf Bisping | Controlling signal-to-noise ratio in noisy recordings |
US5742927A (en) * | 1993-02-12 | 1998-04-21 | British Telecommunications Public Limited Company | Noise reduction apparatus using spectral subtraction or scaling and signal attenuation between formant regions |
US5432859A (en) * | 1993-02-23 | 1995-07-11 | Novatel Communications Ltd. | Noise-reduction system |
US5533133A (en) * | 1993-03-26 | 1996-07-02 | Hughes Aircraft Company | Noise suppression in digital voice communications systems |
JP3626492B2 (en) * | 1993-07-07 | 2005-03-09 | ポリコム・インコーポレイテッド | Reduce background noise to improve conversation quality |
US5651071A (en) * | 1993-09-17 | 1997-07-22 | Audiologic, Inc. | Noise reduction system for binaural hearing aid |
JPH07193548A (en) * | 1993-12-25 | 1995-07-28 | Sony Corp | Noise reduction processing method |
GB9405211D0 (en) * | 1994-03-17 | 1994-04-27 | Deas Alexander R | Noise cancellation apparatus |
EP0693747A3 (en) * | 1994-07-18 | 1997-12-29 | Gec-Marconi Limited | An apparatus for cancelling vibrations |
GB9414484D0 (en) * | 1994-07-18 | 1994-09-21 | Marconi Gec Ltd | An apparatus for cancelling vibrations |
JPH08102687A (en) * | 1994-09-29 | 1996-04-16 | Yamaha Corp | Aural transmission/reception system |
EP0797824B1 (en) * | 1994-12-15 | 2000-03-08 | BRITISH TELECOMMUNICATIONS public limited company | Speech processing |
US5774846A (en) * | 1994-12-19 | 1998-06-30 | Matsushita Electric Industrial Co., Ltd. | Speech coding apparatus, linear prediction coefficient analyzing apparatus and noise reducing apparatus |
SE505156C2 (en) * | 1995-01-30 | 1997-07-07 | Ericsson Telefon Ab L M | Procedure for noise suppression by spectral subtraction |
US6263307B1 (en) * | 1995-04-19 | 2001-07-17 | Texas Instruments Incorporated | Adaptive weiner filtering using line spectral frequencies |
JP3591068B2 (en) * | 1995-06-30 | 2004-11-17 | ソニー株式会社 | Noise reduction method for audio signal |
FI100840B (en) | 1995-12-12 | 1998-02-27 | Nokia Mobile Phones Ltd | Noise attenuator and method for attenuating background noise from noisy speech and a mobile station |
SE9700772D0 (en) * | 1997-03-03 | 1997-03-03 | Ericsson Telefon Ab L M | A high resolution post processing method for a speech decoder |
AU764316C (en) * | 1997-04-16 | 2004-06-24 | Emma Mixed Signal C.V. | Apparatus for noise reduction, particulary in hearing aids |
CA2286268C (en) * | 1997-04-16 | 2005-01-04 | Dspfactory Ltd. | Method and apparatus for noise reduction, particularly in hearing aids |
AU8102198A (en) * | 1997-07-01 | 1999-01-25 | Partran Aps | A method of noise reduction in speech signals and an apparatus for performing the method |
GB2343822B (en) * | 1997-07-02 | 2000-11-29 | Simoco Int Ltd | Method and apparatus for speech enhancement in a speech communication system |
GB9714001D0 (en) * | 1997-07-02 | 1997-09-10 | Simoco Europ Limited | Method and apparatus for speech enhancement in a speech communication system |
US5913187A (en) * | 1997-08-29 | 1999-06-15 | Nortel Networks Corporation | Nonlinear filter for noise suppression in linear prediction speech processing devices |
US6157908A (en) * | 1998-01-27 | 2000-12-05 | Hm Electronics, Inc. | Order point communication system and method |
US6415253B1 (en) * | 1998-02-20 | 2002-07-02 | Meta-C Corporation | Method and apparatus for enhancing noise-corrupted speech |
CN1258368A (en) * | 1998-03-30 | 2000-06-28 | 三菱电机株式会社 | Noise reduction device and noise reduction method |
US6453289B1 (en) * | 1998-07-24 | 2002-09-17 | Hughes Electronics Corporation | Method of noise reduction for speech codecs |
US6328507B1 (en) | 1998-08-04 | 2001-12-11 | Shoda Iron Works Co., Ltd | Working table apparatus for a cutting machine tool |
US7991448B2 (en) * | 1998-10-15 | 2011-08-02 | Philips Electronics North America Corporation | Method, apparatus, and system for removing motion artifacts from measurements of bodily parameters |
US6519486B1 (en) | 1998-10-15 | 2003-02-11 | Ntc Technology Inc. | Method, apparatus and system for removing motion artifacts from measurements of bodily parameters |
US6993480B1 (en) | 1998-11-03 | 2006-01-31 | Srs Labs, Inc. | Voice intelligibility enhancement system |
US6604071B1 (en) * | 1999-02-09 | 2003-08-05 | At&T Corp. | Speech enhancement with gain limitations based on speech activity |
JP3454190B2 (en) * | 1999-06-09 | 2003-10-06 | 三菱電機株式会社 | Noise suppression apparatus and method |
US6738445B1 (en) | 1999-11-26 | 2004-05-18 | Ivl Technologies Ltd. | Method and apparatus for changing the frequency content of an input signal and for changing perceptibility of a component of an input signal |
US6931292B1 (en) | 2000-06-19 | 2005-08-16 | Jabra Corporation | Noise reduction method and apparatus |
US20030216907A1 (en) * | 2002-05-14 | 2003-11-20 | Acoustic Technologies, Inc. | Enhancing the aural perception of speech |
US7292985B2 (en) * | 2004-12-02 | 2007-11-06 | Janus Development Group | Device and method for reducing stuttering |
US7672842B2 (en) * | 2006-07-26 | 2010-03-02 | Mitsubishi Electric Research Laboratories, Inc. | Method and system for FFT-based companding for automatic speech recognition |
US7459962B2 (en) * | 2006-07-26 | 2008-12-02 | The Boeing Company | Transient signal detection algorithm using order statistic filters applied to the power spectral estimate |
JP4827661B2 (en) * | 2006-08-30 | 2011-11-30 | 富士通株式会社 | Signal processing method and apparatus |
US7925307B2 (en) * | 2006-10-31 | 2011-04-12 | Palm, Inc. | Audio output using multiple speakers |
US8311590B2 (en) * | 2006-12-05 | 2012-11-13 | Hewlett-Packard Development Company, L.P. | System and method for improved loudspeaker functionality |
US8050434B1 (en) | 2006-12-21 | 2011-11-01 | Srs Labs, Inc. | Multi-channel audio enhancement system |
DK2201567T3 (en) * | 2007-07-27 | 2018-01-08 | Stichting Vumc | NOISE DUTY IN SPEECH SIGNALS |
US8355908B2 (en) * | 2008-03-24 | 2013-01-15 | JVC Kenwood Corporation | Audio signal processing device for noise reduction and audio enhancement, and method for the same |
ATE552690T1 (en) * | 2008-09-19 | 2012-04-15 | Dolby Lab Licensing Corp | UPSTREAM SIGNAL PROCESSING FOR CLIENT DEVICES IN A WIRELESS SMALL CELL NETWORK |
KR101547344B1 (en) * | 2008-10-31 | 2015-08-27 | 삼성전자 주식회사 | Restoraton apparatus and method for voice |
CN101986386B (en) * | 2009-07-29 | 2012-09-26 | 比亚迪股份有限公司 | Method and device for eliminating voice background noise |
US9684087B2 (en) * | 2013-09-12 | 2017-06-20 | Saudi Arabian Oil Company | Dynamic threshold methods for filtering noise and restoring attenuated high-frequency components of acoustic signals |
US9800276B2 (en) | 2013-10-08 | 2017-10-24 | Cisco Technology, Inc. | Ingress cancellation tachometer |
WO2017143334A1 (en) * | 2016-02-19 | 2017-08-24 | New York University | Method and system for multi-talker babble noise reduction using q-factor based signal decomposition |
EP3429230A1 (en) * | 2017-07-13 | 2019-01-16 | GN Hearing A/S | Hearing device and method with non-intrusive speech intelligibility prediction |
US11170799B2 (en) * | 2019-02-13 | 2021-11-09 | Harman International Industries, Incorporated | Nonlinear noise reduction system |
CN110931035B (en) * | 2019-12-09 | 2023-10-10 | 广州酷狗计算机科技有限公司 | Audio processing method, device, equipment and storage medium |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US3989897A (en) * | 1974-10-25 | 1976-11-02 | Carver R W | Method and apparatus for reducing noise content in audio signals |
US4221934A (en) * | 1979-05-11 | 1980-09-09 | Rca Corporation | Compandor for group of FDM signals |
DE3029441A1 (en) * | 1980-08-02 | 1982-03-04 | Licentia Patent-Verwaltungs-Gmbh, 6000 Frankfurt | Dynamic compression and expansion circuit for an analogue signal - has detector, quantitation device, store and arithmetic unit |
US4569569A (en) * | 1982-03-31 | 1986-02-11 | Plessey Overseas Limited | Optical coupling devices |
US4544916A (en) * | 1982-08-31 | 1985-10-01 | At&T Bell Laboratories | Digital code translator |
US4630304A (en) * | 1985-07-01 | 1986-12-16 | Motorola, Inc. | Automatic background noise estimator for a noise suppression system |
US4897878A (en) * | 1985-08-26 | 1990-01-30 | Itt Corporation | Noise compensation in speech recognition apparatus |
US4658426A (en) * | 1985-10-10 | 1987-04-14 | Harold Antin | Adaptive noise suppressor |
US4829578A (en) * | 1986-10-02 | 1989-05-09 | Dragon Systems, Inc. | Speech detection and recognition apparatus for use with background noise of varying levels |
US4887299A (en) * | 1987-11-12 | 1989-12-12 | Nicolet Instrument Corporation | Adaptive, programmable signal processing hearing aid |
-
1988
- 1988-01-18 GB GB888801014A patent/GB8801014D0/en active Pending
-
1989
- 1989-01-18 US US07/401,455 patent/US5133013A/en not_active Expired - Fee Related
- 1989-01-18 CA CA000588588A patent/CA1332626C/en not_active Expired - Fee Related
- 1989-01-18 DE DE68913139T patent/DE68913139T2/en not_active Expired - Fee Related
- 1989-01-18 WO PCT/GB1989/000049 patent/WO1989006877A1/en active IP Right Grant
- 1989-01-18 JP JP50154289A patent/JP3204501B2/en not_active Expired - Fee Related
- 1989-01-18 GB GB8918755A patent/GB2220330B/en not_active Expired
- 1989-01-18 EP EP89901730A patent/EP0367803B1/en not_active Expired - Lifetime
-
1996
- 1996-07-11 HK HK121496A patent/HK121496A/en not_active IP Right Cessation
Also Published As
Publication number | Publication date |
---|---|
GB8918755D0 (en) | 1989-10-04 |
GB2220330B (en) | 1992-03-25 |
EP0367803B1 (en) | 1994-02-16 |
GB2220330A (en) | 1990-01-04 |
DE68913139D1 (en) | 1994-03-24 |
US5133013A (en) | 1992-07-21 |
HK121496A (en) | 1996-07-19 |
JP3204501B2 (en) | 2001-09-04 |
WO1989006877A1 (en) | 1989-07-27 |
GB8801014D0 (en) | 1988-02-17 |
EP0367803A1 (en) | 1990-05-16 |
JPH02503256A (en) | 1990-10-04 |
DE68913139T2 (en) | 1994-06-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA1332626C (en) | Noise reduction | |
US7174291B2 (en) | Noise suppression circuit for a wireless device | |
US8249861B2 (en) | High frequency compression integration | |
US8219389B2 (en) | System for improving speech intelligibility through high frequency compression | |
US6591234B1 (en) | Method and apparatus for adaptively suppressing noise | |
US20030216907A1 (en) | Enhancing the aural perception of speech | |
KR950022572A (en) | Original audio signal processing method and telephone set | |
KR100876794B1 (en) | Apparatus and method for enhancing intelligibility of speech in mobile terminal | |
US20100182510A1 (en) | Spectral smoothing method for noisy signals | |
JPH09130281A (en) | Processing method of voice signal and its circuit device | |
AU2009242464A1 (en) | System and method for dynamic sound delivery | |
US7917359B2 (en) | Noise suppressor for removing irregular noise | |
US20030033139A1 (en) | Method and circuit arrangement for reducing noise during voice communication in communications systems | |
Chanda et al. | Speech intelligibility enhancement using tunable equalization filter | |
KR101789781B1 (en) | Apparatus and method for attenuating noise at sound signal inputted from low impedance single microphone | |
RU2589298C1 (en) | Method of increasing legible and informative audio signals in the noise situation | |
US8437386B2 (en) | Communication system | |
JPH07146700A (en) | Pitch emphasizing method and device and hearing acuity compensating device | |
EP1748426A2 (en) | Method and apparatus for adaptively suppressing noise | |
Tzur et al. | Sound equalization in a noisy environment | |
Popov et al. | Changing the Properties of the Audio Broadcast Signal in Adaptive Transmission Channels | |
WO2001065543A1 (en) | Compensation for linear filtering using frequency weighting factors | |
Thiemann et al. | Noise suppression using a perceptual model for wideband speech signals | |
Verschuure et al. | Technical assessment of fast compression hearing aids |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MKLA | Lapsed |