KR20120022101A - Noise reduction method and device in voice communication of iptv - Google Patents
Noise reduction method and device in voice communication of iptv Download PDFInfo
- Publication number
- KR20120022101A KR20120022101A KR1020100085216A KR20100085216A KR20120022101A KR 20120022101 A KR20120022101 A KR 20120022101A KR 1020100085216 A KR1020100085216 A KR 1020100085216A KR 20100085216 A KR20100085216 A KR 20100085216A KR 20120022101 A KR20120022101 A KR 20120022101A
- Authority
- KR
- South Korea
- Prior art keywords
- signal
- noise
- microphone
- equation
- voice
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 22
- 230000006854 communication Effects 0.000 title claims abstract description 15
- 238000004891 communication Methods 0.000 title claims abstract description 15
- 230000009467 reduction Effects 0.000 title description 2
- 230000007175 bidirectional communication Effects 0.000 claims abstract description 4
- 238000012545 processing Methods 0.000 claims description 11
- 230000003068 static effect Effects 0.000 claims description 8
- 230000008859 change Effects 0.000 claims description 7
- 230000007613 environmental effect Effects 0.000 claims description 7
- 230000006870 function Effects 0.000 claims description 4
- 230000003044 adaptive effect Effects 0.000 claims description 3
- 230000003595 spectral effect Effects 0.000 claims description 3
- 230000007774 longterm Effects 0.000 claims description 2
- 238000013500 data storage Methods 0.000 claims 1
- 230000003014 reinforcing effect Effects 0.000 claims 1
- 238000011410 subtraction method Methods 0.000 claims 1
- 238000006243 chemical reaction Methods 0.000 abstract description 4
- 230000002457 bidirectional effect Effects 0.000 abstract 1
- 238000010586 diagram Methods 0.000 description 10
- 238000001228 spectrum Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000009499 grossing Methods 0.000 description 3
- 230000003321 amplification Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 239000003623 enhancer Substances 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R1/00—Details of transducers, loudspeakers or microphones
- H04R1/08—Mouthpieces; Microphones; Attachments therefor
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02082—Noise filtering the noise being echo, reverberation of the speech
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03H—IMPEDANCE NETWORKS, e.g. RESONANT CIRCUITS; RESONATORS
- H03H17/00—Networks using digital techniques
- H03H2017/0072—Theoretical filter design
- H03H2017/009—Theoretical filter design of IIR filters
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Circuit For Audible Band Transducer (AREA)
- Telephone Function (AREA)
Abstract
Description
According to the present invention, when the microphone (hereinafter referred to as a microphone) is mounted on a two-way television (TV) or a
The microphone used in the video conferencing system is located within 1 to 2 m from the user's mouth to maintain a high signal-to-noise ratio of the user's target sound and the speaker's output sound from the other party's voice.
According to the prior art, a plurality of microphone modules are arranged and used to prevent the microphone from falling away from the user's mouth. Two-way TV can be used with multiple microphone modules or with a microphone mounted on the remote control, but for the convenience of user management and use, various voice signal processing technologies have been applied so that the microphone module can be attached to the top of the screen. Since the noise and speaker output sound were so large compared to the target sound among the input signals, the signal-to-noise ratio was very bad.
The present invention has been devised to solve the above problems, using a microphone array, a sound collecting housing having a special structure, a powerful echo canceller, and a static noise canceling technology, so that the microphone is 3 to 5 meters away from the mouth. The purpose of the present invention is to develop an effective noise canceling algorithm to enable two-way voice communication, and to provide a method for implementing a hardware system capable of processing the same in real time.
In using the voice signal processing technique according to the present invention for achieving the above object, an echo canceller for reducing the output sound of the speaker input to the microphone and noise by spectral subtraction to effectively remove static noise of the environment By using a canceller, a hardware system for preprocessing of digital signal processing techniques, a microphone array, a collection structure with a special structure, a microphone preamplifier with a low pass filter function, and the analog hardware combined with software By emphasizing only the target sound of the transmission to the other party to facilitate a smooth voice communication.
According to the present invention, since the microphone extracts and transmits only the target sound from a two-way TV, a videoconferencing system, etc., which should be used away from the user's mouth, it is effective for smooth two-way communication in many fields such as communication, education, shopping, and entertainment. Content activation becomes possible.
1 is a diagram illustrating a microphone sound collecting housing.
2 is a diagram illustrating a software flowchart of the invention.
3 is a block diagram of a microphone preamplifier including a summer and a low pass filter.
4 is a diagram illustrating an example of two-way TV voice communication.
5 is a block diagram of the signal flow and system of the echo canceller.
6 is a block diagram of a short channel speech enhancement technique.
7 is a block diagram of a real-time hardware system.
Hereinafter, the theory, configuration and operation of the present invention will be described in detail.
In the flowchart shown in FIG. 2, the signal input from the microphone array includes a target sound, a speaker output sound, and environmental noise, as in the example shown in FIG. 4, and is converted into a frequency domain through a Fourier transform. After detecting, use the line out signal just before the speaker output sound as a reference signal to operate the echo canceller.
Remove the speaker output sound included in the microphone input signal. After estimating the static noise section in the single-channel speech enhancer, we remove the static noise from the environment using a Wiener filter, convert it to a speech signal after outputting it by inverse Fourier transform.
Microphone
housing
And
Preamp
If the microphone mounted on the top of the TV is 3 to 5m away from the user's mouth, the microphone's sensitivity and amplification rate must be increased to use the user's voice as valid data. The output sound of and the voice signal of the person on the left and right of the TV, not the user in front of the TV, should be suppressed as much as possible to increase the clarity of the target sound of the user. A digital signal processing technique using a microphone array can be used to direct the input signal of the microphone, but the analog-to-digital conversion (ADC) channel is required for the number of microphones, which in turn increases the computational cost of the microprocessor. In order to reduce and preserve original sound, the present invention intends to implement this function as a hardware device. Using the microphone sound collecting housing as shown in Figure 1
In the mid-range
The micro-input signal through the microphone collecting housing is added to an adder using an element such as an operational amplifier as shown in FIG. 3 (310). When the signals present in the direction are incident and the signals are summed, attenuation occurs, so that an environmental noise can be suppressed. The signal passing through the summer is amplified to a valid data level in the
Echo canceller
Feedback and howling can be prevented by removing the speaker output sound (echo) of the TV incident to the microphone and then transmitting the signal to the other party. In one example of the bi-directional communication system configured as shown in FIG. 4, if the echo component is not effectively removed, communication is practically impossible due to an echo phenomenon in which a voice is heard by the speaker of the TV used by the user, and a closed loop is formed between the microphone and the speaker. If the loudspeaker volume is formed and the microphone's sensitivity and gain are high, howling occurs, which makes the equipment unusable and can seriously damage the amplifier and the speaker.
As shown in FIG. 5, since the user's voice and the speaker output sound are input to the microphone together, the power of the signal obtained by subtracting the reference signal from the input signal of the microphone by using the line out signal in front of the speaker output sound as a reference signal is obtained. In the frequency domain, a method of updating the coefficient of the adaptive filter to be minimized is used, and a double talk detection is performed through a method of measuring a correlation between a microphone and a reference signal to determine whether a speaker is in the voice (double talk detection). ), The echo is more efficiently removed by adjusting the gain of the output signal.
The other party's signal x (n), the user's speaker output signal y (n), the user's signal s (n), the environmental noise v (n), the error signal e (n), the coefficient
Line out signal (relative signal) before speaker output that passed through adaptive filter with Speaking, respectively, are represented by Equations 1 to 3 below.[Equation 1]
[Equation 2]
&Quot; (3) "
The smoothed power of x (n) in (3)
Is expressed as in&Quot; (4) "
The algorithm used for this echo canceller uses a complex least squares average method normalized in the frequency domain.
Voice enhancer
A typical short channel noise cancellation system is performed in the frequency domain and estimates the loudness of speech by determining the attenuation or gain of each frequency component. This is a method to remove the ambient noise by using the characteristic that the noise is less change than the voice in the input signal mixed with voice and noise.
A block diagram of the proposed short channel microphone noise cancellation system is shown in FIG. 6. The short channel microphone sound quality improvement system of FIG. 6 estimates the power spectrum of the noise D (k, l) from the magnitude information of the frequency component Y (k, l) of the input signal y (t) with noise added to the voice, and uses the same. After estimating the gain G (k, l), multiply the input magnitude signal spectrum (noise spectral subtraction) and synthesize the speech using an Inverse Fast Fourier Transform (IFFT).
If the noise section is estimated, the gain controller reduces the magnitude of the input signal. If the gain controller is not used, the residual component after subtracting the frequency component of the noise is output.
As a method of estimating the noise section, the variation of the frequency axis and the time axis among the statistical characteristics of the microphone input signal is calculated, and the threshold is set by experimentally examining the variation of the noise section and the target signal section, and compared with each variation. After that, the noise section and the target signal section are distinguished.
The power of each frequency component in the frequency domain
, Average of , The full power of that frame In this case, the normalized change amount in the frequency domain, that is, the frequency flatness, is expressed by Equation 5.[Equation 5]
The power of one frame in the time domain
, The average of, the total power of that frame In this case, the normalized change amount in the time domain is expressed by Equation 6.&Quot; (6) "
When the values of Equations 7 and 8 obtained by averaging Infinite Impulse Responses (IIRs) of Equations 5 and 6, respectively, are larger than experimentally obtained thresholds, they may be regarded as target signals.
[Equation 7]
[Equation 8]
here
Is the IIR smoothing coefficient, , Is the experimental threshold.As another parameter for estimating the desired signal, the IIR average is calculated for the power of the input signal, and the average of the frame power when a certain multiple or more is compared with the current frame power is taken into account. By not doing this, the method of considering the target sound when suddenly a large signal is continuously input is described below.
The IIR average power of the current frame is calculated as in Equation 9.
[Equation 9]
here
Is the IIR coefficient between 0 and 1, Is the IIR average power of the current frame, Is the power of the previous frame.end When the IIR average is recalculated for power corresponding to a frame less than a certain multiple, it is expressed by Equation 10.
[Equation 10]
here
Is the IIR smoothing coefficient between 0 and 1, Is the long-term IIR average power up to the current frame, Is the power of the previous frame.Calculated in Equation 10
Does not participate in the average calculation for a rapidly large, i.e., large input signal, and generally represents the frame power level of the noise components.Therefore, as shown in Equation 11, the power of the current frame of the microphone input signal
If it is larger than a certain multiple of, it can be regarded as the purpose signal and protected.[Equation 11]
here
Where is the power of the current frame, c is an arbitrary constant obtained experimentally in multiples that can be regarded as the destination signal.The noise estimator of FIG. 6 uses the spectrum of the noise as a frame determined as the noise section in Equations 7, 8, and 11.
In this case, the frame power of the noise section estimated using Equations 9 to 11 Frame power of the Calculate the IIR averaged signal to noise ratio with[Equation 12]
here
Is the power of the previous frame Is the IIR smoothing coefficient between 0 and 1, abs is the absolute operator.Using a Wiener filter commonly used among short-channel noise cancellation algorithms and applying Equation 12, the weight multiplied by the frequency component of the input signal is expressed as Equation 13.
[Equation 13]
As a variant of the Wiener filter, the gain multiplied by the frequency component of the input signal in FIG. 6 is expressed by Equation 14 below.
[Equation 14]
here
Is an attenuation level of the noise component, and the larger it reduces the noise component.To reduce the sudden change in gain, the IIR average of the gain is calculated and multiplied by each frequency component of the microphone input signal as shown in Equation 15.
[Equation 15]
here
Is a constant between 0 and 1 that is determined experimentally.
Hardware system for real time processing
FIG. 7 is a block diagram of an independent hardware system for operating a program implementing the software flowchart shown in FIG. 2 in real time.
A real-time processing board based on an
The input signals from the four microphones are summed into one channel by an
The above description of a speaker output sound or a structure or hardware system designed to remove environmental noise included in a microphone input signal and a signal processing algorithm in an apparatus such as a two-way TV with a distance between a microphone and a user is illustrated and described. Presented for. It is not intended to be exhaustive or to limit the invention to the precise form of equations or drawings. Many modifications and variations are possible in light of the above teachings, and some combinations of the mathematical formulas and embodiments may be used. It is intended that the scope of the invention be limited not by this detailed description or figures, or by equations, but by the claims appended hereto.
200: signal input module of the microphone array
210: Fast Fourier Transform (FFT) module for transforming from the time domain to the frequency domain
220: echo canceller module for detecting whether bidirectional communication and removing the echo component
230: Module for estimating noise section and removing noise by subtracting spectrum by Wiener filter
240: IFFT (Inverse FFT) module converts from the frequency domain to the time domain
Claims (6)
Equation 7 after converting the microphone input signal into the frequency domain
By the amount of change in the frequency component,
By the amount of change in the time domain,
By the IIR average power of the current frame,
By the long term IIR average power of the current frame,
And estimating whether the signal of the current frame is a noise or a target signal by comparing the power of the current frame with an experimentally obtained threshold, and a bidirectional communication method and system characterized by the same.
Compute the IIR average for the ratio of the input signal to the estimated noise signal, such as Based on the Wiener filter of Equation 14
As the IIR mean for the gain of And a noise canceling method for minimizing a noise component in a target signal section by multiplying each frequency component of a microphone input signal by the signal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020100085216A KR20120022101A (en) | 2010-09-01 | 2010-09-01 | Noise reduction method and device in voice communication of iptv |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020100085216A KR20120022101A (en) | 2010-09-01 | 2010-09-01 | Noise reduction method and device in voice communication of iptv |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020120120763A Division KR101402287B1 (en) | 2012-10-29 | 2012-10-29 | Apparatus for eliminating noise |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20120022101A true KR20120022101A (en) | 2012-03-12 |
Family
ID=46130279
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020100085216A KR20120022101A (en) | 2010-09-01 | 2010-09-01 | Noise reduction method and device in voice communication of iptv |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR20120022101A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10313789B2 (en) | 2016-06-16 | 2019-06-04 | Samsung Electronics Co., Ltd. | Electronic device, echo signal cancelling method thereof and non-transitory computer readable recording medium |
CN114242106A (en) * | 2020-09-09 | 2022-03-25 | 中车株洲电力机车研究所有限公司 | Voice processing method and device |
CN114242096A (en) * | 2021-08-20 | 2022-03-25 | 北京士昌鼎科技有限公司 | Noise reduction system based on time-frequency domain |
WO2022139899A1 (en) * | 2020-12-23 | 2022-06-30 | Intel Corporation | Acoustic signal processing adaptive to user-to-microphone distances |
-
2010
- 2010-09-01 KR KR1020100085216A patent/KR20120022101A/en not_active Application Discontinuation
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10313789B2 (en) | 2016-06-16 | 2019-06-04 | Samsung Electronics Co., Ltd. | Electronic device, echo signal cancelling method thereof and non-transitory computer readable recording medium |
CN114242106A (en) * | 2020-09-09 | 2022-03-25 | 中车株洲电力机车研究所有限公司 | Voice processing method and device |
WO2022139899A1 (en) * | 2020-12-23 | 2022-06-30 | Intel Corporation | Acoustic signal processing adaptive to user-to-microphone distances |
CN114242096A (en) * | 2021-08-20 | 2022-03-25 | 北京士昌鼎科技有限公司 | Noise reduction system based on time-frequency domain |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110169041B (en) | Method and system for eliminating acoustic echo | |
EP2749016B1 (en) | Processing audio signals | |
CN104158990B (en) | Method and audio receiving circuit for processing audio signal | |
US8842851B2 (en) | Audio source localization system and method | |
US8396234B2 (en) | Method for reducing noise in an input signal of a hearing device as well as a hearing device | |
US20190273988A1 (en) | Beamsteering | |
WO2011129725A1 (en) | Method and arrangement for noise cancellation in a speech encoder | |
CN106713570B (en) | Echo cancellation method and device | |
US9106196B2 (en) | Sound field spatial stabilizer with echo spectral coherence compensation | |
EP2700161B1 (en) | Processing audio signals | |
US20100184488A1 (en) | Sound signal adjuster adjusting the sound volume of a distal end voice signal responsively to proximal background noise | |
EP2741481B1 (en) | Subband domain echo masking for improved duplexity of spectral domain echo suppressors | |
KR20100003530A (en) | Apparatus and mehtod for noise cancelling of audio signal in electronic device | |
KR20120022101A (en) | Noise reduction method and device in voice communication of iptv | |
US9099973B2 (en) | Sound field spatial stabilizer with structured noise compensation | |
US9743179B2 (en) | Sound field spatial stabilizer with structured noise compensation | |
KR102517939B1 (en) | Capturing far-field sound | |
CN111970610B (en) | Echo path detection method, audio signal processing method and system, storage medium, and terminal | |
US9729967B2 (en) | Feedback canceling system and method | |
CA2840730C (en) | Maintaining spatial stability utilizing common gain coefficient | |
US10692514B2 (en) | Single channel noise reduction | |
KR101306868B1 (en) | Un-identified system modeling method and audio system for howling cancelation using it | |
KR101402287B1 (en) | Apparatus for eliminating noise | |
KR20150045203A (en) | Apparatus for eliminating noise | |
Favrot et al. | Adaptive equalizer for acoustic feedback control |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
A107 | Divisional application of patent | ||
E902 | Notification of reason for refusal | ||
E601 | Decision to refuse application |