KR101396873B1

KR101396873B1 - Method and apparatus for noise reduction in a communication device having two microphones

Info

Publication number: KR101396873B1
Application number: KR1020130036153A
Authority: KR
Inventors: 조정권; 김종현; 반재미
Original assignee: 주식회사 크린컴; 주식회사 시그테크
Priority date: 2013-04-03
Filing date: 2013-04-03
Publication date: 2014-05-19

Abstract

Provided are method and device which eliminates a noise using a partitioned block frequency domain adaptive filter (PBFDAF) in a communication device having two microphones. The method of the present invention comprises; a first step of performing the PBFDAF to a signal inputted in a second microphone and outputting the filtered signal; a second step of a deduction signal which takes the filtered signal from the signal inputted in a first microphone; and a third step of multiplying a gain which is calculated by performing a minimum mean-square error log-spectral amplitude (MMSE-LSA) by the deduction signal. According to the present invention, the deduction signal e(n) which subtracts the signal inputted in an auxiliary microphone from a signal inputted in a main microphone is applied to the MMSE-LSA which is a single channel noise. A residual noise in the deduction signal is eliminated again. Therefore, the present invention has an excellent noise elimination effect.

Description

두 개의 마이크로폰을 포함하는 통신장치에서의 잡음제거방법 및 장치 {Method and apparatus for noise reduction in a communication device having two microphones}BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a method and apparatus for noise reduction in a communication device including two microphones,

본 발명은 두 개의 마이크로폰을 포함하는 통신장치에서의 잡음제거방법 및 장치에 관한 것으로서, 더욱 상세하게는 두 개의 마이크로폰을 포함하는 통신장치에서 PBFDAF(Partitioned Block Frequency Domain Adaptive Filter)를 사용하여 잡음을 제거하는 방법 및 장치에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a noise reduction method and apparatus in a communication apparatus including two microphones, and more particularly, to a communication apparatus including two microphones using a Partitioned Block Frequency Domain Adaptive Filter (PBFDAF) And more particularly,

휴대폰 등의 통신장치에서 배경 잡음을 감소시키기 위하여, 두 개의 마이크로폰을 사용하여 잡음을 제거하는 방법들이 제안되고 있다. 예를 들면, 대한민국 특허공개 제10-2004-0101373호에서는 소정 거리만큼 이격된 한 개의 전방향 마이크로폰과 한 개의 일방향 마이크로폰, 그리고 사용자 피부와 접촉하는 한 개 이상의 피부 표면 마이크로폰 센서를 포함하는 통신장치에서 피부 표면 마이크로폰 센서 음성 활동 신호를 처리하여 제어 신호를 출력하는 음성활동감지기를 제안하고 있다. 그러나, 이러한 구성은 하드웨어의 구성이 복잡하고 알고리즘도 이러한 하드웨어 구성에 종속되어 있어서, 일반적으로 적용하기에는 어려움이 있다.In order to reduce the background noise in a communication device such as a mobile phone, methods of removing noise using two microphones have been proposed. For example, Korean Patent Publication No. 10-2004-0101373 discloses a communication device comprising one omni-directional microphone, a one-way microphone, and one or more skin surface microphone sensors in contact with the user's skin, And a voice activity sensor for outputting a control signal by processing a voice sensor activity signal of a skin surface microphone. However, such a configuration is difficult to apply generally because the hardware configuration is complicated and the algorithm is also dependent on such hardware configuration.

또한, 특허 제574666호에서는 음성을 입력받기 위한 음성입력부와 잡음을 입력받기 위한 잡음입력부를 구비하는 통신장치에서, 소음입력부로 들어온 신호를 음성입력부를 통해 들어온 잡음에 가까워지도록 처리한 다음에, 음성입력부로 들어온 신호에서 이 신호를 뺌으로써 음성입력부를 통해 들어온 잡음을 제거하는 구성이 개시되어 있다. 그러나, 이 특허에서는 구체적으로 어떠한 처리를 거치는가에 대해서는 개시되어 있지 않으며, 단순히 합성된 신호를 피드백 받아서 처리한다고 되어 있을 뿐이다. 또한, 음성입력부로 음성이 들어오고 있을 때와 들어오고 있지 않을 때에 동일한 잡음처리를 하고 있어서 효율적인 잡음저감을 기대하기 어렵다. In addition, Japanese Patent No. 574666 discloses a communication apparatus having a voice input unit for inputting voice and a noise input unit for inputting noise so that a signal input to the noise input unit is processed so as to approach the noise input through the voice input unit, And the noise introduced through the voice input unit is removed by subtracting the signal from the signal input to the input unit. However, this patent does not disclose specifically what process is to be performed, and merely means that the synthesized signal is processed by feedback. In addition, it is difficult to expect efficient noise reduction since the same noise processing is performed when a voice is input to the voice input unit and when no voice is input to the voice input unit.

본 발명은 이러한 점을 감안하여 이루어진 것으로서, 두 개의 마이크로폰을 포함하는 통신장치에 일반적으로 적용될 수 있으면서도 효율 좋은 잡음제거방법 및 장치를 제공하는 것을 목적으로 한다.SUMMARY OF THE INVENTION It is an object of the present invention to provide a noise cancellation method and apparatus, which can be generally applied to a communication apparatus including two microphones, and which is efficient.

본 발명의 한가지 형태는, 제1 마이크로폰과, 상기 제1 마이크로폰에 비해서 화자의 입에서 상대적으로 멀리 떨어져 있는 제2 마이크로폰을 포함하는 통신장치에서의 잡음제거방법에 적용된다. 본 발명은 또한, 제1 마이크로폰과, 상기 제1 마이크로폰에 비해서 화자의 입에서 상대적으로 멀리 떨어져 있는 제2 마이크로폰을 포함하는 통신장치에 적용될 수 있다.One aspect of the present invention is applied to a noise reduction method in a communication device including a first microphone and a second microphone relatively far from the mouth of the speaker as compared to the first microphone. The invention is also applicable to a communication device comprising a first microphone and a second microphone relatively far from the mouth of the speaker as compared to the first microphone.

본 발명의 방법은, 제2 마이크로폰으로 입력되는 신호에 PB-FDAF(Partitioned Block Frequency Domain Adaptive Filtering)을 수행하여 필터링된 신호를 출력하는 제1단계와, 제1 마이크로폰으로 입력되는 신호에서 상기 필터링된 신호를 뺀 차감신호를 출력하는 제2단계와, MMSE-LSA(Minimum Mean-Square Error Log-Spectral Amplitude)를 수행하여 계산한 이득을 상기 차감 신호에 곱하는 제3단계를 구비한다. The method includes a first step of performing PB-FDAF (Partitioned Block Frequency Domain Adaptive Filtering) on a signal input to a second microphone and outputting a filtered signal, a second step of filtering the signal input from the first microphone And a third step of multiplying the difference signal by a gain calculated by performing MMSE-LSA (Minimum Mean-Square Error Log-Spectral Amplitude).

제1단계는, 제1 마이크로폰으로 입력되는 신호 d(n)과 제2 마이크로폰으로 입력되는 신호 x(n)에 기초하여 음성활동 유무를 판단하고 음성활동 유무를 나타내는 신호를 출력하는 제1-1단계와, 제1-1단계에서 음성활동이 없음을 나타내는 신호가 출력되면 PB-FDAF 필터의 계수를 갱신하는 제1-2단계와, 제2 마이크로폰으로 입력되는 신호 x(n)에 PB-FDAF 필터링을 수행하여 필터링된 출력 y(n)을 구하는 제1-3단계를 포함할 수 있다. In the first step, the presence or absence of voice activity is determined based on the signal d (n) input to the first microphone and the signal x (n) input to the second microphone, and a 1-1 FDAF filter to the signal x (n) input to the second microphone; and a step (b-1) of updating the coefficient of the PB-FDAF filter when a signal indicating no voice activity is output in step And filtering the output y (n) to obtain a filtered output y (n).

M은 파티션(partition)의 갯수, N은 각 파티션의 크기, L=MxN라 할 때, 제1-2단계에서는 다음 수식 M is the number of partitions, N is the size of each partition, L = MxN,

에 의해 p번째 파티션에서의 q번째 계수 w _p _,q (n)을 갱신하며(μ는 수렴상수로서 0에서 1 사이의 값을 가지며 δ는 분모가 0에 가까운 값을 갖지 않도록 만들기 위한 상수값임), 제1-3단계에서는(Where μ is a constant between 0 and 1, and δ is a constant value to make the denominator not have a value close to 0), and updates the q-th coefficient w _p _{, q} ( n ) , And in step 1-3,

에 의해 필터링을 수행한다. Lt; / RTI >

본 발명의 일 실시예에서, 제1-1단계는 제1 마이크로폰으로 입력되는 신호 d(n)의 전력의 변화량(이하, "제1 변화량"이라 함)과 제2 마이크로폰으로 입력되는 신호 x(n)의 전력의 변화량(이하, "제2 변화량"이라 함)을 계산하는 단계와, 상기 제1 변화량과 제2 변화량의 차이값이 기준치보다 크면 음성활동이 있다고 판단하고 음성활동이 있음을 나타내는 신호를 출력하는 단계를 포함한다. In an embodiment of the present invention, step 1-1 is a step of comparing the amount of change (hereinafter referred to as "first amount of change") of the power of the signal d (n) input to the first microphone and the amount of change (hereinafter referred to as "second change amount") of the first change amount and the second change amount when the difference value between the first change amount and the second change amount is larger than the reference value; And outputting a signal.

상기 제1 변화량은 현 시점으로부터 이전의 제1 기간 동안 제1 마이크로폰으로 입력된 평균신호전력과, 현 시점으로부터 이전의 상기 제1 기간보다 긴 제2 기간 동안 제1 마이크로폰으로 입력된 평균신호전력과의 차이값이며, 상기 제2 변화량은 현 시점으로부터 이전의 상기 제1 기간 동안 제2 마이크로폰으로 입력된 평균신호전력과, 현 시점으로부터 이전의 상기 제2 기간 동안 제2 마이크로폰으로 입력된 평균신호전력과의 차이값일 수 있다. Wherein the first variation amount is a difference between an average signal power input to the first microphone during the first period from the current point of time and an average signal power input from the first microphone during the second period longer than the first period, The second variation amount is a difference value between the average signal power input to the second microphone during the first period from the current point of time and the average signal power input from the current point to the average signal power input from the second microphone during the second period, Lt; / RTI >

또는, 상기 제1 변화량은 현 시점에 제1 마이크로폰으로 입력된 신호의 전력과, 현 시점으로부터 이전의 소정 기간 동안 제1 마이크로폰으로 입력된 평균전력과의 차이값이며, 상기 제2 변화량은 현 시점에 제2 마이크로폰으로 입력된 신호의 전력과, 현 시점으로부터 이전의 소정 기간 동안 제2 마이크로폰으로 입력된 평균전력과의 차이값일 수 있다. Alternatively, the first change amount is a difference value between a power of a signal input to the first microphone at the current time and an average power input to the first microphone during a predetermined period from the current time, May be a difference value between the power of the signal input to the second microphone and the average power input to the second microphone for a predetermined period from the present time.

음성활동이 있다고 판단된 이후에 상기 제1 변화량과 제2 변화량의 차이값이 기준치보다 작아진 경우에도 소정 시간 동안에는 음성활동이 있음을 나타내는 신호를 계속 출력하는 것이 바람직하다. It is preferable to continue to output a signal indicating that there is voice activity for a predetermined time even if the difference value between the first change amount and the second change amount becomes smaller than the reference value after the voice activity is determined to exist.

k는 k 번째 주파수 성분, q _k 는 잡음을 얼마나 많이 제거할 것인지를 결정하는 상수라고 할 때, 상기 이득은 k is to say the k-th frequency component, q _k is a constant that determines how many times to confirm the removal of the noise, the gain is

에 의해 구하며,&Lt; / RTI >

여기에서, l은 프레임 번호, E _k ²(l)은 프레임 l에서의 차감신호 e(n)의 파워,

(l-1)는 프레임 l-1에서의 최종 출력 z(n)의 파워, α는 스무딩 상수라 할 때 Here, l is the frame number, E _k ² ( l ) is the power of the difference signal e (n) in frame l ,

(l-1) is the final output z (n) in a frame l -1 power, α is referred to when the smoothing constant

이며, Lt;

여기에서 β는 스무딩 상수, Y(k,l)은 l 번째 프레임에서 y(n)을 FFT한 것이고 E(k,l)은 l 번째 프레임에서 e(n)을 FFT한 것이라 할 때 Where β is a smoothing constant, Y (k, l) is an exemplary FFT of y (n) in the l-th frame (k, l) is E when a FFT would the e (n) in the l th frame

이다. to be.

제3단계는, 차감신호 e(n)과 제2 마이크로폰으로부터의 신호 x(n)에 기초하여 음성활동 유무를 판단하고 음성활동 유무를 나타내는 신호를 출력하는 제3-1단계와, 제3-1단계에서 음성활동이 없음을 나타내는 신호가 출력되면

과

를 계산하는 단계를 더 포함할 수 있다. The third step includes a third step (3-1) of judging the presence or absence of sound activity based on the difference signal e (n) and the signal x (n) from the second microphone and outputting a signal indicating the presence or absence of sound activity, When a signal indicating that there is no voice activity is output in step 1

and

And the step of calculating the second threshold value.

바람직하게는, 제1 마이크로폰은 통신장치의 하단부에 위치하며, 제2 마이크로폰은 통신장치의 상단부에 위치한다. Preferably, the first microphone is located at the lower end of the communication device and the second microphone is located at the upper end of the communication device.

본 발명에 따르면, 보조 마이크로폰으로 입력된 신호를 필터링한 신호를 주 마이크로폰으로부터 입력된 신호에서 차감한 차감신호 e(n)에 대해서 단채널 잡음제거방법인 MMSE-LSA를 적용하여 차감신호에 남아있는 잔여잡음을 다시 제거하므로 잡음제거효율이 좋다. 또한, 두 개의 마이크로폰으로부터의 신호전력의 변화량의 차이값을 이용하여 음성활동 유무를 판단하므로, 두 개의 마이크로폰에 들어오는 신호의 크기에 관계 없이 보다 정확한 음성활동감지가 가능하여, 보다 정확한 잡음제거가 가능하다. 또한, 잡음의 제거 정도를 잡음의 크기 레벨에 따라 가변적으로 운영함으로써 잡음의 크기에 상관없이 잡음을 적당한 레벨로 제거하게 되어 최종 출력신호의 잔여 잡음을 일정하게 할 수 있다. According to the present invention, the MMSE-LSA, which is a short channel noise canceling method, is applied to the difference signal e (n) obtained by subtracting the signal input from the auxiliary microphone from the signal input from the main microphone, Residual noise is removed again, so noise rejection efficiency is good. In addition, since the presence or absence of the voice activity is determined by using the difference value of the amount of change of the signal power from the two microphones, more accurate voice activity can be detected regardless of the size of the signal inputted to the two microphones, Do. In addition, by operating the degree of noise elimination variably according to the noise level level, the noise can be removed to an appropriate level regardless of the noise level, and the residual noise of the final output signal can be made constant.

도 1은 본 발명의 음성활동감지방법이 적용되는 통신장치의 내부 구성을 보여주는 블록도이다.
도 2는 두 개의 마이크로폰을 갖는 통신장치에서 마이크로폰과 스피커의 배치예를 보여주는 도면이다.
도 3은 본 발명에 적용될 수 있는 음성활동감지방법의 일 예를 보여주는 흐름도이다.
도 4는 본 발명의 일 실시예에 따른 잡음제거부의 내부 구성을 보여주는 블록도이다.
도 5는 PB-FDAF부의 개략적인 내부 구성을 보여주는 블록도이다.1 is a block diagram illustrating an internal configuration of a communication device to which the voice activity detection method of the present invention is applied.
2 is a view showing an example of arrangement of a microphone and a speaker in a communication device having two microphones.
3 is a flowchart illustrating an example of a voice activity detection method applicable to the present invention.
4 is a block diagram illustrating an internal configuration of a noise removing unit according to an embodiment of the present invention.
5 is a block diagram showing a schematic internal configuration of the PB-FDAF unit.

이하, 도면을 참고하여 본 발명의 바람직한 실시예에 대해서 상세히 설명한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the drawings.

도 1에 본 발명의 음성활동감지방법이 적용되는 통신장치의 내부 구성을 보여주는 블록도가 도시되어 있다.FIG. 1 is a block diagram illustrating an internal configuration of a communication device to which the voice activity detection method of the present invention is applied.

본 발명의 음성활동감지방법은 적어도 두 개의 마이크로폰(11, 12)이 구비되는 통신장치에 적용된다. 제1 마이크로폰(11)은 화자(話者)의 입 근처에 위치하고, 제2 마이크로폰(12)은 제1 마이크로폰(11)에 비해서 화자(話者)의 입으로부터 더 멀리 떨어져 있다. 바람직하게는, 제1 마이크로폰(11)은 통신장치의 하단에 위치하고 제2 마이크로폰(12)은 통신장치의 상단에 위치한다.The voice activity sensing method of the present invention is applied to a communication device having at least two microphones (11, 12). The first microphone 11 is located near the mouth of the speaker and the second microphone 12 is further away from the mouth of the speaker than the first microphone 11. Preferably, the first microphone 11 is located at the bottom of the communication device and the second microphone 12 is located at the top of the communication device.

각 마이크로폰(11,12)으로 입력되는 아날로그 신호는 적절한 크기로 증폭되어 아날로그-디지털 변환기(13, 14)에서 디지털 신호로 변환된 다음, 본 발명의 잡음제거방법을 채택한 잡음제거기(16)로 입력된다. An analog signal input to each of the microphones 11 and 12 is amplified to an appropriate size and then converted into a digital signal by the analog-digital converters 13 and 14, and then input to a noise eliminator 16 adopting the noise- do.

잡음제거부(16)는 각 마이크로폰(11,12)으로부터 입력되어 디지털로 변환된 신호를 이용하여 주변 잡음을 제거한 후에, 보코더(31)로 출력한다.The noise eliminator 16 removes the ambient noise from the microphones 11 and 12 using the digitally converted signal, and outputs the noise to the vocoder 31.

보코더(31)는 잡음제거부(16)에서 잡음이 제거된 신호를 인코딩하여 통신망 인터페이스(41)를 통해 통화상대방에게 전송하고, 통신망 인터페이스(41)를 통해 전송되어 오는 통화상대방의 음성을 디코딩한다. 디코딩된 음성신호는 디지털-아날로그 변환기(22)를 거쳐서 아날로그 신호로 변환된 다음에 적절한 레벨로 증폭되어 스피커(21)를 통해 출력된다.
The vocoder 31 encodes the noise canceled signal in the noise eliminator 16 and transmits the encoded signal to the calling party through the communication network interface 41 and decodes the voice of the calling party transmitted through the communication network interface 41 . The decoded speech signal is converted into an analog signal via the digital-to-analog converter 22 and then amplified to an appropriate level and output through the speaker 21. [

도 2는 두 개의 마이크로폰을 갖는 통신장치에서 마이크로폰과 스피커의 배치예를 보여주는 도면이다. 도 2에 도시된 것처럼, 제1 마이크로폰(11)은 화자의 입에 가까운 곳인 통신장치의 하단에 위치하고, 제2 마이크로폰(12)은 통신장치의 상단에 위치한다. 스피커(21)는 화자의 귀에 가까운 위치인 통신장치의 상부에 위치하고 있다.
2 is a view showing an example of arrangement of a microphone and a speaker in a communication device having two microphones. As shown in Fig. 2, the first microphone 11 is located at the lower end of the communication device, which is near the mouth of the speaker, and the second microphone 12 is located at the upper end of the communication device. The speaker 21 is located at the top of the communication device, which is a position close to the ear of the speaker.

다음으로 도 3 내지 도 5를 참조하여, 본 발명의 일 실시예에 따른 잡음제거방법을 설명한다. Next, with reference to FIG. 3 to FIG. 5, a noise canceling method according to an embodiment of the present invention will be described.

도 4는 본 발명의 일 실시예에 따른 잡음제거부의 내부 구성을 보여주는 블록도이다.4 is a block diagram illustrating an internal configuration of a noise removing unit according to an embodiment of the present invention.

VAD1(161)은 제1 마이크로폰(11)으로부터 입력되는 신호 d(n)과 제2 마이크로폰(12)으로부터 입력되는 신호 x(n)을 이용하여 음성활동(Voice Activity)을 감지하는 음성활동감지기(Voice Activity Detector)이다. The VAD1 161 is a voice activity detector that detects a voice activity using a signal d (n) input from the first microphone 11 and a signal x (n) input from the second microphone 12 Voice Activity Detector).

VAD1(161)은, 제1 마이크로폰(11)으로 입력되는 신호전력의 변화량(이하, "제1 변화량"이라 함)과 제2 마이크로폰(12)으로 입력되는 신호전력의 변화량(이하, "제2 변화량"이라 함)을 계산하고, 계산된 제1 변화량과 제2 변화량의 차이값에 기초하여 음성활동 유무를 판단할 수 있다. 이 방법에 대해서 도 3을 참조하여 설명한다. The VAD1 161 calculates the variation amount of the signal power input to the first microphone 11 (hereinafter referred to as a "first variation amount") and the variation amount of the signal power input to the second microphone 12 Quot; change amount "), and determine whether there is a voice activity based on the calculated difference between the first change amount and the second change amount. This method will be described with reference to Fig.

도 3은 본 발명의 본 발명에 적용될 수 있는 음성활동감지방법의 일 예의 동작을 보여주는 흐름도이다. 도 3의 동작은 프레임마다 수행하는 것이 바람직하지만, 이에 한정되는 것은 아니다.FIG. 3 is a flowchart showing an operation of an example of a voice activity sensing method applicable to the present invention of the present invention. The operation of FIG. 3 is preferably performed for each frame, but is not limited thereto.

먼저, VAD1(161)은 신호 d(n)과 x(n)의 전력의 변화량을 계산한다(단계 310). 신호전력의 변화량은 현 시점에 마이크로폰으로 입력된 신호의 전력과, 현 시점으로부터 이전의 소정 기간 동안 동일 마이크로폰으로 입력된 평균전력과의 차이값의 절대값으로 정의된다.First, the VAD1 161 calculates the amount of change in the power of the signals d (n) and x (n) (step 310). The amount of change in the signal power is defined as the absolute value of the difference between the power of the signal input to the microphone at the current time and the average power input to the same microphone during the predetermined period from the present time.

이를 수식으로 표현하면 수학식 1과 같다. 수학식 1은 신호 d(n)의 신호전력의 변화량을 보여주고 있지만, 신호 x(n)에 대해서도 동일한 방식으로 신호전력의 변화량을 구할 수 있다.This can be expressed by Equation (1). Equation (1) shows the amount of change in the signal power of the signal d (n), but the amount of change in the signal power can also be obtained for the signal x (n) in the same manner.

여기에서, M은 현재 샘플을 포함한 이전 소정 기간 동안의 샘플 수이다. 계산량을 줄이기 위해서 모든 오디오 데이터를 사용하지 않고 일정 시간마다 샘플링을 하여 사용할 수도 있다. 예를 들어, 신호전력의 평균치 계산을 위한 샘플링을 프레임마다 수행하도록 할 수도 있으며 이 경우에는 M은 현재 프레임을 포함한 이전 소정 기간 동안의 프레임 수이다. Here, M is the number of samples during a predetermined period including the current sample. In order to reduce the amount of calculation, it is also possible to use sampling at a predetermined time without using all the audio data. For example, sampling for calculating the average of the signal power may be performed for each frame, where M is the number of frames for the previous predetermined period including the current frame.

한편, 다른 방법으로는, 신호전력의 변화량을, 현 시점으로부터 이전의 제1 기간 동안 마이크로폰으로 입력된 평균신호전력과, 현 시점으로부터 이전의 상기 제1 기간보다 긴 제2 기간 동안 마이크로폰으로 입력된 평균신호전력과의 차이값을 사용할 수도 있다.On the other hand, as another method, the amount of change in the signal power is calculated by multiplying the average signal power input to the microphone during the first first period from the present time and the average signal power input from the microphone during the second period longer than the previous first period A difference value from the average signal power may be used.

이를 수식으로 표현하면 수학식 2와 같다. 수학식 2는 신호 d(n)의 신호전력의 변화량을 보여주고 있지만, 신호 x(n)에 대해서도 동일한 방식으로 신호전력의 변화량을 구할 수 있다.This can be expressed by the following equation (2). Equation 2 shows the amount of change in the signal power of the signal d (n), but the amount of change in the signal power can also be obtained for the signal x (n) in the same manner.

여기에서, N은 현 시점으로부터 이전의 제1 기간 동안의 현재 샘플을 포함한 샘플 수이며, M은 현 시점으로부터 이전의 제2 기간 동안의 현재 샘플을 포함한 샘플 수이고, N < M이다. 예를 들어, N을 1 프레임 동안의 샘플수로 잡고, M을 10 프레임 동안의 샘플수로 잡으면, 신호전력의 변화량은 1 프레임 동안의 평균전력과 10 프레임 동안의 평균전력의 차이가 된다. N과 M은 통신기기의 구조와 마이크로폰의 특성 등에 따라서 달라지며, 실험에 의해 적절한 값을 구하면 된다.
Where N is the number of samples including the current sample for the previous first period from the current point of time, M is the number of samples including the current sample for the previous second period from the current point of time, and N < For example, if N is taken as the number of samples for one frame and M is taken as the number of samples for 10 frames, the amount of change in the signal power is the difference between the average power for one frame and the average power for ten frames. N and M vary depending on the structure of the communication device and the characteristics of the microphone, and an appropriate value can be obtained by experiment.

다음으로, 제1 변화량과 제2 변화량의 차이값이 기준치보다 크면(단계 320의 'Yes') VAD1(161)은 음성활동이 있다고 판단하고 음성활동이 있음을 나타내는 신호를 출력한다(단계 340). 본 예에서는 음성활동이 있는 경우에 논리값 1을 출력하도록 하고 있다. 제1 변화량과 제2 변화량의 차이값은 제1 변화량에서 제2 변화량을 뺀 값으로 할 수 있다. Next, if the difference between the first change amount and the second change amount is greater than the reference value (Yes in step 320), the VAD1 161 determines that there is a voice activity and outputs a signal indicating that there is a voice activity (step 340) . In this example, a logical value 1 is output when there is voice activity. The difference value between the first change amount and the second change amount may be a value obtained by subtracting the second change amount from the first change amount.

한편, 음성활동이 있다고 판단된 이후에 제1 변화량과 제2 변화량의 차이값이 기준치보다 작아진 경우에도 이후의 소정 시간 동안에는 음성활동이 있음을 나타내는 신호를 계속 출력하는 것이 바람직하다. 이를 위하여 VAD1(15)은 유지시간(Hold time)을 나타내는 HT 값을 초기화시키고(도 3의 예에서는 HT=15)(단계 330), HT 값을 하나 감소시킨다(350). 예를 들어, 도 3의 동작이 매 프레임마다 수행되는 경우에는 유지시간은 1 프레임 시간 x 15가 되며, 1 프레임 시간이 20msec인 경우에는 도 3의 예에서 유지시간은 0.3초가 된다.If the difference between the first change amount and the second change amount is smaller than the reference value after the determination that the voice activity is present, it is preferable that a signal indicating that there is voice activity continues to be output for a predetermined period of time. For this, the VAD1 15 initializes the HT value indicating the hold time (HT = 15 in the example of FIG. 3) (step 330), and decrements the HT value by one (350). For example, when the operation of FIG. 3 is performed every frame, the holding time is one frame time x 15, and when one frame time is 20 msec, the holding time is 0.3 seconds in the example of FIG.

단계 320에서의 판단 결과, 제1 변화량과 제2 변화량의 차이값이 기준치보다 작으면(단계 320의 'No'), 단계 360에서 유지시간이 지났는지를 확인한다. 유지시간이 지나지 않은 경우에는, 즉 HT 값이 0보다 큰 경우에는 단계 340으로 가서 음성활동 유무를 나타내는 출력을 그대로 유지하고(즉, VAD=1로 계속 유지하고) HT 값을 하나 감소시킨다(350). 단계 360에서의 판단 결과, 유지시간이 지난 경우에는, 즉 HT 값이 0인 경우에는 음성활동이 없음을 나타내는 신호를 출력한다(단계 370). 본 예에서는 음성활동이 있는 경우에 논리값 0을 출력하도록 하고 있다.If it is determined in step 320 that the difference between the first and second amounts of change is smaller than the reference value (No in step 320), it is determined in step 360 whether the retention time has passed. If the hold time is not exceeded, that is, if the HT value is greater than 0, the process goes to step 340 to keep the output indicating the presence or absence of voice activity (i.e., keep VAD = 1) ). As a result of the determination in step 360, if the retention time has elapsed, that is, if the HT value is 0, a signal indicating that there is no voice activity is output (step 370). In this example, a logical value 0 is output when there is voice activity.

한편, 음성활동 감지방법에 대해서는 종래부터 다양한 방법이 제안되고 있으며, VAD1(161)은 어느 특정 방법에 제한되는 것은 아니다. 예를 들면, 제1 마이크로폰(11)으로 입력되는 신호전력과 제2 마이크로폰(12)으로 입력되는 신호전력의 차이값에 기초하여 음성활동 유무를 판단하도록 구성할 수도 있다.
Meanwhile, various methods have been proposed for the voice activity detection method, and the VAD1 161 is not limited to any particular method. For example, the presence or absence of a voice activity may be determined based on the difference between the signal power input to the first microphone 11 and the signal power input to the second microphone 12.

PB-FDAF(Partitioned Block Frequency Domain Adaptive Filter)부(162)는 제2 마이크로폰(12)으로부터 들어온 신호 x(n)과 VAD1(161)의 출력값을 이용하여 신호 x(n)을 필터링한 후에, 필터링된 신호 y(n)을 출력한다.The PB-FDAF (Partitioned Block Frequency Domain Adaptive Filter) unit 162 filters the signal x (n) using the signal x (n) input from the second microphone 12 and the output value of the VAD1 161, And outputs a signal y (n).

PB-FDAF 방식은 반향음(echo) 또는 배경잡음을 제거하기 위하여 종래부터 사용되고 있었다. PB-FDAF 방식에 대해서는 특허 제716377호, 미국특허 제7,171,436호 등에 설명되어 있다. The PB-FDAF scheme has been used conventionally to eliminate echo or background noise. The PB-FDAF method is described in Patent No. 716377, US Patent No. 7,171,436 and the like.

PB-FDAF 방식에서는 필터 계수가 적응적으로 갱신된다. 도 5에 본 발명의 PB-FDAF부(162)의 개략적인 내부 구성이 도시되어 있다. PB-FDAF부(162)는 필터(51)와 필터(51)의 계수를 갱신하기 위한 갱신부(52)를 구비한다.In the PB-FDAF scheme, the filter coefficients are updated adaptively. Fig. 5 shows a schematic internal structure of the PB-FDAF unit 162 of the present invention. The PB-FDAF unit 162 includes an update unit 52 for updating the filter 51 and coefficients of the filter 51.

필터(51)는 시간 도메인 또는 주파수 도메인 적응형 필터로서 구현될 수 있다. 시간 도메인에서 구현되는 경우에는 수학식 3에서와 같이 입력신호와 필터 계수 간의 컨볼루션을 수행하고, 주파수 도메인에서 구현되는 경우에는 수학식 4에서와 같이 주파수 도메인의 입력신호와 필터 계수가 곱해진다. 수학식 3과 수학식 4에서 M은 파티션(partition)의 갯수이며, N은 각 파티션의 크기, L=MxN이다.The filter 51 may be implemented as a time domain or frequency domain adaptive filter. In a time domain implementation, convolution is performed between the input signal and the filter coefficient as in Equation (3), and when it is implemented in the frequency domain, the input signal in the frequency domain is multiplied by the filter coefficient as shown in Equation (4). In Equations (3) and (4), M is the number of partitions, and N is the size of each partition, L = MxN.

수학식 4에서 변환 크기는 2N이며, 각 변수는 다음과 같다.In Equation (4), the transform size is 2N, and each variable is as follows.

여기에서 p번째 파티션에서의 q번째 계수 w _p _,q (n)는 아래의 식에 의해 구할 수 있다. Here, the q-th coefficient w _p _{, q} ( n ) at the p-th partition can be obtained by the following equation.

위 식에서 μ는 수렴상수로서 0에서 1 사이의 값을 가지며, δ는 분모가 0에 가까운 값을 갖지 않도록 만들기 위한 상수값이다.
In this equation, μ is a convergent constant with a value between 0 and 1, and δ is a constant value to make the denominator not have a value close to zero.

본 발명에서는 필터 계수의 갱신이 음성활동이 감지되지 않는 동안(즉, VAD1=0인 동안)에 이루어지며, 음성활동이 감지되고 있는 동안(즉, VAD1=1인 동안)에는 직전의 필터 계수가 계속 사용된다.
In the present invention, the update of the filter coefficient is performed while no voice activity is detected (i.e., while VAD1 = 0), and the previous filter coefficient is updated while the voice activity is being detected (i.e., while VAD1 = 1) It continues to be used.

제1 마이크로폰(11)으로부터의 신호 d(n)에서 PB-FDAF(162)에서 출력되는 신호 y(n)을 뺀 차감신호 e(n)은 MMSE-LSA(Minimum Mean-Square Error Log-Spectral Amplitude)부(164)로 입력되어 e(n)에 남아있는 잡음이 추가로 제거된 후에 출력신호 z(n)이 출력된다. MMSE-LSA부(164)에는 또한 y(n)과 VAD2(163)의 출력이 입력된다. The difference signal e (n) obtained by subtracting the signal y (n) output from the PB-FDAF 162 from the signal d (n) from the first microphone 11 is expressed as MMSE-LSA (Minimum Mean-Square Error Log- ) Portion 164 and the output signal z (n) is output after the noise remaining in e (n) is further removed. The outputs of y (n) and VAD2 (163) are also input to the MMSE-LSA unit 164.

VAD2(163)에는 e(n)과 x(n)이 입력되며, VAD2(163)은 e(n)과 x(n)으로부터 음성활동 유무 여부를 판단한다. VAD2(163)에는 VAD1(161)과 동일한 음성활동감지 방식을 사용할 수도 있고, 다른 음성활동감지 방식을 사용할 수도 있다.E (n) and x (n) are input to VAD2 163 and VAD2 163 determines whether or not there is a voice activity from e (n) and x (n). The VAD2 163 may use the same voice activity detection method as the VAD1 161, or may use another voice activity detection method.

MMSE-LSA부(164)의 출력 z(n)은 e(n)에 이득

를 곱한 값이며, 이득

은 수학식 5에 의해 계산된다.The output z (n) of the MMSE-LSA unit 164 is multiplied by e (n)

, And the gain

Is calculated by Equation (5).

여기에서 k는 k 번째 주파수 성분을 나타내고, q _k 는 잡음을 얼마나 많이 제거할 것인지를 결정하는 상수로서 1에 가까울수록 잡음을 적게 제거하고 0에 가까울수록 많이 제거한다. ξ는 priori SNR,

는 posteriori SNR을 나타내며 이들과 v _k 는 각각 다음과 같이 정의된다.Here, k represents the k- th frequency component, and q _k is a constant for determining how much noise is removed. The closer to 1, the less noise is removed, and the closer to 0, the more removed. ξ is the priori SNR,

Represents the posteriori SNR, and v _k are defined as follows.

여기에서 l은 프레임 번호이며, E _k ²(l)은 프레임 l에서의 차감신호 e(n)의 파워,

(l-1)는 프레임 l-1에서의 최종 출력 z(n)의 파워,

는 스무딩 상수이다. l번째 프레임의 k 번째 주파수에서의 잡음의 스펙트럼인

과

는 다음 식에서 구할 수 있다.Where l is the frame number, E _k ² ( l ) is the power of the difference signal e (n) in frame l ,

(l-1) is the power of the final output z (n) in a frame l -1,

Is a smoothing constant. l the spectrum of the noise at the k-th frequency in the second frame,

and

Can be obtained from the following equation.

위 식에서 Y(k,l)은 l 번째 프레임에서 y(n)을 FFT(Fast Fourier Transform)한 것이며, E(k,l)은 l 번째 프레임에서 e(n)을 FFT한 것이다.In the above equation, Y (k, l) is obtained by FFT (Fast Fourier Transform) of y (n) in the lth frame and E (k, l) is obtained by FFTing e (n) in the lth frame.

이상의 식에서

와 β는 스무딩 상수로서, 1에 가까울 수록 스무딩이 많이 되고 0에 가까울수록 스무딩이 적게 된다. 또한, 스무딩이 많이 되면 변화량이 작아진다. In the above equation

And β are the smoothing constants. The closer to 1, the more smoothing. The closer to 0, the less smoothing. Further, when the amount of smoothing is large, the amount of change is small.

과

는 VAD2(163)에서 잡음이 감지되었을 때만 계산하도록 하는 것도 가능하다. 즉,

과

는 VAD2=0 일 때만 갱신되며(따라서, 잡음의 스펙트럼만 반영됨), MMSE-LSA 게인값은 VAD2 의 결과와 관계없이 항상 에러신호에 곱하게 된다.

and

It is also possible to calculate only when noise is detected in the VAD2 163. In other words,

and

Is updated only when VAD2 = 0 (thus, only the spectrum of noise is reflected), and the MMSE-LSA gain value is always multiplied by the error signal regardless of the result of VAD2.

이상, 본 발명을 몇가지 예를 들어 설명하였으나, 본 발명의 실시예를 구성하는 모든 구성 요소들이 하나로 결합하거나 결합하여 동작하는 것으로 설명되었다고 해서, 본 발명이 반드시 이러한 실시예에 한정되는 것은 아니다. 즉, 본 발명의 목적 범위 안에서라면, 그 모든 구성 요소들이 하나 이상으로 선택적으로 결합하여 동작할 수도 있다. 또한, 그 모든 구성 요소들이 각각 하나의 독립적인 하드웨어로 구현될 수 있지만, 각 구성 요소들의 그 일부 또는 전부가 선택적으로 조합되어 하나 또는 복수 개의 하드웨어에서 조합된 일부 또는 전부의 기능을 수행하는 프로그램 모듈을 갖는 컴퓨터 프로그램으로서 구현될 수도 있다. 그 컴퓨터 프로그램을 구성하는 코드들 및 코드 세그먼트들은 본 발명의 기술 분야의 당업자에 의해 용이하게 추론될 수 있을 것이다. 이러한 컴퓨터 프로그램은 컴퓨터가 읽을 수 있는 저장매체(Computer Readable Media)에 저장되어 컴퓨터에 의하여 읽혀지고 실행됨으로써, 본 발명의 실시예를 구현할 수 있다. 또한, 주파수 도메인에서 수행되는 것으로 설명된 동작을 시간 도메인에서 수행되도록 수정하거나, 시간 도메인에서 수행되는 것으로 설명된 동작을 주파수 도메인에서 수행되도록 수정하여 구현하는 것도 가능하다.While the present invention has been described with reference to exemplary embodiments, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. That is, within the scope of the present invention, all of the components may be selectively coupled to one or more of them. In addition, although all of the components may be implemented as one independent hardware, some or all of the components may be selectively combined to perform a part or all of the functions in one or a plurality of hardware. As shown in FIG. The codes and code segments constituting the computer program may be easily deduced by those skilled in the art. Such a computer program can be stored in a computer-readable storage medium, readable and executed by a computer, thereby realizing an embodiment of the present invention. It is also possible to modify the operation described as being performed in the frequency domain to be performed in the time domain or to modify the operation described as being performed in the time domain to be performed in the frequency domain.

이상에서 기재된 "포함하다", "구성하다" 또는 "가지다" 등의 용어는, 특별히 반대되는 기재가 없는 한, 해당 구성 요소가 내재할 수 있음을 의미하는 것이므로, 다른 구성 요소를 제외하는 것이 아니라 다른 구성 요소를 더 포함할 수 있는 것으로 해석되어야 한다. It is to be understood that the terms "comprises", "comprising", or "having" as used in the foregoing description mean that a component can be implied unless specifically stated to the contrary, But should be construed as further including other elements.

이상의 설명은 본 발명의 기술 사상을 예시적으로 설명한 것에 불과한 것으로서, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 다양한 수정 및 변형이 가능할 것이다. 따라서, 본 발명에 개시된 실시예들은 본 발명의 기술 사상을 한정하기 위한 것이 아니라 설명하기 위한 것이고, 이러한 실시예에 의하여 본 발명의 기술 사상의 범위가 한정되는 것은 아니다. 본 발명의 보호 범위는 아래의 청구범위에 의하여 해석되어야 하며, 그와 동등한 범위 내에 있는 모든 기술 사상은 본 발명의 권리범위에 포함되는 것으로 해석되어야 할 것이다.The foregoing description is merely illustrative of the technical idea of the present invention, and various changes and modifications may be made by those skilled in the art without departing from the essential characteristics of the present invention. Therefore, the embodiments disclosed in the present invention are intended to illustrate rather than limit the scope of the present invention, and the scope of the technical idea of the present invention is not limited by these embodiments. The scope of protection of the present invention should be construed according to the following claims, and all technical ideas falling within the scope of the same shall be construed as falling within the scope of the present invention.

11 제1 마이크로폰,
12 제2 마이크로폰,
15 음성활동감지기,
16 잡음제거부.11 first microphone,
12 second microphone,
15 voice activity detector,
16 Noise Canceling.

Claims

제1 마이크로폰과, 상기 제1 마이크로폰에 비해서 화자의 입에서 상대적으로 멀리 떨어져 있는 제2 마이크로폰을 포함하는 통신장치에서의 잡음제거방법으로서, 상기 방법은
제2 마이크로폰으로 입력되는 신호에 PB-FDAF(Partitioned Block Frequency Domain Adaptive Filtering)을 수행하여 필터링된 신호를 출력하는 제1단계와,
제1 마이크로폰으로 입력되는 신호에서 상기 필터링된 신호를 뺀 차감신호를 출력하는 제2단계와,
MMSE-LSA(Minimum Mean-Square Error Log-Spectral Amplitude)를 수행하여 계산한 이득을 상기 차감 신호에 곱하는 제3단계
를 구비하는 두 개의 마이크로폰을 포함하는 통신장치에서의 잡음제거방법.A method for noise reduction in a communication device comprising a first microphone and a second microphone relatively far from the mouth of the speaker as compared to the first microphone,
A first step of performing PB-FDAF (Partitioned Block Frequency Domain Adaptive Filtering) on a signal input to the second microphone and outputting a filtered signal;
A second step of outputting a difference signal obtained by subtracting the filtered signal from a signal input to the first microphone;
A third step of multiplying the difference signal by a gain calculated by performing MMSE-LSA (Minimum Mean-Square Error Log-Spectral Amplitude)
And a plurality of microphones having a plurality of microphones.

제1항에 있어서, 제1단계는
제1 마이크로폰으로 입력되는 신호 d(n)과 제2 마이크로폰으로 입력되는 신호 x(n)에 기초하여 음성활동 유무를 판단하고 음성활동 유무를 나타내는 신호를 출력하는 제1-1단계와,
제1-1단계에서 음성활동이 없음을 나타내는 신호가 출력되면 PB-FDAF 필터의 계수를 갱신하는 제1-2단계와,
제2 마이크로폰으로 입력되는 신호 x(n)에 PB-FDAF 필터링을 수행하여 필터링된 출력 y(n)을 구하는 제1-3단계
를 포함하는 두 개의 마이크로폰을 포함하는 통신장치에서의 잡음제거방법.The method of claim 1, wherein the first step
A first step of determining presence / absence of voice activity based on the signal d (n) input to the first microphone and the signal x (n) input to the second microphone, and outputting a signal indicating the presence or absence of voice activity;
A 1-2 step of updating a coefficient of the PB-FDAF filter when a signal indicating that no voice activity is output in the step 1-1,
A third step of performing PB-FDAF filtering on the signal x (n) input to the second microphone to obtain a filtered output y (n)
And a second microphone for receiving the noise.

제2항에 있어서,
M은 파티션(partition)의 갯수, N은 각 파티션의 크기, L=MxN라 할 때,
제1-2단계에서는 다음 수식

에 의해 p번째 파티션에서의 q번째 계수 w _p _,q (n)을 갱신하며(μ는 수렴상수로서 0에서 1 사이의 값을 가지며 δ는 분모가 0에 가까운 값을 갖지 않도록 만들기 위한 상수값임),
제1-3단계에서는 다음 수식

에 의해 필터링을 수행하는 것을 특징으로 하는 두 개의 마이크로폰을 포함하는 통신장치에서의 잡음제거방법.3. The method of claim 2,
M is the number of partitions, N is the size of each partition, L = MxN,
In step 1-2, the following equation

(Where μ is a constant between 0 and 1, and δ is a constant value to make the denominator not have a value close to 0), and updates the q-th coefficient w _p _{, q} ( n ) ,
In step 1-3,

Wherein the filtering is performed by the first microphone and the second microphone.

제3항에 있어서, 상기 제1-1단계는
제1 마이크로폰으로 입력되는 신호 d(n)의 전력의 변화량(이하, "제1 변화량"이라 함)과 제2 마이크로폰으로 입력되는 신호 x(n)의 전력의 변화량(이하, "제2 변화량"이라 함)을 계산하는 단계와,
상기 제1 변화량과 제2 변화량의 차이값이 기준치보다 크면 음성활동이 있다고 판단하고 음성활동이 있음을 나타내는 신호를 출력하는 단계
를 포함하는 두 개의 마이크로폰을 포함하는 통신장치에서의 잡음제거방법.4. The method of claim 3,
(Hereinafter referred to as "second variation amount") of the power of the signal d (n) input to the first microphone (hereinafter referred to as "first variation amount" ), &Lt; / RTI >
If the difference between the first change amount and the second change amount is greater than a reference value, determining that there is a voice activity and outputting a signal indicating that there is a voice activity
And a second microphone for receiving the noise.

제4항에 있어서,
상기 제1 변화량은 현 시점으로부터 이전의 제1 기간 동안 제1 마이크로폰으로 입력된 평균신호전력과, 현 시점으로부터 이전의 상기 제1 기간보다 긴 제2 기간 동안 제1 마이크로폰으로 입력된 평균신호전력과의 차이값이며,
상기 제2 변화량은 현 시점으로부터 이전의 상기 제1 기간 동안 제2 마이크로폰으로 입력된 평균신호전력과, 현 시점으로부터 이전의 상기 제2 기간 동안 제2 마이크로폰으로 입력된 평균신호전력과의 차이값인 것을 특징으로 하는 두 개의 마이크로폰을 포함하는 통신장치에서의 잡음제거방법.5. The method of claim 4,
Wherein the first variation amount is a difference between an average signal power input to the first microphone during the first period from the current point of time and an average signal power input from the first microphone during the second period longer than the first period, Lt; / RTI >
The second change amount is a difference value between an average signal power input to the second microphone during the first period from the current time and an average signal power input from the current time to the second microphone during the second period Wherein the noise elimination method comprises the steps of:

제4항에 있어서,
상기 제1 변화량은 현 시점에 제1 마이크로폰으로 입력된 신호의 전력과, 현 시점으로부터 이전의 소정 기간 동안 제1 마이크로폰으로 입력된 평균전력과의 차이값이며,
상기 제2 변화량은 현 시점에 제2 마이크로폰으로 입력된 신호의 전력과, 현 시점으로부터 이전의 소정 기간 동안 제2 마이크로폰으로 입력된 평균전력과의 차이값인 것을 특징으로 하는 두 개의 마이크로폰을 포함하는 통신장치에서의 잡음제거방법.5. The method of claim 4,
The first change amount is a difference value between a power of a signal input to the first microphone at the current time and an average power input to the first microphone during a predetermined period from the current time,
Wherein the second change amount is a difference between a power of a signal input to the second microphone at the current time and an average power input to the second microphone during a predetermined period from the current time, Noise removal method in a communication device.

제4항 내지 제6항 중 어느 한 항에 있어서,
음성활동이 있다고 판단된 이후에 상기 제1 변화량과 제2 변화량의 차이값이 기준치보다 작아진 경우에도 소정 시간 동안에는 음성활동이 있음을 나타내는 신호를 계속 출력하는 것을 특징으로 하는 두 개의 마이크로폰을 포함하는 통신장치에서의 잡음제거방법.7. The method according to any one of claims 4 to 6,
And outputs a signal indicating that there is a voice activity for a predetermined time even if the difference value between the first change amount and the second change amount becomes smaller than the reference value after the voice activity is determined to be present Noise removal method in a communication device.

제2항 내지 제6항 중 어느 한 항에 있어서, 상기 제3단계에서
k는 k 번째 주파수 성분, q _k 는 잡음을 얼마나 많이 제거할 것인지를 결정하는 상수라고 할 때, 상기 이득은

에 의해 구하며,
여기에서, l은 프레임 번호, E _k ²(l)은 프레임 l에서의 차감신호 e(n)의 파워,

(l-1)는 프레임 l-1에서의 최종 출력 z(n)의 파워,

는 스무딩 상수라 할 때

이며,
여기에서 β는 스무딩 상수, Y(k,l)은 l 번째 프레임에서 y(n)을 FFT한 것이고 E(k,l)은 l 번째 프레임에서 e(n)을 FFT한 것이라 할 때

인 것을 특징으로 하는 두 개의 마이크로폰을 포함하는 통신장치에서의 잡음제거방법.7. The method according to any one of claims 2 to 6, wherein in the third step
k is to say the k-th frequency component, q _k is a constant that determines how many times to confirm the removal of the noise, the gain is

&Lt; / RTI >
Here, l is the frame number, E _k ² ( l ) is the power of the difference signal e (n) in frame l ,

(l-1) is the power of the final output z (n) in a frame l -1,

Is a smoothing constant

Lt;
Where β is a smoothing constant, Y (k, l) is an exemplary FFT of y (n) in the l-th frame (k, l) is E when a FFT would the e (n) in the l th frame

Wherein the first microphone and the second microphone are connected to each other.

제8항에 있어서, 제3단계는
차감신호 e(n)과 제2 마이크로폰으로부터의 신호 x(n)에 기초하여 음성활동 유무를 판단하고 음성활동 유무를 나타내는 신호를 출력하는 제3-1단계와,
제3-1단계에서 음성활동이 없음을 나타내는 신호가 출력되면

과

를 계산하는 단계
를 더 포함하는 두 개의 마이크로폰을 포함하는 통신장치에서의 잡음제거방법.9. The method of claim 8, wherein the third step comprises:
A third step of determining presence or absence of voice activity based on the difference signal e (n) and the signal x (n) from the second microphone, and outputting a signal indicating the presence or absence of voice activity;
If a signal indicating that there is no voice activity is output in the step 3-1

and

&Lt; / RTI >
The method of claim 1, further comprising the steps < RTI ID = 0.0 > of: < / RTI >

제1항 내지 제6항 중 어느 한 항에 있어서,
상기 제1 마이크로폰은 통신장치의 하단부에 위치하며, 상기 제2 마이크로폰은 통신장치의 상단부에 위치하는 것을 특징으로 하는 두 개의 마이크로폰을 포함하는 통신장치에서의 잡음제거방법.7. The method according to any one of claims 1 to 6,
Wherein the first microphone is located at a lower end of the communication device and the second microphone is located at an upper end of the communication device.

제1 마이크로폰과,
상기 제1 마이크로폰에 비해서 화자의 입에서 상대적으로 멀리 떨어져 있는 제2 마이크로폰과,
제2 마이크로폰으로 입력되는 신호에 PB-FDAF(Partitioned Block Frequency Domain Adaptive Filtering)을 수행하여 필터링된 신호를 출력하는 PB-FDAF부와, 제1 마이크로폰으로 입력되는 신호에서 상기 필터링된 신호를 뺀 차감신호를 출력하는 차감부와, MMSE-LSA(Minimum Mean-Square Error Log-Spectral Amplitude)를 수행하여 계산한 이득을 상기 차감 신호에 곱하는 MMSE-LSA부를 포함하는 잡음제거기
를 구비하는 잡음제거장치.A first microphone,
A second microphone relatively far from the mouth of the speaker as compared to the first microphone,
A PB-FDAF unit for performing a PB-FDAF (Partitioned Block Frequency Domain Adaptive Filtering) on a signal input to the second microphone and outputting a filtered signal, a difference signal generator for subtracting the filtered signal from a signal input to the first microphone, And a MMSE-LSA unit for multiplying the difference signal by a gain calculated by performing a Minimum Mean-Square Error Log-Spectral Amplitude (MMSE-LSA)
And a noise canceling unit.

제11항에 있어서,
상기 잡음제거기는 제1 마이크로폰으로 입력되는 신호 d(n)과 제2 마이크로폰으로 입력되는 신호 x(n)에 기초하여 음성활동 유무를 판단하고 음성활동 유무를 나타내는 신호를 출력하는 제1 음성활동감지부를 더 구비하며,
상기 PB-FDAF부는 상기 제1 음성활동감지부에서 음성활동이 없음을 나타내는 신호가 출력되면 PB-FDAF 필터의 계수를 갱신하고,
상기 PB-FDAF부는 제2 마이크로폰으로 입력되는 신호 x(n)에 PB-FDAF 필터링을 수행하여 필터링된 출력 y(n)을 구하는 것을 특징으로 하는 잡음제거장치.12. The method of claim 11,
The noise eliminator determines whether there is a voice activity based on the signal d (n) input to the first microphone and the signal x (n) input to the second microphone, and outputs a first voice activity detection Further,
The PB-FDAF unit updates the coefficient of the PB-FDAF filter when the first voice activity detection unit outputs a signal indicating no voice activity,
Wherein the PB-FDAF unit performs PB-FDAF filtering on a signal x (n) input to the second microphone to obtain a filtered output y (n).

제12항에 있어서,
M은 파티션(partition)의 갯수, N은 각 파티션의 크기, L=MxN라 할 때,
상기 PB-FDAF부는 다음 수식

에 의해 p번째 파티션에서의 q번째 계수 w _p _,q (n)을 갱신하며(μ는 수렴상수로서 0에서 1 사이의 값을 가지며 δ는 분모가 0에 가까운 값을 갖지 않도록 만들기 위한 상수값임),
상기 PB-FDAF부는 다음 수식

에 의해 필터링을 수행하는 것을 특징으로 하는 잡음제거장치.13. The method of claim 12,
M is the number of partitions, N is the size of each partition, L = MxN,
The PB-FDAF part may be expressed by the following equation

(Where μ is a constant between 0 and 1, and δ is a constant value to make the denominator not have a value close to 0), and updates the q-th coefficient w _p _{, q} ( n ) ,
The PB-FDAF part may be expressed by the following equation

Wherein the filtering unit performs filtering by the noise reduction unit.

제13항에 있어서, 상기 제1 음성활동감지부는
제1 마이크로폰으로 입력되는 신호 d(n)의 전력의 변화량(이하, "제1 변화량"이라 함)과 제2 마이크로폰으로 입력되는 신호 x(n)의 전력의 변화량(이하, "제2 변화량"이라 함)을 계산하고, 상기 제1 변화량과 제2 변화량의 차이값이 기준치보다 크면 음성활동이 있다고 판단하고 음성활동이 있음을 나타내는 신호를 출력하는 것을 특징으로 하는 잡음제거장치.14. The apparatus of claim 13, wherein the first voice activity sensing unit
(Hereinafter referred to as "second variation amount") of the power of the signal d (n) input to the first microphone (hereinafter referred to as "first variation amount" And outputs a signal indicating that there is a voice activity if the difference value between the first change amount and the second change amount is larger than the reference value.

제14항에 있어서,
상기 제1 변화량은 현 시점으로부터 이전의 제1 기간 동안 제1 마이크로폰으로 입력된 평균신호전력과, 현 시점으로부터 이전의 상기 제1 기간보다 긴 제2 기간 동안 제1 마이크로폰으로 입력된 평균신호전력과의 차이값이며,
상기 제2 변화량은 현 시점으로부터 이전의 상기 제1 기간 동안 제2 마이크로폰으로 입력된 평균신호전력과, 현 시점으로부터 이전의 상기 제2 기간 동안 제2 마이크로폰으로 입력된 평균신호전력과의 차이값인 것을 특징으로 하는 잡음제거장치.15. The method of claim 14,
Wherein the first variation amount is a difference between an average signal power input to the first microphone during the first period from the current point of time and an average signal power input from the first microphone during the second period longer than the first period, Lt; / RTI >
The second change amount is a difference value between an average signal power input to the second microphone during the first period from the current time and an average signal power input from the current time to the second microphone during the second period And a noise canceling unit.

제14항에 있어서,
상기 제1 변화량은 현 시점에 제1 마이크로폰으로 입력된 신호의 전력과, 현 시점으로부터 이전의 소정 기간 동안 제1 마이크로폰으로 입력된 평균전력과의 차이값이며,
상기 제2 변화량은 현 시점에 제2 마이크로폰으로 입력된 신호의 전력과, 현 시점으로부터 이전의 소정 기간 동안 제2 마이크로폰으로 입력된 평균전력과의 차이값인 것을 특징으로 하는 잡음제거장치.15. The method of claim 14,
The first change amount is a difference value between a power of a signal input to the first microphone at the current time and an average power input to the first microphone during a predetermined period from the current time,
Wherein the second change amount is a difference between a power of a signal input to the second microphone at the current time and an average power input to the second microphone during a predetermined period from the current time.

제14항 내지 제16항 중 어느 한 항에 있어서,
상기 제1 음성활동감지기는 음성활동이 있다고 판단된 이후에 상기 제1 변화량과 제2 변화량의 차이값이 기준치보다 작아진 경우에도 소정 시간 동안에는 음성활동이 있음을 나타내는 신호를 계속 출력하는 것을 특징으로 하는 잡음제거장치.17. The method according to any one of claims 14 to 16,
The first voice activity sensor continuously outputs a signal indicating that there is voice activity for a predetermined time even if the difference value between the first change amount and the second change amount becomes smaller than the reference value after it is determined that voice activity is present Noise canceling device.

제12항 내지 제16항 중 어느 한 항에 있어서, 상기 MMSE-LSA부는
k는 k 번째 주파수 성분, q _k 는 잡음을 얼마나 많이 제거할 것인지를 결정하는 상수라고 할 때, 다음 식

에 의해 상기 이득을 구하며,
여기에서, l은 프레임 번호, E _k ²(l)은 프레임 l에서의 차감신호 e(n)의 파워,

(l-1)는 프레임 l-1에서의 최종 출력 z(n)의 파워,

는 스무딩 상수라 할 때

인 것을 특징으로 하는 잡음제거장치.17. The system as claimed in any one of claims 12 to 16, wherein the MMSE-LSA unit
k is to say the k-th frequency component, q _k is a constant that determines how many times to confirm the removal of the noise, the following equation

To obtain the gain,
Here, l is the frame number, E _k ² ( l ) is the power of the difference signal e (n) in frame l ,

(l-1) is the power of the final output z (n) in a frame l -1,

Is a smoothing constant

And a noise canceling unit.

제18항에 있어서, 상기 잡음제거기는
차감신호 e(n)과 제2 마이크로폰으로부터의 신호 x(n)에 기초하여 음성활동 유무를 판단하고 음성활동 유무를 나타내는 신호를 출력하는 제2 음성활동감지부를 더 구비하며,
상기 MMSE-LSA부는 상기 제2 음성활동감지부로부터 음성활동이 없음을 나타내는 신호가 출력되면

과

를 계산하는 것을 특징으로 하는 잡음제거장치. 19. The apparatus of claim 18, wherein the noise canceller
Further comprising a second voice activity sensing unit for determining the presence or absence of voice activity based on the difference signal e (n) and the signal x (n) from the second microphone and for outputting a signal indicating the presence or absence of voice activity,
When the MMSE-LSA unit outputs a signal indicating that there is no voice activity from the second voice activity sensing unit

and

Of the noise canceling unit.

제11항 내지 제16항 중 어느 한 항에 있어서,
상기 제1 마이크로폰은 잡음제거장치가 설치된 통신장치의 하단부에 위치하며, 상기 제2 마이크로폰은 통신장치의 상단부에 위치하는 것을 특징으로 하는 잡음제거장치.
17. The method according to any one of claims 11 to 16,
Wherein the first microphone is located at a lower end of a communication device in which the noise removing device is installed, and the second microphone is located at an upper end of the communication device.