KR0175250B1

KR0175250B1 - Vocoder Tone Detection Circuit and Method

Info

Publication number: KR0175250B1
Application number: KR1019920022654A
Authority: KR
Inventors: 김흥국; 조용덕; 공병구
Original assignee: 정용문; 삼성전자주식회사
Priority date: 1992-11-27
Filing date: 1992-11-27
Publication date: 1999-04-01
Also published as: KR940013031A

Abstract

보코더의 톤검출회로 및 방법에 관한 것으로, 특히 입력신호로 부터 모델 파라메타를 추출하여 톤신호를 검출하는 톤검출회로 및 방법에 관한 것이다.The present invention relates to a tone detection circuit and a method of a vocoder, and more particularly, to a tone detection circuit and a method for detecting a tone signal by extracting model parameters from an input signal.

음성분석시 처리되어 입력된 각종 파라메타를 분석하여 DTMF 및 호진행톤 신호 또는 음성데이타를 검출하여 출력하고 상기 검출된 DTMF신호를 분석하여 다이얼 키 신호의 종류를 판별하며, 상기 검출된 호진행 톤을 분석하여 링백톤, 다이얼톤, 비지톤임을 판별함으로써, 음성데이타를 신속하게 처리한다.Analyzes various parameters input and processed during voice analysis to detect and output DTMF and call progress tone signals or voice data, and analyzes the detected DTMF signals to determine the type of dial key signal, and detects the detected call progress tone. By analyzing the ringback tone, dial tone and busy tone, voice data is processed quickly.

Description

보코더의 톤 검출회로 및 방법Vocoder Tone Detection Circuit and Method

제1도는 종래의 DTMF 검출 흐름도.1 is a conventional DTMF detection flow chart.

제2도는 종래의 호진행톤검출 흐름도.2 is a flow chart of a conventional arc progress tone detection.

제3도는 본 발명에 따른 보코더의 블럭 구성도.3 is a block diagram of a vocoder according to the present invention.

제4도는 제3도중 톤검출부(102)의 구체블럭도.4 is a concrete block diagram of the tone detection unit 102 in FIG.

* 도면의 주요부분에 대한 부호의 설명* Explanation of symbols for main parts of the drawings

101 : 음성분석부 201 : 언패킷화부101: voice analysis unit 201: unpacketization unit

102 : 톤검출부 202 : 역양자화부102: tone detection unit 202: inverse quantization unit

103 : 양자화부 203 : 음성분석부103: quantization unit 203: speech analysis unit

104 : 패킷화부 204 : 톤검출부104: packetization unit 204: tone detection unit

본 발명은 보코더의 톤검출회로 및 방법에 관한 것으로, 특히 입력신호로 부터 모델 파라메타를 추출하여 톤신호를 검출하는 톤검출회로 및 방법에 관한 것이다.The present invention relates to a tone detection circuit and method of a vocoder, and more particularly, to a tone detection circuit and a method for detecting a tone signal by extracting model parameters from an input signal.

일반적으로 DTMF나 다이얼톤(dial), 비지톤(busy), 링백톤(ringback tone)과 같은 호진행톤(call progress tone)은 전화회로 사이의 통신수단일 뿐만 아니라 전화선을 이용하는 자동 응답 시스템(Automatic Response System: ARS)에서 사용자(User)와 시스템간의 통신 수단으로 사용되고 있다. 특히 저전송률을 갖는 음성 부호화기는 음성 신호 자체를 전송하는 것이 아니라 비트수를 줄이기 위해 음성의 특징을 추출하여 그 추출된 파라메타를 전송하기 때문에 음성 복호시 톤(tone)신호를 정확히 복구해 내기가 어렵게 된다. 이러한 상황에서는 미리 톤(tone)을 검출하여 톤일 경우 음성과 다르게 그 톤(tone)의 코드(code)를 전송함으로써 문제를 해결할 수 있다.In general, call progress tones such as DTMF, dial, busy and ringback tones are not only means of communication between telephone circuits, but also automated telephone systems using telephone lines. It is used as a means of communication between a user and a system in a response system (ARS). In particular, the voice coder with a low bit rate does not transmit the speech signal itself, but instead extracts the features of the speech and transmits the extracted parameters to reduce the number of bits, making it difficult to accurately recover the tone signal during speech decoding. do. In such a situation, the problem can be solved by detecting a tone in advance and transmitting a code of the tone differently from the voice in the case of the tone.

제1도는 종래의 DTMF 검출 흐름도이고, 제2도는 종래의 호진행톤검출 흐름도로서, 종래의 DTMF신호와 호진행 톤신호를 검출하는 동작을 제1-2도를 참조하여 설명하면 입력되는 S(n)에 대한 DFT(Discrete Fourier Transform) X(k)는 하기와 같다.FIG. 1 is a conventional DTMF detection flowchart, and FIG. 2 is a conventional call progress tone detection flowchart. Referring to FIGS. 1-2, an operation of detecting a conventional DTMF signal and a call progress tone signal is described. Discrete Fourier Transform (DFT) X (k) for n) is as follows.

이다. to be.

여기서, N은 입력신호 샘플의 수이며, N = 256으로 하는 경우 k가 나타내는 주파수는 (8㎑/N)ㆍk로 된다.Here, N is the number of input signal samples, and when N = 256, the frequency indicated by k is (8 Hz / N) k.

예를들면 ｜X(10) ｜은에서의 스펙트럼의 크기가 된다. 입력 톤신호를 분석하게 되며 이때 DTMF신호는 하기 표(1)과 같은 주파수에 의해 발생되므로 (11)단계에서 각 주파수 스펙트럼의 크기를 상기 DFT를 사용하여 DTMF신호 상태를 판별하기 위한 8가지 주파수에 의해 구한다.For example, X (10) Is the magnitude of the spectrum at. The input tone signal is analyzed. At this time, since the DTMF signal is generated by the frequency as shown in Table 1, the frequency spectrum of each frequency spectrum is determined at 8 frequencies for determining the state of the DTMF signal using the DFT. Obtained by

그리고 (12)단계에서는 상기 DTMF상태를 구하기 위한 8가지의 주파수의 값을 R(K) {k=697, 770, 852, 941}, C(k) {k=1209, 1336, 1477, 1633}이라 하고 각 R(k), C(k)에 대해 최대크기를 갖는 k를 하기식(1)과 같이 선택하고 (13)단계를 수행한다.In step (12), the eight frequency values for obtaining the DTMF state are R (K) {k = 697, 770, 852, 941}, and C (k) {k = 1209, 1336, 1477, 1633}. Then, k having the maximum size for each R (k) and C (k) is selected as in the following Equation (1) and step (13) is performed.

상기 (13)단계에서는 상기 Krmax, Kcmax에 해당하는 값을 각각 R(Krmax), C(Kcmax)라 하고 이 값은 DTMF 신호가 아닌것으로 판단하고 (14)단계를 수행한다.In step (13), the values corresponding to Krmax and Kcmax are referred to as R (Krmax) and C (Kcmax), respectively, and this value is determined not to be a DTMF signal, and step (14) is performed.

상기 (14)단계에서는 모든 R(k), C(k)에 대해 다른 임계치(R-min-no-tone, C-min-no-tone)과 비교해서 이값 이상인 주파수가 2개 이상 존재하지 않을 경우 DTMF신호라 판단하고 상기 주파수가 2개 이상 존재할 경우에 DTMF가 아니라고 판단하고 (15)단계를 수행한다. 상기 (15)단계에서는 트위스트(Twist)가 되어야 하는데 상기 트위스트(Twist)는 하나의 DTMF 디지트(digit)에 대해 로우주파수(row frequency)와 컬럼주파수(column frequency)에서의 스팩트럼(spectrum) 크기의 차가 포워드트위스트(forword twist)는 4㏈ 이하, 리버스 트위스트(reverse twist)는 8㏈ 이상이어야 한다. 이 조건을 만족해야 한다. 그리고 (16)단계에서 2번째 하모닉스(harmonics)에 대한 값의 크기를 비교한다. 마찬가지로 DFT를 사용하여 해당 주파수에서 크기를 구한다. 즉, R2(Krmax), C2(Kcmax)를 구한후 ｜R(Krmax) - R2(Krmax)｜ max-2nd-harm, ｜C(Kcmax) - C2(Kcmax)｜ max-2nd-harm 이어야 DTMF의 후보로 인식한다.In step (14), two or more frequencies that are higher than this value do not exist in comparison with other threshold values (R-min-no-tone and C-min-no-tone) for all R (k) and C (k). If it is determined that it is a DTMF signal and if two or more frequencies exist, it is determined that it is not DTMF, and step (15) is performed. In step (15), the twist should be a twist. The twist is a difference between a spectrum size at a row frequency and a column frequency with respect to one DTMF digit. The forward twist should be 4 ㏈ or less, and the reverse twist should be 8 ㏈ or more. This condition must be satisfied. In step 16, the size of the second harmonics is compared. Similarly, use the DFT to find the magnitude at that frequency. That is, after R2 (Krmax) and C2 (Kcmax) are obtained, | R (Krmax)-R2 (Krmax) | max-2nd-harm, | C (Kcmax)-C2 (Kcmax) | max-2nd-harm, Recognize it as a candidate.

호출진행 톤(Call Progress Tone)은 DTMF와 마찬가지로 하기 표2와 같은 두개의 주파수의 합신호로 발생된다. 그러나 DTMF의 신호 주파수와는 다르게 톤에서는 두 주파수의 차가 적기 때문에 제2도와 같은 방법으로 검출된다.Call Progress Tone is generated as a sum signal of two frequencies as shown in Table 2, like DTMF. However, unlike the signal frequency of DTMF, since the difference between two frequencies is small in the tone, it is detected by the method shown in FIG.

(21)단계에서 120㎐ 이하에서 또는 1600㎐ 이상에서의 스펙트럼(spectrum)의 크기가 40㏈이상 인가 판단한다. 이때 40㏈ 이상이면 (25)단계에서 호출톤(call Tone)이 아니라고 결정하고 (22)단계를 수행한다.In step (21), it is determined whether the spectrum has a size of 40 Hz or more at 120 Hz or less or 1600 Hz or more. In this case, if it is more than 40 ms, it is determined that the call tone is not called in step 25 and step 22 is performed.

상기 (22)단계에서는 DTMF와 마찬가지로 최대 크기를 갖는 로우(row)와 컬럼(column) 주파수를 찾는다.In step (22), as in DTMF, the row and column frequencies having the maximum magnitude are found.

그리고 (23)단계에서 해당 주파수와 하기 표2에서 보인 카덴스(cadence)로 부터 호출톤 디코딩(Call Tone Decoding)을 하고 (24)단계에서 디코딩한 호출진행 톤을 검출한다.In step 23, call tone decoding is performed from the corresponding frequency and the cadence shown in Table 2 below, and the call progress tone decoded in step 24 is detected.

상기와 같은 종래의 DTMF신호 및 호출진행 톤신호검출 방법은 음성부호화 알고리즘과 별개로 진행되어 전송채널의 왜곡에 의해 스펙트럼에 진폭 왜곡이 발생할 경우 신호검출시 에러가 발생되고 또한 정확한 DFT해석을 위해 보다 많은 샘플이 필요하게 되어 디코딩 딜레이가 많아 음성부호화 처리시 처리 속도가 떨어지는 문제점이 있었다.The conventional DTMF signal and call progress tone signal detection method is performed separately from the voice encoding algorithm, and when amplitude distortion occurs in the spectrum due to distortion of the transmission channel, an error occurs in signal detection and more accurate DFT analysis is performed. Since a large number of samples are required, a large number of decoding delays have caused a problem of slowing down the processing speed of speech encoding.

따라서 본 발명의 목적은 음성신호처리로 부터 얻어진 파라메타를 이용하여 톤신호를 검출하는 검출회로 및 방법을 제공함에 있다.It is therefore an object of the present invention to provide a detection circuit and method for detecting a tone signal using a parameter obtained from speech signal processing.

본 발명의 다른 목적은 음성신호를 처리한후 톤신호를 검출하여 음성 처리 속도를 향상시킬 수 있는 톤검출회로 및 방법을 제공함에 있다.Another object of the present invention is to provide a tone detection circuit and a method capable of improving a voice processing speed by detecting a tone signal after processing a voice signal.

이하 본 발명을 첨부한 도면을 참조해서 상세히 설명한다.Hereinafter, the present invention will be described in detail with reference to the accompanying drawings.

제3도는 본 발명에 따른 보코더의 블럭 구성도로서, 64kbps의 PCM 음성신호를 입력하여 8kbps이하의 데이타로 부호화하는 부호화부(100)와, 상기 부호화부(100)에서 부호화된 8kbps 음성신호를 64kbps음성신호로 복호화하는 복호화부(200)로 구성된다.3 is a block diagram of a vocoder according to the present invention. An encoder 100 for inputting a 64 kbps PCM speech signal and encoding the data into 8 kbps or less and a 64 kbps encoded 8 kbps speech signal encoded by the encoder 100 are shown in FIG. It consists of a decoding unit 200 for decoding into a voice signal.

입력단자(P1)를 통해 입력된 음성신호를 입력하여 PCM 신호로 부터 LSP계수, 피치(pitch) 음성신호의 반사계수(RC1), ZCR(Zero-Crossing Rate), COVMAX(Maximum Error Signal Variance) 계산하는 음성분석부(101)과, 상기 음성분석부(101)에서 출력된, LSP계수, 피치, RC1, ZCR, COVMAX로 부터 톤신호의 종류 및 음성신호를 검출하는 톤검출부(102)와, 상기 톤검출부(102)에서 검출된 톤신호와 음성 분석된 신호를 양자화 하는 양자화부(103)와, 상기 양자화부(103)에서 양자화된 신호를 패킷화 하는 패킷화부(104)로 구성된 부분이 부호화부(100)가 된다.Calculate LSP coefficient, reflection coefficient (RC1), ZCR (Zero-Crossing Rate), COVMAX (Maximum Error Signal Variance) from PCM signal by inputting voice signal input through input terminal (P1) And a tone detector 102 for detecting the type and tone of the tone signal from the LSP coefficient, pitch, RC1, ZCR, and COVMAX output from the voice analyzer 101, and The encoding unit includes a quantizer 103 for quantizing the tone signal detected by the tone detector 102 and a signal analyzed by voice, and a packetizer 104 for packetizing the quantized signal by the quantizer 103. (100).

패킷화된 신호를 입력하여 진폭, 피치, DTMF 정보 LSP계수를 분석하여 패킷을 풀기 위한 언패킷화부(201)와, 상기 언패킷화부(201)에서 언패킷화된 신호를 입력하여 역양자화를 수행하는 역양자하부(202)와, 상기 역양자화부(202)에서 역양자화된 음성신호를 입력하여 음성을 합성하는 음성합성부(203)과, 상기 역양자화부(202)에서 역양자화된 톤신호를 분석하여 톤신호를 재생하는 톤생성부(204)로 구성되어 있다.Inverse quantization is performed by inputting a packetized signal to analyze the amplitude, pitch, and DTMF information LSP coefficients to unpack the packet, and input the unpacketized signal from the unpacketizer 201. The inverse quantum lower part 202, the voice synthesizer 203 for synthesizing the voice by inputting the dequantized voice signal in the inverse quantizer 202, and the inverse quantized tone signal in the inverse quantizer 202. It is composed of a tone generating unit 204 for reproducing a tone signal by analyzing the.

상기와 같은 LSP 보코더의 구성은 원칩(one-chip)으로 된 DSP로도 구성할 수 있다.The configuration of the LSP vocoder as described above can also be configured as a one-chip DSP.

제3도중 톤검출부(103)의 상세 블럭도로서, 음성분석부(101)에서 검출한 각종 파라메타(LSP1-LSP10, ZCR, PITCH, RC1, COVMAX)를 입력하여 DTMF, 톤신호, 음성데이타를 검출하여 출력하는 신호판별부(11)와, 상기 신호판별부(11)에서 출력된 DTMF신호를 입력하여 다이얼키의 종류를 판별하여 출력하는 다이얼키 판별부(12)와, 상기 신호판별부(11)에서 출력된 톤신호를 입력하여 톤의 종류를 판별하여 출력하는 톤신호 판별부(13)로 구성되어 있다.As a detailed block diagram of the tone detector 103 shown in FIG. 3, DTMF, tone signals, and voice data are detected by inputting various parameters detected by the voice analyzer 101 (LSP1-LSP10, ZCR, PITCH, RC1, and COVMAX). And a dial key discriminating unit 12 for inputting a DTMF signal output from the signal discriminating unit 11 to determine the type of dial key, and outputting the same. It is composed of a tone signal determination unit 13 for inputting the tone signal output from the) to determine the type of tone to output.

상술한 제3~4도를 참조하여 본 발명의 바람직한 일실시예를 상세히 설명한다.One preferred embodiment of the present invention will be described in detail with reference to FIGS. 3 to 4 above.

입력단자(P1)를 통해 입력된 신호로 부터 DTMF처리를 위한 전단계로 음성분석부(101)에서는 LSP를 분석하게 되는데 상기 LSP분석은 입력신호로부터 10㎳의 프레임 레이트(frame rate)로 30㎳의 해밍윈도잉(Hamming Windowing)을 한다. 우선 입력신호의 시간변화량 ZCR(영교차율 Zero Crossing Rat)을 구하고, LPC(Linear Predictive Coding) 분석을 위해 자기상관계수(autocorrelation)를 계산한다. LPC 분석 후 얻은 선형 예측 계수는 주파수 특성을 갖는 LSP(Line Spectrum Pair)로 바뀌어지며, 상기 LSP1-LSP10으로 분리된다. 또한 잔차신호로 부터 피치를 추출하고, 이때 전차신호의 최대 자기상관계수(COVMAX)와 첫번째 반사계수(RC1)를 구한다.As a previous step for DTMF processing from the signal input through the input terminal P1, the voice analyzer 101 analyzes the LSP. The LSP analysis is performed at a frame rate of 10 Hz from the input signal. Hamming Windowing First, the time-varying ZCR (Zero Crossing Rat) of the input signal is calculated, and autocorrelation is calculated for LPC (Linear Predictive Coding) analysis. The linear prediction coefficients obtained after the LPC analysis are converted into LSPs (Line Spectrum Pairs) having frequency characteristics and separated into LSP1-LSP10. In addition, the pitch is extracted from the residual signal, and the maximum autocorrelation coefficient COVMAX and the first reflection coefficient RC1 of the tram signal are obtained.

상기 음성분석부(101)에서 검출된 파라메타 LSP1-LSP10, ZCR, 피치, RC1, COVMAX를 입력하는 신호판별부(11)는 상기 각 파라메타가 DTMF인지 아니면 톤(다이얼톤, 비지톤, 링백톤 등) 및 음성 데이타 인가를 판별하게 되는데 상기 DTMF와 톤신호를 판별하는 조건은 다음과 같다.The signal discrimination unit 11 for inputting the parameters LSP1-LSP10, ZCR, pitch, RC1, and COVMAX detected by the voice analyzer 101 may be DTMF or tone (dial tone, busy tone, ringback tone, etc.). ) And voice data application. The conditions for discriminating the DTMF and the tone signal are as follows.

상기 음성분석부(101)에서 검출된 각 파라메타가 DTMF로 판단되기 위해서는 다음의 조건을 만족해야 한다. 입력신호가 DTMF로 판단되기 위해서는 다음의 조건을 동시에 만족해야 한다.In order to determine each parameter detected by the voice analyzer 101 as DTMF, the following conditions must be satisfied. In order for the input signal to be determined as DTMF, the following conditions must be satisfied simultaneously.

1) LSP 계수의 조건; 입력신호가 백색 잡음(white noise)인 경우의 LSP계수의 분포는 0에서 1사이로 거의 균일하게 존재한다. DTMF인 경우는 상위 3개의 LSP계수가 백색잡음 보다는 음성에 조금 가까운 정도의 분포를 하여 그 범위는 다음과 같다. 8번째 LSP 계수가 0.53 LSP_8 0.67이고 9번째 LSP 계수는 0.63 LSP_9 0.75을 만족하여야 하고, 마지막 LSP계수는 0.73 LSP_10 0.90이어야 DTMF의 후보가 된다.1) the condition of the LSP coefficients; When the input signal is white noise, the distribution of the LSP coefficients is almost uniformly between 0 and 1. In the case of DTMF, the top three LSP coefficients are distributed to a little closer to speech than white noise, and the range is as follows. The 8th LSP coefficient should be 0.53 LSP_8 0.67, the 9th LSP coefficient should satisfy 0.63 LSP_9 0.75, and the last LSP coefficient should be 0.73 LSP_10 0.90 to be a candidate for DTMF.

2) LSP 계수간의 조건: DTMF신호의 고주파 성분은 크기가 매우 작고 음성과는 다르게 포먼트(formant)가 없다. 그리고 포먼트(formant)가 나타날 경우 LSP 계수의 분포 특성이 인접한 2개의 LSP계수가 접근하여 나타난다. 따라서 DTMF의 경우는 아래의 조건과 같이 두개의 파라메타가 인접하지 않아야 한다. 즉, DTMF신호의 경우는 6번째와 7번째ㆍLSP계수가(LSP_7 - LSP_6) 0.05이고, 8번째 LSP계수와 7번째 LSP 계수의 관계는(LSP_8 - LSP_7) 0.05를 만족해야 한다.2) Condition between LSP coefficients: The high frequency component of the DTMF signal is very small in size and has no formant unlike voice. When formants appear, the distribution characteristics of LSP coefficients appear as two adjacent LSP coefficients approach. Therefore, in case of DTMF, two parameters should not be adjacent as below condition. That is, in the case of the DTMF signal, the sixth, seventh, and LSP coefficients (LSP_7-LSP_6) are 0.05, and the relationship between the eighth LSP coefficient and the seventh LSP coefficient (LSP_8-LSP_7) must satisfy 0.05.

3) 피치(PITCH)의 조건; DTMF의 신호 S(t)는 하기 식(2)와 같다.3) conditions of the pitch (PITCH); The signal S (t) of DTMF is shown in the following equation (2).

이므로 피치는 두 주파수의 차의 반의 주파수에 영향을 받으며 134-468㎐이고, 이를 샘플(sample)수로 표시하면 17-60이 된다. 또한 노이즈(Noise)의 영향을 고려하여 DTMF인 경우의 피치는 15 pitch 60으로 분석되어야 한다.Therefore, the pitch is 134-468 kHz, which is affected by the frequency of half of the difference between the two frequencies, which is 17-60 when expressed as the number of samples. Also, considering the influence of noise, the pitch in the case of DTMF should be analyzed as 15 pitch 60.

4) ZCR 조건; 영교차율의 영향은 식 (2)에서의 2개 주파수합의 반에 영향을 받는다. 958-1287㎐이므로 약 10정도의 ZCR을 갖는다. 그러나 상기 식 (2)에서 보듯이 DTMF는 반송파 주파수(carrier frequency)가 sin[(ω_rt + ω_Ct)/2]이고 베이스밴드(baseband)신호가 cos[ω_rt - ω_ct)/2]인 변조 신호로 볼 수 있으며 실제 음성 분석시에는 30㎳의 해밍(hamming)윈도우를 사용했으므로 ZCR은 증가한다. 실험적으로 DTMF인 경우는 10 ≤ ZCR 28의 범위 안에 존재해야 한다.4) ZCR conditions; The influence of the zero crossing rate is affected by half of the sum of the two frequencies in (2). Since it is 958-1287 ㎐, it has a ZCR of about 10. However, as shown in Equation (2), DTMF has a carrier frequency of sin [(ω _r t + ω _C t) / 2] and a baseband signal of cos [ω _r t-ω _c t). / 2], and the ZCR increases because the Hamming window of 30㎳ is used for the actual speech analysis. Experimentally, the DTMF case should be in the range of 10 ≦ ZCR 28.

5) COVMAX 조건; COVMAX 파라메타는 신호의 잔차에 대한 자기 상관 계수의 값들 중에서 최대값과 같고 백색잡음의 경우에는 COVMAX는 0이 되며 음성인 경우 1에 가까운 값으로 나타난다. DTMF를 음성으로 간주하여 분석할 경우, 무성음보다는 유성음에 가깝게 나타남으로 COVMAX 0.5으로 분포한다.5) COVMAX condition; The COVMAX parameter is equal to the maximum of the values of the autocorrelation coefficient for the residual of the signal. The COVMAX becomes 0 for white noise and close to 1 for negative. When DTMF is regarded as negative, it is distributed as COVMAX 0.5 because it appears closer to voiced sound than unvoiced sound.

6) RC1 조건; RC1은 신호의 에너지와 첫번째 자기 상관 계수의 비로써 DTMF는 유성음에 비해 신호가 보다 주기적으로 반복된다. 실험적으로 DTMF 신호의 첫번째 반사 계수가 RC1 0.8정도가 분포하였다.6) RC1 condition; RC1 is the ratio of the energy of the signal to the first autocorrelation coefficient. DTMF repeats the signal more periodically than voiced sound. Experimentally, the first reflection coefficient of DTMF signal was about RC1 0.8.

따라서 신호판별부(11)에서는 상기 음성분석부(101)로 부터 출력된 각종 파라메타가 DTMF신호임을 판별하여 출력하게 된다. 또한 상기 음성분석부(101)에서 출력된 각종 파라메타가 호진행톤(call progress tone)이 되기 위해서는 하기 표(3)과 같은 조건을 만족하여야 한다.Accordingly, the signal discriminating unit 11 determines that the various parameters output from the voice analyzing unit 101 are DTMF signals and outputs the DTMF signals. In addition, in order for the various parameters output from the voice analyzer 101 to be call progress tones, the conditions as shown in Table 3 below must be satisfied.

따라서 상기 신호판별부(11)에서는 상기 음성분석부(101)에서 출력된 각종 파라메타가 상기 표(3)과 같은 조건을 만족하게 되면 호진행톤(call progress tone)신호임을 판별하여 출력하게 된다.Therefore, the signal discriminating unit 11 determines that the various parameters output from the voice analyzing unit 101 satisfy the conditions as shown in Table 3, and outputs a call progress tone signal.

상기 신호판별부(11)에서 판별되어 출력된 DTMF신호를 입력하는 다이얼키 판별부(12)는 다이얼키의 종류(0,1,2,3,...9,*,#등)을 판별하여 출력하게 된다.The dial key determining unit 12 for inputting the DTMF signal discriminated and output by the signal discriminating unit 11 determines the type (0, 1, 2, 3, ... 9, *, #, etc.) of the dial key. Will print.

상기 다이얼키 판별부(12)에서 다이얼키의 종류를 판별하기 위해서는 DTMF중심 주파수에 대한 편차를 하기 식(4), 식(5)에 의해 구한다.In order to determine the type of dial key in the dial key discriminating unit 12, the deviation with respect to the DTMF center frequency is obtained by the following equations (4) and (5).

여기서, Frow[k], Fcol[k]는 전술한 표(1)의 k번째 로우(row)와 컬럼(column) 주파수를 나타낸다. 그리고 각각의 값이 0.07이하인 것 중 최소의 k를 갖는 인덱스(index)를 각각 R*, C*라 한다.Here, Frow [k] and Fcol [k] represent the k-th row and the column frequency of the above-described table (1). The index having the smallest k among the values of 0.07 or less is referred to as R * and C *, respectively.

따라서 DTMF 디지트는 4 × R* + C*로 되며 이를 다이얼 키로 판별하여 출력하게 된다. 또한 상기 신호판별부(11)로 부터 판별 출력된 호진행톤(call progress tone)신호를 입력하는 톤신호판별부(13)는 비지톤, 링백톤, 다이얼톤 등을 판별하여 출력하게 된다. 상기 톤신호판별부(13)에서 톤신호의 종류를 판별하기 위해서는 각각 RC1의 분포에 따라 결정되며 상기 RC1이 0.9-0.93이면, 비지톤 0.93-0.94이면, 링백톤 0.94-0.96이면, 다이얼톤으로 판별하여 출력하게 된다.Therefore, the DTMF digit is 4 × R * + C *, which is determined by the dial key and output. In addition, the tone signal discrimination unit 13 for inputting a call progress tone signal discriminated and output from the signal discrimination unit 11 discriminates and outputs a busy tone, a ringback tone, a dial tone, and the like. To determine the type of the tone signal in the tone signal discrimination unit 13 is determined according to the distribution of RC1, and if RC1 is 0.9-0.93, busy tone 0.93-0.94, ringback tone 0.94-0.96, dial tone It is determined and output.

상기 톤검출부(102)에서 검출된 DTMF 및 호진행톤이나 음성데이타(Speech)를 입력하는 양자화부(103)는 상기 입력신호를 양자화하여 출력하게 되는데 그 동작을 살펴보면 다음과 같다.The quantization unit 103 for inputting the DTMF detected by the tone detector 102, the call progress tone, or the speech data is quantized and outputted by the input signal.

상기 톤검출부(102)에서 검출된 음성신호의 진폭은 음성 분석시 입력 음성이 μ-Law 8bit이므로 CCITT 권고안 G.711에 의해 선형 14bit로 변환한다. 그리고 피치는 유성음인 경우의 피치는 20 ~ 146의 범위로 분석된다. 피치는 7bit로 양자화 하면 충분한다. 즉 양자화 된 값은 QT = T - 19이 되며 유/무성음을 구별하는 플래그인 V/U는 1bit로 할당한다.The amplitude of the speech signal detected by the tone detector 102 is converted into a linear 14-bit according to CCITT Recommendation G.711 because the input speech is μ-Law 8 bits during speech analysis. The pitch is analyzed in the range of 20 to 146 when the pitch is voiced sound. Pitch is enough to quantize to 7 bits. That is, the quantized value is QT = T-19, and V / U, which is a flag for distinguishing between voiced and unvoiced sound, is allocated to 1 bit.

그리고 상기 LSP 계수의 존재 범위는 0 ~ 4㎑이고 LSP 합성 필터에서는 ω1보다는 cosω1을 사용한다. 따라서 ω1를 전송하는데 비해 본 발명의 보코더(vocoder)에서는 cosω1를 양자화함으로써 엔코더(encoder)와 디코더(decoder)에서 cos^-1(ㆍ)나 cos(ㆍ)의 연산을 피함으로서 계산량을 줄인다. 양자화 레벨(level)은 64로 각 ω1당 6bits 씩 할당한다. 양자화 방법은 주파수 전범위에 대해 하기 식 (6)에서와 같이 로그 스케일(log scale)로 레벨(level)의 중간값을 구한후 제일 가까운 레벨(level)을 찾아 bit를 할당한다.And the existence range of the LSP coefficient is 0 ~ 4㎑ and cosω1 is used in the LSP synthesis filter rather than ω1. Therefore, the vocoder of the present invention reduces the amount of calculation by avoiding the operation of cos ^-1 or cos in the encoder and decoder by quantizing cosω1 in the vocoder of the present invention. The quantization level is 64, allocated 6 bits for each ω1. The quantization method allocates bits by finding the median value of the level on a log scale for the entire frequency range, as shown in Equation (6), and then finding the nearest level.

이값에 cosine을 취하면 △k = cos(△ㆍπ/4000) (k = 0,....,127)If we take cosine at this value, then Δk = cos (△ · π / 4000) (k = 0, ...., 127)

기호로는 C1로 표기한다. 이 C1(i = 1,....,10)는 C1을 제외한 나머지 9개에 대해서는 그 차로 양자화 된다. 즉, 하기 식(7)과 같다.The symbol is denoted by C1. This C1 (i = 1, ..., 10) is quantized for the remaining nine except C1. That is, it is as following formula (7).

DTMF와 정상 호출 진행톤(Normal Call Progress Tone)의 양자화를 보면 DTMF와 Tone에 대해서는 6bit를 할당한다. 하기 표(4)sms 각각의 DTMF의 디지트(digit)와 톤(tone)에 대한 코드(code)를 보여준다.In quantization of DTMF and normal call progress tone, 6 bits are allocated for DTMF and tone. Table 4 below shows the codes for the digits and tones of each DTMF.

DF5 는 DF4 ~ DF0에 대해 우수 캐패시터(even parity)로 할당된다.DF5 is allocated as an even parity for DF4 to DF0.

이상에서 설명한 각 파라메타에 대한 bit 할당은 다음과 같다.Bit allocation for each parameter described above is as follows.

상기 양자화부(103)에서 양자화를 한후 패킷화부(104)에서는 양자화된 파라메타들을 전송하기 위해 80bit로 패킷화를 한다. 상기 패킷화부(104)에서 패킷화하는 동작을 보면 전술한 68bits는 3가지 데이타 형태(data type)로 분류된다. Type I의 12bits로 부터 19bits의 CRC(Cyclic Redundancy Check) 코드(code)를 만들고 ((19,12) code), type II에 대해서는 (15,10) 코드 code를 생성한다. 표 5 은 각 파라메타에 대한 코드(code) 분류를 보여준다.After the quantization is performed by the quantization unit 103, the packetization unit 104 performs packetization with 80 bits to transmit the quantized parameters. In the packetizing operation of the packetizer 104, the above-described 68 bits are classified into three data types. Generates 19 bits of Cyclic Redundancy Check (CRC) code from 12 bits of Type I ((19,12) code) and (15,10) code for Type II. Table 5 shows the code classifications for each parameter.

상기 표(3)의 Type I에 대한 (19,12) 코드(code)의 생성 다항식(polynomial)은 하기 식(8)와 같다.The polynomial of generation of the (19,12) code for Type I in Table (3) is shown in Equation (8).

또한 type II의 생성 다항식(generator polynomial)은 하기 식(9) 같이 주어진다.Also, the generator polynomial of type II is given by the following equation (9).

그리고 (19,12) CRC코드와 (15,10)CRC코드, (code)의 생성 메트릭스(generator matrix)를 보여 준다.The generation matrix of the (19,12) CRC code, the (15,10) CRC code, and (code) is shown.

이때 상기 패킷화된 80비트에 대한 데이타는 하이웨이 인터페이스를 통해 수신단의 언패킷화부로 인가된다.At this time, the data for the packetized 80 bits is applied to the unpacketization unit of the receiver through the highway interface.

그리고 하이웨이 인터페이스를 통해 전송된 데이타는 언패킷화부(201)에서 진폭, 피치, 무유성/유음성, DTMF정보 LSP계수값을 분리 해 낸다. 또한 3가지 데이타 형태에 따른 CRC 코드를 다시 생성하고 전송된 CRC와 그 차이를 비교한다. 이때 DTMF인 경우는 역양자화(dequantization)의 필요없이 싸인파 발생기(sine wave generator)에 의해 해당 신호를 만들어 낸다. CRC코드(code)에 에러(error)가 생긴 경우는 그 에러(error)의 형태에 따라 처리를 행한다.The data transmitted through the highway interface separates the amplitude, pitch, voiceless / voiceless, DTMF information LSP coefficient values from the unpacketizer 201. In addition, CRC codes for three data types are regenerated and compared with the transmitted CRC. In the case of DTMF, the signal is generated by a sine wave generator without the need for dequantization. If an error occurs in the CRC code, processing is performed according to the type of the error.

상기 언패킷화부(201)에서 분리 처리된 신호는 역양자화부(202)에 인가된다.The signal separated by the unpacketizer 201 is applied to the dequantizer 202.

상기 역양자화 과정에서 진폭은 역연산에 해당하는 기능(operation)으로 CCITT 권고안 G.711에 의해 μ-law에서 선형으로 변환을 하여 13bits를 만든다.In the inverse quantization process, the amplitude is an inverse operation and converted from μ-law to linear by CCITT Recommendation G.711 to produce 13 bits.

그리고 피치 T6 ~ T0의 7bits 데이터에서 대해 19를 더하여 피치를 복원한다. 즉, T + QT + 19가 된다.The pitch is restored by adding 19 to the 7bits data of pitches T6 to T0. That is, T + QT + 19.

그리고 DTMF와 진행톤(Call Progress Tone)은 전송된 패킷 중 해당하는 6bits를 얻어 내어 30(16진수)이 아닌 경우는 상기 표 4를 이용하여 해당 디지트(digit)를 알아 낸다. 이 경우 다음에 설명할 DTMF와 호 진행톤(call progress tone) 생성부에서 80샘플 (samples)을 만든다.DTMF and Call Progress Tone obtain the corresponding 6bits from the transmitted packet, and if it is not 30 (hexadecimal), the corresponding digit is found using Table 4 above. In this case, 80 samples are generated by the DTMF and the call progress tone generator which will be described later.

상기 역양자화부(202)에서 역양자화를 거친 신호가 톤(Tone)인가 음성인가를 결정한 후(Tone)이면 톤(Tone) 신호를 생성하고 음성이면 음성부호화 할 때의 양자화 전의 LSP 파라메타를 상기 표6를 이용하여 복원한다.After the inverse quantization unit 202 determines whether the signal subjected to inverse quantization is tone or voice (Tone), it generates a tone signal, and if it is voice, the LSP parameters before quantization when voice coding Restore using 6.

그런데 상기 역양자화부(202)에서 역양자화된 신호가 톤신호이면 톤생성부(204)에서 DTMF와 호 진행톤(Call Progress Tone) 생성한다.However, if the dequantized signal of the inverse quantization unit 202 is a tone signal, the tone generator 204 generates a DTMF and a call progress tone.

이때 톤생성은 DTMF나 호진행 톤로 분석되어 전송되어온 경우는 DTMF와 호진행 톤에 대해 각각 표(1)과 표(2)를 이용해서 2개의 주파수를 찾는다. 그리고 나서 그 주파수에 해당하는 싸인(sine) 함수를 호출(call)하여 합해 줌으로서 톤 신호를 생성한다.At this time, if tone generation has been analyzed and transmitted as DTMF or call tone, two frequencies are found by using table (1) and table (2) for DTMF and call tone, respectively. Then, the sine function corresponding to the frequency is called and summed to generate a tone signal.

그러나 상기 역양자화부(202)에서 역양자화된 신호가 음성신호이면 음성합성부(203)에서 음성합성하여 제2도의 가입자 인터페이스(60)를 통해 가입자라인으로 수신된다. 상기 음성합성부(203)에서 음성합성을 위해 전송된 LSP계수는 음성 발생의 성도 모델을 표현하며 진폭(amplitude)과 피치(pitch) 그리고 유/무성음 bit는 성도에 대한 음원을 나타낸다.However, if the inverse quantized signal by the inverse quantization unit 202 is a voice signal, the voice synthesis unit 203 synthesizes the voice and receives the subscriber line through the subscriber interface 60 of FIG. The LSP coefficients transmitted by the voice synthesizer 203 for voice synthesis represent a vocal model of voice generation, and amplitude, pitch, and voice / voice unvoiced bits represent sound sources for vocals.

상기 유/무성음에 따라 음원으로 피치주기에 맞는 펄스와 랜덤 노이즈로 각각 모델링되며 여기에 진폭(amplitude)이 곱해져 음원으로 사용된다. 상기 음원은 하기 식(10)과 같은 합성필터를 거쳐 합성음을 만들어 낸다.According to the voiced / unvoiced sound, each of the sound sources is modeled as a pulse and a random noise according to the pitch period, and the amplitude is multiplied and used as a sound source. The sound source produces a synthesized sound through a synthesis filter as shown in Equation (10).

여기서 P = 10이다.Where P = 10.

본 코덱(codec)에서는 10㎳ 당 80 샘플(sample)의 음성신호를 만든다.The codec produces an audio signal of 80 samples per 10 Hz.

실제의 음성 합성시에는 각각 10㎳마다 만들어 지는 합성음성에서의 연속성과 합성필터의 제로 입력 응답(zero input response)을 제거하기 위해 LSP 합성 파라메타와 진폭(amplitude)는 그 전 10㎳ 프레임(frame)과 보간하여 사용되어야 하며 여기서 보간 방법은 선형 보간을 따랐다. 즉 각 LSP 계수에 대해 변화하는 양을 먼저 구한후 각 샘플포인트(sample point)마다 그 차이를 더하여 사용한다.In actual speech synthesis, the LSP synthesis parameter and amplitude are the previous 10 ms frame to remove the continuity in the synthesized speech produced every 10 ms and the zero input response of the synthesis filter. The interpolation method follows linear interpolation. That is, the amount of change for each LSP coefficient is obtained first, and then the difference is added to each sample point.

여기서 t는 현재의 frame을, t-1은 전 프레임(frame)을 나타낸다. 각 샘플(sample)에 대해 새로운 C_i는 하기식 12과 같다.Where t represents the current frame and t-1 represents the previous frame. For each sample, the new C _i is shown in Equation 12 below.

이와같이 복원된 파라메타로는 LSP 파라메타 10차, 피치(pitch), 진폭(amplitude)이 있으며, 음성합성기가 복원된 파라메타들로 부터 음성신호를 생성하여 출력하며, 상기 음성신호는 64 kbps로 되기전에 리니어(linear) PCM은 μ-law(혹은 A-law) PCM으로 변형되어 출력하게된다.The restored parameters include the LSP parameter 10th order, pitch, and amplitude, and a speech synthesizer generates and outputs a speech signal from the restored parameters, and the speech signal is linear before it becomes 64 kbps. (linear) PCM is transformed into μ-law (or A-law) PCM and output.

상술한 바와같이 음성전송 시스템에 있어서 음성 분석을 하여 검출된 파라메타로 부터 DTMF 신호 및 호진행톤을 검출하여 음성 데이타 처리속도를 향상시킬 수 있는 이점이 있다.As described above, in the voice transmission system, the DTMF signal and the traveling tone are detected from the parameters detected by the voice analysis, thereby improving the voice data processing speed.

Claims

보코더의 톤검출회로에 있어서, 음성분석시 검출한 각종 파라메타(LSP1-LSP10, ZCR, PITCH, RC1, COVMAX)를 입력하여 DTMF, 톤신호, 음성데이타를 검출하여 출력하는 신호판별부수단과, 상기 신호판별수단에서 출력된 DTMF신호를 입력하여 다이얼키의 종류를 판별하여 출력하는 다이얼키 판별수단과, 상기 신호판별수단에서 출력된 톤신호를 입력하여 톤의 종류를 판별하여 출력하는 톤신호 판별수단으로 구성함을 특징으로 하는 회로.In the vocoder tone detection circuit, signal discrimination means for inputting various parameters detected during speech analysis (LSP1-LSP10, ZCR, PITCH, RC1, COVMAX) to detect and output DTMF, tone signals, and voice data, and the signal. Dial key discrimination means for inputting the DTMF signal outputted from the discriminating means to discriminate the type of dial key, and for outputting the tone signal outputted from the signal discriminating means to discriminate and output the tone type. Circuit, characterized in that the configuration.

보코더의 톤신호검출 방법에 있어서, 음성분석시 처리되어 입력된 각종 파라메타를 분석하여 DTMF 및 호진행톤 신호 또는 음성데이타를 검출하여 출력하는 신호검출 과정과, 상기 신호검출 과정에서 검출된 DTMF신호를 분석하여 다이얼 키 신호의 종류를 판별하는 다이얼키 판별 과정과, 상기 신호검출 과정에서 검출된 호진행 톤을 분석하여 링백톤, 다이얼톤, 비지톤임을 판별하는 톤신호판별 과정으로 이루어짐을 특징으로 하는 방법.In the vocoder tone signal detection method, a signal detection process of detecting and outputting a DTMF and a progressive tone signal or a voice data by analyzing various parameters processed and input during voice analysis, and the DTMF signal detected in the signal detection process Dial key discrimination process for determining the type of dial key signal by analyzing and tone signal discrimination process for determining ringback tone, dial tone, busy tone by analyzing call progress tone detected in the signal detection process Way.