KR19990053837A

KR19990053837A - Method and apparatus for error concealment of audio signal

Info

Publication number: KR19990053837A
Application number: KR1019970073539A
Authority: KR
Inventors: 전병우; 정제창
Original assignee: 윤종용; 삼성전자 주식회사
Priority date: 1997-12-24
Filing date: 1997-12-24
Publication date: 1999-07-15
Also published as: KR100238324B1

Abstract

본 발명은 오디오 신호의 에러 은닉 방법 및 그 장치가 개시되어 있다. 본 발명은 에러가 발생한 프레임의 전과 후 프레임의 오디오 데이터를 복수개의 삼각 변환을 수행하여 각 변환에 따른 변환 계수들을 생성하는 삼각 변환 연산기, 각 변환에 따른 변환 계수들 중에서 에너지 집중도에 의해 하나를 선택하여 선택된 변환에 따른 전과 후 프레임에 대한 변환 계수들을 생성하는 선택기 및 선택된 변환에 따른 전과 후 프레임에 대한 변환 계수들을 주파수 영역에서 보간하여 에러가 발생한 프레임의 변환 계수를 복원하는 에러 은닉 데이터 발생기를 포함하여 나은 음질의 향상을 도모할 수 있으며, 또한 주어진 채널에서 보다 많은 음성 정보의 전송을 가능케 한다.The present invention discloses a method and apparatus for error concealment of an audio signal. A triangular conversion operator for performing a plurality of triangular transforms on audio data of a frame before and after an error occurred frame to generate transform coefficients according to each transform, and one of the transform coefficients according to each transform is selected A selector for generating transform coefficients for the pre- and post-frames according to the selected transform, and an error concealment data generator for interpolating the transform coefficients for the pre- and post- Thereby improving the sound quality and enabling transmission of more voice information in a given channel.

Description

오디오 신호의 에러 은닉 방법과 그 장치Method and apparatus for error concealment of audio signal

본 발명은 오디오 신호의 에러 은닉 방법 및 그 장치에 관한 것으로, 특히 보다 나은 음질을 출력하는 오디오 신호의 에러 은닉 방법 및 그 장치에 관한 것이다.The present invention relates to a method and apparatus for error concealment of an audio signal, and more particularly to a method and apparatus for error concealment of an audio signal outputting a better sound quality.

MPEG(Moving Picture Experts Group) 오디오가 MPEG 비디오와 조합됨으로써 멀티미디어를 위한 고능률의 정보 압축을 실현할 수 있다. MPEG 오디오는, 고능률의 압축 방법에 의해 압축되지 않는 컴팩트 디스크의 디지털 오디오(CD-DA)에 견주어도 음질이 거의 떨어지지 않는다. MPEG 비디오와의 조합뿐만 아니라 DAB(Digital Audio Broadcasting: 디지털 음악 방송)등에 MPEG 오디오를 단독으로 이용할 수도 있다. 유럽에서는 이미 위성을 통해 MPEG 오디오를 송출하는 디지털 음악 방송의 계획이 입안되고 있으며, 이외에도 MPEG 오디오의 단독 이용에 의한 디지털 방송이 계획되고 있다.By combining MPEG (Moving Picture Experts Group) audio with MPEG video, highly efficient information compression for multimedia can be realized. MPEG audio hardly degrades in sound quality even when compared to digital audio (CD-DA) of a compact disc that is not compressed by a highly efficient compression method. It is possible to use not only MPEG video but also MPEG audio in DAB (Digital Audio Broadcasting). In Europe, plans for digital music broadcasting are already being drafted, in which MPEG audio is transmitted through satellites. In addition, digital broadcasting is planned by using MPEG audio alone.

그러나, 디지털 오디오 데이터는 미리 규정된 표준에 따라 부호화되기 때문에 프레임내의 에러에 매우 민감하다. 그러므로 에러로 인한 문제를 극복하기 위해서 에러 정정 부호가 사용된다. MPEG-2 오디오의 프레임에서는 CRC(Cyclic Redundancy Code)를 사용하지만, 에러 정정 부호는 에러를 정정하기 위한 일정한 양의 부가적인 데이터로서, 실제 오디오 데이터 뒤에 덧붙여 전송된다. 이러한 부가적인 데이터는 전체적인 데이터 양의 증가를 가져온다. 그리고, 채널상의 에러 특성을 알 수 없기 때문에 과도한 에러 정정 능력을 가진 정정 부호를 사용하거나 에러 정정 과정 중에 에러가 발생하기도 한다.However, since the digital audio data is encoded according to a predetermined standard, it is very sensitive to errors in the frame. Therefore, an error correction code is used to overcome the problem caused by the error. In the frame of MPEG-2 audio, CRC (Cyclic Redundancy Code) is used, but the error correction code is a certain amount of additional data for correcting the error and is added after the actual audio data. This additional data leads to an increase in the overall amount of data. Since the error characteristic on the channel can not be known, a correction code having an excessive error correction capability may be used or an error may occur during the error correction process.

그러므로, 전송 에러를 극복하기 위해서 부가적인 데이터를 덧붙이지 않고 인간의 청각 특성을 이용하는 에러 은닉 방법을 사용한다. 즉, 에러가 발생하여 프레임을 사용하지 못하는 경우 인간이 에러를 느끼지 못하도록 에러 은닉 방법을 사용한다.Therefore, in order to overcome the transmission error, error concealment method using human auditory characteristic without adding additional data is used. In other words, if an error occurs and a frame can not be used, an error concealment method is used to prevent a human from feeling an error.

기존의 제안된 에러 은닉 방법은 묵음화, 반복, 주파수 영역에서의 에러 은닉, AR(Autoregressive) 모델링을 이용한 에러 은닉, 최하위비트(LSB) 드로핑(dropping)을 이용한 에러 은닉과 같은 것이 있다.Conventional error concealment methods include silence, repetition, error concealment in frequency domain, error concealment using AR (autoregressive) modeling, and error concealment using least significant bit (LSB) dropping.

(ⅰ) 묵음화(I) silence

에러 은닉의 가장 간단한 방법은 에러에 의해 제대로 복호화할 수 없는 프레임을 묵음화하는 것이다. 프레임을 자주 묵음화하는 것은 주관적인 음질을 매우 저하시킨다. 그리고 오디오 신호와 묵음 구간의 '클릭'하는 소리를 피하기 위해 "fade-in" 또는 "fade-out" 방법이 첨가될 수 있다. 하지만 프레임을 묵음화하는 것은 매우 귀에 거슬리며, 묵음화된 프레임의 정보는 잃어버리게 된다.The simplest method of error concealment is to silence frames that can not be properly decoded due to errors. Frequently silencing the frame greatly degrades the subjective sound quality. And a "fade-in" or "fade-out" method may be added to avoid "clicking" audio signals and silence periods. However, silencing a frame is very annoying, and the information in the silent frame is lost.

따라서, 여러 가지의 에러 은닉 방법 중에서 에러가 발생한 프레임을 묵음으로 만드는 가장 단순한 방법보다 나은 음질의 에러 은닉 방법을 선택해야 한다.Therefore, among various error concealment methods, it is necessary to select a method of error concealment with better quality than the simplest method of making a frame in which an error occurs, to be silent.

(ⅱ) 반복(Ii) Repeat

에러에 의해 복호화할 수 없는 프레임들을 묵음화하는 대신에 올바르게 복호화된 이전 프레임으로 대체하는 것을 말한다. 올바르게 복호화된 프레임을 에러에 의해 잃어버린 프레임으로 대체하는 방법은 오디오 신호 그 자체에 매우 민감하다. 하지만 대부분의 경우 단순히 묵음화시키는 것보나는 좋은 음질의 오디오 신호를 출력한다.Instead of silencing frames that can not be decoded by an error, it is replaced with a correctly decoded previous frame. How to replace a correctly decoded frame with a lost frame by error is very sensitive to the audio signal itself. In most cases, however, it outputs an audio signal of good quality, just like silencing it.

그러나, 이 방법은 최소한 한 프레임의 오디오 데이터를 저장할 수 있는 메모리가 필요하다. 또한, 다수의 프레임이 에러가 발생하여 복호화할 수 없는 경우에는 한 프레임을 여러 번 반복하거나, 복호화할 수 없는 프레임의 수만큼 올바르게 복호화된 프레임들을 한 번씩 반복하여 에러 은닉을 할 수도 있다. 이러한 에러 은닉 방법은 순수한 사인파나 백색 잡음의 경우에는 잘 적용되지만 대부분의 오디오 신호에는 그렇지 못한 문제점이 있었다.However, this method requires a memory capable of storing at least one frame of audio data. In addition, when a plurality of frames can not be decoded due to an error, one frame may be repeated a plurality of times or an error concealment may be performed by repeating correctly decoded frames by the number of frames which can not be decoded once. These error concealment methods are well suited for pure sine or white noise but not for most audio signals.

(ⅲ) 주파수 영역에서의 에러 은닉(Iii) error concealment in the frequency domain

시간 영역에서의 에러 은닉 방법에 대해서는 잘 알려져 왔지만 주파수 영역에서의 에러 은닉 방법은 많이 제안되지 않았다. 디지털 오디오 신호의 전송은 도 1에 도시된 바와 같이 모델링된다. 시간 영역에서 주파수 영역으로의 변환은 MDCT(Modified Discrete Cosine Tranform)이나 다위상 필터 뱅크 또는 이들의 하이브리드 방법을 사용한다. 만약 전송중에 에러 정정 부호로 복구할 수 없는 에러가 발생하면 주파수 영역의 에러는 시간 영역에서는 군집 에러를 유발하게 된다.The error concealment method in the time domain has been well known, but a method of concealing errors in the frequency domain has not been proposed much. The transmission of the digital audio signal is modeled as shown in FIG. The transformation from the time domain to the frequency domain uses Modified Discrete Cosine Transform (MDCT), a polyphase filter bank, or a hybrid method thereof. If an error that can not be recovered by error correcting code occurs during transmission, error in frequency domain will cause cluster error in time domain.

주파수 영역에서의 에러 은닉 방법으로서 문헌 [1]에 개시된 에러 은닉 방법은 분석(anaylsis) 필터 뱅크의 출력을 시간, 주파수 및 진폭의 3차원 공간을 구성한 후, 소정의 문턱치 이하로 클리핑한다: [1] J. Herre and E. Eberlein, "Error Concealment in the Spectral Domain". 하지만 이 에러 은닉 방법은 에러에 대한 정보를 가지고 있지 않기 때문에 특별한 종류의 오디오 신호에 대해 음질을 저하시킬 가능성을 가지고 있다.As an error concealment method in the frequency domain, the error concealment method disclosed in Document [1] clusters the output of the anaylsis filter bank to a predetermined threshold value or less after constructing a three-dimensional space of time, frequency and amplitude: ] J. Herre and E. Eberlein, "Error Concealment in the Spectral Domain". However, this error concealment method has no information about errors, and therefore has the potential to degrade the quality of a particular kind of audio signal.

(ⅳ) AR(Autoregressive) 모델링을 이용한 에러 은닉(Iv) error concealment using autoregressive (AR) modeling

손실된 프레임의 신호를 AR 과정(process)으로서 모델링하여 손실된 프레임을 보간하는 방법이 문헌 [2]에 개시되어 있다: [2] W.Etter, "Restoration of a Discrete-Time Signal Segment by Interpolation Based on the Left-Sided and Right-Sided Autoregressive parameters," IEEE Trans. Signal Process., vol.44, no.5, pp 1124-1135, May 1996.A method of interpolating a lost frame by modeling the signal of a lost frame as an AR process is disclosed in [2] W. Ether, "Restoration of a Discrete-Time Signal Segment by Interpolation Based on the Left-Sided and Right-Sided Autoregressive parameters, " IEEE Trans. Signal Process., Vol. 44, no. 5, pp 1124-1135, May 1996.

문헌 [2]에 개시된 방법은 기존의 AR 모델링과는 달리, 손실된 프레임과 그 주위의 신호에 대해 1개의 AR 모델링을 사용하지 않고 전과 후의 프레임을 이용하여 2개의 서로 다른 AR 모델링을 사용한다. 각각의 AR 모델링에 의해 구해진 신호에 대해 프레임의 연속성을 고려해서 올림(raised) 코사인 형태의 가중치를 부가한 후 두 신호를 합하여 손실된 프레임의 신호를 복원한다.Unlike the conventional AR modeling, the method disclosed in Document [2] uses two different AR modeling using the before and after frames without using one AR modeling for the lost frame and surrounding signals. The signal obtained by each AR modeling is added with the weight of the raised cosine type considering the continuity of the frame, and then the two signals are added together to recover the lost frame signal.

(ⅴ) LSB(Least Significant Bit) 드로핑을 이용한 에러 은닉(V) error concealment using Least Significant Bit (LSB) dropping

패킷 통신에 있어서, 송신측에서는 패킷의 수가 전송 능력을 초과하여 발생하면 초과 발생된 패킷을 버리게 되고, 수신측에서는 버퍼 메모리의 지연시간내에 패킷이 도착하지 않으면 그 패킷을 잃어버리게 된다. 또한, 전송선상에 혼잡도를 제어하기 위해 패킷을 버리기도 한다.In the packet communication, when the number of packets exceeds the transmission capability, the transmitting side discards the excessively generated packets. On the receiving side, if the packets do not arrive within the delay time of the buffer memory, the packets are lost. In addition, packets are discarded to control congestion on the transmission line.

LSB는 복원된 신호의 음질에 영향을 적게 주기 때문에 LSB 드로핑 방법은 전송되는 패킷을 MSB(Most Significant Bit) 패킷과 LSB 패킷으로 만들어서, 전송 도중 LSB 패킷을 손실하여도 복원된 신호의 음질에 영향을 적게 주고자 하는 방법이다.Since the LSB minimizes the influence on the sound quality of the recovered signal, the LSB dropping method makes the transmitted packet into MSB (Most Significant Bit) packet and LSB packet so that even if the LSB packet is lost during transmission, To give less.

이때, LSB 패킷이 손실되면 부호기의 예측기 입력과 복호기의 예측기 입력이 달라져서 복호기측의 양자화 간격(step-size)이나 예측 변수등이 달라져서 복호기의 예측기의 출력값이 부호기의 예측기의 출력값과 달라지게 되는 문제점이 있었다.At this time, if the LSB packet is lost, the predictor input of the encoder and the predictor input of the decoder are changed, so that the quantization step-size or predicted variable of the decoder side is changed and the output value of the predictor of the decoder is different from the output value of the predictor of the encoder .

따라서, 본 발명의 목적은 주파수 영역에서 오디오 신호의 에러를 은닉하는 방법을 제공하는 데 있다.Accordingly, an object of the present invention is to provide a method of concealing errors of an audio signal in a frequency domain.

본 발명의 다른 목적은 에러가 발생한 프레임의 전과 후 프레임의 오디오 데이터를 이용하여 에러가 발생한 프레임의 오디오 데이터를 복원하는 오디오 신호의 에러 은닉 방법을 제공하는 데 있다.Another object of the present invention is to provide an error concealment method of an audio signal for restoring audio data of a frame in which an error occurs, using audio data of frames before and after an error occurred frame.

본 발명의 또 다른 목적은 에러가 발생한 프레임의 전과 후 프레임의 오디오 데이터를 이용하여 에러가 발생한 프레임의 오디오 데이터를 복원하는 오디오 신호의 에러 은닉 장치를 제공하는 데 있다.It is still another object of the present invention to provide an error concealment apparatus for an audio signal that reconstructs audio data of a frame in which an error occurs, using audio data of frames before and after a frame in which an error occurs.

상기한 목적과 다른 목적을 달성하기 위하여, 본 발명에 의한 오디오 신호의 에러 은닉 방법은 복호화된 오디오 신호의 에러를 은닉하는 방법에 있어서, 에러가 발생한 프레임의 전과 후 프레임의 오디오 데이터를 복수개의 삼각 변환을 수행하여 각 변환에 따른 변환 계수들을 생성하는 단계, 각 변환에 따른 변환 계수들 중에서 소정의 기준에 의해 하나를 선택하여 선택된 변환에 따른 전과 후 프레임에 대한 변환 계수들을 생성하는 단계 및 선택된 변환에 따른 전과 후 프레임에 대한 변환 계수들을 주파수 영역에서 보간하여 에러가 발생한 프레임의 변환 계수를 복원하는 단계를 포함함을 특징으로 한다.According to another aspect of the present invention, there is provided a method for concealing an error in a decoded audio signal, the method comprising the steps of: Selecting one of the transform coefficients according to each transform by a predetermined criterion to generate transform coefficients for a pre- and post-frame according to the selected transform by performing a transform, And interpolating the transform coefficients for the before and after frames according to the transform coefficients in the frequency domain to reconstruct the transform coefficients of the frames in which the error occurs.

상기한 또 다른 목적을 달성하기 위하여, 본 발명에 의한 오디오 신호의 에러 은닉 장치는 복호화된 오디오 신호의 에러를 은닉하는 장치에 있어서, 에러가 발생한 프레임의 전과 후 프레임의 오디오 데이터를 복수개의 삼각 변환을 수행하여 각 변환에 따른 변환 계수들을 생성하는 삼각 변환 연산기, 각 변환에 따른 변환 계수들 중에서 소정의 기준에 의해 하나를 선택하여 선택된 변환에 따른 전과 후 프레임에 대한 변환 계수들을 생성하는 선택기 및 선택된 변환에 따른 전과 후 프레임에 대한 변환 계수들을 주파수 영역에서 보간하여 에러가 발생한 프레임의 변환 계수를 복원하는 에러 은닉 데이터 발생기를 포함함을 특징으로 한다.According to another aspect of the present invention, there is provided an apparatus for concealing an error in a decoded audio signal, the apparatus comprising: A selector that selects one of the transform coefficients according to each transform and generates transform coefficients for a pre- and post-frame according to the selected transform, and a selector And an error concealment data generator for interpolating the transform coefficients for the before and after frames according to the transform in the frequency domain to recover the transform coefficients of the frame in which the error occurs.

도 1은 일반적인 디지털 오디오 전송 장치의 블록도이다.1 is a block diagram of a general digital audio transmission apparatus.

도 2는 본 발명에 의한 오디오 신호의 에러 은닉 장치의 블록도이다.2 is a block diagram of an error concealment apparatus for an audio signal according to the present invention.

도 3은 도 2에 도시된 오디오 복호기의 일 예인 MPEG 오디오 복호기의 블록도이다.3 is a block diagram of an MPEG audio decoder, which is an example of the audio decoder shown in FIG.

도 4는 도 2에 도시된 오디오 복호기의 다른 예인 AC-3 복호기의 블록도이다.4 is a block diagram of an AC-3 decoder, which is another example of the audio decoder shown in FIG.

도 5는 손실된 프레임과 올바르게 복호화된 프레임을 보인 도면이다.Figure 5 shows a lost frame and a correctly decoded frame.

도 6은 손실된 프레임의 전과 후 프레임의 오디오 데이터의 변환 계수를 설명하기 위한 도면이다.6 is a diagram for explaining conversion coefficients of audio data of a frame before and after a lost frame.

도 7은 손실된 프레임의 오디오 데이터의 변환 계수의 예측을 설명하기 위한 도면이다.7 is a diagram for explaining prediction of a transform coefficient of audio data of a lost frame.

도 8 내지 도 12는 도 2에 도시된 에러 은닉 데이터 발생기에서 합성에 사용된 DFT, π -ODFT, DCT, DST, DHT의 변환 커널들의 일부를 보인 도면이다.FIGS. 8 to 12 show the DFTs used in the synthesis in the error concealment data generator shown in FIG. 2, π -ODFT, DCT, DST, and DHT.

도 13은 손실된 프레임의 오디오 데이터가 예측되어 복구된 모습을 보인 도면이다.13 is a diagram showing a state in which audio data of a lost frame is predicted and recovered.

이하, 첨부된 도면을 참조하여 본 발명에 의한 오디오 신호의 에러 은닉 방법과 그 장치의 바람직한 실시예를 설명하기로 한다.Hereinafter, a method of concealing an audio signal according to the present invention and a preferred embodiment of the apparatus will be described with reference to the accompanying drawings.

본 발명에 의한 오디오 신호의 에러 은닉 장치의 일 실시예에 따른 블록도인 도 2에 있어서, 본 발명의 에러 은닉 장치는 오디오 복호기(100)의 후단에 구성된 후처리기(120)를 지칭할 수도 있고, 오디오 복호기(100)의 후단에 별도로 구성될 수 있다. 이 후처리기(120)는 음질 향상을 위해 오디오 복호기(100)에서 복호화된 오디오 데이터를 이용하는 블록들이 더 있을 수 있다.2, which is a block diagram according to an embodiment of an error concealment apparatus for an audio signal according to the present invention, an error concealment apparatus of the present invention may refer to a post-processor 120 configured at the rear end of an audio decoder 100 And may be separately provided at the rear end of the audio decoder 100. The post-processor 120 may further include blocks that use the decoded audio data in the audio decoder 100 to improve sound quality.

그리고, 본 발명의 에러 은닉 장치는 삼각 변환 연산기(122), 삼각 변환 선택기(124) 및 에러 은닉 데이터 발생기(126)로 구성된다.The error concealment apparatus of the present invention comprises a triangular transformation operator 122, a triangular transformation selector 124 and an error concealment data generator 126.

도 3은 후처리기(220)가 MPEG 오디오 복호기(200)의 출력단에 구성된 예를 보인 블록도이고, 도 4는 후처리기(320)가 돌비(Dolby)사에서 제안한 복호기인 AC(Audio Coding)-3(300)의 출력단에 구성된 예를 보인 도면이다.FIG. 3 is a block diagram illustrating an example in which the post-processor 220 is configured at the output end of the MPEG audio decoder 200. FIG. 4 is a block diagram illustrating an example of a post-processor 320, which is a decoder proposed by Dolby, 3 (300) according to an embodiment of the present invention.

도 3의 MPEG 오디오 복호기(200)의 디멀티플렉서(202)는 MPEG 오디오 비트스트림으로부터 오디오 비트스트림과 비트스트림상에 포함되어 있는 제어 정보를 분리한다. 역양자화기(204)는 오디오 비트스트림을 역양자화해서 서브밴드 및 앤티-얼라어스(anti-alias) 필터 뱅크(208)에 인가한다. 제어기(206)는 제어정보로부터 비트 할당 정보를 추출하여 서브밴드 및 앤티-얼라어스 필터 뱅크(208)에 인가한다.The demultiplexer 202 of the MPEG audio decoder 200 of FIG. 3 separates the audio bitstream and the control information contained in the bitstream from the MPEG audio bitstream. The inverse quantizer 204 dequantizes the audio bitstream and applies it to the subband and anti-alias filter bank 208. The controller 206 extracts bit allocation information from the control information and applies it to the subband and anti-alias filter bank 208. [

서브밴드 및 앤티-얼라어스 필터 뱅크(208)는 역양자화된 비트스트림을 비트 할당 정보에 따라 역변환해서 각 서브밴드 샘플들을 복원하고, 각 서브밴드 샘플들에 포함되어 있는 얼라어스를 제거한다. 스테레오 레벨 제어기(210)는 서브밴드 및 앤티-얼라어스 필터 뱅크(208)의 출력으로부터 좌, 우 신호로 분리해서 좌, 우 신호의 레벨을 제어한 후 원래의 오디오 데이터를 후처리기(220)의 삼각 변환 연산기(222)에 인가한다.The subband and anti-alias filter bank 208 inversely transforms the inversely quantized bitstream according to the bit allocation information, restores each subband sample, and removes the alias contained in each subband sample. The stereo level controller 210 separates the output of the subband and anti-alias filter bank 208 into left and right signals to control the levels of the left and right signals and then outputs the original audio data to the post- And applies it to the triangular conversion operator 222.

후처리기(220)의 삼각 변환 연산기(222)에 인가되고 있는 에러가 발생한 프레임을 나타내는 정보는 MPEG 복호기(200)의 전단에 구성되는 오류정정 복호기(도시되지 않음)에서 발생하며, 이 오류정정 복호기에서는 오디오 비트스트림상의 CRC(Cyclic Redundancy Code)를 사용하여 오디오 비트스트림의 에러를 발견할 수 있으므로 어떤 프레임에 에러가 발생했는지 알 수 있으므로 이에 대한 정보를 발생한다. 오류정정 복호기도 MPEG 복호기에 포함될 수 있다.Information indicating a frame in which an error has occurred in the triangular conversion operator 222 of the post processor 220 is generated in an error correction decoder (not shown) configured at the previous stage of the MPEG decoder 200, An error of an audio bitstream can be detected using CRC (Cyclic Redundancy Code) on an audio bitstream, so that it is possible to know a frame in which an error has occurred, so that information is generated. An error correction decoder may also be included in the MPEG decoder.

또한, 도 4의 AC-3의 복호기(320)의 디멀티플렉서(302)는 입력되는 AC-3 비트스트림으로부터 오디오 비트스트림, 부가정보와 성분(exponent)정보를 분리한다. 스펙트럴 엔벨로프 복호기(304)는 분리된 성분정보로부터 비트 할당 정보를 추출해서 비트 할당기(306)에 인가한다. 비트 할당기(306)는 디멀티플렉서(302)로부터 분리된 부가정보와 스펙트럴 엔벨로프 복호기(304)로부터 인가되는 비트 할당 정보에 따라 결정된 양자화 간격을 역양자화기(308)에 인가한다. 여기서, 부가정보는 입력 비트스트림이 AC-3 비트스트림임을 나타내는 정보를 포함하고 있다.The demultiplexer 302 of the decoder 320 of AC-3 in FIG. 4 separates the audio bitstream, the additional information, and the exponent information from the input AC-3 bitstream. The spectral envelope decoder 304 extracts bit allocation information from the separated component information and applies it to the bit allocator 306. The bit allocator 306 applies the quantization interval determined according to the additional information separated from the demultiplexer 302 and the bit allocation information applied from the spectral envelope decoder 304 to the inverse quantizer 308. [ Here, the additional information includes information indicating that the input bit stream is an AC-3 bit stream.

역양자화기(308)는 분리된 오디오 비트스트림을 양자화 간격에 따라 역양자화해서 역필터 뱅크(310)에 인가한다. 역필터 뱅크(310)는 역양자화된 오디오 비트스트림을 역변환해서 원래의 오디오 데이터를 복원해서 후처리기(320)의 삼각 변환 연산기(322)에 인가한다. 삼각 변환 연산기(322)에 인가되고 있는 에러가 발생한 프레임을 나타내는 정보는 AC-3 복호기(300)의 전단에 구성되는 오류정정 복호기에서 발생된다.The inverse quantizer 308 dequantizes the separated audio bitstream according to the quantization interval and applies it to the inverse filter bank 310. The inverse filter bank 310 inversely transforms the inversely quantized audio bit stream, restores the original audio data, and applies the inverse transformed audio bit stream to the triangular conversion operator 322 of the post-processor 320. Information indicating a frame in which an error has occurred in the triangular conversion operator 322 is generated in an error correction decoder configured at the previous stage of the AC-3 decoder 300. [

이어서, 본 발명의 주요 구성인 삼각 변환기(122), 선택기(124) 및 에러 은닉 데이터 발생기(126)에 대해 도 2를 결부시켜 설명하기로 한다.Next, the triangular converter 122, the selector 124 and the error concealment data generator 126, which are the main components of the present invention, will be described with reference to FIG.

오디오 데이터의 어느 한 프레임에서 에러가 발생하여 그 프레임이 복호화되지 못하고 버려지게 되면 손실된 프레임의 전과 후의 올바르게 복호화된 프레임을 사용하여 손실된 프레임을 은닉한다.If an error occurs in one frame of the audio data and the frame can not be decoded and discarded, the lost frame is concealed using the correctly decoded frame before and after the lost frame.

즉, 도 5의 (a)에 도시된 바와 같이 연속하는 프레임의 오디오 비트스트림에 대하여 도 5의 (b)에 도시된 바와 같이 에러에 의해 손실된 프레임(FRAME 2)이 발생되면 후처리기(120)는 손실된 프레임(FRAME 2)의 전과 후의 프레임(FRAME 1, FRAME 3)의 변환 계수를 이용하여 복구한다.That is, when a frame (FRAME 2) lost due to an error is generated as shown in FIG. 5 (b) for an audio bitstream of successive frames as shown in FIG. 5 (a) ) Is recovered using the transform coefficients of the frames (FRAME 1, FRAME 3) before and after the lost frame (FRAME 2).

즉, 도 2의 삼각 변환 연산기(122)는 에러가 발생한 손실된 프레임(FRAME 2)을 나타내는 정보에 따라 오디오 복호기(100)로부터 인가되는 손실된 프레임의 전 프레임(FRAME 1)과 후 프레임(FRAME 3)의 오디오 데이터들을 각각 DFT(Discrete Fourier Tranform), π -ODFT( π -Offset Discrete Fourier Tranform), DCT(Discrete Cosine Transform), DST(Discrete Sine Tranform), DHT(Discrete Hartley Transform) 등 지금까지 알려진 삼각 변환을 이용하여 변환계수를 구한다.That is, the triangular conversion operator 122 of FIG. 2 decodes the previous frame FRAME 1 and FRAME 1 of the lost frame applied from the audio decoder 100 according to the information indicating the lost frame (FRAME 2) 3) are respectively converted into DFT (Discrete Fourier Transform), π -ODFT ( π (Discrete Fourier Transform), DCT (Discrete Cosine Transform), DST (Discrete Sine Transform), and DHT (Discrete Hartley Transform).

선택기(124)는 구해진 변환 계수들의 에너지 집중도를 이용해서 손실된 프레임의 전 프레임(FRAME 1)과 후 프레임(FRAME 2)에 대해 행해진 5개의 삼각 변환 중에서 에너지 집중도가 가장 높은 하나의 변환을 선택한다. 이렇게 선택된 변환은 손실된 프레임의 오디오 신호를 복원하는데 사용한다.The selector 124 selects one transform having the highest energy concentration among the five triangular transformations performed on the previous frame FRAME 1 and the FRAME 2 of the lost frame using the energy concentration of the obtained transform coefficients . The selected transform is used to recover the audio signal of the lost frame.

에러 은닉 데이터 발생기(126)는 선택된 변환에 의해 얻어진 손실된 프레임의 전과 후 프레임(FRAME 2, FRMAE 3)의 변환계수들을 보간하여 손실된 프레임(FRAME 2)의 변환 계수들을 예측하고 예측된 변환계수들을 합성하여 손실된 프레임(FRAME 2)의 오디오 데이터를 복원한다. 따라서, 에러 은닉 데이터 발생기(126)는 변환 계수들을 보간하는 보간기와 보간된 변환 계수들을 합성하는 합성기를 포함한다.The error concealment data generator 126 predicts the transform coefficients of the lost frame (FRAME 2) by interpolating the transform coefficients of the frame (FRAME 2, FRMAE 3) before and after the lost frame obtained by the selected transform, To recover the audio data of the lost frame (FRAME 2). Thus, the error concealment data generator 126 includes an interpolator that interpolates the transform coefficients and a synthesizer that combines the interpolated transform coefficients.

여기서, 에너지 집중도를 이용하여 5개의 삼각 변환 중에서 하나의 변환을 선택한다는 것은 손실된 프레임의 전 프레임(FRAME 1)과 후 프레임(FRAME 3)의 오디오 신호의 성분을 가장 잘 특징짓는 변환 계수들을 찾아낸다는 것이다. 그리고, 이 변환 계수들은 손실된 프레임(FRAME 2)에서 급격히 변화할 수 없다. 그러므로, 에너지 집중도가 가장 높은 변환을 사용하여 전 프레임(FRAME 1)과 후 프레임(FRAME 3)의 오디오 신호의 특징을 가장 잘 특징짓는 변환 계수들을 구한다.Here, selecting one of the five triangular transforms using the energy concentration is performed by finding transform coefficients that best characterize the components of the audio signal of the previous frame (FRAME 1) and the latter frame (FRAME 3) of the lost frame It is. And these transform coefficients can not change rapidly in the lost frame (FRAME 2). Therefore, transform coefficients that best characterize the audio signal of the previous frame (FRAME 1) and the latter frame (FRAME 3) are obtained using the transform with the highest energy concentration.

이렇게 구해진 전 프레임(FRAME 1)의 변환계수를 F_k, 손실된 프레임(FRAME 2)의 변환 계수를 S_k, 후 프레임(FRAME 3)의 변환 계수를 T_k으로 표시한다.The transform coefficient of the previous frame (FRAME 1) thus obtained is denoted by F _k , the transform coefficient of the lost frame (FRAME 2) is _denoted by S _k , and the transform coefficient of the next frame (FRAME 3) is denoted by T _k .

프레임 1의 오디오 데이터 변환 계수 F₀,F₁,...,F_N-1 The audio data conversion coefficients F ₀ , F ₁ , ..., F _N-1

프레임 2의 오디오 데이터 변환 계수 S₀,S₁,...,S_N-1 The audio data conversion coefficients S ₀ , S ₁ , ..., S _N-1

프레임 3의 오디오 데이터 변환 계수 T₀,T₁,...,T_N-1 The audio data conversion coefficients T ₀ , T ₁ , ..., T _N-1

도 6의 (a)와 (b)는 각각 손실된 프레임(FRAME 2)의 전 프레임(FRAME 1)의 오디오 데이터 변환 계수들(F₀,F₁,...,F_N-1)과 후 프레임(FRAME 3)의 오디오 데이터의 변환 계수들(T₀,T₁,...,T_N-1)을 보이고 있다.6A and 6B illustrate audio data conversion coefficients F ₀ , F ₁ , ..., F _N-1 of the previous frame FRAME 1 of the lost frame FRAME 2, (T ₀ , T ₁ , ..., T _N-1 ) of the audio data of the frame (FRAME 3).

k 번째 변환 계수인 F_k와 T_k를 이용하여 S_k를 예측한다. F_k와 T_k의 차이를 N(=256) 간격으로 선형적으로 보간하여 손실된 프레임(FRAME 2)의 k 번째 오디오 데이터의 변환 계수(S_k)의 예측을 전과 후 프레임의 변환 계수(F_k,T_K)로부터 할 수 있다.using the k-th transform coefficient, F _k and T _k estimates the _k S. The prediction of the conversion coefficient S _k of the k-th audio data of the lost frame FRAME 2 is performed by linearly interpolating the difference between F _k and T _k at intervals of N (= 256) _k , T _K ).

도 7의 (a)는 손실된 프레임(FRAME 2)의 전과 후 프레임(FRAM1,FRAME3)의 첫 번째 오디오 데이터의 변환 계수들(F₀,T₀)의 차이를 N 간격으로 선형적으로 보간하여 손실된 프레임의 첫 번째 오디오 데이터의 보간된 변환계수(S₀(n))를 예측하는 과정을 설명하기 위한 도면이고, 도 7의 (b)는 손실된 프레임(FRAME 2)의 전과 후 프레임(FRAME 1, FRAME 3)의 두 번째 오디오 데이터의 변환 계수(F₁,T₁)의 차이를 N 간격으로 선형적으로 보간하여 손실된 프레임의 두 번째 오디오 데이터의 보간된 변환 계수(S₁(n))를 예측하는 과정을 설명하기 위한 도면이다.FIG. 7A shows the difference between the transform coefficients F ₀ and T ₀ of the first audio data of the frames FRAM 1 and FRAME 3 before and after the lost frame FRAME 2 by linearly interpolating the N intervals FIG. 7B is a view for explaining a process of predicting the interpolated transform coefficient S ₀ (n) of the first audio data of the lost frame, and FIG. The interpolation coefficient S ₁ (n) of the second audio data of the lost frame is obtained by linearly interpolating the difference between the transform coefficients F ₁ , T ₁ of the second audio data of the frame FRAME 1, FRAME 3, )) In the second embodiment.

이렇게 얻어진 손실된 프레임(FRAME 2)의 k 번째 오디오 데이터의 보간된 변환 계수(S_k(n))는 수학식 1과 같이 나타낼 수 있다.The interpolated transform coefficient S _k (n) of the k-th audio data of the thus-obtained lost frame (FRAME 2) can be expressed by Equation (1).

다음, 손실된 프레임의 오디오 데이터의 보간된 변환 계수들을 합성하여 시간 영역에서의 오디오 데이터를 생성한다. 손실된 프레임의 오디오 데이터의 보간된 변환 계수들을 합성할 때 사용되는 변환들의 일부 커널들을 도 8 내지 도 12에 도시되어 있다.Next, the interpolated transform coefficients of the audio data of the lost frame are synthesized to generate audio data in the time domain. Some of the transformations used to combine the interpolated transform coefficients of the audio data of the lost frame are shown in Figures 8-12.

실제로는 256개의 커널이 모두 사용되지만 여기서는 저주파 6개의 커널들만 보이고 있다. 즉, 도 8은 DFT의 변환 커널들을, 도 9는 π -ODFT의 변환 커널들을, 도 10은 DCT의 변환 커널들을, 도 11은 DST의 변환 커널들을, 도 12는 DHT의 변환 커널들을 각각 보이고 있다. 그리고, 도 8 내지 도 12에서 수평축은 시간(time)을 나타내고 수직축은 크기(amplitude)을 나타내고 있다.In practice, all 256 kernels are used, but only six low-frequency kernels are shown. That is, FIG. 8 shows the transformation kernels of the DFT, π FIG. 10 shows conversion kernels of DCT, FIG. 11 shows conversion kernels of DST, and FIG. 12 shows conversion kernels of DHT. 8 to 12, the horizontal axis represents time and the vertical axis represents amplitude.

복원된 프레임(FRAME 2)의 오디오 신호들이 전 프레임(FRAME 1)과 후 프레임(FRAME 3)과의 연속성을 가지기 위해서는 오디오 데이터를 합성할 때 사용되는 변환의 커널들이 프레임간에 서로 연속되어야 한다. 각각의 변환에 대해 프레임간에 커널들이 연속되도록 하는 시간영역에 대한 합성식은 표 1과 같다.In order for the audio signals of the reconstructed frame (FRAME 2) to have continuity with the FRAME 1 and FRAME 3, the kernels of the transform used to synthesize the audio data must be contiguous with each other. Table 1 shows the synthesis formulas for the time domain in which kernels are successive between frames for each transformation.

변환conversion 오디오 신호의 복원시 사용되는 합성식Synthetic expression used for restoration of audio signal DFTDFT π - ODFT π - ODFT DCTDCT DSTDST DHTDHT

여기서, 은 시간 영역에서 복구된 프레임(FRAME 2)의 오디오 데이터이다. 이렇게 표 1에서의 합성식을 사용하여 합성한 오디오 데이터는 전후의 프레임과 연속성을 같게 된다.here, Is the audio data of the frame (FRAME 2) recovered in the time domain. The audio data synthesized using the synthesis formula in Table 1 has the same continuity with the preceding and succeeding frames.

도 13은 에러에 의해 손실된 프레임(FRAME 2)의 오디오 데이터가 전과 후 프레임의 오디오 데이터를 이용한 에러 은닉 방법에 의해 복구된 모습을 보이고 있다.FIG. 13 shows a state in which audio data of a frame (FRAME 2) lost due to an error is restored by an error concealment method using audio data of a before and after frame.

본 발명은 멀티미디어 시대의 음성 및 오디오 신호의 전송 분야 등에서 보다 나은 음질의 향상을 도모할 수 있으며, 따라서 정보화 사회에서 주어진 채널에서 보다 많은 음성 정보의 전송을 가능케 하는 효과가 있다.The present invention can improve the sound quality in the field of transmission of voice and audio signals in the multimedia age and thus has the effect of enabling transmission of more voice information in a given channel in the information society.

Claims

복호화된 오디오 신호의 에러를 은닉하는 방법에 있어서:A method for concealing errors in a decoded audio signal, comprising:

(a) 에러가 발생한 프레임의 전과 후 프레임의 오디오 데이터를 복수개의 삼각 변환을 수행하여 각 변환에 따른 변환 계수들을 생성하는 단계;(a) generating a plurality of transform coefficients according to each transform by performing a plurality of triangular transforms on audio data of frames before and after an error occurred frame;

(b) 상기 각 변환에 따른 변환 계수들 중에서 소정의 기준에 의해 하나를 선택하여 선택된 변환에 따른 전과 후 프레임에 대한 변환 계수들을 생성하는 단계; 및(b) selecting one of the transform coefficients according to a predetermined criterion to generate transform coefficients for the selected frame; And

(c) 상기 선택된 변환에 따른 전과 후 프레임에 대한 변환 계수들을 주파수 영역에서 보간하여 상기 에러가 발생한 프레임의 변환 계수를 복원하는 단계를 포함함을 특징으로 하는 오디오 신호의 에러 은닉 방법.(c) interpolating the transform coefficients for the pre- and post-frames according to the selected transform in a frequency domain to reconstruct transform coefficients of the frame in which the error occurred.

제1항에 있어서, 상기 소정의 기준은 에너지 집중도인 것을 특징으로 하는 오디오 신호의 에러 은닉 방법.The method of claim 1, wherein the predetermined criterion is energy concentration.

제1항에 있어서, 상기 복수개의 삼각 변환은 적어도 DFT(Discrete Fourier Tranform),

π

-ODFT(

π

-Offset Discrete Fourier Tranform), DCT(Discrete Cosine Transform), DST(Discrete Sine Tranform), DHT(Discrete Hartley Transform)를 포함하는 것을 특징으로 하는 오디오 신호의 에러 은닉 방법.The method of claim 1, wherein the plurality of triangulation transforms comprise at least one of a discrete Fourier transform (DFT)

π

-ODFT (

π

A Discrete Fourier Transform (DCT), a Discrete Sine Transform (DST), and a Discrete Hartley Transform (DHT).

제1항에 있어서, 상기 (c)단계는,The method of claim 1, wherein the step (c)

(c1) 상기 선택된 변환에 따른 전과 후 프레임의 변환 계수들을 이용하여 주파수 영역에서 소정의 보간에 의해 보간하여 상기 에러가 발생한 프레임의 변환 계수들을 생성하는 단계; 및(c1) generating transform coefficients of the frame in which the error occurs, by interpolating the transform coefficients of the before and after frames according to the selected transform by predetermined interpolation in the frequency domain; And

(c2) 상기 변환 계수들을 시간 영역에서 합성하여 에러가 발생한 프레임을 복원하는 단계를 포함함을 특징으로 하는 오디오 신호의 에러 은닉 방법.(c2) synthesizing the transform coefficients in a time domain to recover a frame in which an error has occurred.

제4항에 있어서, 상기 소정의 보간은 선형 보간인 것을 특징으로 하는 오디오 신호의 에러 은닉 방법.5. The method of claim 4, wherein the predetermined interpolation is linear interpolation.

복호화된 오디오 신호의 에러를 은닉하는 장치에 있어서:1. An apparatus for concealing an error in a decoded audio signal, comprising:

에러가 발생한 프레임의 전과 후 프레임의 오디오 데이터를 복수개의 삼각 변환을 수행하여 각 변환에 따른 변환 계수들을 생성하는 삼각 변환 연산기;A triangular conversion operator for performing a plurality of triangular transforms on the audio data of the frames before and after the frame in which an error occurs, to generate transform coefficients according to each transform;

상기 각 변환에 따른 변환 계수들 중에서 소정의 기준에 의해 하나를 선택하여 선택된 변환에 따른 전과 후 프레임에 대한 변환 계수들을 생성하는 선택기; 및A selector for selecting one of the transform coefficients according to the transform according to a predetermined criterion and generating transform coefficients for a pre- and post-frame according to the selected transform; And

상기 선택된 변환에 따른 전과 후 프레임에 대한 변환 계수들을 주파수 영역에서 보간하여 상기 에러가 발생한 프레임의 변환 계수를 복원하는 에러 은닉 데이터 발생기를 포함함을 특징으로 하는 에러 은닉 장치.And an error concealment data generator for interpolating the transform coefficients for the before and after frames according to the selected transform in the frequency domain to recover transform coefficients of the frame in which the error occurred.

제6항에 있어서, 상기 소정의 기준은 에너지 집중도인 것을 특징으로 하는 오디오 신호의 에러 은닉 장치.7. The error concealment apparatus of claim 6, wherein the predetermined criterion is energy concentration.

제6항에 있어서, 상기 복수개의 삼각 변환은 적어도 DFT,

π

-ODFT, DCT, DST, DHT를 포함하는 것을 특징으로 하는 오디오 신호의 에러 은닉 장치.7. The apparatus of claim 6, wherein the plurality of trigonometric transforms comprise at least a DFT,

π

-ODFT, DCT, DST, and DHT.

제6항에 있어서, 상기 에러 은닉 데이터 발생기는,7. The apparatus of claim 6, wherein the error concealment data generator comprises:

상기 선택된 변환에 따른 전과 후 프레임의 변환 계수들을 이용하여 주파수 영역에서 소정의 보간에 의해 보간하여 상기 에러가 발생한 프레임의 변환 계수들을 생성하는 보간기; 및An interpolator for generating transform coefficients of the frame in which the error occurs, by interpolating the transform coefficients of the before and after frames according to the selected transform by predetermined interpolation in the frequency domain; And

상기 변환 계수들을 시간 영역에서 합성하여 에러가 발생한 프레임을 복원하는 합성기를 포함함을 특징으로 하는 오디오 신호의 에러 은닉 장치.And a combiner for combining the transform coefficients in a time domain to recover a frame in which an error has occurred.

제9항에 있어서, 상기 소정의 보간은 선형 보간인 것을 특징으로 하는 오디오 신호의 에러 은닉 장치.The error concealment apparatus of claim 9, wherein the predetermined interpolation is linear interpolation.

입력되는 오디오 비트스트림을 오류정정 복호화해서 오류정정 복호화된 프레임이 에러가 발생하면 이에 대한 정보를 발생하고 상기 오류정정 복호화된 비트스트림으로부터 오디오 데이터를 복호화하는 오디오 복호기에 있어서:An audio decoder for error correction decoding an input audio bitstream to generate information about an error-corrected decoded frame, and for decoding audio data from the error-correction-decoded bitstream, the audio decoder comprising:

상기 오디오 복호기로부터 출력되는 상기 에러가 발생한 프레임의 전과 후 프레임의 오디오 데이터를 이용해서 에러가 발생한 프레임 데이터값을 복원하는 후처리기를 포함함을 특징으로 하는 오디오 신호의 에러 은닉 장치.And a post processor for recovering an error frame data value using the audio data of the frame before and after the error frame output from the audio decoder.

제11항에 있어서, 상기 후처리기는,12. The post-processor according to claim 11,

상기 선택된 변환에 따른 전과 후 프레임에 대한 변환 계수들을 이용하여 주파수 영역에서 보간하여 상기 에러가 발생한 프레임의 변환 계수를 복원하는 에러 은닉 데이터 발생기를 포함함을 특징으로 하는 오디오 신호의 에러 은닉 장치.And an error concealment data generator for interpolating in the frequency domain using the transform coefficients for the before and after frames according to the selected transform to recover the transform coefficients of the frame in which the error occurred.

제12항에 있어서, 상기 소정의 기준은 에너지 집중도인 것을 특징으로 하는 오디오 신호의 에러 은닉 장치.13. The error concealment apparatus of claim 12, wherein the predetermined criterion is energy concentration.

제12항에 있어서, 상기 복수개의 삼각 변환은 적어도 DFT,

π

-ODFT, DCT, DST, DHT를 포함하는 것을 특징으로 하는 오디오 신호의 에러 은닉 장치.13. The apparatus of claim 12, wherein the plurality of trigonometric transforms comprise at least a DFT,

π

-ODFT, DCT, DST, and DHT.

제12항에 있어서, 상기 에러 은닉 데이터 발생기는,13. The apparatus of claim 12, wherein the error concealment data generator comprises:

제15항에 있어서, 상기 소정의 보간은 선형 보간인 것을 특징으로 하는 오디오 신호의 에러 은닉 장치.16. The error concealment apparatus of claim 15, wherein the predetermined interpolation is linear interpolation.

제11항에 있어서, 상기 오디오 복호기는 MPEG 오디오 복호기인 것을 특징으로 하는 오디오 신호의 에러 은닉 장치.The error concealment apparatus of claim 11, wherein the audio decoder is an MPEG audio decoder.

제11항에 있어서, 상기 오디오 복호기는 AC-3 복호기인 것을 특징으로 하는 오디오 신호의 에러 은닉 장치.12. The error concealment apparatus of claim 11, wherein the audio decoder is an AC-3 decoder.