KR20070003546A

KR20070003546A - Clipping restoration by clipping restoration information for multi-channel audio coding

Info

Publication number: KR20070003546A
Application number: KR1020060030672A
Authority: KR
Inventors: 방희석; 오현오; 김동수; 임재현; 정양원
Original assignee: 엘지전자 주식회사
Priority date: 2005-06-30
Filing date: 2006-04-04
Publication date: 2007-01-05
Also published as: KR20070003545A; KR20070003544A; KR20070003543A; KR20070003547A

Abstract

A method and an apparatus for encoding a multi-channel audio signal, and a method and an apparatus for decoding the multi-channel audio signal are provided to effectively prevent the clipping problem occurring in a multi-channel audio signal by using clipping restoration information included in a bit stream. A multi-channel audio signal is downmixed, and then a downmix audio signal is generated(802). Space information is extracted from the multi-channel audio signal(803). An entire bit stream is generated, wherein the entire bit stream contains the downmix audio signal and the space information(806). Herein, information regarding the use of the clipping restoration is contained within a header of the space information bit stream(805).

Description

멀티채널 오디오 코딩에서 클리핑복원정보를 이용한 클리핑 복원방법{CLIPPING RESTORATION BY CLIPPING RESTORATION INFORMATION FOR MULTI-CHANNEL AUDIO CODING}Clipping Restoration Method Using Clipping Restoration Information in Multichannel Audio Coding {CLIPPING RESTORATION BY CLIPPING RESTORATION INFORMATION FOR MULTI-CHANNEL AUDIO CODING}

도 1은 본 발명에서의 오디오 신호에 대한 공간 정보를 인간이 인식하는 방법을 나타내는 도면.BRIEF DESCRIPTION OF THE DRAWINGS Fig. 1 is a diagram illustrating a method for a human to recognize spatial information about an audio signal in the present invention.

도 2는 클리핑 발생과정을 나타내는 도면.2 is a diagram illustrating a clipping process.

도 3은 본 발명에 따른 클리핑복원정보를 이용하여 클리핑을 방지하기 위한 인코딩 방법에 대한 도면.3 is a diagram of an encoding method for preventing clipping using clipping restoration information according to the present invention.

도 4는 본 발명에 따른 클리핑복원정보를 이용하여 클리핑을 방지하기 위한 제1 디코딩 방법에 대한 도면.4 is a diagram of a first decoding method for preventing clipping using clipping restoration information according to the present invention;

도 5는 본 발명에 따른 클리핑복원정보를 이용하여 클리핑을 방지하기 위한 제2 디코딩 방법에 대한 도면.5 is a diagram for a second decoding method for preventing clipping using clipping restoration information according to the present invention.

도 6은 본 발명에 따른 클리핑복원정보를 이용하여 클리핑을 방지하기 위한 제3 디코딩 방법에 대한 도면.6 is a diagram of a third decoding method for preventing clipping by using clipping restoration information according to the present invention;

도 7a 및 7b는 상기 제2 디코딩 방법에 대한 신택스.7A and 7B are syntax for the second decoding method.

도 8은 본 발명에 따른 클리핑복원정보를 이용하는 제1 인코딩 방법에 대한 흐름도.8 is a flowchart of a first encoding method using clipping restoration information according to the present invention;

도 9는 본 발명에 따른 클리핑복원정보를 이용하는 제2 인코딩 방법에 대한 흐름도.9 is a flowchart of a second encoding method using clipping restoration information according to the present invention;

도 10은 본 발명에 따른 클리핑복원정보를 이용하는 제1 디코딩 방법에 대한 흐름도.10 is a flowchart of a first decoding method using clipping restoration information according to the present invention;

도 11은 본 발명에 따른 클리핑복원정보를 이용하는 제2 디코딩 방법에 대한 흐름도.11 is a flowchart of a second decoding method using clipping restoration information according to the present invention;

도 12는 본 발명에 따른 클리핑복원정보를 이용하는 제3 디코딩 방법에 대한 흐름도.12 is a flowchart of a third decoding method using clipping restoration information according to the present invention;

*도면의 주요부분에 대한 부호의 설명* Explanation of symbols for main parts of the drawings

101.원거리 음원 102.직접적인 음파101.Remote sound source 102.Direct sound wave

104.반사된 음파 301.멀티채널 오디오 신호104. Reflected sound waves 301. Multichannel audio signal

302.다운믹스부 303.공간정보발생부302. Downmix unit 303. Spatial information generator

402.비트스트림수신부 403.클리핑탐색부402.Bitstream Receiver 403.Clipping Searcher

404.클리핑복원정보독출부 404.클리핑복원부404. Clipping Restoration Information Reader 404. Clipping Restoration Department

406.멀티채널생성부 408.공간 디코더406. Multi-channel generator 408. Space decoder

604.클리핑복원게인 추정부604. Clipping Restoration Gain Estimator

본 발명은 멀티채널 오디오 신호의 공간 정보에 대한 부호-복호화(encoding- decoding)방법에 관한 것으로서, 더욱 상세하게는 클리핑복원정보를 이용한 클리핑 복원방법을 갖는 멀티채널 오디오 신호의 부호화-복호화 방법에 대한 것이다.The present invention relates to a method of encoding and decoding spatial information of a multichannel audio signal, and more particularly, to a method of encoding and decoding a multichannel audio signal having a clipping recovery method using clipping restoration information. will be.

최근에 디지털 오디오 신호에 대한 다양한 코딩기술 및 방법들이 개발되고 있으며, 이와 관련된 제품들이 생산되고 있다. 또한 심리음향 모델(Psychoacoustic model)을 이용한 멀티채널 오디오 신호(multi-channel audio signal)의 코딩방법들이 개발되고 있으며, 이에 대한 표준화 작업이 진행되고 있다. 상기 심리음향 모델은 인간이 소리를 인식하는 방식, 예를 들면 큰 소리 다음에 오는 작은 소리는 들리지 않으며, 20Hz 내지 20000Hz의 주파수에 해당되는 소리만 들을 수 있다는 사실을 이용하여, 코딩과정에서 불필요한 부분에 대한 오디오 신호를 제거함으로써 필요한 데이터의 양을 효과적으로 줄일 수 있는 것이다.Recently, various coding techniques and methods for digital audio signals have been developed, and related products have been produced. In addition, coding methods for a multi-channel audio signal using a psychoacoustic model have been developed, and standardization thereof has been in progress. The psychoacoustic model is an unnecessary part of the coding process by using a method of recognizing a sound, for example, a small sound following a loud sound, and only a sound corresponding to a frequency of 20 Hz to 20000 Hz. By eliminating the audio signal for, the amount of data needed can be effectively reduced.

현재 MPEG-1 오디오(MEPG-1 레이어 Ⅲ), MPEG-4 AAC(Advanced Audio Coding) 및 MPEG-4 HE-AAC(High-Efficiency AAC)와 같은 오디오 표준 기술이 개발되어 상용화되고 있다. 또한 공간 정보를 이용하는 멀티채널 오디오 신호의 코딩방법이 개발되고 있다. 상기 멀티채널 오디오 신호의 코딩방법은 압축된 오디오 신호(예를 들면, 모노 또는 스테레오 오디오 신호) 및 낮은 비트-레이트의 부가정보(low-rate side information)(예를 들면, 공간 정보) 채널을 이용하여 멀티채널 오디오 신호의 전송 효율을 매우 효과적으로 향상시키는 것이다.Currently, audio standard technologies such as MPEG-1 Audio (MEPG-1 Layer III), MPEG-4 Advanced Audio Coding (AAC), and MPEG-4 High-Efficiency AAC (HE-AAC) have been developed and commercialized. In addition, a method of coding a multichannel audio signal using spatial information has been developed. The multi-channel audio signal coding method uses a compressed audio signal (e.g., mono or stereo audio signal) and a low bit-rate side information (e.g., spatial information) channel. Therefore, the transmission efficiency of the multichannel audio signal is greatly improved.

그러나, 상기 멀티채널 오디오 신호의 비트스트림을 구성하는데 있어서, 종래에는 멀티채널을 모노 또는 스테레오 오디오 신호로 다운믹스하면 클리핑(Clipping) 문제가 발생하였었다. 특히 부호화된 신호는 16비트 등으로 크기가 제한되어야하므로, 상기 부호화된 신호는 코어 코덱 인코딩 이후에도 클리핑이 지속된다. 상기 클리핑은 오디오 신호의 출력에도 영향을 주며, 음질 저하의 원인이 되었었다. However, in configuring the bitstream of the multichannel audio signal, a conventional clipping problem occurs when downmixing the multichannel to a mono or stereo audio signal. In particular, since the coded signal should be limited in size to 16 bits or the like, the coded signal continues clipping even after core codec encoding. The clipping also affects the output of the audio signal, and has been a cause of sound quality degradation.

따라서 상기와 같은 문제점을 해결하기 위해 제안된 본 발명은, 멀티채널 오디오 신호를 코딩하는데 있어서, 클리핑복원정보를 포함하도록 비트스트림을 구성하고, 상기 클리핑복원정보를 이용하여 클리핑된 부분을 복원함으로써, 멀티채널 오디오 신호에서 일어나는 클리핑 문제를 해결하는 방법 및 장치를 제공하는데 그 목적이 있다.Therefore, the present invention proposed to solve the above problems, in coding a multi-channel audio signal, by configuring the bitstream to include the clipping restore information, and by using the clipping restore information to restore the clipped portion, It is an object of the present invention to provide a method and apparatus for solving a clipping problem occurring in a multichannel audio signal.

상기의 목적을 달성하기 위하여, 본 발명은 상기 멀티채널 오디오 신호를 다운믹스하여 다운믹스 오디오 신호를 생성하는 단계와; 상기 멀티채널 오디오 신호로부터 공간 정보를 추출하는 단계와; 상기 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함하는 전체 비트스트림을 생성하는 단계를 포함하되, 상기 공간 정보 비트스트림의 헤더내에 클리핑복원(Clipping Restoration)의 사용 여부에 관한 제1 정보를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호의 인코딩 방법을 제공한다. 만일 상기 인코딩 방법은 상기 제1 정보가 사용상태를 표시하는 경우, 상기 공간 정보 비트스트림내에 프레임별로 상기 클리핑복원정보(Clipping Restoration Information)를 포함하는 단계를 더 포함할 수 있다. 또한, 만일 상기 인코딩 방법은 상기 제1 정보가 사용상태를 표시하는 경우, 상기 공간 정보 비트스 트림내에 프레임별로 상기 클리핑복원를 위한 데이터의 존재 여부에 관한 제2 정보를 포함할 수 있다. 이때, 상기 제2 정보가 존재상태를 표시하는 경우, 상기 공간 정보 비트스트림내에 프레임별로 상기 클리핑복원정보를 포함할 수 있다.In order to achieve the above object, the present invention comprises the steps of downmixing the multi-channel audio signal to generate a downmix audio signal; Extracting spatial information from the multichannel audio signal; Generating an entire bitstream comprising the downmix audio signal and a spatial information bitstream, wherein the first bit includes information on whether to use clipping restoration in a header of the spatial information bitstream. A method of encoding a multichannel audio signal is provided. If the first information indicates a usage state, the encoding method may further include including the clipping restoration information for each frame in the spatial information bitstream. In addition, if the first information indicates a usage state, the encoding method may include second information on whether the data for the clipping restoration exists for each frame in the spatial information bitstream. In this case, when the second information indicates the existence state, the clipping restoration information may be included for each frame in the spatial information bitstream.

상기 클리핑복원정보는 상기 다운믹스 오디오 신호의 시간 포락선(Time envelope) 또는 주파수 포락선(frequency envelope) 정보이거나, 시간 포락선(Time envelope) 또는 주파수 포락선(frequency envelope) 모델의 파라미터에 관한 정보이거나, 또는, 상기 클리핑복원정보는 클리핑이 일어난 위치정보 또는 클리핑을 보정하기 위한 게인정보 중 하나 이상을 포함하는 정보가 될 수 있다.The clipping restoration information is time envelope or frequency envelope information of the downmix audio signal, information about a parameter of a time envelope or frequency envelope model, or The clipping restoration information may be information including one or more of position information where clipping is performed or gain information for correcting clipping.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함하는 전체 비트스트림을 수신하는 단계와; 상기 다운믹스 오디오 신호에서 클리핑된 부분을 찾는 단계와; 상기 클리핑된 부분에 대하여 상기 공간 정보 비트스트림으로부터 클리핑복원정보를 독출하는 단계와; 독출된 상기 클리핑복원정보를 이용하여, 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원하는 단계;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호로 디코딩하는 방법을 제공한다.In addition, in order to achieve the above object, the present invention comprises the steps of: receiving an entire bitstream including a downmix audio signal and a spatial information bitstream; Finding a clipped portion of the downmix audio signal; Reading clipping recovery information from the spatial information bitstream for the clipped portion; Restoring a clipped portion of the downmix audio signal by using the readout of the clipping restoration information; and providing a method of decoding to a multichannel audio signal.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함하는 전체 비트스트림을 수신하는 단계와; 상기 공간 정보 비트스트림으로부터 클리핑된 부분의 위치정보 및 클리핑복원정보를 독출하는 단계와; 상기 위치정보 및 클리핑복원정보를 이용하여 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원하는 단계;를 포함하는 것을 특징으로 하는 멀티채널 오디 오 신호로 디코딩하는 방법을 제공한다.In addition, in order to achieve the above object, the present invention comprises the steps of: receiving an entire bitstream including a downmix audio signal and a spatial information bitstream; Reading positional information and clipping restoration information of a clipped portion from the spatial information bitstream; Restoring a clipped portion of the downmix audio signal using the positional information and the clipping restoration information; and providing a decoding method of a multi-channel audio signal.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함하는 전체 비트스트림을 수신하는 단계와; 상기 다운믹스 오디오 신호에서 클리핑된 부분을 찾는 단계와; 상기 클리핑된 부분에 대하여 클리핑복원게인(Clipping Restoration Gain)을 추정하는 단계와; 추정된 상기 클리핑복원게인을 이용하여 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원하는 단계;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호로 디코딩하는 방법을 제공한다. In addition, in order to achieve the above object, the present invention comprises the steps of: receiving an entire bitstream including a downmix audio signal and a spatial information bitstream; Finding a clipped portion of the downmix audio signal; Estimating a clipping restoration gain for the clipped portion; Restoring the clipped portion of the downmix audio signal using the estimated clipping restore gain.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 오디오 신호가 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함하도록 생성하되, 상기 공간 정보 비트스트림내에 클리핑복원정보를 포함하도록 생성되는 것을 특징으로 하는 오디오 신호의 생성방법을 제공한다.In addition, in order to achieve the above object, the present invention is to generate an audio signal to include a downmix audio signal and a spatial information bitstream, the audio is characterized in that it is generated to include clipping restore information in the spatial information bitstream Provides a method of generating a signal.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 상기 멀티채널 오디오 신호를 다운믹스하여 다운믹스 오디오 신호를 생성하는 다운믹스부; 상기 멀티채널 오디오 신호로부터 공간 정보를 추출하는 공간정보발생부; 및 상기 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함하도록 전체 비트스트림을 생성하는 비트스트림포맷터를 포함하되, 상기 공간 정보 비트스트림내에 클리핑복원정보(Guided Clipping Restoration Information)를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호의 인코딩 장치를 제공한다.In addition, to achieve the above object, the present invention provides a downmix unit for downmixing the multi-channel audio signal to generate a downmix audio signal; A spatial information generator for extracting spatial information from the multichannel audio signal; And a bitstream formatter for generating the entire bitstream to include the downmixed audio signal and the spatial information bitstream, wherein the clipped restoration information is included in the spatial information bitstream. An apparatus for encoding a channel audio signal is provided.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함하는 전체 비트스트림을 수신하는 비트스트림수신부; 상기 다운믹스 오디오 신호에서 클리핑된 부분을 찾는 클리핑탐색부; 상기 클리핑된 부분에 대하여 상기 공간 정보 비트스트림으로부터 클리핑복원정보를 독출하는 클리핑복원정보독출부; 및 독출된 상기 클리핑복원정보를 이용하여, 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원하는 클리핑복원부;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호의 디코딩 장치를 제공한다.In addition, in order to achieve the above object, the present invention includes a bitstream receiving unit for receiving the entire bitstream including a downmix audio signal and a spatial information bitstream; A clipping searcher to find a clipped portion of the downmix audio signal; A clipping restoration information reader for reading clipping restoration information from the spatial information bitstream with respect to the clipped portion; And a clipping restoring unit for restoring a clipped portion of the downmix audio signal by using the read out clipping restoring information.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함하는 전체 비트스트림을 수신하는 비트스트림수신부; 상기 공간 정보 비트스트림으로부터 클리핑된 부분의 위치정보 및 클리핑복원정보를 독출하는 클리핑복원정보독출부; 및 상기 위치정보 및 클리핑복원정보를 이용하여 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원하는 클리핑복원부;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호로 디코딩하는 방법을 제공한다. In addition, in order to achieve the above object, the present invention includes a bitstream receiving unit for receiving the entire bitstream including a downmix audio signal and a spatial information bitstream; A clipping restoration information reader for reading positional information and clipping restoration information of a portion clipped from the spatial information bitstream; And a clipping restoring unit for restoring a clipped portion of the downmix audio signal using the position information and the clipping restoring information.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함하는 전체 비트스트림을 수신하는 비트스트림수신부; 상기 다운믹스 오디오 신호에서 클리핑된 부분을 찾는 클리핑탐색부; 상기 클리핑된 부분에 대하여 클리핑복원게인(Clipping Restoration Gain)을 추정하는 클리핑복원게인추정부; 및 추정된 상기 클리핑복원게인을 이용하여 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원하는 클리핑복원부;를 포함하는 것을 특징으로 하는 멀티채널 오디오 신호의 디코딩 장치를 제공한다.In addition, in order to achieve the above object, the present invention includes a bitstream receiving unit for receiving the entire bitstream including a downmix audio signal and a spatial information bitstream; A clipping searcher to find a clipped portion of the downmix audio signal; A clipping restoration gain estimator for estimating a clipping restoration gain with respect to the clipped portion; And a clipping restoring unit for restoring the clipped portion of the downmix audio signal using the estimated clipping restoring gain.

이하 상기의 목적을 구체적으로 실현할 수 있는 본 발명의 바람직한 실시예 를 첨부한 도면을 참조하여 설명한다.Hereinafter, with reference to the accompanying drawings, preferred embodiments of the present invention that can specifically realize the above object will be described.

도 1 은 본 발명에서의 오디오 신호에 대한 공간 정보를 인간이 인식하는 방법을 도시한다. 멀티채널 오디오 신호에 대한 코딩방법은 인간이 오디오 신호를 3차원적 공간으로 인지한다는 사실을 바탕으로, 복수의 파라미터 세트(parameter sets)를 통하여 상기 오디오 신호를 3차원적 공간 정보로 표현할 수 있다는 것을 이용한다. 멀티채널 오디오 신호의 공간 정보를 표시하기 위한 "공간 파라미터"라고 불리는 상기 파라미터에는 ICLD(Inter Channel level differences), ICC(Inter Channel Coherences) 및 ICTD(Inter Channel Time Difference)등이 있다. 상기 ICLD는 두 채널간의 에너지 차이를 의미하고, 상기 ICC는 두 채널 간의 상관관계(correlation)를 의미하며, ICTD는 두 채널간의 시간 차이를 의미한다.1 shows a method for a human to recognize spatial information about an audio signal in the present invention. The coding method for a multichannel audio signal is based on the fact that a human perceives the audio signal as a three-dimensional space. I use it. Such parameters, called "spatial parameters" for indicating spatial information of a multichannel audio signal, include ICLD (Inter Channel level differences), ICC (Inter Channel Coherences), ICTD (Inter Channel Time Difference), and the like. The ICLD means an energy difference between two channels, the ICC means a correlation between two channels, and the ICTD means a time difference between two channels.

인간이 오디오 신호를 어떻게 공간적으로 인식하며, 상기 공간 파라미터의 개념이 어떻게 생성되는지가 도 1에 도시된다. 원거리에 있는 음원(105)으로부터의 직접적인 음파(direct sound wave)(103)가 인간의 왼쪽 귀(107)에 도달하고, 또 다른 직접적인 음파(102)는 머리 주위에서 회절되어 오른쪽 귀(106)에 도달하게 된다. 상기 두 음파(102 및 103)는 도달시간 및 에너지 레벨에서 차이를 보이게 되며, 이와 같은 차이가 상기 CLD, CPC 및 CTD 파라미터를 생성하게 된다.How a human perceives an audio signal spatially and how the concept of the spatial parameter is generated is shown in FIG. 1. Direct sound wave 103 from the remote source 105 arrives at the human left ear 107, and another direct sound wave 102 is diffracted around the head to the right ear 106. Will be reached. The two sound waves 102 and 103 show a difference in arrival time and energy level, and this difference generates the CLD, CPC and CTD parameters.

또한 만일 반사된 음파(104 및 105)가 양 귀에 도달되거나, 또는 상기 음원(105)이 분산되어 있다면, 서로 상관관계가 없는 음파가 양 귀에 도달될 것이고, 이것이 상기 ICC 파라미터를 생성하게 된다. 상기와 같이 원리로 생성된 공간 파라미터들은 멀티채널 오디오 신호를 모노 또는 스테레오 신호로 전송한 후 다시 멀티 채널로 출력하는데 있어서, 강력한 비트 수 감소를 가능하게 한다는 것이 알려져 있다. 본 발명은 상기 공간 정보를 이용하는 멀티채널 오디오 신호에 있어서, 멀티채널을 다운믹스하여 코딩하는 과정에서 발생할 수 있는 클리핑(Clipping) 현상을 방지하기 위한 방법을 제시한다.Also, if the reflected sound waves 104 and 105 reach both ears, or if the sound source 105 is dispersed, sound waves that do not correlate with each other will reach both ears, which will generate the ICC parameter. It is known that the spatial parameters generated on the principle as described above enable a strong number of bits in transmitting a multichannel audio signal as a mono or stereo signal and then outputting the multichannel audio signal back to the multichannel. The present invention provides a method for preventing clipping from occurring in a process of downmixing and coding a multichannel in a multichannel audio signal using the spatial information.

도 2는 클리핑 발생과정을 도시한다. 클리핑은 주로 두 가지 원인으로 발생한다. 첫 번째는 원래 신호(original signal)의 음량(sound level)이 높은 경우에 발생한다. 두 번째는 다운믹스 과정 중에 입력 채널(input channel)의 수가 많은 경우에 발생한다. 예를 들면, 3개의 채널을 1개의 채널도 다운믹스하는 경우보다, 7개의 채널을 1개의 채널도 다운믹스하는 경우에 클리핑이 더 자주 발생한다. 도 2의 클리핑 발생과정은 5개 채널을 1개의 채널로 다운믹스하는 경우를 도시하나, 본 발명은 이 경우에만 한정되지는 않는다. 도 2의 (a)는 5개의 채널로 구성된 원래 신호의 음량을 도시한다. 각각의 채널은 제한된 크기(예를 들면, 16비트)의 거의 전 범위를 사용할 수 있다. 도 2의 (b)는 상기 5개의 채널을 다운믹스하여 생성된 다운믹스 오디오 신호를 도시한다. 도시된 것처럼, 상기 다운믹스 오디오 신호는 많은 클리핑 지점들을 가질 수 있다. 도 2의 (c)는 상기 다운믹스 오디오 신호를 코어 코덱(예를 들면, AAC 코덱)을 이용하여 인코딩/디코딩한 오디오 신호를 도시한다. 상기 코어 코덱을 이용하여 인코딩/디코딩된 오디오 신호도 제한된 크기(예를 들면, 16비트)로 표현되므로, 클리핑이 지속될 수 있다. 상기 클리핑은 멀티채널 오디오 신호의 재생부에서의 출력에도 영향을 주며, 음질 저하의 원인이 될 수 있다.2 shows a clipping process. Clipping occurs mainly for two reasons. The first occurs when the sound level of the original signal is high. The second occurs when the number of input channels is large during the downmix process. For example, clipping occurs more often when downmixing seven channels to one channel than when three channels are downmixed. The clipping generation process of FIG. 2 illustrates a case of downmixing five channels into one channel, but the present invention is not limited thereto. 2 (a) shows the volume of the original signal consisting of five channels. Each channel can use almost the entire range of limited size (eg 16 bits). 2B illustrates a downmix audio signal generated by downmixing the five channels. As shown, the downmix audio signal can have many clipping points. FIG. 2C illustrates an audio signal obtained by encoding / decoding the downmix audio signal using a core codec (eg, an AAC codec). Since the audio signal encoded / decoded using the core codec is also represented in a limited size (eg, 16 bits), clipping can be continued. The clipping also affects the output from the reproduction unit of the multi-channel audio signal and may cause sound quality degradation.

도 3은 본 발명에 따른 클리핑복원정보를 이용하여 클리핑을 방지하기 위한 인코딩 방법을 도시한다. 도시된 것처럼, 멀티채널 오디오 신호(301)는 다운믹스부(302)에서 다운믹스되어 다운믹스 오디오 신호를 생성한다. 또한, 공간정보발생부(303)에서 상기 멀티채널 오디오 신호(301)로부터 공간 정보가 추출되고, 추출된 상기 공간 정보를 이용하여 공간 정보 비트스트림을 생성한다. 그 다음에 비트스트림포맷터(304)에서 상기 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함하도록 전체 비트스트림을 생성한다. 본 발명에 따른 인코딩 방법에서는 상기 공간 정보 비트스트림 내에 프레임별로 클리핑복원정보(Clipping Restoration Information, CRI)를 포함하고, 상기 클리핑복원정보를 이용하여 클리핑복원(Clipping Restoration)를 수행할 수 있다. 상기 인코딩 방법은 상기 클리핑복원정보가 어떠한 정보를 가지는지에 따라서 여러 가지 형태로 구현될 수 있다. 3 illustrates an encoding method for preventing clipping by using clipping restoration information according to the present invention. As shown, the multichannel audio signal 301 is downmixed in the downmix unit 302 to generate a downmix audio signal. In addition, the spatial information generator 303 extracts spatial information from the multi-channel audio signal 301 and generates a spatial information bitstream using the extracted spatial information. Bitstream formatter 304 then generates the entire bitstream to include the downmix audio signal and the spatial information bitstream. In the encoding method according to the present invention, clipping restoration information (CRI) may be included for each frame in the spatial information bitstream, and clipping restoration may be performed using the clipping restoration information. The encoding method may be implemented in various forms depending on what information the clipping restoration information has.

제1 인코딩 방법은 클리핑복원를 사용할 것인가 아닌가의 제1 정보를 공간 정보 비트스트림의 헤더에 가지고, 상기 제1 정보가 사용상태를 나타낼 때(즉, 사용한다는 것을 의미할 때), 프레임별로 클리핑복원정보를 상기 공간 정보 비트스트림에 포함한다. 상기 클리핑복원정보에는 클리핑복원이 필요하지 않다는 경우도 포함할 수 있으며, 클리핑된 부분의 위치정보 및 크기정보를 포함할 수 있다. 상기 제1 정보는 독립적인 신택스를 정의하고 사용할 수도 있고, 이미 가지고 있는 신택스를 확장하여(예를 들면, bsFixedGains의 예비필드(reserved field)) 사용할 수도 있다. The first encoding method has the first information of whether to use clipping restoration in the header of the spatial information bitstream, and when the first information indicates the use state (that is, means to use), the clipping restoration information for each frame. Is included in the spatial information bitstream. The clipping restoration information may include a case in which clipping restoration is not required, and may include location information and size information of a clipped portion. The first information may be used to define and use an independent syntax, or to extend an existing syntax (for example, a reserved field of bsFixedGains).

제2 인코딩 방법은 클리핑복원를 사용할 것인가 아닌가의 제1 정보를 공간 정보 비트스트림의 헤더에 가지고, 상기 제1 정보가 사용상태를 나타낼 때, 프레임별로 클리핑복원를 위한 데이터가 있는지 없는지에 대한 제2 정보를 표현하는 신택스를 정의하고, 상기 제2 정보가 상기 데이터가 존재한다는 것을 나타낼 때, 추가적으로 클리핑복원정보를 공간 정보 비트스트림내에 포함한다. 이 경우, 클리핑이 일어나지 않는 프레임의 빈도가 높기 때문에 정보량을 줄일 수 있다.The second encoding method has first information on whether to use clipping restoration in the header of the spatial information bitstream, and when the first information indicates a use state, second information on whether there is data for clipping restoration on a frame-by-frame basis. Defines the syntax to represent and additionally includes clipping restore information in the spatial information bitstream when the second information indicates that the data is present. In this case, the amount of information can be reduced because the frequency of frames where clipping does not occur is high.

상기 클리핑복원정보의 예로는 다음과 같은 것이 있다. 첫 번째는 다운믹스 오디오 신호의 시간에 따른 시간포락선(Time envelope) 또는 주파수포락선(Frequency envelope) 정보의 양자화된 값을 이용하는 것이다. 일반적으로 클리핑은 허용된 최대값보다 큰 값이 나올 때 발생한다. 예를 들면, 16비트 PCM 신호이면 그 값이 32767 ~ -32768 범위에 있어야 하는데, 다운믹스 과정에서 상기 범위를 넘는 데이터 값은 잘리게 된다. 따라서 잘린 값의 정보를 보내주어야 하는데, 샘플 단위로 잘린 정보를 보내주면 정보량이 지나치게 많아진다. 정보량을 줄이기 위해, 포락선을 이용하여 잘린 정보를 보내주면 정보량이 훨씬 줄어들게 된다. 상기 포락선은 시간포락선(Time envelope) 또는 주파수포락선(Frequency envelope)을 포함할 수 있다. 두 번째는 상기 시간포락선 또는 주파수포락선을 모델링하고, 상기 모델의 파라미터를 양자화한 값을 이용하는 것이다. 예를 들면, 상기 포락선 정보를 선형예측모델링(Linear Prediction Modeling)하고, 그 계수인 선형예측계수(Linear Prediction Coefficient)를 양자화하여 보내는 것이다. 세 번째는 클리핑이 일어난 시간 위치에 대한 위치정보 또는 클리핑을 보정하기 위한 크기(즉, 게인(gain)) 정보 중 하나 이상을 이용하는 것이다. 클리핑이 아주 가끔 일어난다면, 모든 구간에 대한 정보를 보내지 않고, 상기 클리핑이 일어난 구간에 대한 시작과 끝의 정보만 보내주면 된다. 만일 복원단계에서의 시작 및 끝 정보와 복원크기 정보를 함께 사용한다면, 상기 위치정보와 게인정보를 함께 보내야 한다. 만일 복원크기가 미리 정해져 있다면(예를 들면, 클리핑이 일어난 구간은 무조건 1/2로 줄여서 압축하는 경우), 상기 복원크기 정보는 필요하지 않을 수 있다. Examples of the clipping restoration information are as follows. The first is to use a quantized value of time envelope or frequency envelope information of a downmix audio signal over time. In general, clipping occurs when a value is greater than the maximum allowed. For example, if a 16-bit PCM signal, the value must be in the range 32767 to -32768, and data values beyond the range are truncated during the downmix process. Therefore, it is necessary to send the information of the truncated value. If the information truncated by the sample unit is sent, the amount of information becomes excessive. To reduce the amount of information, sending the truncated information using an envelope reduces the amount of information much further. The envelope may include a time envelope or a frequency envelope. The second is to model the temporal envelope or the frequency envelope and use the quantized value of the parameter of the model. For example, linear prediction modeling is performed on the envelope information, and linear prediction coefficients, which are coefficients thereof, are quantized and sent. The third is to use one or more of the positional information on the time position at which the clipping took place or the size (ie, gain) information for correcting the clipping. If clipping occurs very occasionally, it is not necessary to send information about all the sections, but only the beginning and the end of the section where the clipping took place. If the start and end information in the restoration step and the restoration size information are used together, the position information and the gain information should be sent together. If the reconstruction size is predetermined (for example, when the clipping section is compressed unconditionally by 1/2), the reconstruction size information may not be necessary.

도 4는 본 발명에 따른 클리핑복원정보를 이용하여 클리핑을 방지하기 위한 제1 디코딩 방법을 도시한다. 도시된 것처럼, 공간 디코더(408)를 구성하는 비트스트림수신부(402)는 클리핑복원정보가 포함된 전체 비트스트림(401)을 수신하고, 상기 전체 비트스트림으로부터 다운믹스 오디오 신호와 공간 정보 비트스트림을 추출한다. 그 다음에 클리핑탐색부(403)는 상기 다운믹스 오디오 신호에서 클리핑된 부분을 찾는다. 클리핑복원정보독출부(404)는 상기 클리핑된 부분에 해당되는 클리핑복원정보를 상기 공간 정보 비트스트림으로부터 독출한다. 그 다음에 클리핑복원부(405)는 상기 클리핑복원정보를 이용하여 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원한다. 멀티채널생성부(406)는 상기 공간 정보 비트스트림을 디코딩하여 얻어진 공간 정보를 이용하여 클리핑된 부분이 복원된 상기 다운믹스 오디오 신호를 멀티채널 오디오 신호(407)로 변환할 수 있다.4 illustrates a first decoding method for preventing clipping by using clipping restoration information according to the present invention. As shown in the drawing, the bitstream receiver 402 constituting the spatial decoder 408 receives the entire bitstream 401 including the clipping restoration information, and decodes the downmix audio signal and the spatial information bitstream from the entire bitstream. Extract. The clipping search unit 403 then finds the clipped portion of the downmix audio signal. The clipping recovery information reading unit 404 reads clipping recovery information corresponding to the clipped portion from the spatial information bitstream. The clipping restorer 405 then restores the clipped portion of the downmix audio signal using the clipping restore information. The multichannel generator 406 may convert the downmixed audio signal from which the clipped portion is reconstructed into the multichannel audio signal 407 using the spatial information obtained by decoding the spatial information bitstream.

도 5는 본 발명에 따른 클리핑복원정보를 이용하여 클리핑을 방지하기 위한 제2 디코딩 방법을 도시한다. 도시된 것처럼, 공간 디코더(507)를 구성하는 비트스트림수신부(502)는 클리핑복원정보가 포함된 전체 비트스트림(501)을 수신하고, 상기 전체 비트스트림(501)으로부터 다운믹스 오디오 신호와 공간 정보 비트스트림을 추출한다. 그 다음에 클리핑복원정보독출부(503)는 상기 공간 정보 비트스트림으로부터 클리핑복원정보를 독출한다. 상기 클리핑복원정보는 클리핑된 부분의 위치정보 또는 클리핑복원크기정보 중 하나 이상을 포함할 수 있다. 클리핑복원부(504)는 독출된 상기 클리핑복원정보를 이용하여 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원한다. 예를 들면, 상기 공간 정보 비트스트림에서 클리핑된 위치를 찾은 후에 상기 공간 정보 비트스트림으로부터 독출된 클리핑복원정보를 이용하여 클리핑된 부분을 복원하거나, 또는 상기 공간 정보 비트스트림으로부터 독출된 클리핑된 부분의 위치정보 및 클리핑복원크기정보를 이용하여 클리핑된 부분을 복원할 수 있다. 이 경우에, 클리핑된 부분을 복원하기 이전에 상기 공간 정보 비트스트림에서 클리핑복원가 필요한 구간을 찾을 수 있다. 그 다음에 멀티채널생성부(505)는 상기 공간 정보 비트스트림을 디코딩하여 얻어진 공간 정보를 이용하여 클리핑된 부분이 복원된 상기 다운믹스 오디오 신호를 멀티채널 오디오 신호(506)로 변환할 수 있다. 전체 신호에 대하여 상기 제1 인코딩 방법과 상기 제2 인코딩 방법 중에서 어느 한 가지만 사용할 수 있으며, 또한, 상기 제1 인코딩 방법과 상기 제2 인코딩 방법 중 선택기준을 두고, 프레임별로 상기 선택기준에 따라 상기 제1 인코딩 방법과 상기 제2 인코딩 방법 중 하나를 선택적으로 사용하거나, 또는 상기 두 가지 방법을 조합하여 사용할 수 있다. 5 illustrates a second decoding method for preventing clipping by using clipping restoration information according to the present invention. As shown, the bitstream receiver 502 constituting the spatial decoder 507 receives the entire bitstream 501 including the clipping restoration information, and the downmix audio signal and the spatial information from the entire bitstream 501. Extract the bitstream. The clipping recovery information reading unit 503 then reads the clipping recovery information from the spatial information bitstream. The clipping restoration information may include one or more of position information or clipping restoration size information of the clipped portion. The clipping restoring unit 504 restores the clipped portion of the downmix audio signal by using the read out clipping restoring information. For example, after finding a clipped position in the spatial information bitstream, the clipped portion may be reconstructed using the clipping restoration information read from the spatial information bitstream, or the clipped portion read from the spatial information bitstream. The clipped portion can be restored using the location information and the clipping restoration size information. In this case, before restoring the clipped portion, it is possible to find a section requiring clipping restoration in the spatial information bitstream. Then, the multichannel generator 505 may convert the downmixed audio signal from which the clipped portion is reconstructed into the multichannel audio signal 506 using the spatial information obtained by decoding the spatial information bitstream. Only one of the first encoding method and the second encoding method may be used for the entire signal, and a selection criterion is selected between the first encoding method and the second encoding method, and the frame is selected according to the selection criteria for each frame. One of the first encoding method and the second encoding method may be selectively used, or a combination of the two methods may be used.

도 6은 본 발명에 따른 클리핑복원정보를 이용하여 클리핑을 방지하기 위한 제3 디코딩 방법을 도시한다. 도시된 것처럼, 공간 디코더(608)를 구성하는 비트스트림수신부(602)는 전체 비트스트림(601)을 수신하고, 상기 전체 비트스트림(601) 으로부터 다운믹스 오디오 신호와 공간 정보 비트스트림을 추출한다. 그 다음에 클리핑탐색부(603)는 상기 다운믹스 오디오 신호에서 클리핑된 부분을 찾는다. 클리핑복원게인추정부(604)는 상기 클리핑된 부분에 대해 클리핑복원게인을 추정한다. 예를 들면, 상기 클리핑복원게인을 추정하기 위해 시간, 주파수 또는 기타 다른 영역에서의 분석방법을 이용할 수 있다. 구체적으로 시간축에서는 파형 분석을 통해 파형이 너무 급격하게 변하지 않고 부드럽게 변할 수 있도록 상기 클리핑복원게인을 추정할 수 있다. 그 다음에 클리핑복원부(605)는 상기 클리핑복원게인을 이용하여 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원한다. 그 다음에 멀티채널생성부(606)는 상기 공간 정보 비트스트림을 디코딩하여 얻어진 공간 정보를 이용하여 클리핑된 부분이 복원된 상기 다운믹스 오디오 신호를 멀티채널 오디오 신호(607)로 변환할 수 있다. 상기 제3 디코딩 방법은 상기 제1 디코딩 방법 또는 제2 디코딩 방법과 조합하여 사용할 수 있다. 이 경우 상기 제1, 제2 또는 제3 디코딩 방법들 중 하나를 선택적으로 사용하거나, 또는, 제1 또는 제2 디코딩 방법과 상기 제3 디코딩 방법을 조합하여 사용할 수 있다.6 illustrates a third decoding method for preventing clipping using clipping restoration information according to the present invention. As shown, the bitstream receiver 602 constituting the spatial decoder 608 receives the entire bitstream 601 and extracts a downmix audio signal and a spatial information bitstream from the entire bitstream 601. The clipping search unit 603 then finds a clipped portion of the downmix audio signal. The clipping restore gain estimator 604 estimates the clipping restore gain for the clipped portion. For example, an analysis method in time, frequency or other areas may be used to estimate the clipping restore gain. Specifically, the clipping restoration gain may be estimated on the time axis so that the waveform may be smoothly changed without being changed too rapidly. The clipping restorer 605 then reconstructs the clipped portion of the downmix audio signal using the clipping restore gain. The multichannel generator 606 may then convert the downmixed audio signal from which the clipped portion is reconstructed into the multichannel audio signal 607 using the spatial information obtained by decoding the spatial information bitstream. The third decoding method may be used in combination with the first decoding method or the second decoding method. In this case, one of the first, second, or third decoding methods may be selectively used, or a combination of the first or second decoding method and the third decoding method may be used.

도 7a 및 7b는 상기 제2 디코딩 방법에 대한 신택스를 도시한다. 도 7a의 (a)에 도시되는 것처럼, 클리핑복원정보는 공간 정보 비트스트림의 프레임(이하, "공간 프레임"이라 한다)내에 삽입된다. 상기 클리핑복원정보는 클리핑된 부분의 위치정보 또는 클피핑복원크기정보 중 하나 이상을 포함할 수 있다. 두 가지 정보는 다양한 실시예로 이용될 수 있다. 각각의 실시예에서, 먼저 프레임내에 클리핑이 일어났는지 여부를 먼저 확인하여야 한다. 이는 클리핑발생여부정보(즉, bsClippingPresent")를 읽어서 판단한다. 예를 들면, 상기 클리핑발생여부정보 값이 1이면 클리핑이 있다는 의미이므로 추가적인 비트를 읽어 클리핑된 부분을 복원하게 되고, 상기 클리핑발생여부정보 값이 0이면 클리핑이 없다는 의미이므로 추가적인 비트를 읽을 필요가 없다. 7A and 7B show the syntax for the second decoding method. As shown in Fig. 7A (a), the clipping restore information is inserted into a frame of the spatial information bitstream (hereinafter referred to as a "spatial frame"). The clipping restoration information may include at least one of position information of the clipped portion or clipping clipping size information. Two pieces of information may be used in various embodiments. In each embodiment, first check whether clipping has occurred in the frame. This is determined by reading clipping occurrence information (ie, bsClippingPresent "). For example, if the clipping occurrence information value is 1, it means that clipping is performed, and thus, the clipping portion is restored by reading an additional bit. An information value of zero means no clipping, so no additional bits need to be read.

도 7a의 (b)는 클리핑복원정보를 이용하는 제1 실시예를 나타낸다. 여기서는 프레임을 미리 정해진 일정한 구간으로 나눈 후, 상기 구간마다 클리핑복원크기정보(즉, "bsRestorationAmp[i]")를 통해 클리핑을 복원할 크기정보를 받을 수 있다. 상기 크기정보는 시간포락선 또는 주파수포락선 크기정보를 포함할 수 있다. 예를 들면, 상기 구간마다 클리핑이 있을 수도 있고 없을 수도 있으므로, 상기 클리핑복원크기정보 값이 0이면 클리핑이 없는 것으로 사용하고, 상기 클리핑복원크기정보 값이 1~7이면 특정한 클리핑복원크기정보로 사용할 수 있다.FIG. 7A (b) shows a first embodiment using clipping restoration information. Here, after dividing the frame into predetermined predetermined sections, size information for restoring clipping may be received through clipping restoration size information (that is, "bsRestorationAmp [i]") for each section. The size information may include time envelope or frequency envelope size information. For example, since there may or may not be clipping for each section, if the clipping restoration size information value is 0, no clipping is used, and if the clipping restoration size information values are 1-7, it is used as specific clipping restoration size information. Can be.

도 7a의 (c)는 클리핑복원정보를 이용하는 제2 실시예를 나타낸다. 여기서는 프레임내에서 클리핑이 일어나는 횟수(즉, "FixedNumber")를 미리 정한 후, 상기 횟수마다 클리핑된 부분의 위치정보(즉, "bsRestorationPos[i]) 및 클리핑복원크기정보(즉, "bsRestorationAmp[i])를 읽는다. 상기 클리핑된 부분의 위치정보 및 클리핑복원크기정보를 이용하여 클리핑된 부분을 복원할 수 있는데, 이 방법은 클리핑이 발생하는 횟수가 적은 경우에 상기 제1 실시예보다 효율적일 수 있다. FIG. 7A (c) shows a second embodiment using clipping restoration information. Here, the number of clipping occurs in the frame (that is, "FixedNumber") is determined in advance, and the positional information of the clipped portion (i.e., "bsRestorationPos [i]) and the clipping restoration size information (i.e.," bsRestorationAmp [i] ]) The clipped portion may be restored by using the positional information and the clipping restoration size information of the clipped portion, which may be more efficient than the first embodiment when the number of clipping occurs is small.

도 7b의 (d)는 클리핑복원정보를 이용하는 제3 실시예를 나타낸다. 여기서는 구간개수정보(즉, "bsNumRestoration")를 읽어서 프레임을 몇 개의 구간으로 나눌지를 결정할 수 있다. 여기서, "FixedNumber"는 상기 구간개수정보에 의해 결정되 는 값이 될 수 있다. 그 다음에 구간마다 클리핑복원크기정보(즉, "bsRestorationAmp[i]")를 통해 클리핑된 부분을 복원할 크기정보를 받아, 클리핑된 부분을 복원하는데 이용할 수 있다.FIG. 7B (d) shows a third embodiment using clipping restoration information. Here, it is possible to determine how many sections to divide the frame by reading the section number information (ie, "bsNumRestoration"). Here, "FixedNumber" may be a value determined by the section number information. Then, each section receives size information for restoring the clipped portion through clipping restoration size information (ie, "bsRestorationAmp [i]"), and may be used to restore the clipped portion.

도 7b의 (e)는 클리핑복원정보를 이용하는 제4 실시예를 나타낸다. 여기서는 클리핑이 일어난 횟수를 "bsNumRestoration"를 읽어서 결정할 수 있다. 즉, 여기서 "FixedNumber"는 클리핑이 일어난 횟수를 의미하며, 상기 "bsNumRestoration"에 의해 결정될 수 있다. 그 다음에 상기 횟수마다 클리핑된 부분의 위치정보(즉, "bsRestorationPos[i]) 및 클리핑복원크기정보(즉, "bsRestorationAmp[i])를 읽는다. 상기 클리핑된 부분의 위치정보 및 클리핑복원크기정보를 이용하여 클리핑된 부분을 복원할 수 있다.FIG. 7B (e) shows a fourth embodiment using clipping restoration information. Here, the number of clippings can be determined by reading "bsNumRestoration". That is, "FixedNumber" here means the number of times clipping has occurred, and may be determined by the "bsNumRestoration". Then, the positional information (i.e., "bsRestorationPos [i]) and the clipping restoration size information (i.e.," bsRestorationAmp [i]) of the clipped part are read out every said number of times. The clipped portion may be restored by using the positional information and the clipping restoration size information of the clipped portion.

도 8은 본 발명에 따른 클리핑복원정보를 이용하는 제1 인코딩 방법에 대한 흐름도를 나타낸다. 먼저 멀티채널 오디오 신호(801)를 다운믹스(802)하여 다운믹스 오디오 신호를 생성하고, 상기 멀티채널 오디오 신호로부터 공간 정보를 추출(803)한다. 만일 클리핑복원를 사용한다면(804), 클리핑복원정보를 포함(805)하도록 공간 정보 비트스트림을 생성(806)할 수 있다. 만일 클리핑복원를 사용하지 않는다면(804), 클리핑복원정보를 포함하지 않도록 공간 정보 비트스트림을 생성(806)할 수 있다. 그 다음에 상기 다운믹스 오디오 신호 및 공간 정보를 포함하는 전체 비트스트림을 전송(807)한다.8 is a flowchart illustrating a first encoding method using clipping restoration information according to the present invention. First, the multichannel audio signal 801 is downmixed 802 to generate a downmix audio signal, and spatial information is extracted from the multichannel audio signal 803. If clipping recovery is used (804), a spatial information bitstream can be generated (806) to include (805) clipping recovery information. If clipping recovery is not used (804), a spatial information bitstream may be generated (806) to not include clipping recovery information. The entire bitstream containing the downmix audio signal and spatial information is then transmitted 807.

도 9은 본 발명에 따른 클리핑복원정보를 이용하는 제2 인코딩 방법에 대한 흐름도를 나타낸다. 먼저 멀티채널 오디오 신호(901)를 다운믹스(902)하여 다운믹 스 오디오 신호를 생성하고, 상기 멀티채널 오디오 신호로부터 공간 정보를 추출(903)한다. 만일 클리핑복원를 사용한다면(904), 클리핑복원를 위한 데이터가 존재하는지 판단(905)한다. 만일 클리핑복원를 위한 데이터가 존재한다면 클리핑복원정보를 포함(906)하도록 공간 정보 비트스트림을 생성(907)한다. 만일 클리핑복원를 사용하지 않거나, 클리핑복원를 위한 데이터가 존재하지 않는다면, 클리핑복원정보를 포함하지 않도록 공간 정보 비트스트림을 생성(907)할 수 있다. 그 다음에 상기 다운믹스 오디오 신호 및 공간 정보를 포함하는 전체 비트스트림을 전송(908)한다.9 is a flowchart illustrating a second encoding method using clipping restoration information according to the present invention. First, the multichannel audio signal 901 is downmixed 902 to generate a downmix audio signal, and spatial information is extracted 903 from the multichannel audio signal. If clipping restoration is used (904), it is determined whether data for clipping restoration exists (905). If data for clipping restoration exists, a spatial information bitstream is generated (907) to include (906) the clipping restoration information. If clipping restoration is not used or if data for clipping restoration does not exist, the spatial information bitstream may be generated 907 so that the clipping restoration information is not included. The entire bitstream including the downmix audio signal and spatial information is then transmitted 908.

도 10은 본 발명에 따른 클리핑복원정보를 이용하는 제1 디코딩 방법에 대한 흐름도이다. 먼저 다운믹스 오디오 신호 및 공간 정보를 포함하는 비트스트림을 수신(1001)하고, 상기 비트스트림으로부터 다운믹스 오디오 신호 및 공간 정보 비트스트림을 추출(1002 및 1003)한다. 그 다음에 상기 다운믹스 오디오 신호에서 클리핑된 부분을 찾는다(1004). 그 다음에 공간 정보 비트스트림으로부터 상기 클리핑된 부분에 해당하는 클리핑복원정보를 독출(1006)하고, 독출된 상기 클리핑복원정보를 이용하여 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원(1005)한다. 상기 클피핑복원정보는 클리핑복원크기정보를 포함할 수 있다. 그 다음에 상기 공간 정보 비트스트림을 디코딩하여 얻어진 공간 정보를 이용하여 클리핑된 부분이 복원된 상기 다운믹스 오디오 신호를 멀티채널 오디오 신호로 변환(1007)한다. 10 is a flowchart of a first decoding method using clipping restoration information according to the present invention. First, a bitstream including a downmix audio signal and spatial information is received (1001), and a downmix audio signal and spatial information bitstream are extracted (1002 and 1003) from the bitstream. The clipped portion of the downmix audio signal is then found (1004). Then, the clipping restoration information corresponding to the clipped portion is read 1006 from the spatial information bitstream, and the clipping portion of the downmix audio signal is restored 1005 by using the read clipping restoration information. The clipping restoration information may include clipping restoration size information. The downmixed audio signal, from which the clipped portion is recovered, is converted into a multichannel audio signal by using the spatial information obtained by decoding the spatial information bitstream.

도 11은 본 발명에 따른 클리핑복원정보를 이용하는 제2 디코딩 방법에 대한 흐름도이다. 먼저 다운믹스 오디오 신호 및 공간 정보를 포함하는 비트스트림을 수 신(1101)하고, 상기 비트스트림으로부터 다운믹스 오디오 신호 및 공간 정보 비트스트림을 추출(1102 및 1103)한다. 그 다음에 상기 공간 정보 비트스트림으로부터 클리핑된 부분에 대한 클리핑복원정보를 독출(1105)한다. 상기 클리핑복원정보는 클리핑된 부분의 위치정보 또는 클리핑복원크기정보 중 하나 이상을 포함할 수 있다. 독출된 상기 클리핑복원정보를 이용하여 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원(1104)한다. 그 다음에 상기 공간 정보 비트스트림을 디코딩하여 얻어진 공간 정보를 이용하여 클리핑된 부분이 복원된 상기 다운믹스 오디오 신호를 멀티채널 오디오 신호로 변환(1106)한다. 11 is a flowchart of a second decoding method using clipping restoration information according to the present invention. First, a bitstream including a downmix audio signal and spatial information is received 1101, and a downmix audio signal and spatial information bitstream are extracted 1102 and 1103 from the bitstream. Then, the clipping recovery information for the clipped portion from the spatial information bitstream is read (1105). The clipping restoration information may include one or more of position information or clipping restoration size information of the clipped portion. The clipped portion of the downmix audio signal is restored (1104) using the read out clipping restoration information. The downmixed audio signal, from which the clipped portion is restored, is converted into a multichannel audio signal using spatial information obtained by decoding the spatial information bitstream (1106).

도 12는 본 발명에 따른 클리핑복원정보를 이용하는 제3 디코딩 방법에 대한 흐름도이다. 먼저 다운믹스 오디오 신호 및 공간 정보를 포함하는 비트스트림을 수신(1201)하고, 상기 비트스트림으로부터 다운믹스 오디오 신호 및 공간 정보 비트스트림을 추출(1202 및 1203)한다. 그 다음에 상기 다운믹스 오디오 신호에서 클리핑된 부분을 찾는다(1204). 그 다음에 상기 다운믹스 오디오 신호로부터 클리핑복원게인을 추정(1205)하고, 추정된 클리핑복원게인을 이용하여 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원(1206)한다. 그 다음에 상기 공간 정보 비트스트림을 디코딩하여 얻어진 공간 정보를 이용하여 클리핑된 부분이 복원된 상기 다운믹스 오디오 신호를 멀티채널 오디오 신호로 변환(1207)한다. 12 is a flowchart of a third decoding method using clipping restoration information according to the present invention. First, a bitstream including a downmix audio signal and spatial information is received (1201), and a downmix audio signal and spatial information bitstream are extracted (1202 and 1203) from the bitstream. A clipped portion of the downmix audio signal is then found (1204). A clipping restore gain is then estimated 1205 from the downmix audio signal, and the clipped portion of the downmix audio signal is recovered 1206 using the estimated clipping restore gain. Next, the downmixed audio signal from which the clipped portion is recovered is converted to a multichannel audio signal using spatial information obtained by decoding the spatial information bitstream (1207).

지금까지 본 발명에 대하여 몇몇 실시예들을 들어 구체적으로 설명하였으나, 상기 실시예들은 본 발명을 이해하기 위한 설명을 위해 제시된 것이며, 본 발명의 범위가 상기 실시예에 제한되는 것은 아니다. 당업자라면 본 발명의 기술적 사상의 범위를 벗어나지 않고도 다양한 변형이 가능함을 이해할 수 있을 것이며, 본 발명의 범위는 첨부된 특허청구범위에 의해서 해석되어야 할 것이다.Although the present invention has been described in detail with reference to some embodiments, the above embodiments are presented for the purpose of understanding the present invention, and the scope of the present invention is not limited to the above embodiments. Those skilled in the art will understand that various modifications are possible without departing from the scope of the technical idea of the present invention, and the scope of the present invention should be interpreted by the appended claims.

이상에서 기술된 것과 같이, 본 발명에 따른 멀티채널 오디오 신호를 코딩하는데 있어서, 클리핑복원정보를 포함하도록 비트스트림을 구성하고, 상기 클리핑복원정보를 이용하여 다운믹스 오디오 신호의 클리핑된 부분을 복원함으로써 멀티채널 오디오 신호를 다운믹스하는 과정에서 발생되는 클리핑 문제를 효과적으로 방지할 수 있다.As described above, in coding a multi-channel audio signal according to the present invention, by configuring a bitstream to include clipping restoration information, and restoring the clipped portion of the downmix audio signal by using the clipping restoration information Clipping problems caused by downmixing multichannel audio signals can be effectively prevented.

또한, 상기 다운믹스 오디오 신호로부터 클리핑복원게인을 추정하고, 추정된 상기 클리핑복원게인을 이용하여 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원함으로써 멀티채널 오디오 신호를 다운믹스하는 과정에서 발생되는 클리핑 문제를 효과적으로 방지할 수 있다.In addition, a clipping problem generated in the process of downmixing a multichannel audio signal by estimating a clipping restore gain from the downmix audio signal and restoring a clipped portion of the downmix audio signal using the estimated clipping restore gain Can be effectively prevented.

Claims

멀티채널 오디오 신호를 인코딩하는 방법에 있어서,A method of encoding a multichannel audio signal,

(a) 상기 멀티채널 오디오 신호를 다운믹스하여 다운믹스 오디오 신호를 생성하는 단계;(a) downmixing the multichannel audio signal to generate a downmix audio signal;

(b) 상기 멀티채널 오디오 신호로부터 공간 정보를 추출하는 단계; 및(b) extracting spatial information from the multichannel audio signal; And

(c) 상기 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함하는 전체 비트스트림을 생성하는 단계를 포함하되, 상기 공간 정보 비트스트림의 헤더내에 클리핑복원(Clipping Restoration)의 사용 여부에 관한 제1 정보를 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호의 인코딩 방법.(c) generating an entire bitstream including the downmix audio signal and the spatial information bitstream, wherein the first information on whether to use clipping restoration in the header of the spatial information bitstream is generated. And encoding a multichannel audio signal.

제 1 항에 있어서,The method of claim 1,

상기 (c)단계는,Step (c) is,

상기 제1 정보가 사용상태를 표시하는 경우, 상기 공간 정보 비트스트림내에 프레임별로 상기 클리핑복원정보(Clipping Restoration Information)를 포함하는 단계를 더 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호의 인코딩 방법.And if the first information indicates a usage state, including the clipping restoration information for each frame in the spatial information bitstream.

제 1 항에 있어서,The method of claim 1,

상기 (c)단계는,Step (c) is,

상기 제1 정보가 사용상태를 표시하는 경우, 상기 공간 정보 비트스트림내에 프레임별로 상기 클리핑복원를 위한 데이터의 존재 여부에 관한 제2 정보를 포함하는 단계를 더 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호의 인코딩 방법.And if the first information indicates a usage state, including the second information on whether the data for the clipping restoration exists for each frame in the spatial information bitstream. Method of encoding.

제 3 항에 있어서,The method of claim 3, wherein

상기 제2 정보가 존재상태를 표시하는 경우, 상기 공간 정보 비트스트림내에 프레임별로 상기 클리핑복원정보를 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호의 인코딩 방법.And when the second information indicates an existence state, the clipping restore information for each frame in the spatial information bitstream.

제 2 항 또는 제 4 항에 있어서,The method according to claim 2 or 4,

상기 클리핑복원정보는 상기 다운믹스 오디오 신호의 시간 포락선(Time envelope) 또는 주파수 포락선(frequency envelope) 정보인 것을 특징으로 하는, 멀티채널 오디오 신호의 인코딩 방법.And the clipping recovery information is time envelope or frequency envelope information of the downmix audio signal.

제 2 항 또는 제 4 항에 있어서,The method according to claim 2 or 4,

상기 클리핑복원정보는 시간 포락선(Time envelope) 또는 주파수 포락선(frequency envelope) 모델의 파라미터에 관한 정보인 것을 특징으로 하는, 멀티채널 오디오 신호의 인코딩 방법.And the clipping recovery information is information on a parameter of a time envelope or a frequency envelope model.

제 2 항 또는 제 4 항에 있어서,The method according to claim 2 or 4,

상기 클리핑복원정보는 클리핑이 일어난 위치정보 또는 클리핑을 보정하기 위한 게인정보 중 하나 이상을 포함하는 정보인 것을 특징으로 하는, 멀티채널 오디오 신호의 인코딩 방법.And the clipping restoration information is information including at least one of position information where clipping has occurred or gain information for correcting clipping.

멀티채널 오디오 신호로 디코딩하는 방법에 있어서,In the method of decoding into a multi-channel audio signal,

(a) 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함하는 전체 비트스트림을 수신하는 단계;(a) receiving an entire bitstream comprising a downmix audio signal and a spatial information bitstream;

(b) 상기 다운믹스 오디오 신호에서 클리핑된 부분을 찾는 단계;(b) finding a clipped portion of the downmix audio signal;

(c) 상기 클리핑된 부분에 대하여 상기 공간 정보 비트스트림으로부터 클리핑복원정보를 독출하는 단계; 및(c) reading clipping restoration information from the spatial information bitstream for the clipped portion; And

(d) 독출된 상기 클리핑복원정보를 이용하여, 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원하는 단계;를 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호로 디코딩하는 방법.and (d) restoring the clipped portion of the downmix audio signal by using the readout of the clipping restoration information.

제 8 항에 있어서,The method of claim 8,

상기 클리핑복원정보는 상기 다운믹스 오디오 신호의 시간 포락선(Time envelope) 또는 주파수 포락선(frequency envelope) 정보인 것을 특징으로 하는, 멀티채널 오디오 신호로 디코딩하는 방법.And the clipping recovery information is time envelope or frequency envelope information of the downmix audio signal.

제 8 항에 있어서,The method of claim 8,

상기 클리핑복원정보는 시간 포락선(Time envelope) 또는 주파수 포락 선(frequency envelope) 모델의 파라미터에 관한 정보인 것을 특징으로 하는, 멀티채널 오디오 신호로 디코딩하는 방법.And the clipping restoration information is information about a parameter of a time envelope or frequency envelope model.

제 8 항에 있어서,The method of claim 8,

상기 클리핑복원정보는 클리핑이 일어난 위치정보 또는 클리핑을 보정하기 위한 게인정보 중 하나 이상을 포함하는 정보인 것을 특징으로 하는, 멀티채널 오디오 신호로 디코딩하는 방법.And the clipping restoration information is information including at least one of position information in which clipping has been performed or gain information for correcting clipping.

제 8 항에 있어서,The method of claim 8,

상기 디코딩 방법은,The decoding method,

상기 공간 정보 비트스트림으로부터 공간 정보를 추출하고, 추출된 상기 공간 정보를 이용하여 클리핑된 부분이 복원된 상기 다운믹스 오디오 신호를 멀티채널 오디오 신호로 변환하는 단계를 더 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호로 디코딩하는 방법.And extracting spatial information from the spatial information bitstream and converting the downmixed audio signal from which the clipped portion is restored to the multichannel audio signal using the extracted spatial information. How to decode to a channel audio signal.

(b) 상기 공간 정보 비트스트림으로부터 클리핑복원정보를 독출하는 단계; 및(b) reading clipping recovery information from the spatial information bitstream; And

(c) 상기 클리핑복원정보를 이용하여 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원하는 단계;를 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호로 디코딩하는 방법.and (c) restoring the clipped portion of the downmix audio signal using the clipping restore information.

제 13 항에 있어서,The method of claim 13,

상기 디코딩 방법은,The decoding method,

상기 (b)단계 이전에, 상기 공간 정보 비트스트림으로부터 프레임별로 클리핑 존재여부에 관한 정보를 독출하는 단계를 더 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호로 디코딩하는 방법.Before the step (b), further comprising the step of reading the information on the presence of clipping for each frame from the spatial information bitstream, decoding method to a multi-channel audio signal.

제 13 항에 있어서,The method of claim 13,

상기 클리핑복원정보는,The clipping restore information,

클리핑된 부분의 클리핑복원크기정보를 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호로 디코딩하는 방법.And clipping restore size information of the clipped portion.

제 13 항에 있어서,The method of claim 13,

상기 클리핑복원정보는,The clipping restore information,

상기 클리핑된 부분의 위치정보 및 클리핑복원크기정보를 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호로 디코딩하는 방법.And position information and clipping restoration size information of the clipped portion.

제 13 항에 있어서,The method of claim 13,

제 13항에 있어서,The method of claim 13,

상기 클리핑복원크기정보는 시간 포락선(Time envelope) 또는 주파수 포락선(frequency envelope) 모델의 파라미터에 관한 정보인 것을 특징으로 하는, 멀티채널 오디오 신호로 디코딩하는 방법.The clipping restoration size information is information on a parameter of a time envelope or a frequency envelope model.

제 13 항에 있어서,The method of claim 13,

상기 디코딩 방법은,The decoding method,

(c) 상기 클리핑된 부분에 대하여 클리핑복원게인(Clipping Restoration Gain)을 추정하는 단계; 및(c) estimating a clipping restoration gain for the clipped portion; And

(d) 추정된 상기 클리핑복원게인을 이용하여 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원하는 단계;를 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호로 디코딩하는 방법.and (d) restoring a clipped portion of the downmix audio signal using the estimated clipping restore gain.

제 19 항에 있어서,The method of claim 19,

상기 클리핑복원게인은 시간 포락선 또는 주파수 포락선을 이용하여 추정되는 것을 특징으로 하는, 멀티채널 오디오 신호로 디코딩하는 방법.And the clipping restore gain is estimated using a temporal envelope or a frequency envelope.

제 20 항에 있어서,The method of claim 20,

상기 디코딩 방법은,The decoding method,

(a) 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함하는 전체 비트스 트림을 수신하는 단계;(a) receiving a full bitstream comprising a downmix audio signal and a spatial information bitstream;

(b) 상기 공간 정보 비트스트림으로부터 프레임별로 클리핑복원정보의 독출방법을 선택하는 단계;(b) selecting a method of reading clipping restoration information for each frame from the spatial information bitstream;

(c) 선택된 상기 독출방법을 이용하여 상기 공간 정보 비트스트림으로부터 클리핑복원정보를 독출하는 단계; 및(c) reading clipping restoration information from the spatial information bitstream using the selected reading method; And

제 23 항에 있어서,The method of claim 23, wherein

상기 클리핑복원정보의 독출방법은,The method of reading the clipping restore information,

상기 다운믹스 오디오 신호에서 클리핑된 부분을 찾은 후에 상기 공간 정보 비트스트림으로부터 클리핑복원정보를 독출하는 방법, 상기 공간 정보 비트스트림으로부터 클리핑된 부분의 위치정보 및 크기정보를 포함하는 클리핑복원정보를 독출하는 방법, 또는 상기 다운믹스 오디오 신호에서 클리핑된 부분을 찾은 후에 클리핑복원게인을 추정하는 방법 중에서 하나를 선택하는 것을 특징으로 하는, 멀티채널 오디오 신호로 디코딩하는 방법.A method of reading clipping restoration information from the spatial information bitstream after finding a clipped portion of the downmix audio signal, and reading clipping restoration information including position information and size information of a portion clipped from the spatial information bitstream. And a method of estimating a clipping restore gain after finding a clipped portion of the downmix audio signal.

오디오 신호를 생성함에 있어서,In generating an audio signal,

상기 오디오 신호는 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함 하도록 생성하되,The audio signal is generated to include a downmix audio signal and a spatial information bitstream,

상기 공간 정보 비트스트림내에 클리핑복원정보를 포함하도록 생성되는 것을 특징으로 하는, 오디오 신호의 생성방법.And generate clipping restoration information in the spatial information bitstream.

멀티채널 오디오 신호의 인코딩 장치에 있어서,An apparatus for encoding a multichannel audio signal,

(a) 상기 멀티채널 오디오 신호를 다운믹스하여 다운믹스 오디오 신호를 생성하는 다운믹스부;a downmix unit for downmixing the multichannel audio signal to generate a downmix audio signal;

(b) 상기 멀티채널 오디오 신호로부터 공간 정보를 추출하는 공간정보발생부; 및(b) a spatial information generator for extracting spatial information from the multichannel audio signal; And

(c) 상기 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함하도록 전체 비트스트림을 생성하는 비트스트림포맷터를 포함하되, 상기 공간 정보 비트스트림내에 클리핑복원정보(Guided Clipping Restoration Information)를 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호의 인코딩 장치.(c) a bitstream formatter for generating the entire bitstream to include the downmixed audio signal and the spatial information bitstream, wherein the clipped restoration information is included in the spatial information bitstream. An apparatus for encoding a multichannel audio signal.

멀티채널 오디오 신호의 디코딩 장치에 있어서,An apparatus for decoding a multichannel audio signal,

(a) 다운믹스 오디오 신호 및 공간 정보 비트스트림을 포함하는 전체 비트스트림을 수신하는 비트스트림수신부;(a) a bitstream receiver for receiving an entire bitstream including a downmix audio signal and a spatial information bitstream;

(b) 상기 다운믹스 오디오 신호에서 클리핑된 부분을 찾는 클리핑탐색부;(b) a clipping searcher to find a clipped portion of the downmix audio signal;

(c) 상기 클리핑된 부분에 대하여 상기 공간 정보 비트스트림으로부터 클리핑복원정보를 독출하는 클리핑복원정보독출부; 및(c) a clipping recovery information reader for reading clipping recovery information from the spatial information bitstream with respect to the clipped portion; And

(d) 독출된 상기 클리핑복원정보를 이용하여, 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원하는 클리핑복원부;를 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호의 디코딩 장치.and (d) a clipping restoring unit for restoring a clipped portion of the downmix audio signal using the read out clipping restoring information.

(b) 상기 공간 정보 비트스트림으로부터 클리핑된 부분의 위치정보 및 클리핑복원정보를 독출하는 클리핑복원정보독출부; 및(b) a clipping recovery information reading unit which reads positional information and clipping restoration information of a portion clipped from the spatial information bitstream; And

(c) 상기 위치정보 및 클리핑복원정보를 이용하여 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원하는 클리핑복원부;를 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호로 디코딩하는 방법.and (c) a clipping restoring unit for restoring a clipped portion of the downmix audio signal using the positional information and the clipping restoring information.

(c) 상기 클리핑된 부분에 대하여 클리핑복원게인(Clipping Restoration Gain)을 추정하는 클리핑복원게인추정부; 및(c) a clipping restoration gain estimator for estimating a clipping restoration gain for the clipped portion; And

(d) 추정된 상기 클리핑복원게인을 이용하여 상기 다운믹스 오디오 신호의 클리핑된 부분을 복원하는 클리핑복원부;를 포함하는 것을 특징으로 하는, 멀티채널 오디오 신호의 디코딩 장치.and (d) a clipping restoration unit for restoring a clipped portion of the downmix audio signal by using the estimated clipping restoration gain.