KR20070081735A

KR20070081735A - Apparatus for encoding and decoding audio signal and method thereof

Info

Publication number: KR20070081735A
Application number: KR1020060097319A
Authority: KR
Inventors: 정양원; 방희석; 오현오; 김동수; 임재현
Original assignee: 엘지전자 주식회사
Priority date: 2006-02-13
Filing date: 2006-10-02
Publication date: 2007-08-17

Abstract

A device and a method for encoding/decoding an audio signal are provided to reduce an amount of information to transmit or store by deforming spatial information by using spatial ambient information before integrating spatial information into downmix signals and decoding downmix signals by the deformed spatial information. A device for encoding/decoding an audio signal includes a downmix part(405) for generating downmix signals from multi channel audio signals, and a spatial information extracting part(403) for generating spatial information from the multichannel audio signals. A spatial ambient information generating part(405) generates spatial ambient information for providing at least one or more kinds of 3D sound effect. A multiplexer part(407) generates bit streams including the downmix signals, the spatial information and the spatial ambient information.

Description

오디오 신호의 인코딩/디코딩 방법 및 장치{APPARATUS FOR ENCODING AND DECODING AUDIO SIGNAL AND METHOD THEREOF}Method and apparatus for encoding / decoding audio signal {APPARATUS FOR ENCODING AND DECODING AUDIO SIGNAL AND METHOD THEREOF}

도 1은 오디오 신호의 공간 정보를 인간이 인식하는 방법을 나타내는 도면.1 is a diagram illustrating a method for a human to recognize spatial information of an audio signal.

도 2는 본 발명의 일실시예에 따른 공간음향정보를 적용하여 생성되는 입체음향 효과를 나타내는 도면.2 is a view showing a three-dimensional sound effect generated by applying the spatial sound information according to an embodiment of the present invention.

도 3은 본 발명의 일실시예에 따른 변형된 공간 정보를 이용하여 입체음향 효과를 나타내는 원리를 나타내는 도면3 is a view showing a principle of showing a three-dimensional sound effect using the modified spatial information according to an embodiment of the present invention

도 4는 본 발명의 일실시예에 따른 공간음향정보를 생성하는 인코딩 장치를 나타내는 도면.4 is a diagram illustrating an encoding apparatus for generating spatial sound information according to an embodiment of the present invention.

도 5는 본 발명의 일실시예에 따른 공간음향정보를 이용하여 공간 정보를 변형하는 디코딩 장치를 나타내는 도면.5 is a diagram illustrating a decoding apparatus for transforming spatial information using spatial acoustic information according to an embodiment of the present invention.

도 6은 본 발명의 일실시예에 따른 공간음향정보를 이용하는 디코딩 방법을 나타내는 흐름도.6 is a flowchart illustrating a decoding method using spatial acoustic information according to an embodiment of the present invention.

*도면의 주요부분에 대한 부호의 설명* Explanation of symbols for main parts of the drawings

101.원거리 음원 102.직접적인 음파101.Remote sound source 102.Direct sound wave

104.반사된 음파 401.멀티채널 오디오 신호104. Reflected sound waves 401. Multichannel audio signal

402.다운믹스부 403.공간정보추출부402.Downmix Unit 403.Spatial Information Extraction Unit

404.아비트러리 다운믹스 신호 405.공간음향정보생성부404. Aviary downmix signal 405. Spatial acoustic information generator

406.공간음향정보 개수정보 407.다중화부406.Spatial acoustic information count information 407.Multiplexer

408.비트스트림 502.역다중화부408.Bitstream 502. Demultiplexer

503.다운믹스신호 504.공간 정보 신호503. Downmix signal 504. Spatial information signal

505.공간음향정보 507.공간정보디코딩부505. Spatial acoustic information 507. Spatial information decoding unit

508.공간정보변형부 509.공간음향정보선택부508. Spatial information transformation unit 509. Spatial acoustic information selection unit

510.미리 결정된 공간음향정보 511.출력채널 개수510. Predetermined Spatial Acoustic Information 511. Number of Output Channels

512.공간정보통합부512. Spatial Information Integration Unit

본 발명은 오디오 신호의 인코딩 및 디코딩(encoding and decoding) 방법 및/또는 장치에 관한 것이다. 최근에 디지털 오디오 신호에 대한 다양한 코딩기술 및 방법들이 개발되고 있으며, 이와 관련된 제품들이 생산되고 있다. 또한, 멀티채널 오디오 신호의 공간 정보를 이용하여 모노 또는 스테레오 오디오 신호를 멀티채널 신호로 바꾸는 코딩방법들이 개발되고 있다.The present invention relates to a method and / or apparatus for encoding and decoding an audio signal. Recently, various coding techniques and methods for digital audio signals have been developed, and related products have been produced. In addition, coding methods for converting a mono or stereo audio signal into a multichannel signal using spatial information of the multichannel audio signal have been developed.

그러나 컨텐츠 제공자(contents provider)가 두 가지 이상의 입체음향 효과(예를 들면, 객석에서 듣는 음향 효과와 무대 위에서 듣는 음향 효과)를 제공하기 위해서는, 종래에는 독립된 멀티채널 신호를 제공해야만 하였다. 이 경우에, 전송 또는 저장해야 하는 정보량이 두 배로 늘어나는 문제점이 있었다. 또한, 컨텐츠 제공자의 의도와 무관하게 사용자의 요구에 따라 음향 환경을 전환할 필요성이 발생하였다. 또한, DVD 등의 영상물에서 사용자가 영상을 줌(zoom)하는 경우, 음향이 줌과 무관하게 고정된 형태로 재생되어 일체감을 느끼기 어려운 문제가 있었다. 또한, 줌 이후에 확대된 화면을 이동하는 경우에도 음향이 고정되어 있는 문제가 있었다.However, in order for a content provider to provide two or more stereoscopic effects (for example, a sound effect heard in an audience and a sound effect heard on a stage), a conventional multichannel signal has to be provided. In this case, there is a problem that the amount of information to be transmitted or stored doubles. In addition, regardless of the intention of the content provider, there is a need to change the acoustic environment according to the needs of the user. In addition, when a user zooms in an image such as a DVD, the sound is reproduced in a fixed form irrespective of the zoom, thereby making it difficult to feel a sense of unity. In addition, there is a problem that the sound is fixed even when moving the enlarged screen after zooming.

본 발명이 이루고자 하는 기술적 과제는 오디오 신호를 코딩하는데 있어서, 과도한 정보량의 증가 없이 공간음향정보를 이용하여 사용자의 선택에 따라 오디오 신호에 입체음향 효과를 제공할 수 있는 부호화 및 복호화 방법을 제공하는 데 있다. An object of the present invention is to provide an encoding and decoding method for encoding an audio signal, which can provide a stereoscopic sound effect to an audio signal according to a user's selection by using spatial acoustic information without excessively increasing the amount of information. have.

상기의 목적을 달성하기 위하여, 본 발명은 오디오 신호의 채널별 출력 신호를 생성하는 단계와; 상기 채널별 출력 신호에 적어도 하나의 입체음향효과를 제공하는 공간음향정보를 적용하여 변형된 채널별 출력 신호를 생성하는 단계를 포함하는 것을 특징으로 하는 오디오 신호의 생성 방법을 제공한다.In order to achieve the above object, the present invention comprises the steps of generating an output signal for each channel of the audio signal; And generating a modified channel-specific output signal by applying spatial acoustic information that provides at least one stereoscopic sound effect to the channel-specific output signal.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 오디오 신호의 비트스트림으로부터 다운믹스 신호와 공간 정보를 분리하는 단계와; 적어도 하나의 입체음향효과를 제공하는 공간음향정보를 이용하여 상기 공간 정보를 변형하는 단계와; 변 형된 상기 공간 정보를 이용하여 상기 다운믹스 신호를 디코딩하는 단계를 포함하는 것을 특징으로 하는 오디오 신호의 디코딩 방법을 제공한다.In addition, to achieve the above object, the present invention comprises the steps of separating the downmix signal and spatial information from the bitstream of the audio signal; Transforming the spatial information using spatial acoustic information that provides at least one stereophonic effect; And decoding the downmix signal using the modified spatial information.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 멀티채널 오디오 신호로부터 다운믹스 신호 및 공간 정보를 생성하는 단계와; 상기 다운믹스 신호 및 공간 정보를 포함하는 비트스트림을 생성하되, 상기 비트스트림에 하나 이상의 입체음향정보를 제공하는 공간음향정보를 삽입하는 단계를 포함하는 것을 특징으로 하는 오디오 신호의 인코딩 방법을 제공한다.In addition, to achieve the above object, the present invention comprises the steps of generating a downmix signal and spatial information from a multi-channel audio signal; Generating a bitstream including the downmix signal and spatial information, and inserting spatial sound information for providing one or more stereophonic sound information into the bitstream. .

또한, 상기의 목적을 달성하기 위하여, 본 발명은 오디오 신호 및 하나 이상의 입체음향효과를 제공하는 공간음향정보를 포함하되, 상기 공간음향정보는 상기 오디오 신호의 출력 신호를 변형하는데 이용되는 것을 특징으로 하는 데이터 구조를 제공한다.In addition, in order to achieve the above object, the present invention includes an audio signal and spatial acoustic information for providing one or more stereoscopic effects, wherein the spatial acoustic information is used to modify the output signal of the audio signal Provides a data structure.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 오디오 신호의 비트스트림으로부터 다운믹스 신호와 공간 정보를 분리하는 역다중화부; 하나 이상의 입체음향효과를 제공하는 공간음향정보를 이용하여 상기 공간 정보를 변형하는 공간정보변형부; 및 변형된 상기 공간 정보를 이용하여 상기 다운믹스 신호를 디코딩하는 공간정보통합부;를 포함하는 것을 특징으로 하는 오디오 신호의 디코딩 장치를 제공한다. In addition, to achieve the above object, the present invention provides a demultiplexer for separating the downmix signal and spatial information from the bitstream of the audio signal; A spatial information transformation unit that transforms the spatial information using spatial acoustic information that provides one or more stereophonic effects; And a spatial information integrating unit which decodes the downmix signal by using the modified spatial information.

또한, 상기의 목적을 달성하기 위하여, 본 발명은 멀티채널 오디오 신호로부터 다운믹스 신호를 생성하는 다운믹스부; 상기 멀티채널 오디오 신호로부터 공간 정보를 생성하는 공간정보추출부; 하나 이상의 입체음향효과를 제공하는 공간음향 정보를 생성하는 공간음향정보생성부; 및 상기 다운믹스 신호, 상기 공간 정보 및 상기 공간음향정보를 포함하는 비트스트림을 생성하는 다중화부;를 포함하는 것을 특징으로 하는 오디오 신호의 인코딩 장치를 제공한다.In addition, to achieve the above object, the present invention provides a downmix unit for generating a downmix signal from a multi-channel audio signal; A spatial information extracting unit generating spatial information from the multichannel audio signal; A spatial acoustic information generating unit for generating spatial acoustic information providing at least one stereoscopic sound effect; And a multiplexer configured to generate a bitstream including the downmix signal, the spatial information, and the spatial acoustic information.

이하 상기의 목적을 구체적으로 실현할 수 있는 본 발명의 실시예를 첨부한 도면을 참조하여 설명한다.Hereinafter, with reference to the accompanying drawings an embodiment of the present invention that can specifically realize the above object.

도 1 은 본 발명에서의 오디오 신호(audio signal)에 대한 공간 정보(spatial information)를 인간이 인식하는 방법을 도시한다. 멀티채널 오디오 신호에 대한 코딩방법은 인간이 오디오 신호를 3차원적 공간으로 인지한다는 사실을 바탕으로, 복수의 파라미터 세트(parameter sets)를 통하여 상기 오디오 신호를 3차원적 공간 정보로 표현할 수 있다는 것을 이용한다. 멀티채널 오디오 신호의 공간 정보를 표시하기 위한 "공간 파라미터(spatial parameter)"에는 CLD(Channel level differences), ICC(Inter Channel Coherences) 및 CTD(Channel Time Difference)등이 있다. 상기 CLD는 두 채널간의 에너지 차이를 의미하고, 상기 ICC는 두 채널 간의 상관관계(correlation)를 의미하며, CTD는 두 채널간의 시간 차이를 의미한다.1 illustrates a method of human recognition of spatial information about an audio signal in the present invention. The coding method for a multichannel audio signal is based on the fact that a human perceives the audio signal as a three-dimensional space. I use it. "Spatial parameters" for displaying spatial information of a multichannel audio signal include channel level differences (CLD), inter channel coherences (ICC), channel time differences (CTD), and the like. The CLD denotes an energy difference between two channels, the ICC denotes a correlation between two channels, and the CTD denotes a time difference between two channels.

인간이 오디오 신호를 어떻게 공간적으로 인식하며, 상기 공간 파라미터의 개념이 어떻게 생성되는지가 도 1에 도시된다. 원거리에 있는 음원(sound source, 105)으로부터의 직접적인 음파(direct sound wave)(103)가 인간의 왼쪽 귀(107)에 도달하고, 또 다른 직접적인 음파(102)는 머리 주위에서 회절(diffract)되어 오른쪽 귀(106)에 도달하게 된다. 상기 두 음파(102 및 103)는 도달시간 및 에너지 레 벨에서 차이를 보이게 되며, 이와 같은 차이가 상기 CTD 및 CLD 파라미터를 생성하게 된다. 또한, 만일 반사된 음파(104 및 105)가 양 귀에 도달되거나, 또는 상기 음원(105)이 분산되어 있다면, 서로 상관관계가 없는 음파가 양 귀에 도달될 것이고, 이것이 상기 ICC 파라미터를 생성하게 된다. 상기와 같이 원리로 생성된 공간 파라미터들을 이용하여 멀티채널 오디오 신호를 모노 또는 스테레오 신호로 전송한 후 다시 멀티채널로 출력할 수 있다. 본 발명은 멀티채널 오디오 신호에 한정되지 않으나, 본 명세서에서는 편의상 멀티채널 오디오 신호를 예로 하여 기술한다.How a human perceives an audio signal spatially and how the concept of the spatial parameter is generated is shown in FIG. 1. A direct sound wave 103 from a distant sound source 105 reaches the human left ear 107, and another direct sound wave 102 is diffracted around the head. The right ear 106 is reached. The two sound waves 102 and 103 show a difference in arrival time and energy level, and this difference generates the CTD and CLD parameters. In addition, if the reflected sound waves 104 and 105 reach both ears, or if the sound source 105 is dispersed, uncorrelated sound waves will reach both ears, which will generate the ICC parameter. By using the spatial parameters generated as described above, the multichannel audio signal may be transmitted as a mono or stereo signal and then output again to the multichannel. Although the present invention is not limited to a multichannel audio signal, the present specification describes a multichannel audio signal as an example for convenience.

도 2는 본 발명의 일실시예에 따른 입체음향 효과를 나타내는 도면이다. 도 2의 (a)는 대중 음악 공연에서 청자가 객석에서의 음향 효과를 느낄 수 있는 스피커 배치를 나타낸다. 이 경우에, 중앙(center) 채널의 스피커와 좌측(L) 및 우측(R) 채널의 스피커는 약 30도 각도로 배치되어 있고, 상기 좌측 및 우측 채널의 스피커로부터 좌측 서라운드(Ls) 및 우측 서라운드(Rs) 채널의 스피커는 약 80도 각도로 배치되어 있다. 그러나 청자가 객석이 아닌, 예를 들면, 무대 위에서의 음향 효과를 느끼기 위해서는 관객의 함성이 포함된 서라운드 채널의 게인을 줄이고, 상기 서라운드 채널의 각도를 후방으로 이동한 뒤, 좌측 및 우측 채널의 스피커가 청자의 옆으로 오게 배치함으로써 그 효과를 얻을 수 있다.2 is a view showing a three-dimensional sound effect according to an embodiment of the present invention. FIG. 2A illustrates a speaker arrangement in which a listener can feel a sound effect in an auditorium at a popular music performance. In this case, the speaker of the center channel and the speaker of the left (L) and right (R) channels are arranged at an angle of about 30 degrees, and the left surround (Ls) and the right surround from the speakers of the left and right channels are arranged. The speakers of the (Rs) channel are arranged at an angle of about 80 degrees. However, in order for the listener to feel a sound effect on the stage that is not the audience, for example, to reduce the gain of the surround channel that contains the shout of the audience, move the surround channel angle backwards, and then the speaker of the left and right channels. The effect can be obtained by placing them next to the listener.

도 2의 (b)는 상기와 같이 무대 위에서의 음향 효과를 느낄 수 있는 가상 스피커 배치를 도시한다. 이와 같은 음향 효과를 나타내기 위하여, 디코딩 장치는 원래의 중앙(C), 좌측(L), 우측(R), 좌측 서라운드(Ls) 및 우측 서라운드(Rs)와 같은 채널별 출력 신호에 공간음향정보(spatial ambient imformation)를 적용하여, 하나 이상의 입체음향 효과를 가지는 변형된 채널별 출력 신호를 생성할 수 있다. "공간음향정보"란 사용자가 가상의 공간에 위치하거나, 가상의 음향 효과를 느낄 수 있도록 입체음향 효과를 제공하기 위한 정보를 말한다. 상기 공간음향정보는 HRTF(Head Related Transfer function)와 같은 3D 입체음향정보, 오디오 이미지 제어 정보(audio image control information) 등을 포함한다. 또한, 도 2의 (c)에 도시되는 것처럼, 사용자가 자신의 위치를 임의의 가상 위치로 옮기로 싶은 경우, 공간음향정보를 이용하여 스피커의 위치가 가상 위치에 있는 것처럼 처리함으로써 적절한 입체음향 효과를 가지는 채널별 출력 신호를 생성할 수 있다.Figure 2 (b) shows a virtual speaker arrangement that can feel the sound effect on the stage as described above. In order to exhibit such a sound effect, the decoding apparatus includes spatial acoustic information on the channel-specific output signals such as the original center (C), left (L), right (R), left surround (Ls), and right surround (Rs). By applying spatial ambient imformation, a modified channel-specific output signal having one or more stereophonic effects may be generated. "Spatial acoustic information" refers to information for providing a stereoscopic sound effect so that a user may be located in a virtual space or feel a virtual sound effect. The spatial sound information includes 3D stereo sound information such as a head related transfer function (HRTF), audio image control information, and the like. Also, as shown in (c) of FIG. 2, when the user wants to move his or her position to an arbitrary virtual position, the stereophonic effect is appropriate by processing the speaker's position as if it is at the virtual position using spatial acoustic information. A channel-specific output signal may be generated.

"공간음향정보"는 음향 경로에 관한 음향경로정보를 이용하여 생성될 수 있다. "음향경로정보"는 스피커 위치와 청자 위치 사이의 음향 경로에 관한 정보로서, 음향 경로에 관한 함수를 포함한다. 이때, 스피커 위치는 실제 스피커 위치뿐만 아니라 임의의 가상 스피커 위치가 될 수 있다. 구체적으로 기술하면, 스피커 위치와 청자 위치 사이의 음향 경로에 대응되는 제1 음향경로정보가 채널별 출력 신호에 적용된 후, 새로운 가상 스피커 위치(예를 들면, 도 2의 (b)에서 C', L', R', Ls' 및 Rs')와 청자 위치 사이의 경로에 대응되는 제2 음향경로정보가 상기 채널별 출력 신호에 적용됨으로써, 입체음향 효과를 가지는 변형된 채널별 출력 신호가 생성될 수 있다. 또한, 제2 음향경로정보만이 채널별 출력 신호에 적용됨으로써, 입체음향 효과를 가지는 변형된 채널별 출력 신호가 생성될 수 있다. 따라서, 공간음향정보는 제1 음향경로정보 및 제2 음향경로정보 중 하나 이상을 이용하여 생성될 수 있다. 이때, 제1 음향경로정보 및 제2 음향경로정보는 음향 경로뿐만 아 니라, 무대 위에서의 음향효과를 적용하기 위한 최적의 값으로 만들어 낼 수 있다. 좌측 채널의 경우를 예를 들면, 아래의 수학식 1을 통해서 변형된 좌측 채널 신호를 생성할 수 있다."Spatial acoustic information" may be generated using sound path information on the sound path. "Sound path information" is information about the sound path between the speaker position and the listener position, and includes a function on the sound path. In this case, the speaker position may be any virtual speaker position as well as the actual speaker position. Specifically, after the first sound path information corresponding to the sound path between the speaker position and the listener position is applied to the output signal for each channel, a new virtual speaker position (for example, C ', in FIG. L ', R', Ls 'and Rs') and the second sound path information corresponding to the path between the listener position are applied to the channel-specific output signal, thereby generating a modified channel-specific output signal having a stereophonic effect. Can be. In addition, since only the second sound path information is applied to the channel-specific output signal, a modified channel-specific output signal having a stereophonic effect may be generated. Therefore, the spatial sound information may be generated using at least one of the first sound path information and the second sound path information. In this case, the first sound path information and the second sound path information may be generated as an optimal value for applying the sound effect on the stage as well as the sound path. For example, the modified left channel signal may be generated through the following equation (1).

L_new = function(H_L, H_L', L)L_new = function (H_L, H_L ', L)

= function(H_L_tot, L)= function (H_L_tot, L)

여기서, L_new는 변형된 좌측 채널 신호이고, H_L은 제1 음향경로정보이며, H_L'는 제2 음향경로정보이고, L은 원래의 좌측 채널 신호이며, H_L_tot는 공간음향정보이다. 상기 공간음향정보를 채널별 신호에 적용하면, 채널별 신호에 상기 제1 음향경로정보 및 제2 음향경로정보를 적용한 것과 거의 동일한 효과를 얻을 수 있다. 또한, 상기 제2 음향경로함수가 복수로 존재하는 경우, 아래의 수학식 2를 통해서 하나의 전송된 신호에 대하여 복수(K 개)의 음향 효과를 얻을 수 있다. Here, L_new is a modified left channel signal, H_L is first sound path information, H_L 'is second sound path information, L is original left channel signal, and H_L_tot is spatial sound information. When the spatial sound information is applied to the signal for each channel, the same effect as that of applying the first sound path information and the second sound path information to the signal for each channel can be obtained. In addition, when there are a plurality of second sound path functions, a plurality (K) sound effects may be obtained for one transmitted signal through Equation 2 below.

L_new_i = function(H_L_tot_i, L), i=1...KL_new_i = function (H_L_tot_i, L), i = 1 ... K

상기 공간음향정보들은 인코딩 장치에서 전송된 공간음향정보를 이용할 수도 있고, 미리 결정된 공간음향정보를 이용할 수도 있다. 상기 방법은 멀티채널 오디오 신호를 재생하는 모든 오디오 코덱에 적용할 수 있다.The spatial acoustic information may use spatial acoustic information transmitted from an encoding apparatus, or may use predetermined spatial acoustic information. The method is applicable to all audio codecs that reproduce multichannel audio signals.

도 3은 본 발명의 일실시예에 따른 오디오 신호의 공간 정보를 이용하여 입체음향 효과를 나타내는 원리를 나타내는 도면이다. 공간음향정보는 상기와 같이 오디오 신호의 채널별 출력 신호에 적용될 수 있을 뿐만 아니라, 공간음향정보는 아래 수학식 3를 통해서 오디오 신호의 공간 정보를 변형하는데 이용될 수 있다. 변형된 공간 정보는 다운믹스 신호에 적용되어 입체음향 효과를 가지는 출력 신호를 생성할 수 있다.3 is a diagram illustrating a principle of displaying a stereoscopic sound effect using spatial information of an audio signal according to an embodiment of the present invention. The spatial sound information may be applied to the output signal for each channel of the audio signal as described above, and the spatial sound information may be used to transform spatial information of the audio signal through Equation 3 below. The modified spatial information may be applied to the downmix signal to generate an output signal having a stereophonic effect.

S_new = function(H_tot, S)S_new = function (H_tot, S)

여기서, S는 공간 정보, S_new는 변형된 공간 정보, H_tot는 공간음향정보이다. 예를 들면, 공간 정보 중 두 채널간 에너지 레벨 차이(CLD)를 이용하여 입체음향 효과가 생성될 수 있다. 두 채널간 에너지 레벨 차이(CLD)를 이용하여 음원의 방향을 표현하는 대표적인 방법에는 패닝(panning) 법칙이 있다. 상기 패닝 법칙은 다음과 같이 공간 정보를 변형하는데 이용될 수 있다. 도 3을 참조하면, 임의의 가상 위치에 음원이 있을 때(즉, 도시된 virtual source 위치), 상기 음원은 두 개의 스피커(ch1 및 ch2)를 이용하여 표현할 수 있다. 이때, 아래의 수학식 4 및 수학식 5가 이용된다.Here, S is spatial information, S_new is modified spatial information, and H_tot is spatial acoustic information. For example, a stereophonic sound effect may be generated using an energy level difference (CLD) between two channels of spatial information. The representative method of expressing the direction of a sound source using the energy level difference (CLD) between two channels is the panning law. The panning law can be used to transform spatial information as follows. Referring to FIG. 3, when a sound source is located at an arbitrary virtual position (ie, the illustrated virtual source position), the sound source may be expressed using two speakers ch1 and ch2. At this time, Equations 4 and 5 below are used.

여기서 sine 대신에 tangent를 사용할 수도 있다. 오디오 신호의 공간 정보 중 두 채널간 에너지 레벨 차이(CLD)는 아래의 수학식 6를 통해 위 식의 g₁, g₂로 표현이 가능하다. You can also use tangent instead of sine here. The energy level difference CLD between two channels in the spatial information of the audio signal may be expressed by g ₁ and g ₂ of the above equation through Equation 6 below.

따라서 이를 이용하면, 현재 스피커 위치의 CLD 값을 임의 위치의 CLD 값으로 변환할 수 있다. 변환된 CLD를 다운믹스 신호에 적용함으로써, 현재의 음향 효과를 가지는 다운믹스 신호를 사용자의 선택에 따라 다양한 입체음향 효과를 가지는 출력 신호로 변환할 수 있다. Therefore, using this, the CLD value of the current speaker position can be converted into the CLD value of the arbitrary position. By applying the converted CLD to the downmix signal, the downmix signal having the current sound effect can be converted into an output signal having various stereophonic effects according to the user's selection.

반대로, 두 스피커로 전송된 신호의 크기(g1 및 g2)를 알고 있다면, 전송된 신호가 표현하는 음원의 위치를 알 수 있다. 전송된 신호는 표준 스피커 배치에서 다채널 오디오 신호의 효과를 얻기 위한 것이다. 이 경우, 표준 스피커 배치를 사용하여 재생하는 경우에는 원래와 같은 효과를 얻을 수 있으나, 스피커의 배치가 표준 스피커 배치와 달라지는 경우에는 원래와 다른 효과를 얻을 수 있다. On the contrary, if the magnitudes g1 and g2 of the signals transmitted to the two speakers are known, the position of the sound source represented by the transmitted signals can be known. The transmitted signal is to achieve the effect of a multichannel audio signal in a standard speaker layout. In this case, when the playback is performed using the standard speaker layout, the same effect as the original may be obtained. However, when the speaker layout is different from that of the standard speaker layout, a different effect may be obtained.

예를 들면, C와 L채널의 각도는 30도인 것이 표준 스피커 배치이나, 사용자의 환경에 따라 C와 L채널의 각도가 60도가 되는 경우가 있다. 만일, C와 L채널 사이의 전송된 CLD값이 -6.293이라 하면, 상기 수학식 6에서 -6.293 = 20log10(Gi)가 된다. 따라서 Gi = 0.4846이 되고, g2 = 0.4846g1이 된다. 이것을 수학식 5에 대입하면 30도 표준 스피커 배치에 대해 가상 음원은 10도 위치임을 알 수 있다. 따라 서, C와 L채널의 각도가 60도인 경우에도 가상 음원이 10도 위치에 있는 것처럼 느낄 수 있어야 한다. 이 경우, 스피커 각도는 60도이고, 가상 음원은 10도 위치이므로, 수학식 5에 의해 (g1_new - g2_new)/(g1_new + g2_new) = sine(10)/sine(60) = 0.2005 가 된다. 따라서, g2_new = 0.6660g1_new가 되고, 수학식 6을 이용하면 CLD_new = -3.5302가 된다. 스피커 위치가 표준 배치와 다른 경우, 상기와 같이 공간 정보를 변형함으로써, 표준 스피커 배치와 다른 스피커 배치에서도 원래 의도한 바와 같은 멀티채널 오디오 신호의 효과를 얻을 수 있다.For example, the angle of the C and L channels is 30 degrees, but the angle of the C and L channels may be 60 degrees depending on the user's environment. If the CLD value transmitted between the C and L channels is -6.293, -6.293 = 20log10 (Gi) in Equation 6 above. Therefore, Gi = 0.4846 and g2 = 0.4846g1. Substituting this into Equation 5, it can be seen that the virtual sound source is located at 10 degrees for a 30 degree standard speaker arrangement. Therefore, even if the angle between the C and L channels is 60 degrees, the virtual sound source should feel as if it is in the 10 degree position. In this case, since the speaker angle is 60 degrees and the virtual sound source is 10 degrees, the equation (5) gives (g1_new-g2_new) / (g1_new + g2_new) = sine (10) / sine (60) = 0.2005. Therefore, g2_new = 0.6660 g1_new, and using Equation 6, CLD_new = -3.5302. If the speaker position is different from the standard arrangement, by modifying the spatial information as described above, the effect of the multichannel audio signal as originally intended can be obtained even in the speaker arrangement different from the standard speaker arrangement.

본 발명의 다음과 같은 실시예를 포함한다. 스피커 배치가 동일하다고 할 때, 특정한 각도의 가상 위치에 있는 음원을 다른 각도의 가상 위치로 이동시킬 수 있다. 이 경우, 상술한 방법에서 스피커 위치에 대한 각도를 유지한 채, 가상 음원의 각도를 변경함으로써 새로운 공간 정보 값을 얻을 수 있다. 또한, 본 발명은 상기 스피커 위치가 바뀌는 경우의 공간 정보 산출법과 상기 음원 위치가 바뀌는 경우의 공간 정보 산출법을 결합하여 사용하는 것을 포함한다. 또한, 스피커의 개수가 원래 신호의 채널 수와 다른 경우에도, 상기 방법을 이용하여 임의의 개수를 가지는 스피커 구성에 대해서 대응하는 것이 가능하다. 본 명세서에서 공간 정보 중 CLD에 대하여 기술하였지만, 본 발명은 다른 공간 정보를 이용하여 상기와 같은 입체음향 효과를 나타내는 방법을 포함한다.It includes the following examples of the invention. Given that the speaker layout is the same, a sound source at a virtual location of a certain angle can be moved to a virtual location of another angle. In this case, a new spatial information value can be obtained by changing the angle of the virtual sound source while maintaining the angle with respect to the speaker position in the above-described method. Further, the present invention includes using a combination of the spatial information calculation method when the speaker position is changed and the spatial information calculation method when the sound source position is changed. In addition, even when the number of speakers differs from the number of channels of the original signal, it is possible to correspond to a speaker configuration having any number using the above method. Although CLD of spatial information is described herein, the present invention includes a method of displaying the stereoscopic sound effect by using other spatial information.

도 4는 본 발명의 일실시예에 따른 공간음향정보를 생성하는 인코딩 장치를 도시한다. 상기 인코딩 장치는 다운믹스부(402), 공간정보추출부(303), 공간음향정보생성부(405) 및 다중화부(407)로 이루어진다. 도 4을 참조하면, 다운믹스부(402) 는 멀티채널 오디오 신호(401)를 다운믹스하여 다운믹스 신호를 생성한다. 도시된 n은 입력 채널(input channel)의 수를 의미한다. 상기 다운믹스 신호는 모노(mono), 스테레오(stereo) 또는 멀티채널 오디오 신호가 될 수 있다. 선택적으로, 상기 다운믹스 신호는 외부에서 직접 제공되는 다운믹스 신호, 예를 들면, 아비트러리 다운믹스 신호(Arbitrary down-mix signal, 404)를 이용하여 생성될 수 있다.4 illustrates an encoding apparatus for generating spatial sound information according to an embodiment of the present invention. The encoding apparatus includes a downmixer 402, a spatial information extractor 303, a spatial acoustic information generator 405, and a multiplexer 407. Referring to FIG. 4, the downmix unit 402 downmixes the multichannel audio signal 401 to generate a downmix signal. N shown means the number of input channels (input channel). The downmix signal may be a mono, stereo or multichannel audio signal. Optionally, the downmix signal may be generated using an externally provided downmix signal, for example, an arbitrary down-mix signal 404.

공간정보추출부(403)는 상기 멀티채널 오디오 신호(401)로부터 공간 정보를 추출한다. "공간 정보(spatial information)"란 멀티채널(예를 들면, Left, Right, Center, Left surround, Right surround 등) 오디오 신호를 다운믹스하여 생성된 다운믹스 신호를 전송하고, 상기 전송된 다운믹스 신호를 다시 멀티채널 오디오 신호로 업믹스(upmix) 할 때 사용되는 오디오 신호 채널에 대한 정보를 말한다. The spatial information extracting unit 403 extracts spatial information from the multichannel audio signal 401. “Spatial information” refers to a downmix signal generated by downmixing a multichannel (eg, Left, Right, Center, Left surround, Right surround, etc.) audio signal, and transmitting the downmix signal. Refers to information about an audio signal channel used when upmixing back to a multichannel audio signal.

공간음향정보생성부(spatial ambient information generating part, 405)는 다운믹스 신호에 적용되어, 입체음향 효과를 가지는 출력 신호를 생성하는데 이용되는 공간음향정보를 생성한다. 구체적으로 기술하면, 공간음향정보생성부(405)는 상기 수학식 1의 H_x_tot를 생성한다. 공간음향정보생성부(405)는 복수(K 개)로 존재할 수 있어서, 인코딩 장치는 하나의 신호에 대하여 복수의 입체음향 효과를 제공할 수 있다. 만일, 새로 생성된 공간음향정보가 존재하지 않는다면, 인코딩 장치는 공간음향정보를 포함하지 않는다는 식별정보(예를 들면, 플래그(flag) 정보)를 보낼 수 있다.A spatial ambient information generating part 405 is applied to the downmix signal to generate spatial acoustic information used to generate an output signal having a stereophonic effect. Specifically, the spatial sound information generating unit 405 generates H_x_tot of Equation 1 above. The spatial sound information generating unit 405 may exist in plural (K), so that the encoding apparatus may provide a plurality of stereophonic effects for one signal. If the newly generated spatial sound information does not exist, the encoding apparatus may transmit identification information (for example, flag information) that does not include the spatial sound information.

다중화부(multiplexer, 407)는 다운믹스 신호, 공간 정보 및 공간음향정보를 포함하는 비트스트림(408)을 생성한다. 상기 비트스트림(408)에는 공간음향정보의 개수 정보(406)가 포함될 수 있다.The multiplexer 407 generates a bitstream 408 including the downmix signal, spatial information, and spatial sound information. The bitstream 408 may include the number information 406 of spatial acoustic information.

도 5는 본 발명의 일실시예에 따른 공간음향정보를 이용하여 오디오 신호의 출력 신호를 변형하는 디코딩 장치를 도시한다. 상기 디코딩 장치는 역다중화부(502), 공간정보디코딩부(507), 공간정보변형부(508), 공간음향정보선택부(509) 및 공간정보통합부(512)로 이루어진다.5 illustrates a decoding apparatus for modifying an output signal of an audio signal using spatial acoustic information according to an embodiment of the present invention. The decoding apparatus includes a demultiplexer 502, a spatial information decoding unit 507, a spatial information transformation unit 508, a spatial acoustic information selecting unit 509, and a spatial information integrating unit 512.

도 5를 참조하면, 역다중화부(502)는 수신된 비트스트림(501)으로부터 공간 정보 신호(504) 및 다운믹스 신호(503)를 분리한다. 비트스트림(501) 내에는 공간음향정보가 포함될 수 있으며, 이 경우 역다중화부(502)는 비트스트림(501)으로부터 공간음향정보(505)를 분리할 수 있다. 또한, 비트스트림(501)에는 공간음향정보의 개수 정보가 포함될 수 있으며, 이 경우 역다중화부(502)는 비트스트림(501)으로부터 공간음향정보의 개수 정보(506)를 분리할 수 있다. 공간정보디코딩부(507)는 공간 정보 신호(504)를 디코딩하여 공간 정보를 추출한다. Referring to FIG. 5, the demultiplexer 502 separates the spatial information signal 504 and the downmix signal 503 from the received bitstream 501. The spatial sound information may be included in the bitstream 501. In this case, the demultiplexer 502 may separate the spatial sound information 505 from the bitstream 501. In addition, the bitstream 501 may include information on the number of spatial sound information. In this case, the demultiplexer 502 may separate the number information 506 of spatial sound information from the bitstream 501. The spatial information decoding unit 507 decodes the spatial information signal 504 to extract spatial information.

공간음향정보선택부(509)는 오디오 신호의 다운믹스 신호에 적용할 하나 이상의 공간음향정보를 선택한다. 상기 공간음향정보는 미리 정해진 공간음향정보(510)이거나 또는 비트스트림에서 추출된 공간음향정보(505)가 될 수 있다. 즉, 공간음향정보선택부(409)는 비트스트림(501)으로부터 추출된 하나 이상의 공간음향정보(505)와 미리 정해진 하나 이상의 공간음향정보(510) 중 어떤 것을 사용할지 사용자의 선택에 따라 결정할 수 있다. 공간음향정보의 선택은 상기와 같이 사용자의 선택에 의해 이루어질 수 있고, 또한 비디오 신호 및 비디오 인터페이스에 대응 하여 자동으로 이루어질 수 있다. 또한, 컨텐츠 제공자가 공간음향정보의 선택에 관한 정보를 선택적으로 전송함으로써, 공간음향정보의 선택이 자동적으로 이루어질 수 있다.The spatial sound information selecting unit 509 selects one or more spatial sound information to be applied to the downmix signal of the audio signal. The spatial sound information may be predetermined spatial sound information 510 or spatial sound information 505 extracted from a bitstream. That is, the spatial sound information selecting unit 409 may determine which of the one or more spatial sound information 505 extracted from the bitstream 501 and one or more predetermined spatial sound information 510 are used according to the user's selection. have. The selection of the spatial sound information can be made by the user's selection as described above, and can be made automatically in response to the video signal and the video interface. In addition, the content provider selectively transmits information about the selection of the spatial acoustic information, so that the selection of the spatial acoustic information can be made automatically.

또한, 공간음향정보선택부(509)는 비트스트림(501)으로부터 추출된 공간음향정보의 개수정보(506) 및 출력 채널의 개수정보(511)를 수신한다. 수신된 상기 공간음향정보의 개수정보(506)과 출력 채널의 개수정보(511)는 다운믹스 신호를 변형하여 입체음향 효과를 가지는 출력 신호를 생성하는데 이용될 수 있다. 만일 공간음향정보선택부(509)에서 어떠한 공간음향정보도 선택되지 않는다면, 비트스트림(501)에서 추출된 공간 정보를 이용하여 다운믹스 신호(503)를 멀티채널 오디오 신호로 출력한다. 만일 공간음향정보선택부(509)에서 하나 이상의 공간음향정보가 선택되면, 선택된 공간음향정보를 이용하여 다운믹스 신호를 변형할 수 있다. In addition, the spatial sound information selection unit 509 receives the number information 506 of the spatial sound information extracted from the bitstream 501 and the number information 511 of the output channel. The received number information 506 of the spatial sound information and the number information 511 of the output channel may be used to generate an output signal having a stereophonic effect by modifying a downmix signal. If no spatial acoustic information is selected by the spatial acoustic information selecting unit 509, the downmix signal 503 is output as a multi-channel audio signal using the spatial information extracted from the bitstream 501. If one or more spatial sound information is selected by the spatial sound information selecting unit 509, the downmix signal may be transformed using the selected spatial sound information.

도 5에 도시되지 않았지만, 공간음향정보는 다운믹스 신호에 직접 적용되어, 입체음향 효과를 가지는 출력 신호를 생성할 수 있다. 또한, 도 5에 도시되는 것처럼, 공간음향정보는 공간 정보를 변형하는데 이용되고, 변형된 공간 정보가 다운믹스 신호를 변형하는데 이용될 수 있다. 이 경우에, 공간정보변형부(508)는 공간음향정보선택부(509)에서 선택된 공간음향정보를 이용하여 공간 정보를 변형시킬 수 있다.Although not shown in FIG. 5, the spatial acoustic information may be directly applied to the downmix signal to generate an output signal having a stereoscopic sound effect. In addition, as shown in FIG. 5, the spatial acoustic information is used to transform the spatial information, and the modified spatial information can be used to modify the downmix signal. In this case, the spatial information transforming unit 508 may transform the spatial information using the spatial acoustic information selected by the spatial acoustic information selecting unit 509.

공간정보통합부(512)는 공간정보변형부(508)에서 변형된 공간 정보를 이용하여 다운믹스 신호(503)를 입체음향 효과를 가지는 멀티채널 오디오 신호(513)로 변환하여 출력할 수 있다. 만일, 디코딩 장치가 멀티채널 오디오 신호(513)를 출력할 수 없다면, 공간음향정보를 다운믹스 신호(503)에 적용한 후에, 다운믹스 신호를 직접 출력할 수 있다. 이와 같이, 공간 정보를 다운믹스 신호에 통합하는 과정을 수행한 후에 공간음향정보를 적용하는 것에 비하여, 공간음향정보를 이용하여 공간 정보를 변환한 후에 공간 정보의 통합 과정을 수행하는 것은 계산량 관점에서 매우 유리하다.The spatial information integrating unit 512 may convert the downmix signal 503 into a multi-channel audio signal 513 having a stereophonic sound effect by using the spatial information modified by the spatial information transforming unit 508 and output the same. If the decoding apparatus cannot output the multichannel audio signal 513, after applying spatial acoustic information to the downmix signal 503, the decoding apparatus may directly output the downmix signal. As described above, as compared with applying spatial acoustic information after performing the process of integrating the spatial information into the downmix signal, the process of integrating the spatial information after converting the spatial information using the spatial acoustic information is performed in terms of calculation amount. Very advantageous.

또한, 전송된 채널 수와 재생 가능한 채널 수가 다른 경우에는, 인코딩 장치에서 재생가능한 채널을 고려한 공간음향정보를 보낼 수 있다. 즉, 전송된 채널의 수가 N, 재생 가능한 채널의 수가 M인 경우에, 인코딩 장치에서는 M채널을 고려한 공간음향정보를 생성하여 보낼 수 있다. 예를 들면, 전송방식 중 하나인 5-1-5의 형태로 전송할 경우에, 인코딩 장치는 사용자가 스테레오 재생장치를 가졌을 경우에 대비한 최적의 공간음향정보를 보낼 수 있다. 5-2-5의 형태로 전송할 경우에도, 인코딩 장치는 다운믹스 신호에 3D 스테레오 효과를 줄 수 있는 공간음향정보를 보낼 수 있다. 또한, 인코딩 장치는 다운믹스 신호와 3D 스테레오를 선택적으로 즐길 수 있도록 공간음향정보를 보낼 수 있다. 이 경우에도 디코딩 장치에 포함된 미리 결정된 공간음향정보를 이용할 수 있다. 미리 결정된 공간음향정보는 1개일 수도 있고, 복수개일 수도 있으며, 사용자가 자유롭게 추가/제거/변형이 가능할 수 있다. 본 발명에 따른 복수의 공간음향정보를 이용하여 재생하는 방법은 DVD에서 다중 영상을 제공하는 멀티-앵글(multi-angle)에 대응하는 멀티-공간음향 또는 멀티-음장으로 널리 사용될 수 있다.In addition, when the number of transmitted channels and the number of playable channels are different, the spatial sound information may be sent in consideration of the playable channel in the encoding apparatus. That is, when the number of transmitted channels is N and the number of reproducible channels is M, the encoding apparatus may generate and transmit spatial acoustic information considering the M channel. For example, in the case of transmission in the form of 5-1-5, which is one of transmission methods, the encoding apparatus may transmit optimal spatial sound information in case the user has a stereo playback apparatus. Even when transmitting in the form of 5-2-5, the encoding apparatus may send spatial acoustic information that can give a 3D stereo effect to the downmix signal. In addition, the encoding apparatus may transmit spatial sound information to selectively enjoy the downmix signal and the 3D stereo. Even in this case, predetermined spatial acoustic information included in the decoding apparatus may be used. The predetermined spatial sound information may be one, or may be a plurality, and may be freely added / removed / modified by the user. The method of reproducing using a plurality of spatial sound information according to the present invention can be widely used as a multi-spatial sound or a multi-sound field corresponding to a multi-angle providing a multi-image in a DVD.

도 6은 본 발명의 일실시예에 따른 공간음향정보를 이용하여 다운믹스 신호 를 변형하는 디코딩 방법에 대한 흐름도를 나타낸다. 도 6을 참조하면, 본 발명에 따른 오디오 신호의 디코딩 방법 및 디코딩 장치의 동작은 다음과 같다. 먼저 디코딩 장치는 다운믹스 신호 및 공간 정보를 포함하는 비트스트림을 수신한다(601). 그 다음에, 역다중화부(502)는 비트스트림으로부터 다운믹스 신호와 공간 정보를 분리한다(602). 상기 비트스트림 내에는 공간음향정보의 포함 여부에 관한 식별정보가 삽입될 수 있다. 상기 식별정보가 비트스트림내에 하나 이상의 공간음향정보가 포함된다는 것을 표시하면, 역다중화부(502)는 비트스트림으로부터 공간음향정보를 분리할 수 있다. 또한, 상기 비트스트림에는 공간음향정보의 개수정보가 포함될 수 있으며, 이 경우에, 역다중화부(502)는 비트스트림으로부터 공간음향정보의 개수정보를 분리할 수 있다.6 is a flowchart illustrating a decoding method of modifying a downmix signal using spatial acoustic information according to an embodiment of the present invention. Referring to FIG. 6, operations of a decoding method and a decoding apparatus of an audio signal according to the present invention are as follows. First, the decoding apparatus receives a bitstream including a downmix signal and spatial information (601). The demultiplexer 502 then separates the downmix signal and spatial information from the bitstream (602). Identification information regarding whether spatial sound information is included in the bitstream may be inserted. If the identification information indicates that one or more spatial sound information is included in the bitstream, the demultiplexer 502 may separate the spatial sound information from the bitstream. Also, the bitstream may include the number information of the spatial sound information. In this case, the demultiplexer 502 may separate the number information of the spatial sound information from the bitstream.

만일 공간음향정보를 사용한다면(603), 공간음향정보선택부(509)는 하나 이상의 공간음향정보를 선택하고(604), 선택된 공간음향정보는 다운믹스 신호에 적용되어, 상기 다운믹스 신호를 변형하는데 이용될 수 있다. 상술한 것처럼, 공간음향정보선택부(509)는 비트스트림으로부터 분리된 하나 이상의 공간음향정보 또는 미리 정해진 하나 이상의 공간음향정보 중에서 선택할 수 있다. If spatial acoustic information is used (603), the spatial acoustic information selecting unit 509 selects one or more spatial acoustic information (604), and the selected spatial acoustic information is applied to a downmix signal to deform the downmix signal. It can be used to As described above, the spatial sound information selecting unit 509 may select one or more spatial sound information separated from the bitstream or one or more predetermined spatial sound information.

또한, 본 발명은 선택된 공간음향정보를 이용하여 공간 정보를 변형하고(605), 변형된 공간 정보를 이용하여 다운믹스 신호를 디코딩(606)하는 것을 포함한다. 이 경우에, 공간정보변형부(508)는 공간정보선택부(509)에서 선택된 공간음향정보를 이용하여 공간 정보를 변형한다. 그 다음에, 공간정보통합부(512)는 변형된 공간 정보를 이용하여 다운믹스 신호를 입체음향 효과를 가지는 멀티채널 오 디오 신호로 변환할 수 있다. 만일 디코딩 장치가 멀티채널 오디오 신호를 출력할 수 없다면, 공간음향정보를 다운믹스 신호에 적용한 후에, 다운믹스 신호가 직접 출력될 수 있다. 만일 공간음향정보를 사용하지 않는다면(603), 변형되지 않은 공간 정보를 이용하여 다운믹스 신호를 디코딩(606)할 수 있다.The present invention also includes modifying the spatial information using the selected spatial acoustic information (605) and decoding (606) the downmix signal using the modified spatial information. In this case, the spatial information transforming unit 508 transforms the spatial information by using the spatial acoustic information selected by the spatial information selecting unit 509. Next, the spatial information integration unit 512 may convert the downmix signal into a multichannel audio signal having a stereophonic sound effect using the modified spatial information. If the decoding apparatus cannot output the multichannel audio signal, the downmix signal may be directly output after applying spatial acoustic information to the downmix signal. If the spatial acoustic information is not used (603), the downmix signal can be decoded (606) using the unmodified spatial information.

지금까지 본 발명에 대하여 몇몇 실시예들을 들어 구체적으로 설명하였으나, 상기 실시예들은 본 발명을 이해하기 위한 설명을 위해 제시된 것이며, 본 발명의 범위가 상기 실시예에 제한되는 것은 아니다. 당업자라면 본 발명의 기술적 사상의 범위를 벗어나지 않고도 다양한 변형이 가능함을 이해할 수 있을 것이며, 본 발명의 범위는 첨부된 특허청구범위에 의해서 해석되어야 할 것이다.Although the present invention has been described in detail with reference to some embodiments, the above embodiments are presented for the purpose of understanding the present invention, and the scope of the present invention is not limited to the above embodiments. Those skilled in the art will understand that various modifications are possible without departing from the scope of the technical idea of the present invention, and the scope of the present invention should be interpreted by the appended claims.

이상에서 기술된 것과 같이, 본 발명에 따른 멀티채널 오디오 신호를 코딩하는데 있어서, 한 가지 이상의 입체음향을 제공하기 위해 공간음향정보를 이용하여 오디오 신호의 출력 신호를 변형할 수 있다. 또한 공간음향정보를 이용하여 공간 정보를 변형하고, 변형된 공간 정보를 이용하여 오디오 신호의 출력 신호를 변형할 수 있다. 상기 공간음향정보는 인코딩 장치에서 보내진 공간음향정보이거나, 또는 미리 결정된 공간음향정보일 수 있다. 이와 같이 공간 정보를 다운믹스 신호에 통합하는 단계 전에, 공간음향정보를 이용하여 공간 정보를 변형하고, 변형된 공간 정보를 이용하여 다운믹스 신호를 디코딩하는 방법 및 장치는 전송 및 저장해야하는 정보량을 효과적으로 줄일 수 있는 효과를 갖는다.As described above, in coding the multi-channel audio signal according to the present invention, the spatial signal information may be used to modify the output signal of the audio signal to provide one or more stereoscopic sounds. In addition, the spatial information may be modified using the spatial acoustic information, and the output signal of the audio signal may be modified using the modified spatial information. The spatial acoustic information may be spatial acoustic information sent from an encoding apparatus or predetermined spatial acoustic information. As described above, before the step of integrating the spatial information into the downmix signal, the method and apparatus for transforming the spatial information using the spatial acoustic information and decoding the downmix signal using the modified spatial information effectively reduce the amount of information to be transmitted and stored. It has an effect that can be reduced.

Claims

오디오 신호의 채널별 출력 신호를 생성하는 단계; 및Generating an output signal for each channel of the audio signal; And

상기 채널별 출력 신호에 적어도 하나의 입체음향효과를 제공하는 공간음향정보를 적용하여 변형된 채널별 출력 신호를 생성하는 단계를 포함하는 것을 특징으로 하는 오디오 신호의 생성 방법.And generating a modified channel-specific output signal by applying spatial acoustic information that provides at least one stereoscopic sound effect to the channel-specific output signal.

제 1 항에 있어서, 상기 공간음향정보는The method of claim 1, wherein the spatial sound information

실제 스피커 위치와 청자 위치 사이의 음향 경로에 대응하는 제1 음향경로정보 및 가상의 스피커 위치와 청자 위치 사이의 음향 경로에 대응하는 제2 음향경로정보 중 하나 이상을 이용하여 생성되는 것을 특징으로 하는 오디오 신호의 생성 방법.And at least one of first sound path information corresponding to the sound path between the actual speaker position and the listener position, and second sound path information corresponding to the sound path between the virtual speaker position and the listener position. Method of generating an audio signal.

제 1 항에 있어서, 상기 변형된 채널별 출력 신호를 생성하는 단계는The method of claim 1, wherein generating the modified channel-specific output signal

상기 오디오 신호의 공간 정보에 상기 공간음향정보를 적용하여 변형된 공간 정보를 생성하는 단계; 및Generating modified spatial information by applying the spatial acoustic information to the spatial information of the audio signal; And

상기 변형된 공간 정보를 상기 채널별 출력 신호에 적용하는 단계를 포함하는 것을 특징으로 하는 오디오 신호의 생성방법.And applying the modified spatial information to the output signal for each channel.

제 3 항에 있어서, 변형된 공간 정보를 생성하는 단계는The method of claim 3, wherein generating the modified spatial information

스피커의 배치 각도 및 가상 음원의 각도 중 하나 이상을 변경함으로 이루어지는 것을 특징으로 하는 오디오 신호의 생성방법.A method of generating an audio signal, characterized in that by changing one or more of the arrangement angle of the speaker and the angle of the virtual sound source.

오디오 신호의 비트스트림으로부터 다운믹스 신호와 공간 정보를 분리하는 단계;Separating the downmix signal and spatial information from the bitstream of the audio signal;

적어도 하나의 입체음향효과를 제공하는 공간음향정보를 이용하여 상기 공간 정보를 변형하는 단계; 및Transforming the spatial information by using spatial acoustic information that provides at least one stereoscopic sound effect; And

변형된 상기 공간 정보를 이용하여 상기 다운믹스 신호를 디코딩하는 단계를 포함하는 것을 특징으로 하는 오디오 신호의 디코딩 방법.Decoding the downmix signal using the modified spatial information.

제 5 항에 있어서, 상기 디코딩 방법은The method of claim 5, wherein the decoding method

상기 비트스트림으로부터 상기 공간음향정보의 포함 여부에 관한 식별정보를 추출하는 단계를 더 포함하는 것을 특징으로 하는 오디오 신호의 디코딩 방법.Extracting identification information on whether the spatial sound information is included from the bitstream.

제 6 항에 있어서, 상기 디코딩 방법은The method of claim 6, wherein the decoding method is

상기 공간음향정보가 상기 비트스트림에 포함되지 않은 경우에, 미리 결정된 공간음향정보 중에서 선택하는 단계를 더 포함하는 것을 특징으로 하는 오디오 신호의 디코딩 방법.And if the spatial sound information is not included in the bitstream, selecting from the predetermined spatial sound information.

상기 공간음향정보가 상기 비트스트림에 포함된 경우에, 상기 비트스트림으로부터 상기 공간음향정보를 추출하는 단계를 더 포함하는 것을 특징으로 하는 오디오 신호의 디코딩 방법.If the spatial sound information is included in the bitstream, extracting the spatial sound information from the bitstream.

제 8 항에 있어서, 상기 디코딩 방법은The method of claim 8, wherein the decoding method is

상기 추출된 공간음향정보 및 미리 결정된 공간음향정보 중 하나를 선택하는 단계를 더 포함하는 것을 특징으로 하는 오디오 신호의 디코딩 방법.And selecting one of the extracted spatial sound information and predetermined spatial sound information.

제 7 항 또는 제 9 항에 있어서, 상기 선택하는 단계는The method of claim 7 or 9, wherein the selecting step

상기 오디오 신호에 대응하는 비디오 신호의 변화에 따라 자동으로 이루어지는 것을 특징으로 하는 오디오 신호의 디코딩 방법.And a video signal corresponding to the audio signal is automatically made.

사용자의 선택에 따라 이루어지는 것을 특징으로 하는 오디오 신호의 디코딩 방법.Method for decoding an audio signal, characterized in that made according to the user's selection.

멀티채널 오디오 신호로부터 다운믹스 신호 및 공간 정보를 생성하는 단계;Generating a downmix signal and spatial information from the multichannel audio signal;

상기 다운믹스 신호 및 공간 정보를 포함하는 비트스트림을 생성하되, 상기 비트스트림에 적어도 하나의 입체음향정보를 제공하는 공간음향정보를 삽입하는 단계를 포함하는 것을 특징으로 하는 오디오 신호의 인코딩 방법.Generating a bitstream including the downmix signal and spatial information, and inserting spatial acoustic information for providing at least one stereophonic sound information into the bitstream.

제 12 항에 있어서, 공간음향정보를 삽입하는 단계는The method of claim 12, wherein the inserting the spatial sound information

디코딩 장치에서 재생가능한 출력 채널 수에 상응하는 공간음향정보를 상기 비트스트림에 삽입하는 단계를 포함하는 것을 특징으로 하는 오디오 신호의 인코딩 방법.And inserting spatial acoustic information corresponding to the number of reproducible output channels in the decoding apparatus into the bitstream.

제 12 항에 있어서, 상기 인코딩 방법은The method of claim 12, wherein the encoding method

상기 비트스트림에 삽입되는 상기 공간음향정보의 개수 정보를 상기 비트스트림에 삽입하는 단계를 더 포함하는 것을 특징으로 하는 오디오 신호의 인코딩 방법.And inserting the number information of the spatial acoustic information inserted into the bitstream into the bitstream.

오디오 신호 및 적어도 하나의 입체음향효과를 제공하는 공간음향정보를 포함하되, 상기 공간음향정보는 상기 오디오 신호의 출력 신호를 변형하는데 이용되는 것을 특징으로 하는 데이터 구조.And spatial acoustic information for providing an audio signal and at least one stereophonic effect, wherein the spatial acoustic information is used to modify an output signal of the audio signal.

제 15 항에 있어서, 상기 데이터 구조는16. The method of claim 15, wherein the data structure is

상기 오디오 신호의 공간 정보를 더 포함하되, 상기 공간 정보는 상기 공간음향정보를 이용하여 변형되고, 상기 변형된 공간 정보가 상기 오디오 신호의 출력 신호를 변형하는데 이용되는 것을 특징으로 하는 데이터 구조.And spatial information of the audio signal, wherein the spatial information is deformed using the spatial sound information, and the modified spatial information is used to deform an output signal of the audio signal.

오디오 신호의 비트스트림으로부터 다운믹스 신호와 공간 정보를 분리하는 역다중화부;A demultiplexer separating the downmix signal and the spatial information from the bitstream of the audio signal;

적어도 하나의 입체음향효과를 제공하는 공간음향정보를 이용하여 상기 공간 정보를 변형하는 공간정보변형부; 및A spatial information transformation unit for transforming the spatial information by using spatial acoustic information that provides at least one stereoscopic sound effect; And

변형된 상기 공간 정보를 이용하여 상기 다운믹스 신호를 디코딩하는 공간정보통합부;를 포함하는 것을 특징으로 하는 오디오 신호의 디코딩 장치.And a spatial information integrating unit which decodes the downmix signal by using the modified spatial information.

제 17항에 있어서, 상기 디코딩 장치는18. The apparatus of claim 17, wherein the decoding device is

적어도 하나의 공간음향정보를 선택하는 공간음향정보선택부를 더 포함하는 것을 특징으로 하는 오디오 신호의 디코딩 장치.And a spatial acoustic information selection unit for selecting at least one spatial acoustic information.

멀티채널 오디오 신호로부터 다운믹스 신호를 생성하는 다운믹스부;A downmix unit configured to generate a downmix signal from the multichannel audio signal;

상기 멀티채널 오디오 신호로부터 공간 정보를 생성하는 공간정보추출부;A spatial information extracting unit generating spatial information from the multichannel audio signal;

적어도 하나의 입체음향효과를 제공하는 공간음향정보를 생성하는 공간음향정보생성부; 및A spatial acoustic information generator for generating spatial acoustic information for providing at least one stereoscopic sound effect; And

상기 다운믹스 신호, 상기 공간 정보 및 상기 공간음향정보를 포함하는 비트스트림을 생성하는 다중화부;를 포함하는 것을 특징으로 하는 오디오 신호의 인코딩 장치.And a multiplexer for generating a bitstream including the downmix signal, the spatial information, and the spatial sound information.