KR102471718B1

KR102471718B1 - Broadcastiong transmitting and reproducing apparatus and method for providing the object audio

Info

Publication number: KR102471718B1
Application number: KR1020210027569A
Authority: KR
Inventors: 이용주; 이태진; 강경옥; 김진웅; 안치득
Original assignee: 한국전자통신연구원
Priority date: 2019-07-25
Filing date: 2021-03-02
Publication date: 2022-11-28
Also published as: KR20210027330A

Abstract

멀티 채널 오디오 부호화 및 복호화를 이용하여 객체 기반 오디오를 제공하는 방송 송신 장치 및 방법, 그리고 방송 재생 장치 및 방법이 제공된다. 방송 송신 장치는 멀티 채널 오디오 신호가 객체 기반 오디오 신호인지, 서라운드 오디오 신호인지를 식별하기 위한 오디오 식별 정보를 생성할 수 있다. 그러면, 방송 재생 장치는 오디오 식별 정보에 기초하여 멀티 채널 오디오 신호가 객체 기반 오디오 신호인 경우, 멀티 채널 오디오 신호를 채널 별로 제어하여 출력할 수 있다.A broadcast transmission apparatus and method for providing object-based audio using multi-channel audio encoding and decoding, and a broadcast reproduction apparatus and method are provided. The broadcast transmission device may generate audio identification information for identifying whether the multi-channel audio signal is an object-based audio signal or a surround audio signal. Then, when the multi-channel audio signal is an object-based audio signal based on the audio identification information, the broadcast reproducing apparatus may control and output the multi-channel audio signal for each channel.

Description

객체 기반 오디오를 제공하는 방송 송신 장치 및 방법, 그리고 방송 재생 장치 및 방법{BROADCASTIONG TRANSMITTING AND REPRODUCING APPARATUS AND METHOD FOR PROVIDING THE OBJECT AUDIO}Broadcast transmission apparatus and method for providing object-based audio, and broadcast reproduction apparatus and method

본 발명은 멀티 채널 오디오 부호화 및 복호화를 이용하여 객체 기반 오디오를 제공하는 기술에 과한 것이다.The present invention pertains to a technology for providing object-based audio using multi-channel audio encoding and decoding.

객체 기반 오디오 서비스는 오디오 내에 포함된 여러 음원들을 사용자가 제어하면서 청취할 수 있는 오디오디오 서비스이다. 일반적으로, 음악은 기타, 피아노, 베이스 등의 악기와 보컬의 목소리 등이 믹싱(mixing)된 형태를 갖는다. 특히, 객체기반 오디오 서비스는 일반적인 오디오와 달리 여러가지 악기와 가수의 목소리 등을 믹싱하지 않고 독립적으로 부호화하거나, 저장/전송할 수 있다. 이에 따라, 객체 기반 오디오 서비스를 이용하는 경우, 재생 단말은 각 악기나 보컬의 목소리 만을 개별적으로 제어할 수 있다.An object-based audio service is an audio/video service that allows a user to control and listen to various sound sources included in audio. In general, music has a form in which an instrument such as a guitar, piano, or bass is mixed with a voice of a vocalist. In particular, the object-based audio service can independently encode or store/transmit various instruments and singer's voices, unlike general audio, without mixing them. Accordingly, in the case of using the object-based audio service, the playback terminal can individually control only the voice of each musical instrument or vocal.

그런데, 객체 기반 오디오 서비스의 경우, 오디오 객체 별로 독립적으로 부호화 및 복호화가 이루어진다. 이에 따라, DTV(Digital Television), DMB(Digital Multimedia Broadcasting), DAB(Digital Audio Broadcasting) 등의 방송 시스템과는 호환되지 않는 어려움이 있다. 다시 말해, DTV, DMB, DAB 등의 방송 송신 및 재생 장치에서는 객체 별로 오디오 신호를 제어하는 데 어려움이 존재한다.However, in the case of an object-based audio service, encoding and decoding are independently performed for each audio object. Accordingly, there is a difficulty in not being compatible with broadcasting systems such as DTV (Digital Television), DMB (Digital Multimedia Broadcasting), and DAB (Digital Audio Broadcasting). In other words, in broadcasting transmitting and reproducing devices such as DTV, DMB, and DAB, it is difficult to control audio signals for each object.

따라서, DTV, DMB, DAB 등의 방송 시스템에서 객체 기반 오디오 서비스를 제공할 수 있는 기술이 필요하다.Therefore, a technology capable of providing an object-based audio service in broadcasting systems such as DTV, DMB, and DAB is required.

본 발명은 오디오 식별 정보를 이용하여 DTV, DMB, DAB 등의 방송 시스템에서 객체 기반 오디오 서비스를 제공할 수 있는 방송 송신 장치 및 방법, 그리고 방송 재생 장치 및 방법을 제공한다.The present invention provides a broadcast transmission apparatus and method, and a broadcast reproduction apparatus and method capable of providing an object-based audio service in a broadcasting system such as DTV, DMB, and DAB using audio identification information.

본 발명의 일실시예에 따른 방송 송신 장치는, 멀티 채널 오디오 신호를 부호화하는 오디오 부호화부, 및 상기 멀티 채널 오디오 신호가 객체 기반 오디오 신호인지를 식별하는 오디오 식별 정보를 생성하는 오디오 식별 정보 생성부를 포함할 수 있다.A broadcast transmission apparatus according to an embodiment of the present invention includes an audio encoder for encoding a multi-channel audio signal, and an audio identification information generator for generating audio identification information for identifying whether the multi-channel audio signal is an object-based audio signal. can include

또한, 비디오 신호를 부호화하는 비디오 부호화부를 더 포함할 수 있다. In addition, a video encoder for encoding a video signal may be further included.

또한, 상기 오디오 식별 정보, 믹싱 정보, 및 부호화된 멀티 채널 오디오 신호를 패킷화 및 다중화하는 패킷화 및 다중화부를 더 포함할 수 있다.In addition, a packetization and multiplexing unit for packetizing and multiplexing the audio identification information, mixing information, and the encoded multi-channel audio signal may be further included.

또한, 상기 오디오 식별 정보 생성부는, 상기 멀티채널 오디오 신호가 객체기반 오디오 신호인 경우, 각 채널을 믹싱하는 방식을 포함하는 믹싱 정보를 하나 이상 생성할 수 있다.Also, when the multi-channel audio signal is an object-based audio signal, the audio identification information generation unit may generate one or more pieces of mixing information including a method for mixing each channel.

또한, 상기 오디오 식별 정보 생성부는, 상기 오디오 식별 정보를 디스크립터 형태로 생성할 수 있다.Also, the audio identification information generator may generate the audio identification information in the form of a descriptor.

본 발명의 일실시예에 따른 방송 송신 방법은, 멀티 채널 오디오 신호를 부호화하는 단계, 및 상기 멀티 채널 오디오 신호가 객체 기반 오디오 신호인지를 식별하는 오디오 식별 정보를 생성하는 단계를 포함할 수 있다.A broadcast transmission method according to an embodiment of the present invention may include encoding a multi-channel audio signal and generating audio identification information identifying whether the multi-channel audio signal is an object-based audio signal.

또한, 상기 오디오 식별 정보, 믹싱 정보, 및 부호화된 오디오 신호를 패킷화 및 다중화 하는 단계를 더 포함할 수 있다. The method may further include packetizing and multiplexing the audio identification information, mixing information, and the encoded audio signal.

본 발명의 일실시예에 따른 방송 재생 장치는, 비트스트림으로부터 추출된 오디오 식별 정보에 기초하여 멀티 채널 오디오 신호가 객체 기반 오디오 신호인지를 판별하는 멀티 채널 오디오 신호 판별부, 및 부호화된 멀티 채널 오디오 신호를 복호화하는 오디오 복호화부를 포함할 수 있다.A broadcast reproducing apparatus according to an embodiment of the present invention includes a multi-channel audio signal determination unit for determining whether a multi-channel audio signal is an object-based audio signal based on audio identification information extracted from a bitstream, and an encoded multi-channel audio signal. It may include an audio decoder that decodes the signal.

또한, 하나 이상의 믹싱 정보가 입력되는 경우, 상기 다운 믹스부는, 입력된 믹싱 정보 중 디폴트(default) 로 설정된 믹싱 정보에 따라 상기 멀티채널 오디오 신호를 스테레오 오디오 신호로 다운믹스(downmix)할 수 있다.In addition, when one or more pieces of mixing information are input, the downmixing unit may downmix the multi-channel audio signal into a stereo audio signal according to mixing information set as a default among the input mixing information.

또한, 하나 이상의 믹싱 정보가 입력되는 경우, 상기 다운믹스부는, 입력된 믹싱 정보 중 사용자 조작을 통해 선택된 어느 하나의 믹싱 정보에 따라 상기 멀티채널 오디오 신호를 스테레오 오디오 신호로 다운믹스(downmix)할 수 있다.In addition, when more than one mixing information is input, the downmixing unit may downmix the multi-channel audio signal into a stereo audio signal according to any one mixing information selected through a user operation among the input mixing information. have.

또한, 상기 다운믹스부는, 사용자 조작을 통해 입력된 믹싱 정보에 따라 상기 멀티채널 오디오 신호를 스테레오 오디오 신호로 다운믹스(downmix)할 수 있다.Also, the downmix unit may downmix the multi-channel audio signal into a stereo audio signal according to mixing information input through a user manipulation.

본 발명의 일실시예에 따른 방송 재생 방법은, 오디오 식별 정보에 기초하여 멀티 채널 오디오 신호가 객체 기반 오디오 신호인지를 판별하는 단계, 및 부호화된 멀티 채널 오디오 신호를 복호화하는 단계를 포함할 수 있다.A broadcast reproduction method according to an embodiment of the present invention may include determining whether a multi-channel audio signal is an object-based audio signal based on audio identification information, and decoding the encoded multi-channel audio signal. .

본 발명에 따르면, 멀티 채널 오디오 신호가 서라운드 오디오 신호인지, 또는 객체 기반 오디오 신호인지를 식별하는 오디오 식별 정보를 이용하여 DTV, DMB, DAB 등의 방송 시스템에서 객체 기반 오디오 서비스를 제공할 수 있다.According to the present invention, an object-based audio service can be provided in broadcasting systems such as DTV, DMB, and DAB using audio identification information for identifying whether a multi-channel audio signal is a surround audio signal or an object-based audio signal.

도 1은 본 발명의 일실시예에 따른 방송 송신 장치의 구성을 도시한 블록도이다.
도 2는 본 발명의 일실시예에 따른 방송 송신 장치의 동작을 설명하기 위해 제공되는 흐름도이다.
도 3은 본 발명의 다른 실시예에 따른 방송 재생 장치의 구성을 도시한 블록도이다.
도 4는 본 발명의 일실시예에 따른 방송 재생 장치의 동작을 설명하기 위해 제공되는 흐름도이다.1 is a block diagram showing the configuration of a broadcast transmission device according to an embodiment of the present invention.
2 is a flowchart provided to explain the operation of a broadcast transmission device according to an embodiment of the present invention.
3 is a block diagram showing the configuration of a broadcast reproducing apparatus according to another embodiment of the present invention.
4 is a flowchart provided to explain the operation of a broadcast reproducing apparatus according to an embodiment of the present invention.

이하에서, 첨부된 도면을 참조하여 본 발명에 따른 실시예들을 상세히 설명한다. 그러나, 본 발명이 실시예들에 의해 제한되거나 한정되는 것은 아니다. 또한, 각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings. However, the present invention is not limited or limited by the examples. Also, like reference numerals in each figure denote like members.

도 1은 본 발명의 일실시예에 따른 방송 송신 장치의 구성을 도시한 블록도이다.1 is a block diagram showing the configuration of a broadcast transmission device according to an embodiment of the present invention.

도 1에 따르면, 방송 송신 장치(100)는 비디오 부호화부(110), 오디오 부호화부(120), 오디오 식별 정보 생성부(130), 및 패킷화 및 다중화부(140)를 포함할 수 있다. According to FIG. 1 , a broadcast transmission device 100 may include a video encoder 110, an audio encoder 120, an audio identification information generator 130, and a packetization and multiplexing unit 140.

비디오 부호화부(110)는 MPEG 등의 다양한 압축 알고리즘을 이용하여 비디오 신호를 부호화할 수 있다.The video encoder 110 may encode a video signal using various compression algorithms such as MPEG.

오디오 부호화부(120)는 멀티 채널 오디오 신호를 부호화할 수 있다.The audio encoder 120 may encode a multi-channel audio signal.

오디오 식별 정보 생성부(130)는 멀티 채널 오디오 신호가 서라운드(surround) 오디오 신호인지, 객체 기반 오디오 신호인지를 식별하는 오디오 식별 정보를 생성할 수 있다. The audio identification information generation unit 130 may generate audio identification information for identifying whether the multi-channel audio signal is a surround audio signal or an object-based audio signal.

이때, 오디오 식별 정보 생성부(130)는 오디오 식별 정보를 디스크립터(descriptor) 형태로 생성할 수 있다. 그러면, 디스크립터 형태의 시그 오디오 식별 정보는 MPEG-2 TS의 PMT에 삽입되어 방송 재생 장치로 전송될 수 있다.At this time, the audio identification information generation unit 130 may generate audio identification information in the form of a descriptor. Then, the SIG audio identification information in the form of a descriptor may be inserted into the PMT of the MPEG-2 TS and transmitted to the broadcast playback device.

또한, 오디오 식별 정보 생성부(130)는, 각 채널을 믹싱하는 방식을 포함하는 믹싱 정보를 하나 이상 생성할 수 있다. 이때, 멀티채널 오디오 신호가 객체기반 오디오 신호인 경우, 오디오 식별 정보 생성부(130)는 믹싱 정보가 삽입된 오디오 식별 정보를 생성할 수 있다. 마찬가지로, 오디오 식별 정보 생성부(130)는 믹싱 정보가 삽입된 오디오 식별 정보를 디스크립터 형태로 생성할 수 있다.Also, the audio identification information generation unit 130 may generate one or more pieces of mixing information including a method for mixing each channel. In this case, when the multi-channel audio signal is an object-based audio signal, the audio identification information generation unit 130 may generate audio identification information to which mixing information is inserted. Similarly, the audio identification information generation unit 130 may generate audio identification information into which mixing information is inserted in the form of a descriptor.

일례로, 가수 등의 음성(voice), 악기 1, 악기 2에 대한 믹싱 정보를 생성하는 경우, 오디오 식별 정보 생성부(130)는 음성:악기 1:악기 2를 1:1:1로 믹싱(Mixing)하는 믹싱 정보 1, 음성:악기 1:악기 2를 1:0:1로 믹싱하는 믹싱 정보 2, 음성:악기 1:악기 2를 1:1:0으로 믹싱하는 믹싱 정보 3을 생성할 수 있다. 그리고, 오디오 식별 정보 생성부(130)는 믹싱 정보 1, 믹싱 정보 2, 및 믹싱 정보 3를 오디오 식별 정보에 삽입하여 생성할 수 있다. 이때, 믹싱 정보 1 내지 믹싱 정보 3 중 어느 하나가 디폴트(default)로 기설정될 수 있다.For example, when generating mixing information for a voice such as a singer, instrument 1, and instrument 2, the audio identification information generating unit 130 mixes voice:instrument 1:instrument 2 in a 1:1:1 ratio ( Mixing information 1, mixing information 2 mixing voice:instrument 1:instrument 2 in a 1:0:1 ratio, and mixing information 3 mixing voice:instrument 1:instrument 2 in a 1:1:0 ratio. have. Also, the audio identification information generation unit 130 may generate mixing information 1, mixing information 2, and mixing information 3 by inserting them into the audio identification information. In this case, any one of mixing information 1 to mixing information 3 may be preset as a default.

패킷화 및 다중화부(140)는 부호화된 비디오 신호, 부호화된 오디오 신호, 및 오디오 식별 정보를 각각 패킷화할 수 있다. 그리고, 패킷화 및 다중화부(140)는 패킷화된 비디오 신호, 패킷화된 오디오 신호, 및 패킷화된 오디오 식별 정보를 다중화하여 하나의 비트스트림을 생성할 수 있다. 그러면, 방송 송신 장치(100)는 비트스트림을 방송 재생 장치로 전송할 수 있다.The packetization and multiplexing unit 140 may packetize the encoded video signal, the encoded audio signal, and the audio identification information, respectively. Also, the packetization and multiplexing unit 140 may generate one bitstream by multiplexing the packetized video signal, the packetized audio signal, and the packetized audio identification information. Then, the broadcast transmission device 100 may transmit the bitstream to the broadcast reproducing device.

이상에서 설명한 도 1의 방송 송신 장치는 부호화된 멀티 채널 오디오 신호와 오디오 식별 정보를 USB, 외장 하드 디스크, 블루레이(Blu-ray) 디스크, DBD 등의 저장 매체에 저장할 수 있다. 이때, 방송 송신 장치는 오디오 식별 정보를 디스크립터 형태로 저장 매체에 저장할 수도 있다.The above-described broadcast transmission device of FIG. 1 may store the encoded multi-channel audio signal and audio identification information in a storage medium such as USB, external hard disk, Blu-ray disk, or DBD. In this case, the broadcast transmission device may store audio identification information in a storage medium in the form of a descriptor.

또한, 도 1에서 설명한 방송 송신 장치로는 DTV, DMB, 및 DAB 등을 제공하는 휴대용 단말, 가정용 단말, 및 차량용 단말 등이 이용될 수 있다.Also, as the broadcast transmission device described in FIG. 1 , a portable terminal providing DTV, DMB, and DAB, a home terminal, and a vehicle terminal may be used.

도 2는 본 발명의 일실시예에 따른 방송 송신 장치의 동작을 설명하기 위해 제공되는 흐름도이다.2 is a flowchart provided to explain the operation of a broadcast transmission device according to an embodiment of the present invention.

먼저, 210 단계에서, 방송 송신 장치는 MPEG-2, HEVC 등의 영상 압축 알고리즘을 이용하여 비디오 신호를 부호화할 수 있다.First, in step 210, the broadcast transmission device may encode a video signal using an image compression algorithm such as MPEG-2 or HEVC.

이어, 220 단계에서, 방송 송신 장치는 AC-3(Audio Coding-3), AAC(Advanced Audio Coding), BSAC(Bit-Sliced Arithmetic Coding) 등의 오디오 압축 알고리즘 이용하여 멀티 채널 오디오 신호를 부호화할 수 있다.Subsequently, in step 220, the broadcast transmission device may encode a multi-channel audio signal using an audio compression algorithm such as Audio Coding-3 (AC-3), Advanced Audio Coding (AAC), or Bit-Sliced Arithmetic Coding (BSAC). have.

그리고, 230 단계에서, 방송 송신 장치는 멀티 채널 오디오 신호가 객체 기반 오디오 신호 인지, 서라운드 오디오 신호(surround)인지를 식별하기 위한 오디오 식별 정보를 생성할 수 있다. 일례로, 방송 송신 장치는, 오디오 식별 정보를 디스크립터 형태로 생성할 수 있다.In step 230, the broadcast transmission device may generate audio identification information for identifying whether the multi-channel audio signal is an object-based audio signal or a surround audio signal. For example, the broadcast transmission device may generate audio identification information in the form of a descriptor.

이때, 방송 송신 장치는 각 채널을 믹싱하는 방식을 포함하는 믹싱 정보를 하나 이상 생성할 수 있다. 그리고,, 멀티채널 오디오 신호가 객체기반 오디오 신호인 경우, 방송 송신 장치는는 믹싱 정보가 삽입된 오디오 식별 정보를 생성할 수 있다. 이때, 방송 송신 장치는 믹싱 정보가 삽입된 오디오 식별 정보를 디스크립터 형태로 생성할 수 있다.In this case, the broadcast transmission device may generate one or more pieces of mixing information including a method of mixing each channel. And, if the multi-channel audio signal is an object-based audio signal, the broadcast transmission device may generate audio identification information into which mixing information is inserted. In this case, the broadcast transmission device may generate audio identification information into which mixing information is inserted in the form of a descriptor.

이어, 240 단계에서, 방송 송신 장치는 오디오 식별 정보, 부호화된 비디오 신호 및 부호화된 오디오 신호를 각각 패킷화할 수 있다. 그리고, 방송 송신 장치는 패킷화된 오디오 식별 정보, 패킷화된 비디오 신호, 및 패킷화된 오디오 신호를 다중화하여 비트스트림을 생성할 수 있다.Subsequently, in step 240, the broadcast transmission device may packetize the audio identification information, the encoded video signal, and the encoded audio signal. Also, the broadcast transmission device may generate a bitstream by multiplexing the packetized audio identification information, the packetized video signal, and the packetized audio signal.

이상의 도 2에서, 210 내지 230 단계의 순서는 서로 바뀔 수 있다. 다시 말해, 비디오 신호 및 멀티 채널 오디오 신호를 부호화하고, 오디오 식별 정보를 생성하는 순서는 서로 바뀔 수 있다.In FIG. 2 above, the order of steps 210 to 230 may be interchanged. In other words, the order of encoding the video signal and the multi-channel audio signal and generating the audio identification information may be interchanged.

한편, 도 2에서, 방송 송신 장치는 부호화된 멀티 채널 오디오 신호와 오디오 식별 정보를 저장 매체에 저장할 수 있다. 여기서, 오디오 식별 정보는 하나 이상의 믹싱 정보를 포함할 수도 있다. 이때, 저장 매체에 저장된 오디오 식별 정보는 디스크립터 형태일 수 있다.Meanwhile, in FIG. 2 , the broadcast transmission device may store an encoded multi-channel audio signal and audio identification information in a storage medium. Here, the audio identification information may include one or more mixing information. In this case, the audio identification information stored in the storage medium may be in the form of a descriptor.

도 3은 본 발명의 다른 실시예에 따른 방송 재생 장치의 구성을 도시한 블록도이다.3 is a block diagram showing the configuration of a broadcast reproducing apparatus according to another embodiment of the present invention.

도 3에 따르면, 방송 재생 장치(300)는 역패킷화부(310), 비디오 복호화부(320), 오디오 복호화부(330), 멀티 채널 오디오 신호 판별부(340), 및 다운믹스부(350)를 포함할 수 있다. According to FIG. 3, the broadcast reproducing apparatus 300 includes a depacketizer 310, a video decoder 320, an audio decoder 330, a multi-channel audio signal discriminator 340, and a downmixer 350. can include

역패킷화부(310)는 비트스트림을 역다중화 및 역패킷화할 수 있다.The depacketizer 310 may demultiplex and depacketize the bitstream.

일례로, 역패킷화부(310)는 방송 송신 장치로부터 수신된 비트스트림을 역다중화하여, 비트스트림으로부터 부호화된 멀티 채널 오디오 신호, 부호화된 비디오 신호, 오디오 식별 정보를 추출할 수 있다. 여기서, 오디오 식별 정보는, 하나 이상의 믹싱 정보를 포함할 수도 있다.For example, the depacketizer 310 may demultiplex the bitstream received from the broadcast transmission device and extract an encoded multi-channel audio signal, an encoded video signal, and audio identification information from the bitstream. Here, the audio identification information may include one or more pieces of mixing information.

그리고, 역패킷화부(310)는 부호화된 멀티 채널 오디오 신호, 부호화된 비디오 신호, 오디오 식별 정보를 각각 역패킷화할 수 있다.Also, the depacketizer 310 may depacketize the encoded multi-channel audio signal, the encoded video signal, and the audio identification information, respectively.

비디오 복호화부(320)는 시그널링 정보에 포함된 비디오 정보에 기초하여 부호화된 비디오 신호를 복호화할 수 있다.The video decoder 320 may decode an encoded video signal based on video information included in signaling information.

오디오 복호화부(330)는, 시그널링 정보에 포함된 오디오 정보에 기초하여 부호화된 멀티 채널 오디오 신호를 복호화할 수 있다. 여기서, 시그널링 정보는 비트스트림에 포함되어 방송 송신 장치로부터 수신될 수 있다.The audio decoder 330 may decode an encoded multi-channel audio signal based on audio information included in signaling information. Here, the signaling information may be included in the bitstream and received from the broadcast transmission device.

멀티 채널 오디오 신호 판별부(340)는 오디오 식별 정보에 기초하여 멀티 채널 오디오 신호가 객체 기반 오디오 신호인지, 서라운드 오디오 신호인지 여부를 판별할 수 있다. 여기서, 오디오 식별 정보는 디스크립터 형태일 수 있다.The multi-channel audio signal determination unit 340 may determine whether the multi-channel audio signal is an object-based audio signal or a surround audio signal based on the audio identification information. Here, the audio identification information may be in the form of a descriptor.

이때, 멀티 채널 오디오 신호가 서라운드 오디오 신호로 판별된 경우, 오디오 복호화부(330)는 복호화된 멀티 채널 오디오 신호를 그대로 출력할 수 있다. In this case, when the multi-channel audio signal is determined as a surround audio signal, the audio decoder 330 may output the decoded multi-channel audio signal as it is.

그리고, 멀티 채널 오디오 신호가 객체 기반 오디오 신호로 판별된 경우, 다운 믹스부(350)는 하나 이상의 믹싱 정보에 기초하여 멀티 채널 오디오 신호를 스테레오 오디오 신호로 다운믹스할 수 있다. 여기서, 믹싱 정보는 오디오 식별 정보에 삽입되어 방송 송신 장치로부터 수신될 수도 있고, 사용자 조작을 통해 입력될 수도 있다.In addition, when the multi-channel audio signal is determined to be an object-based audio signal, the downmixing unit 350 may downmix the multi-channel audio signal into a stereo audio signal based on one or more pieces of mixing information. Here, the mixing information may be inserted into the audio identification information and received from the broadcast transmission device, or may be input through user manipulation.

이때, 오디오 식별 정보에 포함된 하나 이상의 믹싱 정보를 이용하는 경우, 다운 믹스부(350)는 하나 이상의 믹싱 정보 중 디폴트(default)로 설정된 믹싱 정보에 따라 멀티채널 오디오 신호를 스테레오 오디오 신호로 다운믹스(downmix)할 수 있다.At this time, when using one or more mixing information included in the audio identification information, the downmixing unit 350 downmixes the multi-channel audio signal into a stereo audio signal according to mixing information set as default among the one or more mixing information ( downmix).

일례로, 오디오 식별 정보에 음성(Voice):악기 1:악기 2:가 1:1:1로 믹싱하는 믹싱 정보 1, 음성:악기 1:악기 2:가 1:0:1로 믹싱하는 믹싱 정보 2, 및 음성:악기 1:악기 2:가 1:1:0으로 믹싱하는 믹싱 정보 3가 포함되고, 믹싱 정보 1이 디폴트로 기설정된 경우, 다운 믹스부(350)는 디폴트로 설정된 믹싱 정보 1에 따라 멀티채널 오디오 신호를 스테레오 오디오 신호로 다운믹스할 수 있다.For example, in the audio identification information, mixing information 1 in which Voice:Instrument 1:Instrument 2: is mixed in a 1:1:1 ratio, and Mixing information in which Voice:Instrument 1:Instrument 2: is mixed in a 1:0:1 ratio 2, and mixing information 3 in which voice:instrument 1:instrument 2: is mixed in a 1:1:0 ratio, and mixing information 1 is preset as a default, the downmixing unit 350 mixes the mixing information 1 set by default According to this, multi-channel audio signals can be downmixed into stereo audio signals.

다른 예로, 믹싱 정보 1, 믹싱 정보 2, 및 믹싱 정보 3 중 사용자 조작을 통해 믹싱 정보 3이 선택된 경우, 다운 믹스부(350)는 믹싱 정보 3에 따라 멀티채널 오디오 신호를 스테레오 오디오 신호로 다운믹스할 수 있다.As another example, when mixing information 3 is selected through a user operation among mixing information 1, mixing information 2, and mixing information 3, the downmix unit 350 downmixes a multi-channel audio signal into a stereo audio signal according to mixing information 3 can do.

또한, 오디오 식별 정보에 하나 이상의 믹싱 정보가 포함되고, 사용자 조작을 통해 믹싱 정보가 입력되는 경우, 다운 믹스부(350)는 사용자 조작을 통해 입력된 믹싱 정보에 따라 멀티채널 오디오 신호를 스테레오 오디오 신호로 다운믹스할 수 있다.In addition, when one or more mixing information is included in the audio identification information and the mixing information is input through user manipulation, the downmix unit 350 converts the multi-channel audio signal into a stereo audio signal according to the mixing information input through user manipulation. can be downmixed.

일례로, 사용자가 믹싱 정보 1 내지 믹싱 정보 3에 믹싱하기를 원하지 않고 특정 비율로 믹싱하기를 원하는 경우, 사용자는 방송 재생 장치에 마련된 조작부(미도시) 또는 리모콘 등을 이용하여 음성:악기 1:악기 2를 믹싱하고자 하는 믹싱 정보를 입력할 수 있다. 이때, 사용자로부터 음성:악기 1:악기 2를 1:0.5:0.5로 믹싱하는 믹싱 정보가 입력된 경우, 다운 믹스부(350)는 멀티채널 오디오 신호를 1:0.5:0.5로 다운믹스하여 스레오 오디오 신호를 출력할 수 있다.For example, if the user does not want to mix in mixing information 1 to mixing information 3 and wants to mix in a specific ratio, the user uses a control unit (not shown) or a remote control provided in the broadcast reproducing device to perform voice:instrument 1: You can enter mixing information for mixing instrument 2. At this time, when mixing information for mixing voice:instrument 1:instrument 2 in a ratio of 1:0.5:0.5 is input from the user, the downmixing unit 350 downmixes the multi-channel audio signal in a ratio of 1:0.5:0.5 to produce a stereo signal. Audio signals can be output.

도 4는 본 발명의 일실시예에 다른 방송 재생 장치의 동작을 설명하기 위해 제공되는 흐름도이다.4 is a flowchart provided to explain the operation of a broadcast reproducing apparatus according to an embodiment of the present invention.

먼저, 410 단계에서, 방송 재생 장치는 비트스트림을 역다중화할 수 있다. First, in step 410, the broadcast reproducing apparatus may demultiplex the bitstream.

일례로, 역다중화를 통해, 방송 재생 장치는 비트스트림으로부터 부호화된 비디오 신호, 부호화된 멀티 채널 오디오 신호, 오디오 식별 정보 중 적어도 하나를 분리할 수 있다. 여기서, 오디오 식별 정보는 하나 이상의 믹싱 정보를 포함할 수도 있다. 이때, 오디오 식별 정보는 디스크립터 형태일 수 있다.For example, through demultiplexing, the broadcast reproducing apparatus may separate at least one of an encoded video signal, an encoded multi-channel audio signal, and audio identification information from a bitstream. Here, the audio identification information may include one or more mixing information. In this case, the audio identification information may be in the form of a descriptor.

이어, 420 단계에서, 방송 재생 장치는 부호화된 비디오 신호, 부호화된 멀티 채널 오디오 신호, 오디오 식별 정보, 를 각각 역패킷화할 수 있다. 여기서, 오디오 식별 정보는 하나 이상의 믹싱 정보를 포함할 수도 있다. 이때, 오디오 식별 정보는 디스크립터 형태일 수 있다.Subsequently, in step 420, the broadcast reproducing apparatus may depacketize the encoded video signal, the encoded multi-channel audio signal, and the audio identification information, respectively. Here, the audio identification information may include one or more mixing information. In this case, the audio identification information may be in the form of a descriptor.

그리고, 430 단계에서, 방송 재생 장치는, 시그널링 정보에 포함된 비디오 정보에 기초하여 부호화된 비디오 신호를 복호화할 수 있다. 여기서, 시그널링 정보는 비디오 정보, 오디오 정보 등을 포함하며, 비트스트림에 포함되어 방송 송신 장치로부터 수신될 수 있다.In step 430, the broadcast reproducing apparatus may decode the encoded video signal based on the video information included in the signaling information. Here, the signaling information includes video information, audio information, and the like, and may be included in a bitstream and received from the broadcast transmission device.

이어, 440 단계에서, 방송 재생 장치는 시그널링 정보에 포함된 오디오 정보에 기초하여 부호화된 멀티 채널 오디오 신호를 복호화할 수 있다. Subsequently, in step 440, the broadcast reproducing apparatus may decode the encoded multi-channel audio signal based on the audio information included in the signaling information.

그리고, 450 단계에서, 방송 재생 장치는 오디오 식별 정보를 분석할 수 있다.And, in step 450, the broadcast playback device may analyze the audio identification information.

이때, 방송 재생 장치는 오디오 식별 정보를 분석하여 멀티 채널 오디오 신호가 객체 기반 오디오 신호인지, 서라운드 오디오 신호인지 여부를 판별할 수 있다.In this case, the broadcast reproducing apparatus may determine whether the multi-channel audio signal is an object-based audio signal or a surround audio signal by analyzing the audio identification information.

이어, 460 단계에서, 오디오 식별 정보의 분석을 통해 멀티 채널 오디오 신호가 객체 기반 오디오 신호로 판별된 경우(460:YES), 470 단계에서, 방송 재생 장치는 하나 이상의 믹싱 정보에 기초하여 멀티 채널 오디오 신호를 스테레오 오디오 신호로 다운믹스할 수 있다.Then, in step 460, if the multi-channel audio signal is determined as an object-based audio signal through analysis of the audio identification information (460: YES), in step 470, the broadcast reproducing apparatus generates multi-channel audio based on one or more mixing information. The signal can be downmixed to a stereo audio signal.

이때, 방송 재생 장치는, 오디오 식별 정보에 포함된 하나 이상의 믹싱 정보에 따라 멀티 채널 오디오 신호를 스테레오 오디오 신호로 다운믹스할 수 있다.In this case, the broadcast reproducing apparatus may downmix the multi-channel audio signal into a stereo audio signal according to one or more mixing information included in the audio identification information.

일례로, 다운 믹스부(350)는 하나 이상의 믹싱 정보 중 디폴트(default)로 설정된 믹싱 정보에 따라 멀티채널 오디오 신호를 스테레오 오디오 신호로 다운믹스(downmix)할 수 있다.For example, the downmixing unit 350 may downmix a multi-channel audio signal into a stereo audio signal according to mixing information set as a default among one or more pieces of mixing information.

다른 예로, 방송 재생 장치는 오디오 식별 정보에 포함된 하나 이상의 믹싱 정보 중에서 사용자 조작을 통해 선택된 믹싱 정보에 따라 멀티채널 오디오 신호를 스테레오 오디오 신호로 다운믹스할 수 있다.As another example, the broadcast reproducing apparatus may downmix a multi-channel audio signal into a stereo audio signal according to mixing information selected through a user manipulation among one or more pieces of mixing information included in audio identification information.

또한, 방송 재생 장치는, 사용자 조작을 통해 입력된 믹싱 정보에 따라 멀티채널 오디오 신호를 스테레오 오디오 신호로 다운믹스할 수 있다. 즉, 사용자가 오디오 식별 정보에 포함된 하나 이상의 믹싱 정보에 따라 멀티 채널 오디오 신호를 다운믹스하기를 원하지 않는 경우, 방송 재생 장치는 사용자로부터 믹싱 정보를 입력받을 수 있다. 그리고, 방송 재생 장치는, 키버튼, 터치 패널 등의 조작부(미도시) 또는 리모콘 등을 이용하여 사용자로부터 입력받은 믹싱 정보에 따라 멀티채널 오디오 신호를 스테레오 오디오 신호로 다운믹스할 수 있다.Also, the broadcast reproducing apparatus may downmix a multi-channel audio signal into a stereo audio signal according to mixing information input through a user manipulation. That is, when the user does not want to downmix the multi-channel audio signal according to one or more pieces of mixing information included in the audio identification information, the broadcast reproducing apparatus may receive mixing information from the user. In addition, the broadcast reproducing apparatus may downmix the multi-channel audio signal into a stereo audio signal according to mixing information input from the user using a control unit (not shown) such as a key button or a touch panel or a remote controller.

한편, 480 단계에서, 멀티 채널 오디오 신호가 서라운드 오디오 신호로 판별된 경우(460:NO), 방송 재생 장치는 복호화된 멀티 채널 오디오 신호를 그대로 출력할 수 있다.Meanwhile, in step 480, when the multi-channel audio signal is determined as a surround audio signal (460: NO), the broadcast reproducing apparatus may output the decoded multi-channel audio signal as it is.

이상의 도 6에서, 430 내지 450 단계의 순서는 서로 바뀔 수 있다. 다시 말해, 비디오 신호 및 멀티 채널 오디오 신호를 복호화하고, 오디오 식별 정보를 분석하는 순서는 서로 바뀔 수 있다.In FIG. 6 above, the order of steps 430 to 450 may be interchanged. In other words, the order of decoding the video signal and the multi-channel audio signal and analyzing the audio identification information may be interchanged.

지금까지, 도 3 및 도 4를 참조하여 방송 송신 장치로부터 수신된 비트스트림에 기초하여 멀티 채널 오디오 신호를 복원하거나, 멀티채널 오디오 신호를 스테레오 오디오 신호로 다운믹스하는 구성에 대해 설명하였다. 이외에, 방송 재생 장치는 저장 매체에 저장된 부호화된 멀티 채널 오디오 신호 및 오디오 식별 정보에 기초하여 스테레오 오디오 신호를 출력하거나, 복호화된 멀티 채널 오디오 신호를 출력할 수도 있다.So far, a configuration of restoring a multi-channel audio signal based on a bitstream received from a broadcast transmission device or downmixing a multi-channel audio signal into a stereo audio signal has been described with reference to FIGS. 3 and 4 . In addition, the broadcast reproducing apparatus may output a stereo audio signal or output a decoded multi-channel audio signal based on the encoded multi-channel audio signal and audio identification information stored in the storage medium.

이상의 도 1 내지 도 4에서는 비디오 및 오디오 신호를 부호화하고, 오디오 식별 정보를 생성하는 과정에 대해 설명하였으나, 비디오 신호를 처리하는 구성은 생략될 수 있다. 즉, 라디오를 이용하는 경우, 도 1 및 도 2의 방송 송신 장치에서는 비디오 신호를 부호화하는 비디오 부호화부가 생략될 수 있다. 마찬가지로, 도 3 및 도 4의 방송 재생 장치에서는 비디오 신호를 복호화하는 비디오 복호화부가 생략될 수 있다.1 to 4 have described the process of encoding video and audio signals and generating audio identification information, but a configuration for processing a video signal may be omitted. That is, in the case of using a radio, a video coder for encoding a video signal may be omitted in the broadcast transmission devices of FIGS. 1 and 2 . Similarly, in the broadcast reproducing devices of FIGS. 3 and 4, a video decoding unit that decodes a video signal may be omitted.

이상과 같이 본 발명은 비록 한정된 실시예와 도면에 의해 설명되었으나, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상의 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다.As described above, although the present invention has been described by the limited embodiments and drawings, the present invention is not limited to the above embodiments, and those skilled in the art in the field to which the present invention belongs can make various modifications and variations from these descriptions. this is possible

그러므로, 본 발명의 범위는 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등한 것들에 의해 정해져야 한다.Therefore, the scope of the present invention should not be limited to the described embodiments and should not be defined, and should be defined by not only the claims to be described later, but also those equivalent to these claims.

110: 비디오 부호화부
120: 오디오 부호화부
130: 오디오 식별 정보 생성부
140: 패킷화 및 다중화부110: video encoder
120: audio encoding unit
130: audio identification information generation unit
140: packetization and multiplexing unit

Claims

오디오 신호 송신 방법에 있어서,
오디오 신호가 채널 기반 오디오 신호인지 또는 객체 기반 오디오 신호인지를 식별하는 오디오 식별 정보를 생성하는 단계;
상기 오디오 식별 정보 및 상기 오디오 신호를 송신하는 단계
를 포함하고,
상기 오디오 신호는, 상기 오디오 식별 정보에 기초하여 처리되고,
상기 오디오 식별 정보에 기초하여 상기 오디오 신호가 객체 기반 오디오 신호로 판단된 경우, 상기 오디오 신호는 방송 재생 과정에서 복수의 믹싱 정보들 중 사용자 조작을 통해 선택된 어느 하나의 믹싱 정보에 따라 처리되거나 또는 사용자 조작을 통해 입력된 믹싱 정보에 따라 처리되고,
상기 복수의 믹싱 정보들은,
음원을 믹싱하기 위한 비율을 포함하고,
상기 오디오 식별 정보에 삽입되고,
상기 객체 기반 오디오 신호를 위한 복수의 채널들에 대한 믹싱 정보 및 상기 객체 기반 오디오 신호에 포함된 복수의 객체들에 대한 믹싱 정보를 포함하는 오디오 신호 송신 방법.In the audio signal transmission method,
generating audio identification information identifying whether the audio signal is a channel-based audio signal or an object-based audio signal;
transmitting the audio identification information and the audio signal;
including,
The audio signal is processed based on the audio identification information;
When the audio signal is determined to be an object-based audio signal based on the audio identification information, the audio signal is processed according to any one mixing information selected through user manipulation among a plurality of mixing information in a broadcast reproduction process, or It is processed according to the mixing information input through the operation,
The plurality of mixing information,
Including the ratio for mixing the sound source,
inserted into the audio identification information;
An audio signal transmission method comprising mixing information for a plurality of channels for the object-based audio signal and mixing information for a plurality of objects included in the object-based audio signal.

제1항에 있어서,
상기 오디오 신호는,
상기 오디오 신호에 포함된 구성 요소를 포함할지 여부를 나타내는 제1 정보, 또는 상기 오디오 신호에 포함된 구성 요소의 크기와 관련된 제2 정보 중 적어도 하나에 기초하여 처리되는 오디오 신호 송신 방법.According to claim 1,
The audio signal is
The method of claim 1 , wherein processing is performed based on at least one of first information indicating whether to include a component included in the audio signal or second information related to a size of the component included in the audio signal.

제2항에 있어서,
상기 제1 정보 또는 제2 정보는 복수일 수 있으며,
상기 오디오 신호를 재생할 때 사용자의 선택에 사용자 조작을 통해 어느 하나가 선택되어 오디오 신호를 재생하기 위해 사용될 수 있는 믹싱 정보를 포함하는 오디오 신호 송신 방법.According to claim 2,
The first information or the second information may be plural,
An audio signal transmission method comprising mixing information that can be used to reproduce an audio signal when one is selected through a user's manipulation when the audio signal is reproduced.

오디오 신호 재생 방법에 있어서,
오디오 신호가 채널 기반 오디오 신호인지 또는 객체 기반 오디오 신호인지를 식별하는 오디오 식별 정보를 수신하는 단계;
상기 오디오 식별 정보에 기초하여 오디오 신호를 재생하는 단계
를 포함하고,
상기 오디오 신호는, 상기 오디오 식별 정보에 기초하여 처리되고,
상기 오디오 식별 정보에 기초하여 상기 오디오 신호가 객체 기반 오디오 신호로 판단된 경우, 상기 오디오 신호는 방송 재생 과정에서 복수의 믹싱 정보들 중 사용자 조작을 통해 선택된 어느 하나의 믹싱 정보에 따라 처리되거나 또는 사용자 조작을 통해 입력된 믹싱 정보에 따라 처리되고,
상기 복수의 믹싱 정보들은,
음원을 믹싱하기 위한 비율을 포함하고,
상기 오디오 식별 정보에 삽입되고,
상기 객체 기반 오디오 신호를 위한 복수의 채널들에 대한 믹싱 정보 및 상기 객체 기반 오디오 신호에 포함된 복수의 객체들에 대한 믹싱 정보를 포함하는 오디오 신호 재생 방법.In the method of reproducing an audio signal,
receiving audio identification information identifying whether the audio signal is a channel-based audio signal or an object-based audio signal;
reproducing an audio signal based on the audio identification information;
including,
The audio signal is processed based on the audio identification information;
When the audio signal is determined to be an object-based audio signal based on the audio identification information, the audio signal is processed according to any one mixing information selected through user manipulation among a plurality of mixing information in a broadcast reproduction process, or It is processed according to the mixing information input through the operation,
The plurality of mixing information,
Including the ratio for mixing the sound source,
inserted into the audio identification information;
An audio signal reproducing method comprising mixing information on a plurality of channels for the object-based audio signal and mixing information on a plurality of objects included in the object-based audio signal.

제4항에 있어서,
상기 오디오 신호는,
상기 오디오 신호에 포함된 구성 요소를 포함할지 여부를 나타내는 제1 정보, 또는 상기 오디오 신호에 포함된 구성 요소의 크기와 관련된 제2 정보 중 적어도 하나에 기초하여 처리되는 오디오 신호 재생 방법.According to claim 4,
The audio signal is
The method of reproducing an audio signal that is processed based on at least one of first information indicating whether to include a component included in the audio signal or second information related to a size of an component included in the audio signal.

제5항에 있어서,
상기 제1 정보 또는 제2 정보는 복수일 수 있으며,
상기 오디오 신호를 재생할 때 사용자의 선택에 사용자 조작을 통해 어느 하나가 선택되어 오디오 신호를 재생하기 위해 사용될 수 있는 믹싱 정보를 포함하는 오디오 신호 재생 방법.According to claim 5,
The first information or the second information may be plural,
An audio signal reproducing method comprising mixing information that can be used to reproduce an audio signal when one of the audio signals is selected through a user's manipulation when the audio signal is reproduced.