WO2010143907A2 - 다객체 오디오 신호를 부호화하는 방법 및 부호화 장치, 복호화 방법 및 복호화 장치, 그리고 트랜스코딩 방법 및 트랜스코더 - Google Patents
다객체 오디오 신호를 부호화하는 방법 및 부호화 장치, 복호화 방법 및 복호화 장치, 그리고 트랜스코딩 방법 및 트랜스코더 Download PDFInfo
- Publication number
- WO2010143907A2 WO2010143907A2 PCT/KR2010/003752 KR2010003752W WO2010143907A2 WO 2010143907 A2 WO2010143907 A2 WO 2010143907A2 KR 2010003752 W KR2010003752 W KR 2010003752W WO 2010143907 A2 WO2010143907 A2 WO 2010143907A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- rendering
- object signals
- signal
- saoc
- final
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 72
- 230000005236 sound signal Effects 0.000 title claims abstract description 48
- 238000009877 rendering Methods 0.000 claims description 156
- 239000011159 matrix material Substances 0.000 claims description 48
- 238000007781 pre-processing Methods 0.000 claims description 5
- 238000010586 diagram Methods 0.000 description 8
- 230000001755 vocal effect Effects 0.000 description 7
- 238000006243 chemical reaction Methods 0.000 description 4
- 230000015556 catabolic process Effects 0.000 description 3
- 238000006731 degradation reaction Methods 0.000 description 3
- 230000006866 deterioration Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
Definitions
- the present invention relates to a method and an encoding apparatus, a decoding method and a decoding apparatus, and a transcoding method and a transcoder for encoding a multi-object audio signal.
- a method and apparatus for coding are known in the art.
- a multi-object audio signal is compressed by using a spatial audio object codec (SAOC) technique.
- SAOC spatial audio object codec
- a sound scene is generated by compressing a plurality of input object signals using only spatial parameters of audio object signals input for each frequency band.
- a volume controlled sound scene is generated for each object signal even at a very low bit rate.
- the multi-object audio signal is compressed and decompressed using limited bits, sound quality degradation of the object signal itself occurs inevitably in the encoding and decoding process. To this end, sound quality deterioration is severe in an environment in which certain object signals such as vocal signals are completely eliminated or reproduced alone. Therefore, when using the SAOC technique, it generally limits the range of object signal control.
- encoding and decoding are performed on object signals (hereinafter, referred to as foreground objects or Fore Ground Objects (FGOs)) to control to an extreme level among a plurality of input object signals.
- object signals hereinafter, referred to as foreground objects or Fore Ground Objects (FGOs)
- FGOs Fore Ground Objects
- a vocal signal is representative as a foreground object signal to be controlled and may be a karaoke service.
- an audio signal encoding technique capable of providing a sound quality satisfactory to a listener by reducing sound quality deterioration even in an extreme control environment while controlling volumes for a plurality of object signals.
- the present invention provides a multi-object audio encoding / control that can control the volume of a background object (BGO) signal composed of foreground object signals such as vocal signals and other signals for each service signal for a service such as karaoke.
- BGO background object
- a decoding method and apparatus, and a transcoding method and transcoder are provided.
- the present invention provides a multi-object audio encoding / decoding method and apparatus capable of increasing the number of object signals to be controlled by encoding and decoding foreground object signals and background object signals together, and a transcoding method and a transcoder. .
- the present invention provides a multi-object audio encoding / decoding method and apparatus, and a transcoding method and a transcoder, which reduce sound quality degradation even in an extreme control environment by controlling the volume of foreground object signals and background object signals for each object signal. do.
- An apparatus for encoding a multi-object audio signal includes a first encoder for downmixing object signals except for foreground object signals from a plurality of input object signals to generate background object signals and SAOC parameters, and And a second encoder for downmixing the foreground object signals and the background objects to generate a final downmix signal and an enhanced karaoke-solo (EKS) parameter.
- EKS enhanced karaoke-solo
- the apparatus may further include a multiplexer configured to multiplex the SAOC parameter and the EKS parameter to generate a SAOC bitstream.
- the first and second encoders may selectively operate according to an EKS encoding mode for controlling the foreground object signals and a classic encoding mode for controlling the background object signals.
- the multi-object audio signal encoding method includes generating a background object signal and a SAOC parameter by downmixing object signals except for foreground object signals among a plurality of input object signals,
- the method may include downmixing ground object signals and the background objects to generate a final downmix signal and an enhanced Karaoke-Solo parameter.
- the method may further include generating a SAOC bitstream by multiplexing the SAOC parameter and the EKS parameter.
- An apparatus for decoding a multi-object audio signal includes a bitstream analyzer extracting an SAOC parameter and an EKS parameter from a multiplexed spatial audio object codec (SAOC) bitstream, and a final downmix using the EKS parameter.
- SAOC spatial audio object codec
- a first decoder for recovering foreground object signals and background object signals from the signal, a second decoder for generating a first rendering signal from the background object signals using the SAOC parameter and a rendering matrix, and the foreground object signal And a rendering unit generating a final rendering signal using the first rendering signal.
- the rendering unit may generate the final rendering signal using the second rendering signal and the first rendering signal generated from the foreground object signals based on the rendering matrix.
- the first decoder may further include a downmix preprocessor configured to preprocess the background object signals according to the rendering matrix to generate a modified downmix signal, and to set the SAOC parameter to MPS according to the rendering matrix.
- a SAOC transcoder for converting into a bitstream, and an MPS decoder for generating the first rendering signal by rendering the modified downmix signal based on the MPS bitstream.
- the rendering unit may generate the final rendering signal by using the rendered modified downmix signal and the foreground object signals.
- first and second decoders may selectively operate according to an EKS decoding mode for controlling the foreground object signals and a classic decoding mode for controlling the background object signals.
- the first decoder may render the restored foreground object signals according to the rendering matrix. Then, the rendering unit may generate the final rendering signal by adding the rendered foreground object signals and the rendered background object signals.
- SAOC spatial audio object codec
- the generating of the final rendering signal may generate the final rendering signal using the second rendering signal and the first rendering signal generated from the foreground object signals based on the rendering matrix.
- the generating of the first rendering signal may include preprocessing the background object signals according to the rendering matrix to generate a modified downmix signal, and converting the SAOC parameter into MPS according to the rendering matrix. MPEG surround), and converting the modified downmix signal based on the MPS bitstream to generate the first rendering signal.
- the generating of the final rendering signal may generate the final rendering signal by using the rendered modified downmix signal and the foreground object signals.
- the method may further include rendering the restored foreground object signals according to the rendering matrix. Then, generating the final rendering signal may generate the final rendering signal by adding the rendered foreground object signals and the rendered background object signals.
- An apparatus for decoding a multi-object audio signal includes a bitstream analyzer extracting an SAOC parameter and an EKS parameter from a multiplexed spatial audio object codec (SAOC) bitstream, and a final downmix using the EKS parameter.
- SAOC spatial audio object codec
- a second decoder, and a rendering unit generating the final rendering signal by adding the rendered foreground object signals and the rendered background object signals.
- the multi-object audio signal decoding method extracting the SAOC parameter and the EKS parameter from the multiplexed spatial audio object codec (SAOC) bitstream, using the EKS parameter from the final downmix signal Restoring ground object signals and background object signals, rendering the restored foreground object signals according to a rendering matrix, rendering the background object signals using the SAOC parameter and the rendering matrix, and And adding rendered foreground object signals and the rendered background object signals to generate a final rendering signal.
- SAOC spatial audio object codec
- the volume of foreground object signals such as karaoke and background object signals may be controlled for each object signal.
- the number of object signals to be controlled may be increased by encoding and decoding the foreground object signals and the background object signals together.
- the present invention by controlling the volume of the foreground object signals and the background object signals for each object signal, it is possible to reduce sound quality degradation even in an extreme control environment.
- FIG. 1 is a diagram illustrating a configuration of an apparatus for encoding a multi-object audio signal according to an embodiment of the present invention.
- FIG. 2 is a diagram provided to explain a process of encoding a multi-object audio signal according to an embodiment of the present invention.
- FIG. 3 is a block diagram of a multi-object audio signal decoding apparatus according to an embodiment of the present invention.
- FIG. 4 is a diagram provided to explain a process of decoding a multi-object audio signal according to an embodiment of the present invention.
- FIG. 5 is a diagram illustrating a configuration of a multi-object audio signal transcoder according to an embodiment of the present invention.
- FIG. 6 is a view provided to explain a process of transcoding a multi-object audio signal according to an embodiment of the present invention.
- 1 is a diagram illustrating a configuration of an apparatus for encoding a multi-object audio signal according to an embodiment of the present invention.
- 2 is a view provided to explain a process of encoding a multi-object audio signal according to an embodiment of the present invention.
- the multi-object audio signal encoding apparatus 100 may include a first encoder 110, a second encoder 120, and a multiplexer 130.
- the multi-object audio signals mean a plurality of input object signals.
- the N input object signals may include K foreground object signals (FGOs) and N-K object signals. That is, the N-K object signals are object signals except K foreground object signals among the plurality of input object signals.
- N and K are constants.
- the first encoder 110 may downmix object signals to generate background object signals (BGOs) and a spatial audio object codec (SAOC) parameter. Then, the background object signals may be input to the second encoder 120.
- BGOs background object signals
- SAOC spatial audio object codec
- the N-K object signals other than the K foreground signals among the N object signals may be input to the first encoder 110.
- the SAOC parameter is a spatial parameter of each of the N-K object signals and may include energy information and correlation information of the background object signals.
- the first encoder 110 may be defined as a Classic Mode Encoder for downmixing N-K object signals, and the Classic Mode Encoder is an encoder using only spatial parameters defined in the MPEG SAOC standard.
- the foreground object signals FGOs refer to an object signal in which sound quality deteriorates rapidly during single playback or complete removal among a plurality of input object signals, and represents an object signal that a listener specifically wants to control.
- the final signal is karaoke.
- the vocal signal that is to be completely removed may be the foreground object signal.
- the second encoder 120 may downmix the foreground object signals and the background object signals to generate a final downmix signal and an enhanced Karaoke-Solo (EKS) parameter.
- EKS is a spatial Cue parameter of each of the foreground object signals and the background object signals, and the residual calculated from the energy information and similarity information of the final downmix signal and the downmix signal and the foreground object signal. It may include a signal (residual signal).
- the second encoder 120 may be defined as an EKS mode encoder that downmixes the foreground object signals and the background object signals together, and the EKS mode encoder is a residual signal defined in the MPEG SAOC standard. Sound quality of the foreground object signal may be improved by using coding.
- the multiplexer 130 may generate a SAOC bitstream by multiplexing the SAOC parameter and the EKS parameter.
- the multiplexer 130 may receive the SAOC parameter and the EKS parameter and multiplex the SAOC standard bitstream.
- the multiplexer 130 may transmit the generated SAOC bitstream and the final downmix signal to the multi-object audio signal decoding apparatus 300. That is, the multiplexer 130 may transmit the SAOC bitstream and the final downmix signal generated by the second encoder 120 to the multi-object audio signal decoding apparatus 300 together.
- the multi-object audio signal encoding apparatus 100 normally operates like the first encoder 110 and the second encoder 120, but the foreground object signals and the background object are used. Only one of the signals may be used to generate the final downmix signal. That is, the first encoder 110 and the second encoder 120 may selectively operate according to the classic encoding mode or the EKS encoding mode.
- the second encoder 120 and the multiplexer 130 may be inactivated and not operate. Then, the background object signals generated by the first encoder 110 may be the final downmix signal. Accordingly, the background object signals and the SAOC parameter may be transmitted to the multi-object audio signal decoding apparatus 300.
- the first encoder 110 and the multiplexer 130 may be inactivated and not operate. Then, the second encoder 120 may downmix M background object signals and K foreground object signals to generate a final downmix signal and an EKS parameter.
- the EKS parameter may include a spatial signal calculated from M background object signals and K foreground object signals, and a residual signal calculated from a downmix signal and a foreground object signal.
- the final downmix signal generated according to the EKS encoding mode and the EKS parameter may be configured as a SAOC bitstream and transmitted to the multi-object audio signal decoding apparatus 300.
- FIG. 3 is a block diagram of a multi-object audio signal decoding apparatus according to an embodiment of the present invention.
- 4 is a view provided to explain a process of decoding a multi-object audio signal according to an embodiment of the present invention.
- the multi-object audio signal decoding apparatus 300 may include a bitstream analyzer 310, a second decoder 320, a first decoder 330, and a renderer 340.
- the multi-object audio signal decoding apparatus 300 may receive a final downmix signal and a SAOC bitstream from the multi-object audio signal encoding apparatus 100.
- the final downmix signal may be a final downmix signal generated by the second encoder 120.
- the SAOC bitstream may be input to the bitstream analyzer 310 and the final downmix signal may be input to the first decoder 320.
- the bitstream analyzer 310 may extract the SAOC parameter and the EKS parameter from the SAOC bitstream. Then, the extracted EKS parameter may be input to the first decoder 320, and the SAOC parameter may be input to the second decoder 330.
- the bitstream analyzer 310 may parse the input SAOC bitstream to extract SAOC parameters and EKS parameters.
- SAOC parameter is a spatial parameter of each of the object signals except for the foreground object signal among the plurality of input object signals
- EKS parameter is a spatial parameter of each of the foreground object signals. to be.
- the first decoder 320 may restore the foreground object signals FGOs and the background object signals BGOs from the final downmix signal using the EKS parameter.
- the first decoder 320 may be defined as an EKS mode decoder.
- the restored background object signals BGOs may be input to the second decoder 330.
- the second decoder 330 may generate a pre-rendered scene from the background object signals using the SAOC parameter and the pre-stored rendering matrix.
- the second decoder 330 may generate the first rendering signal by adjusting the gain of the background object signals according to a gain value included in the rendering matrix. Then, the generated first render signal (Pre-rendered Scene) may be input to the renderer 340.
- the generated first render signal Pre-rendered Scene
- the renderer 340 may render the foreground object signals FGOs restored by the first decoder 320 to generate a second rendering signal.
- the renderer 340 may generate a second render signal by adjusting gains of the restored foreground object signals according to a gain value included in the rendering matrix.
- the rendering unit 340 may generate a final rendered signal by adding a first rendering signal and a second rendering signal. Then, the generated final rendering signal can be reproduced through sound equipment such as a speaker.
- the multi-object audio signal decoding apparatus 100 normally operates with the first decoder 320 and the second decoder 330, but the restored foreground object signals and The final rendering signal may be generated using only one of the restored background object signals. That is, the first decoder 320 and the second decoder 330 may selectively operate according to the classic decoding mode or the EKS decoding mode.
- the first decoder 320 and the renderer 340 may be inactivated to not operate. Then, the final downmix signal transmitted from the multi-object audio signal encoding apparatus 100 may be directly input to the second decoder 330.
- the final downmix signal may be background object signals BGOs generated by the first encoder 110.
- the second decoder 330 may generate a final rendered signal from the background object signals BGOs using the SAOC parameter and the rendering matrix. For example, the second decoder 330 may generate the final rendered signal by adjusting the gain of the background object signals according to the gain value included in the rendering matrix based on the SAOC parameter.
- the second decoder 330 when operating in the EKS decoding mode, the second decoder 330 may be inactivated and not operate.
- that the second decoder 330 does not operate means that the SAOC parameter does not exist in the SAOC bitstream, and the SAOC bitstream includes only the EKS parameter.
- the foreground object signals FGOs and the background object signals BGOs restored by the first decoder 320 may be directly input to the renderer 340.
- the rendering matrix may be directly input to the rendering unit 340.
- the renderer 340 may generate the final rendering signal from the restored foreground object signals FGOs and the restored background object signals BGOs using the pre-stored rendering matrix. For example, the renderer 340 may generate a final rendered signal by adjusting the gain of the background object signals based on a gain value included in the rendering matrix based on the rendering matrix.
- 5 is a diagram illustrating a configuration of a multi-object audio signal transcoder according to an embodiment of the present invention.
- 6 is a view provided to explain a process of transcoding a multi-object audio signal according to an embodiment of the present invention.
- the SAOC transcoder 500 may include a bitstream analyzer 510, a first decoder 520, a second decoder 530, and a renderer 540.
- the bitstream analyzer 510, the first decoder 520, and the renderer 540 are the same as FIG. 3, and in FIG. 6, steps S610 to S630 are the same as steps S410 to S430 of FIG. 4. Therefore, duplicate descriptions will be omitted. That is, the configuration of the second decoder 530 in the multi-object audio signal transcoder 500 is different from that of the multi-object audio signal decoding apparatus 300 of FIG. 3.
- the second decoder 530 may include a downmix preprocessor 531, a transcoder 532, and an MPS decoder 533.
- the downmix pre-processor pre-processes the restored background object signals (BGOs) to correct the modified downmix signal (Modified Downmix). signal) can be generated.
- the downmix processor 531 may pre-process the restored background object signals according to the pre-stored rendering matrix.
- the same process as the downmix preprocessing process defined in the MPEG SAOC standard may be used as the preprocessing process according to the rendering matrix.
- the transcoder 532 may convert the SAOC parameter into an MPS (MPEG Surround) bitstream.
- MPS MPEG Surround
- transcoder 532 may convert the SAOC parameters into MPS bitstreams according to a pre-stored rendering matrix. In this case, the same conversion process as defined in the MPEG SAOC standard may be used as the conversion process.
- the MPS decoder 533 may generate a pre-rendered scene by rendering a modified downmix signal based on the converted MPS bitstream. Then, the generated first render signal (Pre-rendered Scene) may be input to the renderer 540. In this case, the MPS decoder 533 may render the modified downmix signal in a multi-channel. That is, the MPS decoder 533 may generate a multi-channel first rendering signal.
- the renderer 540 may generate a second rendering signal from the restored foreground object signals based on the pre-stored rendering matrix.
- the rendering unit 540 may generate the second rendering signal by adjusting the gain of the restored foreground object signals according to the gain value included in the rendering matrix.
- the rendering unit 540 may generate a final rendered signal by adding the generated first rendering signal and the second rendering signal.
- the first rendering signal is a rendered correction downmix signal.
- the generated final rendered signal may be reproduced through sound equipment such as a speaker.
- a frequency / time conversion process is required to generate a final rendering signal, and this frequency / time conversion process may be selectively performed by the MPS decoder 533 and the rendering unit 540.
- the MPS decoder 533 may convert the rendered corrected downmix signal (Pre-rendered Scene) from the frequency domain to the time domain.
- the renderer 540 may convert the restored foreground object signals FGOs from the frequency domain to the time domain.
- the multi-object audio signal transcoder 500 normally operates with the first decoder 520 and the second decoder 530, but the restored foreground object signals and The final rendering signal may be generated using only one of the restored background object signals.
- the first decoder 520 and the second decoder 530 may selectively operate according to the classic decoding mode or the EKS decoding mode.
- the process of generating the final rendering signal according to the classic mode and the EKS mode is the same as in FIGS.
- the rendering units 340 and 540 render the restored foreground object signals.
- the first decoders 320 and 520 are restored.
- the second rendering signal may be generated by rendering the ground object signals. That is, the rendering process described with reference to FIGS. 3 and 5 may be performed according to the same process as the rendering defined in the SAOC standard.
- the first decoders 320 and 520 may generate a second rendering signal by adjusting gains of the restored foreground object signals according to gains included in the rendering matrix. Can be. Then, the renderers 340 and 540 may generate a final rendered signal by adding a second rendering signal and a first rendering signal generated by the second decoders 330 and 530. . That is, referring to the dotted line, the rendering matrix may not be input to the renderer renderers 340 and 540.
- the first encoder 110 and the second encoder 120 may be sequentially performed.
- the maximum number of foreground object signals input to the second encoder 120 may be limited to four or two or less.
- the maximum number is limited to four.
- the maximum number is It can be limited to two, four channels.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Stereophonic System (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
Claims (20)
- 복수의 입력 객체 신호들 중에서 포그라운드 객체 신호들을 제외한 객체 신호들을 다운믹스하여 백그라운드 객체 신호들과 SAOC 파라미터를 생성하는 제1 인코더; 및상기 포그라운드 객체 신호들과 상기 백그라운드 객체들을 다운믹스하여 최종 다운믹스 신호와 EKS 파라미터(Enhanced Karaoke-Solo)를 생성하는 제2 인코더를 포함하는 부호화 장치.
- 제1항에 있어서,상기 SAOC 파라미터 및 상기 EKS 파라미터를 다중화하여 SAOC 비트스트림을 생성하는 다중화부를 더 포함하는 부호화 장치.
- 제1항에 있어서,상기 제1 및 제2 인코더는 상기 포그라운드 객체 신호들을 제어하는 EKS 인코딩 모드 및 상기 백그라운드 객체 신호들을 제어하는 클래식 인코딩 모드에 따라 선택적으로 동작하는 것을 특징으로 하는 부호화 장치.
- 복수의 입력 객체 신호들 중에서 포그라운드 객체 신호들을 제외한 객체 신호들을 다운믹스하여 백그라운드 객체 신호들과 SAOC 파라미터를 생성하는 단계; 및상기 포그라운드 객체 신호들과 상기 백그라운드 객체들을 다운믹스하여 최종 다운믹스 신호와 EKS 파라미터(Enhanced Karaoke-Solo)를 생성하는 단계를 포함하는 부호화 방법.
- 제4항에 있어서,상기 SAOC 파라미터 및 상기 EKS 파라미터를 다중화하여 SAOC 비트스트림을 생성하는 단계를 더 포함하는 부호화 방법.
- 다중화된 SAOC(Spatial Audio Object Codec) 비트스트림으로부터 SAOC 파라미터 및 EKS 파라미터를 추출하는 비트스트림 분석부;상기 EKS 파라미터를 이용하여 최종 다운믹스 신호로부터 포그라운드 객체 신호들과 백그라운드 객체 신호들을 복원하는 제1 디코더;상기 SAOC 파라미터와 렌더링 매트릭스를 이용하여 상기 백그라운드 객체 신호들로부터 제1 렌더링 신호를 생성하는 제2 디코더; 및상기 포그라운드 객체 신호들과 상기 제1 렌더링 신호를 이용하여 최종 렌더링 신호를 생성하는 렌더링부를 포함하는 복호화 장치.
- 제6항에 있어서,상기 렌더링부는,상기 렌더링 매트릭스에 기초하여 상기 포그라운드 객체 신호들로부터 생성된 제2 렌더링 신호 및 상기 제1 렌더링 신호를 이용하여 상기 최종 렌더링 신호를 생성하는 것을 특징으로 하는 복호화 장치.
- 제7항에 있어서,상기 렌더링부는,상기 렌더링 매트릭스에 포함된 게인값(gain value)에 따라 상기 백그라운드 객체 신호들의 게인을 조절하여 상기 제1 렌더링 신호를 생성하고, 상기 렌더링 매트릭스에 포함된 게인값(gain value)에 따라 상기 포그라운드 객체 신호들의 게인을 조절하여 상기 제2 렌더링 신호를 생성하는 것을 특징으로 하는 복호화 장치.
- 제6항에 있어서,상기 제1 디코더는,상기 렌더링 매트릭스에 따라 상기 백그라운드 객체 신호들을 전처리하여 수정 다운믹스 신호(modified downmix signal)를 생성하는 다운믹스 전처리부;상기 렌더링 매트릭스에 따라 상기 SAOC 파라미터를 MPS(MPEG Surround) 비트스트림으로 변환하는 SAOC 트랜스코더; 및상기 MPS 비트스트림을 기초로 상기 수정 다운믹스 신호를 렌더링하여 상기 제1 렌더링 신호를 생성하는 MPS 디코더를 포함하는 다객체 오디오 신호 복호화 장치.
- 제9항에 있어서,상기 렌더링부는,상기 렌더링된 수정 다운믹스 신호와 상기 포그라운드 객체 신호들을 이용하여 상기 최종 렌더링 신호를 생성하는 것을 특징으로 하는 복호화 장치.
- 제6항에 있어서,상기 제1 및 제2 디코더는,상기 포그라운드 객체 신호들을 제어하는 EKS 디코딩 모드 및 상기 백그라운드 객체 신호들을 제어하는 클래식 디코딩 모드에 따라 선택적으로 동작하는 것을 특징으로 하는 복호화 장치.
- 제6항에 있어서,상기 제1 디코더는,상기 렌더링 매트릭스에 따라 상기 복원된 포그라운드 객체 신호들을 렌더링하는 것을 특징으로 하고,상기 렌더링부는,상기 렌더링된 포그라운드 객체 신호들과 상기 렌더링된 백그라운드 객체 신호들을 더하여 상기 최종 렌더링 신호를 생성하는 것을 특징으로 하는 복호화 장치.
- 다중화된 SAOC(Spatial Audio Object Codec) 비트스트림으로부터 SAOC 파라미터 및 EKS 파라미터를 추출하는 단계;상기 EKS 파라미터를 이용하여 최종 다운믹스 신호로부터 포그라운드 객체 신호들과 백그라운드 객체 신호들을 복원하는 단계;상기 SAOC 파라미터와 렌더링 매트릭스를 이용하여 상기 백그라운드 객체 신호들로부터 제1 렌더링 신호를 생성하는 단계; 및상기 포그라운드 객체 신호들과 상기 제1 렌더링 신호를 이용하여 최종 렌더링 신호를 생성하는 단계를 포함하는 복호화 방법.
- 제13항에 있어서,상기 최종 렌더링 신호를 생성하는 단계는,상기 렌더링 매트릭스에 기초하여 상기 포그라운드 객체 신호들로부터 생성된 제2 렌더링 신호 및 상기 제1 렌더링 신호를 이용하여 상기 최종 렌더링 신호를 생성하는 것을 특징으로 하는 복호화 방법.
- 제14항에 있어서,상기 제1 렌더링 신호를 생성하는 단계는,상기 렌더링 매트릭스에 포함된 게인값(gain value)에 따라 상기 백그라운드 객체 신호들의 게인을 조절하여 상기 제1 렌더링 신호를 생성하고,상기 최종 렌더링 신호를 생성하는 단계는,상기 렌더링 매트릭스에 포함된 게인값(gain value)에 따라 상기 포그라운드 객체 신호들의 게인을 조절하여 상기 제2 렌더링 신호를 생성하는 것을 특징으로 하는 복호화 방법.
- 제13항에 있어서,상기 제1 렌더링 신호를 생성하는 단계는,상기 렌더링 매트릭스에 따라 상기 백그라운드 객체 신호들을 전처리하여 수정 다운믹스 신호(modified downmix signal)를 생성하는 단계;상기 렌더링 매트릭스에 따라 상기 SAOC 파라미터를 MPS(MPEG Surround) 비트스트림으로 변환하는 단계; 및상기 MPS 비트스트림을 기초로 상기 수정 다운믹스 신호를 렌더링하여 상기 제1 렌더링 신호를 생성하는 단계를 포함하는 다객체 오디오 신호 복호화 방법.
- 제16항에 있어서,상기 최종 렌더링 신호를 생성하는 단계는,상기 렌더링된 수정 다운믹스 신호와 상기 포그라운드 객체 신호들을 이용하여 상기 최종 렌더링 신호를 생성하는 것을 특징으로 하는 복호화 방법.
- 제12항에 있어서,상기 렌더링 매트릭스에 따라 상기 복원된 포그라운드 객체 신호들을 렌더링하는 단계를 더 포함하고,상기 최종 렌더링 신호를 생성하는 단계는,상기 렌더링된 포그라운드 객체 신호들과 상기 렌더링된 백그라운드 객체 신호들을 더하여 상기 최종 렌더링 신호를 생성하는 것을 특징으로 하는 복호화 방법.
- 다중화된 SAOC(Spatial Audio Object Codec) 비트스트림으로부터 SAOC 파라미터 및 EKS 파라미터를 추출하는 비트스트림 분석부;상기 EKS 파라미터를 이용하여 최종 다운믹스 신호로부터 포그라운드 객체 신호들과 백그라운드 객체 신호들을 복원하고, 렌더링 매트릭스에 따라 상기 복원된 포그라운드 객체 신호들을 렌더링하는 제1 디코더;상기 SAOC 파라미터와 상기 렌더링 매트릭스를 이용하여 상기 백그라운드 객체 신호들을 렌더링하는 제2 디코더; 및상기 렌더링된 포그라운드 객체 신호들과 상기 렌더링된 백그라운드 객체 신호들을 더하여 최종 렌더링 신호를 생성하는 렌더링부를 포함하는 복호화 장치.
- 다중화된 SAOC(Spatial Audio Object Codec) 비트스트림으로부터 SAOC 파라미터 및 EKS 파라미터를 추출하는 단계;상기 EKS 파라미터를 이용하여 최종 다운믹스 신호로부터 포그라운드 객체 신호들과 백그라운드 객체 신호들을 복원하는 단계;상기 복원된 포그라운드 객체 신호들을 렌더링 매트릭스에 따라 렌더링하는 단계;상기 SAOC 파라미터와 상기 렌더링 매트릭스를 이용하여 상기 백그라운드 객체 신호들을 렌더링하는 단계; 및상기 렌더링된 포그라운드 객체 신호들과 상기 렌더링된 백그라운드 객체 신호들을 더하여 최종 렌더링 신호를 생성하는 단계를 포함하는 복호화 방법.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201080025528.4A CN102460571B (zh) | 2009-06-10 | 2010-06-10 | 多音频对象信号的编解码方法和装置及转码方法和转码器 |
US13/377,334 US8712784B2 (en) | 2009-06-10 | 2010-06-10 | Encoding method and encoding device, decoding method and decoding device and transcoding method and transcoder for multi-object audio signals |
EP10786390A EP2442303A4 (en) | 2009-06-10 | 2010-06-10 | ENCODING METHOD AND DEVICE, DECODING METHOD AND DEVICE, AND TRANSCODING METHOD AND TRANSCODER FOR MULTI-OBJECT AUDIO SIGNALS |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR10-2009-0051378 | 2009-06-10 | ||
KR20090051378 | 2009-06-10 | ||
KR20090055756 | 2009-06-23 | ||
KR10-2009-0055756 | 2009-06-23 | ||
KR1020100053549A KR101387902B1 (ko) | 2009-06-10 | 2010-06-07 | 다객체 오디오 신호를 부호화하는 방법 및 부호화 장치, 복호화 방법 및 복호화 장치, 그리고 트랜스코딩 방법 및 트랜스코더 |
KR10-2010-0053549 | 2010-06-07 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2010143907A2 true WO2010143907A2 (ko) | 2010-12-16 |
WO2010143907A3 WO2010143907A3 (ko) | 2011-03-03 |
Family
ID=43508441
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2010/003752 WO2010143907A2 (ko) | 2009-06-10 | 2010-06-10 | 다객체 오디오 신호를 부호화하는 방법 및 부호화 장치, 복호화 방법 및 복호화 장치, 그리고 트랜스코딩 방법 및 트랜스코더 |
Country Status (5)
Country | Link |
---|---|
US (1) | US8712784B2 (ko) |
EP (1) | EP2442303A4 (ko) |
KR (1) | KR101387902B1 (ko) |
CN (1) | CN102460571B (ko) |
WO (1) | WO2010143907A2 (ko) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109887517A (zh) * | 2013-05-24 | 2019-06-14 | 杜比国际公司 | 对音频场景进行解码的方法、解码器及计算机可读介质 |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2014007095A1 (ja) | 2012-07-02 | 2014-01-09 | ソニー株式会社 | 復号装置および方法、符号化装置および方法、並びにプログラム |
WO2014007096A1 (ja) | 2012-07-02 | 2014-01-09 | ソニー株式会社 | 復号装置および方法、符号化装置および方法、並びにプログラム |
TWI517142B (zh) | 2012-07-02 | 2016-01-11 | Sony Corp | Audio decoding apparatus and method, audio coding apparatus and method, and program |
CN103765508B (zh) | 2012-07-02 | 2017-11-24 | 索尼公司 | 解码装置、解码方法、编码装置和编码方法 |
RU2643644C2 (ru) * | 2012-07-09 | 2018-02-02 | Конинклейке Филипс Н.В. | Кодирование и декодирование аудиосигналов |
EP2690621A1 (en) * | 2012-07-26 | 2014-01-29 | Thomson Licensing | Method and Apparatus for downmixing MPEG SAOC-like encoded audio signals at receiver side in a manner different from the manner of downmixing at encoder side |
JP6230268B2 (ja) * | 2013-05-23 | 2017-11-15 | キヤノン株式会社 | 画像処理装置、画像処理方法およびプログラム |
EP2830046A1 (en) | 2013-07-22 | 2015-01-28 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for decoding an encoded audio signal to obtain modified output signals |
KR102243395B1 (ko) * | 2013-09-05 | 2021-04-22 | 한국전자통신연구원 | 오디오 부호화 장치 및 방법, 오디오 복호화 장치 및 방법, 오디오 재생 장치 |
EP2879131A1 (en) | 2013-11-27 | 2015-06-03 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Decoder, encoder and method for informed loudness estimation in object-based audio coding systems |
KR101536855B1 (ko) * | 2014-01-23 | 2015-07-14 | 재단법인 다차원 스마트 아이티 융합시스템 연구단 | 레지듀얼 코딩을 이용하는 인코딩 장치 및 방법 |
KR101567665B1 (ko) | 2014-01-23 | 2015-11-10 | 재단법인 다차원 스마트 아이티 융합시스템 연구단 | 퍼스널 오디오 스튜디오 시스템 |
WO2015111949A1 (ko) * | 2014-01-23 | 2015-07-30 | 재단법인 다차원 스마트 아이티 융합시스템 연구단 | 보컬 하모닉 코딩을 위한 인코딩 장치, 디코딩 장치 및 그 방법 |
EP2928216A1 (en) | 2014-03-26 | 2015-10-07 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for screen related audio object remapping |
CN107211227B (zh) | 2015-02-06 | 2020-07-07 | 杜比实验室特许公司 | 用于自适应音频的混合型基于优先度的渲染***和方法 |
CN106303897A (zh) | 2015-06-01 | 2017-01-04 | 杜比实验室特许公司 | 处理基于对象的音频信号 |
US11430451B2 (en) * | 2019-09-26 | 2022-08-30 | Apple Inc. | Layered coding of audio with discrete objects |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101361118B (zh) * | 2006-01-19 | 2011-07-27 | Lg电子株式会社 | 处理媒体信号的方法和装置 |
BRPI0802613A2 (pt) * | 2007-02-14 | 2011-08-30 | Lg Electronics Inc | métodos e aparelhos para codificação e decodificação de sinais de áudio baseados em objeto |
JP5260665B2 (ja) | 2007-10-17 | 2013-08-14 | フラウンホッファー−ゲゼルシャフト ツァ フェルダールング デァ アンゲヴァンテン フォアシュンク エー.ファオ | ダウンミックスを用いたオーディオコーディング |
US20100228554A1 (en) * | 2007-10-22 | 2010-09-09 | Electronics And Telecommunications Research Institute | Multi-object audio encoding and decoding method and apparatus thereof |
-
2010
- 2010-06-07 KR KR1020100053549A patent/KR101387902B1/ko active IP Right Grant
- 2010-06-10 EP EP10786390A patent/EP2442303A4/en not_active Ceased
- 2010-06-10 WO PCT/KR2010/003752 patent/WO2010143907A2/ko active Application Filing
- 2010-06-10 US US13/377,334 patent/US8712784B2/en not_active Expired - Fee Related
- 2010-06-10 CN CN201080025528.4A patent/CN102460571B/zh not_active Expired - Fee Related
Non-Patent Citations (2)
Title |
---|
None |
See also references of EP2442303A4 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109887517A (zh) * | 2013-05-24 | 2019-06-14 | 杜比国际公司 | 对音频场景进行解码的方法、解码器及计算机可读介质 |
CN109887517B (zh) * | 2013-05-24 | 2023-05-23 | 杜比国际公司 | 对音频场景进行解码的方法、解码器及计算机可读介质 |
US11682403B2 (en) | 2013-05-24 | 2023-06-20 | Dolby International Ab | Decoding of audio scenes |
Also Published As
Publication number | Publication date |
---|---|
CN102460571A (zh) | 2012-05-16 |
EP2442303A4 (en) | 2012-11-28 |
KR101387902B1 (ko) | 2014-04-22 |
EP2442303A2 (en) | 2012-04-18 |
US20120078642A1 (en) | 2012-03-29 |
CN102460571B (zh) | 2015-05-13 |
KR20100132913A (ko) | 2010-12-20 |
WO2010143907A3 (ko) | 2011-03-03 |
US8712784B2 (en) | 2014-04-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2010143907A2 (ko) | 다객체 오디오 신호를 부호화하는 방법 및 부호화 장치, 복호화 방법 및 복호화 장치, 그리고 트랜스코딩 방법 및 트랜스코더 | |
WO2009123409A2 (ko) | 다객체 오디오 신호의 부가정보 비트스트림 생성 방법 및 장치 | |
WO2015105393A1 (ko) | 삼차원 오디오 재생 방법 및 장치 | |
WO2014021588A1 (ko) | 오디오 신호 처리 방법 및 장치 | |
EP2695162B1 (en) | Audio encoding method and system for generating a unified bitstream decodable by decoders implementing different decoding protocols | |
JP2013127634A (ja) | 符号化装置 | |
US20050273322A1 (en) | Audio signal encoding and decoding apparatus | |
WO2009134085A2 (ko) | 슈퍼 프레임을 이용하여 멀티채널 오디오 신호를 송수신하는 방법 및 장치 | |
WO2014021587A1 (ko) | 오디오 신호 처리 장치 및 방법 | |
WO2021118107A1 (en) | Audio output apparatus and method of controlling thereof | |
WO2017014366A1 (ko) | 멀티 포맷 초고선명 고효율 코덱을 적용한 인코딩 및 트랜스코딩 장치 | |
WO2019054559A1 (ko) | Brir/rir 파라미터화(parameterization)를 적용한 오디오 인코딩 방법 및 파라미터화된 brir/rir 정보를 이용한 오디오 재생 방법 및 장치 | |
WO2012050382A2 (en) | Method and apparatus for downmixing multi-channel audio signals | |
WO2014021586A1 (ko) | 오디오 신호 처리 방법 및 장치 | |
KR102370672B1 (ko) | 오디오 데이터 제공 방법 및 장치, 오디오 메타데이터 제공 방법 및 장치, 오디오 데이터 재생 방법 및 장치 | |
US9312971B2 (en) | Apparatus and method for transmitting audio object | |
WO2012087042A2 (ko) | 객체 기반 오디오를 제공하는 방송 송신 장치 및 방법, 그리고 방송 재생 장치 및 방법 | |
WO2014058275A1 (ko) | 오디오 데이터 생성 장치 및 방법, 오디오 데이터 재생 장치 및 방법 | |
WO2013073810A1 (ko) | 스케일러블 다채널 오디오 신호를 지원하는 부호화 장치 및 복호화 장치, 상기 장치가 수행하는 방법 | |
US12014709B2 (en) | Transmission device, transmission method, reception device and reception method | |
WO2011122731A1 (ko) | 멀티채널 오디오의 다운믹스 방법 및 장치 | |
JP3594029B2 (ja) | データ伝送方法及び装置 | |
US20040030561A1 (en) | Method and apparatus for digital signal communication between computer-based multi-channel audio controller and surround sound systems | |
KR100329830B1 (ko) | 디지털 텔레비젼 시스템의 ac-3 오디오 수신 장치 및그 처리 방법 | |
WO2018021605A1 (ko) | 4k uhd 통합 콘텐츠 제작 장치 및 방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201080025528.4 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10786390 Country of ref document: EP Kind code of ref document: A2 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13377334 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REEP | Request for entry into the european phase |
Ref document number: 2010786390 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2010786390 Country of ref document: EP |