KR20150005431A - Method for converting sound source posion information and apparatus thereof - Google Patents

Method for converting sound source posion information and apparatus thereof Download PDF

Info

Publication number
KR20150005431A
KR20150005431A KR1020140060330A KR20140060330A KR20150005431A KR 20150005431 A KR20150005431 A KR 20150005431A KR 1020140060330 A KR1020140060330 A KR 1020140060330A KR 20140060330 A KR20140060330 A KR 20140060330A KR 20150005431 A KR20150005431 A KR 20150005431A
Authority
KR
South Korea
Prior art keywords
rendering
sound source
object sound
unit
audio signal
Prior art date
Application number
KR1020140060330A
Other languages
Korean (ko)
Inventor
서정일
강경옥
박태진
유재현
이용주
Original Assignee
한국전자통신연구원
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한국전자통신연구원 filed Critical 한국전자통신연구원
Publication of KR20150005431A publication Critical patent/KR20150005431A/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • H04S3/008Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/03Application of parametric coding in stereophonic audio systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Stereophonic System (AREA)

Abstract

An apparatus for converting the location information of a sound source is disclosed. The apparatus includes: a playback position processing unit configured to process location information according to obtained location properties of an object sound source; a rendering parameter calculating unit configured to calculate rendering parameters based on the processed location information and an audio output environment; and a rendering unit configured to generate a multi-channel audio signal by applying the rendering parameters to an audio signal of the object sound source. Therefore, fixed object sound sources and moving object sound sources may be played at high speed at a desired location on a PC, a digital broadcasting terminal, a DVD/Blu-ray player, and a mobile terminal.

Description

Field of the Invention < RTI ID = 0.0 > [0001] < / RTI &

The following embodiments relate to a method and apparatus for converting location information of a sound source. More particularly, the present invention relates to a method and apparatus for converting position information of a sound source for high-speed processing of a moving sound source.

Recent developments of high-definition broadcasting technology such as Ultra High Definition TeleVision (UHDTV) have led to the emergence of 10.2-channel and 22.2-channel audio playback formats consisting of existing 5.1-channel speakers. Attention has been focused on the three-dimensional sound field reconstruction technique required for producing high-order Multichannel Audio contents.

In addition, the object sound source provided with the conventional channel-based multi-channel audio signal is rendered in the terminal by using the metadata composed of the related position information, thereby realizing the sound scene intended by the content creator faithfully There is a growing interest in a technology for expressing the information in a reproduction environment of the terminal optimally.

In order to represent object sound sources at desired positions in a two- or three-dimensional speaker output environment, V. Fulk's Vector Based Amplitude Panning (VBAP), Wave Field Synthesis (WFS), High-order Ambisonics Squared Method (LSM), and Velocity Control Method (VCM). These techniques calculate the matrix or filter needed to convert the input object sound source to the output speaker signal using the output speaker position information and the position information of the sound source, and apply the calculated matrix or filter to the rendering. If the object sound source does not move, the calculation process of the matrix and the filter is performed beforehand, so there is no problem in the high-speed rendering. However, when the object sound source moves, the update of the matrix or the filter is required, If the computation amount is required, there may be problems in real-time implementation.

In addition, at the boundary at which the matrix or the filter is updated, the characteristics of the previous matrix or the filter and the characteristics of the updated matrix or the filter on the time axis or the frequency axis may be different from each other, .

Embodiments of the present invention can reduce the amount of computation required to render an object sound source by converting the position information of the object sound source into a sampled value in the sound reproduction space, And methods.

In one aspect, an object sound source rendering apparatus includes: a playback position processing unit that processes position information according to a position attribute of an obtained object sound source; A rendering parameter calculation unit for calculating a rendering parameter based on the processed position information and the audio output environment; And a rendering unit for generating the multi-channel audio signal by applying the rendering parameter to the audio signal of the object sound source.

According to the present invention, it is possible to reproduce a fixed object sound source and a moving object sound source at a desired position at a high speed in a PC, a digital broadcasting terminal, a DVD / Blu-ray player and a mobile terminal.

1 is a block diagram illustrating an object sound source rendering apparatus for generating a multi-channel signal using position information.
2 is a block diagram illustrating an object sound source rendering apparatus according to an embodiment of the present invention.
3 is a block diagram illustrating another embodiment of an object sound source rendering apparatus according to the present invention.
4 is a block diagram illustrating an apparatus for rendering an object sound source according to the present invention using a rendering parameter storage unit.
5 is a block diagram illustrating another embodiment of an object sound source rendering apparatus according to the present invention using a rendering parameter storage unit.
6 is a diagram showing one embodiment of sampling.
7 is a flowchart illustrating an object sound source rendering method according to an embodiment of the present invention.
FIG. 8 is a flowchart illustrating an embodiment of an object sound source rendering method according to the present invention using stored rendering parameters.

Hereinafter, embodiments will be described in detail with reference to the accompanying drawings.

1 is a block diagram illustrating an object sound source rendering apparatus for generating a multi-channel signal using position information.

Referring to FIG. 1, an object sound source rendering apparatus 100 includes a rendering parameter calculation unit 110 and a rendering unit 120.

The rendering parameter calculation unit 110 calculates an object sound source rendering algorithm such as Vector Base Amplitude Panning (VBAP) using the input object sound source location information 102 and the audio output environment (the number of channels and the location of each channel) And calculates the object sound source rendering parameters necessary for applying the object sound source rendering parameters.

The object sound source rendering unit 120 calculates an output multi-channel (M-channel) audio signal 104 by applying the rendering parameters calculated by the rendering parameter calculation unit 110 to the object sound source audio signal 103. In this case, when there are two or more input object sound source audio signals, a superposition value superimposed on the output value of each object sound source audio signal for each channel can be a final output multi-channel audio signal.

The rendering parameter calculated by the rendering parameter calculation unit 110 need not be updated when the position of the object sound source is fixed. When the position of the object sound source moves, the position information of the sound source of the input object must be changed in order to express that the object sound source moves. When the position of the object sound source moves, the rendering parameter calculation unit 110 updates the rendering parameter every time the position information of the object sound source changes.

2 is a block diagram illustrating an object sound source rendering apparatus according to an embodiment of the present invention.

2, the object sound source rendering apparatus 200 includes a reproduction position processing unit 210, a rendering parameter calculation unit 220, a rendering unit 230, and a boundary region processing unit 240.

If the object sound source to be processed is a sound source (stationary sound source) having a fixed position, the playback position processing unit 210 transmits the position information 202 of the object sound source to the rendering parameter calculation unit 220 without sampling. If the object sound source to be processed is a sound source (moving sound source) in which the position is moved, the playback position processing unit 210 samples the position information 202 of the object sound source to be input and transmits the sampled position information 202 to the rendering parameter calculation unit 220.

Equation (1) represents an equation for sampling the azimuth angle of the position information of the object sound source in units of two degrees, as an example of the sampling method.

[Equation 1]

Figure pat00001

Figure pat00002
(Degree) of the input object sound source location information,
Figure pat00003
Is a sampled horizontal angle,
Figure pat00004
Represents a rounding operation.

The rendering parameter calculation unit 220 compares the position information transmitted from the reproduction position processing unit 210 with previously received position information. When the position information is changed, the rendering parameter calculation unit 220 recalculates the rendering parameters and transmits the calculated rendering parameters to the rendering unit 230. If the position information is not changed, the rendering parameter calculation unit 220 delivers the previously calculated rendering parameter to the rendering unit 230.

The rendering parameter calculation unit 220 uses the audio output environment (the number of channels and the position of each channel) 203 input through a user interface or the like when calculating rendering parameters.

The rendering unit 230 applies the rendering parameters calculated by the rendering parameter calculation unit 220 to the object sound source audio signal 204 to generate an output multi-channel (M-channel) audio signal 205.

The boundary area processing unit 240 can perform processing for preventing discontinuous characteristics from occurring in the boundary area where the rendering parameters are changed. The boundary area processing unit 240 performs processing for preventing discontinuous characteristics from occurring in the boundary area where the rendering parameters are changed for the multi-channel audio signal 205 generated through the rendering unit 230 as shown in FIG. 2 , Or may perform processing for preventing discontinuous characteristics from occurring in the boundary region where the rendering parameters are changed with respect to the rendering parameters calculated by the rendering parameter calculation unit 320 as shown in FIG.

When the input object sound source audio signal is two or more, the superposition values superimposed on the output values of the respective object sound source audio signals may be the final output multi-channel audio signals.

3 is a block diagram illustrating another embodiment of an object sound source rendering apparatus according to the present invention.

3, the object sound source rendering apparatus 300 includes a playback position processing unit 310, a rendering parameter calculation unit 320, a boundary region processing unit 330, and a rendering unit 340.

The object sound source rendering apparatus 300 shown in FIG. 3 receives data from the rendering parameter calculation unit 320 by the boundary region processing unit 330, unlike the object sound source rendering apparatus 200 shown in FIG.

The boundary area processing unit 330 may perform processing for preventing discontinuous characteristics from occurring in the boundary area where the rendering parameters are changed. In particular, the boundary region processing unit 330 may perform processing on the rendering parameters calculated by the rendering parameter calculation unit 320 to prevent discontinuous characteristics from occurring in the boundary region.

The contents described in FIG. 2 may be directly applied to the operations of the reproduction position processing unit 310, the rendering parameter calculation unit 320, and the rendering unit 340.

4 is a block diagram illustrating an apparatus for rendering an object sound source according to the present invention using a rendering parameter storage unit.

4, the object sound source rendering apparatus 400 includes a rendering parameter calculation unit 410, a rendering parameter storage unit 420, a reproduction position processing unit 430, a rendering parameter determination unit 440, a rendering unit 450, And a border area processing unit 460.

The rendering parameter calculation unit 410 previously calculates rendering parameters for each rendering position of the object sound source using the audio output environment (the number of channels and the position of each channel) 401 input through a user interface or the like, And stores it in the storage unit 420. The rendering parameter calculator 410 may calculate the rendering parameters only for the representative values according to the sampling. The rendering parameter storage unit 420 may store rendering parameters only for the representative values according to the sampling. The details of the sampling will be described later with reference to FIG.

The representative value information may be transmitted to the object sound source reproduction position processing unit 430. The reproduction position processing unit 430 may sample the object sound source position information 402 in the same manner as the rendering parameter calculation unit 410. [ The reproduction position processing unit 430 may sample the position information 402 of the object sound source according to the sampling resolution transmitted from the rendering parameter calculation unit 410 and transmit the sampled information to the rendering parameter determination unit 440.

The rendering parameter determination unit 440 reads a rendering parameter corresponding to the position of the sampled object sound source received from the reproduction position processing unit 430 from the rendering parameter storage unit 402 and transmits the read rendering parameter to the rendering unit 450.

The rendering unit 405 may generate the output multi-channel (M-channel) audio signal 404 by applying the received rendering parameters to the audio signal 403 of the input object sound source.

The boundary area processing unit 460 can perform processing for preventing discontinuous characteristics from occurring in the boundary area where the rendering parameters are changed. The boundary region processing unit 460 performs processing for preventing discontinuous characteristics from occurring in the boundary region where the rendering parameters are changed for the multi-channel audio signal generated through the rendering unit 450 as shown in FIG. 4, As shown in FIG. 5, the rendering parameter determination unit 540 may perform processing for preventing discontinuous characteristics from occurring in the boundary region where the rendering parameters are changed with respect to the rendering parameters read.

When the input object sound source audio signal is two or more, the superposition value superimposed on the output value of each object sound source audio signal for each channel can be the final output multi-channel audio signal.

5 is a block diagram illustrating another embodiment of an object sound source rendering apparatus according to the present invention using a rendering parameter storage unit.

4, the object sound source rendering apparatus 500 includes a rendering parameter calculation unit 510, a rendering parameter storage unit 520, a reproduction position processing unit 530, a rendering parameter determination unit 540, a boundary region processing unit 550 And a rendering unit 560.

The object sound source rendering apparatus 500 shown in FIG. 5 is located between the rendering parameter determination unit 540 and the rendering unit 560, unlike the object sound source rendering apparatus 400 shown in FIG.

The boundary region processing unit 550 may perform a process for preventing a discrete property from occurring in a boundary region where a rendering parameter is changed to a rendering parameter read by the rendering parameter determination unit 540. [

The object sound source rendering apparatus 500 includes operations of the rendering parameter calculation unit 510, the rendering parameter storage unit 520, the reproduction position processing unit 530, the rendering parameter determination unit 540, and the rendering unit 560, The description can be applied as it is.

6 is a diagram showing one embodiment of sampling.

Object source rendering techniques such as Higher-Order Ambisonics (HOA), LSM, and VCM require considerable computational complexity to compute rendering parameters. When the object sound source moves fast, a large amount of computation is required to update and apply the rendering parameters. A large amount of computation can cause problems in the representation of appropriate moving source and real time processing.

The position information of the object sound source can be expressed by the distance (r), azimuth angle, and elevation angle at the listening origin, and optionally the size information can be used as the position information. For convenience of calculation, the distance (r) from the listening origin to the object sound source is fixed equal to the radius of the output speaker system, and it can be assumed that only the horizontal angle and altitude change are changed.

Referring to FIG. 6, a horizontal angle 601 is shown on the horizontal axis and an elevation angle 602 is shown on the vertical axis. Referring to region 610 of FIG. 6, dotted line 603, solid line 604, and solid line intersection 605 are shown.

The horizontal angle 601 can be sampled from -180 degrees to +180 degrees in 2 degree intervals and the elevation angle 602 can be sampled in 5 degree intervals from -90 degrees to +90 degrees. In this case, the horizontal angle 601 is set to a positive angle with respect to the listener and the front side, and the altitude angle 602 is set to a positive angle with the upward direction and a negative angle with the upward direction.

In the area 610, the horizontal angle 601 and the altitude angle 602 in the rectangle formed by the dotted line 603 can be represented by the values at which the solid lines in the rectangle formed by the dotted line meet. The information on the intersection 605 of the solid line can be represented by representative value information.

When sampling is performed, the rendering parameter calculation unit can calculate the rendering parameters only for the intersection 605 of the solid line, and the rendering parameter storage unit can store the rendering parameters only for the intersection 605 of the solid line.

The representative value information may also be conveyed to other units of the object sound source rendering device. The representative value information may be transmitted to the reproduction position processing unit, and the reproduction position processing unit may sample the position information of the object sound source in the same manner as the rendering parameter calculation unit. The reproduction position processing unit may sample the input position information of the object sound source according to the sampling resolution transmitted from the object sound source rendering parameter calculation unit and transmit the sampled information to the rendering parameter determination unit. The rendering parameter determination unit may transmit a rendering parameter corresponding to the position of the sampled object sound source from the rendering parameter storage unit to the rendering unit.

7 is a flowchart illustrating an object sound source rendering method according to an embodiment of the present invention.

Referring to FIG. 7, in step 710, the object sound source rendering apparatus samples the position information of the object sound source. The object sound source rendering apparatus may perform step 710 by sampling the position information for at least one of a horizontal angle of a predetermined interval and an altitude angle of a predetermined interval.

In step 720, the object sound source rendering device computes the rendering parameters. When the sampled position information is changed, the object sound source rendering apparatus can calculate a rendering parameter for the sampled position information based on the audio output environment.

In step 730, the object sound source rendering device generates a multi-channel audio signal. The object sound source rendering apparatus can generate the multi-channel audio signal by applying the rendering parameters to the audio signal of the object sound source.

The object sound source rendering apparatus can perform processing for preventing discontinuous characteristics due to a change in the rendering parameters of a multi-channel audio signal or a rendering parameter.

FIG. 8 is a flowchart illustrating an embodiment of an object sound source rendering method according to the present invention using stored rendering parameters.

Referring to FIG. 8, in step 810, the object sound source rendering apparatus may calculate a rendering parameter. The object sound source rendering apparatus may perform step 810 by calculating the rendering parameters of the object sound source for at least one of a horizontal angle of a predetermined interval or an altitude angle of a predetermined interval using an audio output environment.

In step 820, the object sound source rendering device may store the rendering parameters. When the rendering parameter for the representative value is stored in advance, there is an effect of reducing the load according to an excessive amount of computation.

In step 830, the object sound source rendering apparatus may sample the position information of the object sound source. The object sound source rendering apparatus may sample position information for at least one of a horizontal angle at a predetermined interval or an altitude angle at the predetermined interval based on the position information of the object sound source.

In step 840, the object sound source rendering device may read the rendering parameters. The object sound source rendering apparatus can read the rendering parameters corresponding to the sampled position information from the stored rendering parameters.

In step 850, the object sound source rendering device may generate a multi-channel audio signal. The object sound source rendering apparatus can generate the multi-channel audio signal by applying the read rendering parameters to the object sound source audio signal.

The object sound source rendering apparatus can perform processing for preventing a discontinuous characteristic according to a change of a rendering parameter with respect to a multi-channel audio signal or a read rendering parameter.

The method according to an embodiment may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced.

Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims (1)

A playback position processing unit for processing position information according to a position attribute of the obtained object sound source;
A rendering parameter calculation unit for calculating a rendering parameter based on the processed position information and the audio output environment; And
And a rendering unit for applying the rendering parameters to the audio signal of the object sound source to generate a multi-
The object sound source rendering apparatus comprising:
KR1020140060330A 2013-07-05 2014-05-20 Method for converting sound source posion information and apparatus thereof KR20150005431A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR1020130079167 2013-07-05
KR20130079167 2013-07-05

Publications (1)

Publication Number Publication Date
KR20150005431A true KR20150005431A (en) 2015-01-14

Family

ID=52477281

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020140060330A KR20150005431A (en) 2013-07-05 2014-05-20 Method for converting sound source posion information and apparatus thereof

Country Status (1)

Country Link
KR (1) KR20150005431A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019132516A1 (en) * 2017-12-28 2019-07-04 박승민 Method for producing stereophonic sound content and apparatus therefor
CN109983786A (en) * 2016-11-25 2019-07-05 索尼公司 Transcriber, reproducting method, information processing unit, information processing method and program

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109983786A (en) * 2016-11-25 2019-07-05 索尼公司 Transcriber, reproducting method, information processing unit, information processing method and program
US11259135B2 (en) 2016-11-25 2022-02-22 Sony Corporation Reproduction apparatus, reproduction method, information processing apparatus, and information processing method
US11785410B2 (en) 2016-11-25 2023-10-10 Sony Group Corporation Reproduction apparatus and reproduction method
WO2019132516A1 (en) * 2017-12-28 2019-07-04 박승민 Method for producing stereophonic sound content and apparatus therefor

Similar Documents

Publication Publication Date Title
US10674262B2 (en) Merging audio signals with spatial metadata
KR102622714B1 (en) Ambisonic depth extraction
KR102653560B1 (en) Processing appratus mulit-channel and method for audio signals
CN111276153B (en) Apparatus and method for screen-related audio object remapping
KR101759005B1 (en) Loudspeaker position compensation with 3d-audio hierarchical coding
KR101507901B1 (en) Apparatus for changing an audio scene and an apparatus for generating a directional function
KR101751228B1 (en) Efficient coding of audio scenes comprising audio objects
US11749252B2 (en) Signal processing device, signal processing method, and program
CN110537220B (en) Signal processing apparatus and method, and program
US11750999B2 (en) Method and system for handling global transitions between listening positions in a virtual reality environment
US20200413214A1 (en) System for and method of generating an audio image
CN104969576A (en) Audio providing apparatus and audio providing method
US11924627B2 (en) Ambience audio representation and associated rendering
KR20160013861A (en) Audio signal output device and method, encoding device and method, decoding device and method, and program
WO2016172254A1 (en) Spatial audio signal manipulation
WO2014209902A1 (en) Improved rendering of audio objects using discontinuous rendering-matrix updates
KR20150005431A (en) Method for converting sound source posion information and apparatus thereof
CN114175685A (en) Rendering independent mastering of audio content
KR20150005438A (en) Method and apparatus for processing audio signal
JP6306958B2 (en) Acoustic signal conversion device, acoustic signal conversion method, and acoustic signal conversion program
KR20210071972A (en) Signal processing apparatus and method, and program
KR20200017969A (en) Audio apparatus and method of controlling the same
KR20170095105A (en) Apparatus and method for generating metadata of hybrid audio signal

Legal Events

Date Code Title Description
WITN Withdrawal due to no request for examination