UA123055C2

UA123055C2 - Layered coding for compressed sound or sound field representations

Info

Publication number: UA123055C2
Application number: UAA201804929A
Authority: UA
Inventors: Свен КОРДОН; Александр Крюґер; Александр КРЮГЕР
Original assignee: Долбі Інтернешнл Аб; Долби Интернешнл Аб
Priority date: 2015-10-08
Filing date: 2016-10-07
Publication date: 2021-02-10
Also published as: ME03762B; JP2021036342A; CO2018004867A2; JP7110304B2

Abstract

The present document relates to a method of layered encoding of a compressed sound representation of a sound or sound field. The compressed sound representation comprises a basic compressed sound representation comprising a plurality of components, basic side information for decoding the basic compressed sound representation to a basic reconstructed sound representation of the sound or sound field, and enhancement side information including parameters for improving the basic reconstructed sound representation. The method comprises sub-dividing the plurality of components into a plurality of groups of components and assigning each of the plurality of groups to a respective one of a plurality of hierarchical layers, the number of groups corresponding to the number of layers, and the plurality of layers including a baselayer and one or more hierarchical enhancement layers, adding the basic side information to the base layer, and determining a plurality of portions of enhancement side information from the enhancement side information and assigning each of the plurality of portions of enhancement side information to a respective one of the plurality of layers, wherein each portion of enhancement side information includes parameters for improving a reconstructed sound representation obtainable from data included in the respective layer and any layers lower than the respective layer. The document further relates to a method of decoding a compressed sound representation of a sound or sound field, wherein the compressed sound representation is encoded in a plurality of hierarchical layers that include a base layer and one or more hierarchical enhancement layers, as well as to an encoder and a decoder for layered coding of a compressed sound representation.

Description

представлення звуку, що містить множину компонентів, базову допоміжну інформацію для декодування базового стисненого представлення звуку в базове відтворене представлення звуку або звукового поля і поліпшуючу допоміжну інформацію, яка включає в себе параметри для поліпшення базового відтвореного представлення звуку. Спосіб включає підрозділяння множини компонентів на множину груп компонентів і присвоєння кожної з множини груп відповідному одному з множини ієрархічних рівнів, кількість груп відповідає кількості рівнів, і множина рівнів включає в себе базовий рівень і один або більше ієрархічних поліпшуючих рівнів, додавання базової допоміжної інформації до базового рівня, і визначення множини частин поліпшуючої допоміжної інформації з поліпшуючої допоміжної інформації і присвоєння кожної з множини частин поліпшуючої допоміжної інформації відповідному одному 3 множини рівнів, причому кожна частина поліпшуючої допоміжної інформації включає в себе параметри для поліпшення відтвореного представлення звуку, доступні з даних, включених у відповідний рівень і будь-які рівні нижче відповідного рівня. Документ також стосується способу декодування стисненого представлення звуку або звукового поля, причому стиснене представлення звуку закодоване в множині ієрархічних рівнів, які включають в себе базовий рівень і один або більше ієрархічних поліпшуючих рівнів, а також кодера і декодера для багаторівневого кодування стисненого представлення звуку. 1 Пакнеттранєпортних рівні (a sound representation comprising a plurality of components, basic auxiliary information for decoding the basic compressed sound representation into a basic reproduced sound representation or sound field, and enhancing auxiliary information that includes parameters for improving the basic reproduced sound representation. The method includes subdividing a plurality of components into a plurality of component groups and assigning each of the plurality of groups to a corresponding one of a plurality of hierarchical levels, the number of groups corresponds to the number of levels, and the plurality of levels includes a base level and one or more hierarchical improvement levels, adding basic auxiliary information to the basic level, and determining a plurality of pieces of enhancement auxiliary information from the enhancement auxiliary information and assigning each of the plurality of pieces of enhancement auxiliary information to a corresponding one 3 of the plurality of levels, and each portion of the enhancement auxiliary information includes parameters for improving the reproduced sound representation available from the data included in the relevant level and any levels below the relevant level. The document also relates to a method for decoding a compressed sound representation or a sound field, wherein the compressed sound representation is encoded in a plurality of hierarchical levels that include a base level and one or more hierarchical enhancement levels, and an encoder and decoder for multi-level encoding of the compressed sound representation. 1 packet transport levels (

ОЗ вакео 7 13032130 їі нн їOZ Vaceo 7 13032130 ии нн и

ЕЕ ; Її 1 і ж не ХК ія ЗАЗEE Her 1 is not ХК ия ZAZ

ГОровнияятяю 8; Пакет / і : : вн ка Ж базового рівня: ї Ї нан ЄЄ інаоноовоновононівві сн ши ЩА Е ще ШИ поте-оМох ронHorovniayatayayu 8; Package / and : : vn ka Ж of the basic level:

Пакетловного 00000 МНЕ т 21104, 5. 53300) стисненОгіх ЗЕ НН хи НВ ї Б ї представлення знуки : : ї джлнктнкняя о хPacket 00000 MNE t 21104, 5. 53300) compressed

Сабо звуковага паяяр В. ВЕ й Панят та - 3001 дяводногонадяю 00 В І полівшуючого г "бніінінініінініні нон інінінініінінінініінінінініінінінінц жі: БО ші ЩІ сх Н е і Ор ненні 0 ШЕНЯЯЕ и ; 5 й : КЗ і знов нпононвнис явнийSabo Sound Payar V. Ve and Patyat and - 3001 Domodinnovaya 00 V and lingering G "Bnininininininininininininininininininininininininininininininininininininininininz zhi: For

Гай Енн і ОМ дл ЗМО. ЗВО У 2100 о жк ж: я ВІЯОМ весь ї В Шон ; дит 2500 З. М ї ; неон окт нен : 5 ! рівне ДІМА я пиши ши І М | і шоGuy Ann and OM dl ZMO. ZVO At 2100 o'clock the same: I VIAOM the whole i In Sean ; child 2500 Z. M i ; neon oct nen : 5 ! equal to DIMA I write shi I M | so what

Повні Є нн Ок ТОМІComplete Ye nn Ok TOMY

Фіг. 2Fig. 2

ПЕРЕХРЕСНЕ ПОСИЛАННЯ НА СПОРІДНЕНІ ЗАЯВКИCROSS-REFERENCE TO RELATED APPLICATIONS

За даною заявкою запитується пріоритет відповідно до європейської патентної заявки Мо 15306590.9, поданої 8 жовтня 2015 року, і патентної заявки США Мо 62/361,809, зміст яких повністю включений в дану заявку за допомогою посилання.This application claims priority under European patent application Mo 15306590.9, filed October 8, 2015, and US patent application Mo 62/361,809, the contents of which are fully incorporated herein by reference.

ГАЛУЗЬ ТЕХНІКИ, ДО ЯКОЇ НАЛЕЖИТЬ ВИНАХІДFIELD OF TECHNOLOGY TO WHICH THE INVENTION BELONGS

Даний документ стосується способів і пристроїв для багаторівневого аудіокодування.This document relates to methods and devices for multi-level audio coding.

Зокрема, даний документ стосується способів і пристроїв для багаторівневого аудіокодування стиснених представлень звуку (або звукового поля), наприклад представлення звуку (або звукового поля) системи Амбісонік вищого порядку (Нідпег-Огаег Атрбрізопіс5, НОА).In particular, this document relates to methods and devices for multi-level audio coding of compressed representations of sound (or sound field), such as the representation of sound (or sound field) of a higher-order Ambisonic system (Nidpeg-Ogaeg Atrbrizopis5, NOA).

РІВЕНЬ ТЕХНІКИTECHNICAL LEVEL

Для потокового представлення звуку (або звукового поля) по каналу передачі зі змінними за часом умовами багаторівневе кодування є засобом, щоб адаптувати якість прийнятого представлення звуку до умов передачі і, зокрема, уникнути небажаних зникнень сигналу.For streaming sound (or sound field) over a transmission channel with time-varying conditions, multi-level coding is a means to adapt the quality of the received sound representation to the transmission conditions and, in particular, to avoid unwanted signal drops.

Для багаторівневого кодування представлення звуку (або звукового поля) звичайно підрозділяється на високопріоритетний базовий рівень відносно невеликого розміру і додаткові поліпшуючі рівні зі спадними пріоритетами і довільними розмірами. Кожен поліпшуючий рівень, як звичайно передбачається, містить наростаючу інформацію, щоб доповнити всі з більш низьких рівнів для поліпшення якості представлення звуку (або звукового поля). Величиною захисту від помилок для передачі окремих рівнів керують на основі їх пріоритету. Зокрема, базовому рівню надається високий захист від помилок, який є розумним і прийнятним внаслідок його малого розміру.For multi-level coding, the representation of the sound (or sound field) is usually divided into a high-priority base level of relatively small size and additional enhancing levels with decreasing priorities and arbitrary sizes. Each enhancement level is typically assumed to contain incremental information to complement all of the lower levels to improve the quality of the sound (or sound field) representation. The amount of error protection for transmission of individual layers is controlled based on their priority. In particular, the base layer is provided with high error protection, which is reasonable and acceptable due to its small size.

Однак існує потреба в багаторівневих схемах кодування для (розширеної версії) спеціальних типів стиснених представлень звуку або звукових полів, таких як, наприклад, стиснені представлення НОА звуку або звукового поля.However, there is a need for multi-level coding schemes for (extended version) special types of compressed representations of sound or sound fields, such as, for example, compressed NOA representations of a sound or sound field.

Даний документ вирішує згадані вище проблеми. Зокрема, описані способи і кодери/декодери для багаторівневого кодування стиснених представлень звуку або звукового поля.This document solves the problems mentioned above. In particular, methods and encoders/decoders for multi-level coding of compressed representations of sound or sound field are described.

СУТЬ ВИНАХОДУESSENCE OF THE INVENTION

Відповідно до аспекту описаний спосіб багаторівневого кодування стисненогоAccording to the aspect, a method of multi-level compression encoding is described

Зо представлення звуку або звукового поля. Стиснене представлення звуку може включати в себе базове стиснене представлення звуку, що включає в себе множину компонентів. Множина компонентів може бути взаємодоповнюючими компонентами. Стиснене представлення звуку може додатково включати в себе базову допоміжну інформацію для декодування базового стисненого представлення звуку в базове відтворене представлення звуку або звукового поля.From the presentation of sound or sound field. A compressed audio representation may include a basic compressed audio representation that includes a plurality of components. A plurality of components may be complementary components. The compressed sound representation may further include basic support information for decoding the basic compressed sound representation into a basic reproduced sound or sound field representation.

Стиснене представлення звуку може, крім того, включати в себе поліпшуючу допоміжну інформацію, що включає в себе параметри для поліпшення (наприклад, розширення) базового відтвореного представлення звуку. Спосіб може включати в себе підрозділяння (наприклад, угруповання) множини компонентів на множину груп компонентів. Спосіб може додатково включати в себе присвоєння (наприклад, додавання) кожної з множини груп відповідному одному 3 множини ієрархічних рівнів. Присвоєння може вказувати відповідність між відповідними групами і рівнями. Можна сказати, що компоненти, присвоєні відповідному рівню, включені в цей рівень. Кількість груп може відповідати (наприклад, дорівнювати) кількості рівнів.The compressed sound representation may, in addition, include enhancement auxiliary information including parameters to improve (eg, expand) the basic reproduced sound representation. The method may include subdividing (for example, grouping) a plurality of components into a plurality of component groups. The method may additionally include assigning (for example, adding) each of the plurality of groups to a corresponding one of the 3 plurality of hierarchical levels. The assignment can indicate the correspondence between the respective groups and levels. We can say that the components assigned to the respective level are included in that level. The number of groups can correspond to (for example, equal to) the number of levels.

Множина рівнів може включати в себе базовий рівень і один або більше ієрархічних поліпшуючих рівнів. Множина ієрархічних рівнів може бути впорядкована від базового рівня, через перший поліпшуючий рівень, другий поліпшуючий рівень і т. д., аж до загального найбільш високого поліпшуючого рівня (загального найбільш високого рівня). Спосіб може додатково включати в себе додавання базової допоміжної інформації до базового рівня (наприклад, включення базової допоміжної інформації в базовий рівень або розподіл базової допоміжної інформації базовому рівню, наприклад, з метою передачі або зберігання). Спосіб може додатково включати в себе визначення множини частин поліпшуючої допоміжної інформації на основі поліпшуючої допоміжної інформації. Спосіб може, крім того, включати в себе присвоєння (наприклад, додавання) кожної з множини частин поліпшуючої допоміжної інформації відповідному одному з множини рівнів. Кожна частина поліпшуючої допоміжної інформації може включати в себе параметри для поліпшення відтвореного (наприклад, відновленого) представлення звуку, доступні з даних, включених (наприклад, присвоєних або доданих) у відповідний рівень і будь-які рівні нижче відповідного рівня. Багаторівневе кодування може бути виконане з метою передачі по каналу передачі або з метою збереження на придатному запам'ятовувальному носії, такому як, наприклад, СО, ЮСМО або Віи-гау Оівс м,The plurality of levels may include a base level and one or more hierarchical enhancement levels. A plurality of hierarchical levels may be ordered from the base level, through the first improvement level, the second improvement level, etc., up to the overall highest improvement level (the overall highest level). The method may further include adding base support information to the base layer (eg, incorporating base support information into the base layer or distributing base support information to the base layer, for example, for transmission or storage purposes). The method may additionally include determining a plurality of parts of the improving auxiliary information based on the improving auxiliary information. The method may, in addition, include assigning (eg, adding) each of the plurality of pieces of enhancing auxiliary information to a corresponding one of the plurality of levels. Each piece of enhancement support information may include parameters to enhance the reproduced (eg, reconstructed) sound representation available from data included (eg, assigned or appended) to the corresponding layer and any levels below the corresponding layer. Multi-level coding may be performed for transmission over a transmission channel or for storage on a suitable storage medium, such as, for example, SO, USMO or Vii-gau Oivs m,

Сконфігурований, як згадано вище, запропонований спосіб дозволяє ефективно 60 застосовувати багаторівневе кодування до стиснених представлень звуку, що містять множину компонентів, а також першу і поліпшуючу допоміжну інформацію (наприклад, незалежну базову допоміжну інформацію і поліпшуючу допоміжну інформацію), що мають викладені вище властивості. Зокрема, запропонований спосіб гарантує, що кожен рівень включає в себе придатну допоміжну інформацію для відновлення відтвореного представлення звуку з компонентів, включених у будь-які рівні аж до рівня, що розглядається. При цьому передбачається, що рівні аж до розглянутого рівня включають в себе, наприклад, базовий рівень, перший поліпшуючий рівень, другий поліпшуючий рівень і т. д., аж до рівня, що розглядається. Таким чином, незалежно від фактичного найбільш високого застосовного рівня (наприклад, рівня нижче найбільш низького шару, який не був коректно прийнятий, і, таким чином, всі рівні нижче найбільш високого застосовного рівня і сам найбільш високий застосовний рівень прийняті коректно), декодеру дозволяється поліпшити або розширити відтворене представлення звуку, навіть якщо відтворене представлення звуку може відрізнятися від повного представлення звуку. Зокрема, незалежно від фактичного найбільш високого застосовного рівня, для декодера достатньо декодувати корисне навантаження поліпшуючої допоміжної інформації тільки для єдиного шару (тобто для найбільш високого застосовного рівня), щоб поліпшити або розширити відтворене представлення звуку, яке доступне на основі всіх компонентів, включених у рівні аж до фактичного найбільш високого застосовного рівня. Таким чином, для кожного часового інтервалу (наприклад, кадру) повинно бути декодоване тільки єдине корисне навантаження поліпшуючої допоміжної інформації. З іншого боку, запропонований спосіб дозволяє повністю використовувати перевагу скорочення необхідної ширини смуги, що може бути досягнуто при застосуванні багаторівневого кодування.Configured as mentioned above, the proposed method makes it possible to efficiently 60 apply multi-level coding to compressed sound representations containing a plurality of components, as well as first and enhancement auxiliary information (eg, independent basic auxiliary information and enhancement auxiliary information) having the properties set forth above. In particular, the proposed method ensures that each layer includes suitable auxiliary information for reconstructing a reproduced sound representation from components included in any layers up to the layer under consideration. It is assumed that the levels up to the considered level include, for example, the basic level, the first improving level, the second improving level, etc., up to the considered level. Thus, regardless of the actual highest applicable layer (eg, a layer below the lowest layer that was not correctly received, and thus all levels below the highest applicable layer and the highest applicable layer itself are correctly received), the decoder is allowed to improve or expand the reproduced representation of the sound, even though the reproduced representation of the sound may differ from the full representation of the sound. In particular, regardless of the actual highest applicable layer, it is sufficient for the decoder to decode the enhancement auxiliary information payload for only a single layer (i.e., the highest applicable layer) in order to improve or extend the reproduced sound representation that is available based on all the components included in the layers up to the actual highest applicable level. Thus, for each time interval (for example, a frame) only a single payload of enhancing auxiliary information should be decoded. On the other hand, the proposed method allows you to fully use the advantage of reducing the required bandwidth, which can be achieved when using multi-level coding.

У варіантах здійснення компоненти базового стисненого представлення звуку можуть відповідати монауральним сигналам (наприклад, транспортним сигналам або монауральним транспортним сигналам). Монауральні сигнали можуть представляти або переважні звукові сигнали, або послідовності коефіцієнтів представлення НОА. Монауральні сигнали можуть бути квантовані.In embodiments, the components of the underlying compressed audio representation may correspond to monaural signals (eg, transport signals or monaural transport signals). Monaural signals can represent either dominant sound signals or sequences of NOA representation coefficients. Monaural signals can be quantized.

У варіантах здійснення базова допоміжна інформація може включати в себе інформацію, яка визначає декодування (наприклад, відновлення) одного або більше з множини компонентів індивідуально, незалежно від інших компонентів. Наприклад, базова допоміжна інформаціяIn embodiments, the basic auxiliary information may include information that determines the decoding (eg, recovery) of one or more of the plurality of components individually, independently of the other components. For example, basic supporting information

Зо може представляти допоміжну інформацію, яка стосується індивідуальних монауральних сигналів, незалежно від інших монауральних сигналів. Таким чином, базова допоміжна інформація може згадуватися як незалежна базова допоміжна інформація.Zo can represent auxiliary information that is relevant to individual monaural signals independently of other monaural signals. Thus, the basic supporting information may be referred to as an independent basic supporting information.

У варіантах здійснення поліпшуюча допоміжна інформація може представляти поліпшуючу допоміжну інформацію. Поліпшуюча допоміжна інформація може включати в себе параметри прогнозування для базового стисненого представлення звуку для поліпшення (наприклад, розширення) базового відтвореного представлення звуку, які доступні з базового стисненого представлення звуку і базової допоміжної інформації.In embodiments, the enhancement support information may represent enhancement support information. The enhancing support information may include prediction parameters for the base compressed sound representation to improve (eg, expand) the base reproduced sound representation, which are available from the base compressed sound representation and the base support information.

У варіантах здійснення спосіб може додатково включати в себе формування транспортного потоку для передачі даних множини рівнів (наприклад, даних, присвоєних або доданих до відповідних рівнів або іншим чином включених у відповідні рівні). Базовий рівень може мати найбільш високий пріоритет передачі, і ієрархічні поліпшуючі рівні можуть мати спадні пріоритети передачі. Таким чином, пріоритет передачі може зменшуватися від базового рівня до першого поліпшуючого рівня, від першого поліпшуючого рівня до другого поліпшуючого рівня і т. д. Величиною захисту від помилок для передачі даних множини рівнів можна керувати відповідно до відповідних пріоритетів передачі. Тим самим може бути забезпечено, що щонайменше деяка кількість більш низьких рівнів передається достовірно, з іншого боку, скорочуючи повну необхідну ширину смуги без застосування надмірного захисту від помилок до більш високих рівнів.In embodiments, the method may further include forming a transport stream to transmit data of a plurality of layers (eg, data assigned or added to or otherwise included in the respective layers). A base level may have the highest transmission priority, and hierarchical enhancement levels may have descending transmission priorities. Thus, the transmission priority can decrease from the basic level to the first improvement level, from the first improvement level to the second improvement level, etc. The amount of error protection for data transmission of a plurality of levels can be controlled according to the respective transmission priorities. It can thereby be ensured that at least some of the lower levels are transmitted reliably, on the other hand, reducing the total required bandwidth without applying excessive error protection to the higher levels.

У варіантах здійснення спосіб може додатково включати в себе для кожного з множини рівнів формування пакета транспортного рівня, що включає в себе дані відповідного рівня.In embodiments, the method may additionally include, for each of the plurality of layers, the formation of a packet of the transport layer, which includes the data of the corresponding layer.

Наприклад, для кожного часового інтервалу (наприклад, кадру) відповідний пакет транспортного рівня може бути сформований для кожного з множини рівнів.For example, for each time slot (eg, frame), a corresponding transport layer packet can be formed for each of the plurality of layers.

У варіантах здійснення стиснене представлення звуку може додатково включати в себе додаткову базову допоміжну інформацію для декодування базового стисненого представлення звуку в базове відтворене представлення звуку. Додаткова базова допоміжна інформація може включати в себе інформацію, яка визначає декодування одного або більше з множини компонентів залежно від відповідних інших компонентів. Спосіб може додатково включати в себе виконання декомпозиції додаткової базової допоміжної інформації на множину частин додаткової базової допоміжної інформації. Спосіб може, крім того, включати в себе додавання 60 частин додаткової базової допоміжної інформації до базового рівня (наприклад, включення частини додаткової базової допоміжної інформації в базовий рівень або розподіл частин додаткової базової допоміжної інформації базовому рівню, наприклад, з метою передачі або зберігання). Кожна частина додаткової базової допоміжної інформації може бути пов'язана з відповідним рівнем і може включати в себе інформацію, яка визначає декодування одного або більше компонентів, присвоєних відповідному рівню, залежно (тільки) від відповідних інших компонентів, присвоєних відповідному рівню і будь-яким рівням нижче відповідного рівня. Таким чином, кожна частина додаткової базової допоміжної інформації визначає компоненти на відповідному рівні, якому відповідає ця частина додаткової базової допоміжної інформації, незалежно від будь-яких інших компонентів, присвоєних більш високим рівням, ніж відповідний рівень.In embodiments, the compressed sound representation may further include additional base support information for decoding the base compressed sound representation into the base reproduced sound representation. Additional basic support information may include information that determines the decoding of one or more of the plurality of components depending on the corresponding other components. The method may additionally include performing decomposition of the additional basic auxiliary information into multiple parts of the additional basic auxiliary information. The method may further include adding 60 pieces of additional base support information to the base layer (e.g., incorporating a portion of the additional base support information into the base layer or distributing portions of the additional base support information to the base layer, e.g., for transmission or storage purposes). Each piece of additional basic auxiliary information may be associated with a corresponding layer and may include information that determines the decoding of one or more components assigned to the corresponding layer depending (only) on the corresponding other components assigned to the corresponding layer and any layers below the appropriate level. Thus, each piece of additional basic supporting information identifies components at the corresponding level to which that part of additional basic supporting information corresponds, regardless of any other components assigned to higher levels than the corresponding level.

Сконфігурований таким чином запропонований спосіб уникає фрагментації додаткової базової допоміжної інформації за допомогою додавання всіх частин до базового рівня. Інакше кажучи, всі частини додаткової базової допоміжної інформації включені в базовий рівень.Configured in this way, the proposed method avoids the fragmentation of additional basic auxiliary information by adding all parts to the basic level. In other words, all parts of additional basic supporting information are included in the basic level.

Декомпозиція додаткової базової допоміжної інформації гарантує, що для кожного рівня доступна частина додаткової базової допоміжної інформації, що не вимагає знання компонентів на більш високих рівнях. Таким чином, незалежно від фактичного найбільш високого застосовного рівня, для декодера достатньо декодувати додаткову базову допоміжну інформацію, включену в рівні аж до найбільш високого застосовного рівня.The decomposition of additional basic support information ensures that for each level a part of the additional basic support information is available, which does not require knowledge of the components at higher levels. Thus, regardless of the actual highest applicable level, it is sufficient for the decoder to decode the additional basic auxiliary information included in the levels up to the highest applicable level.

У варіантах здійснення додаткова базова допоміжна інформація може включати в себе інформацію, яка визначає декодування (наприклад, відновлення) одного або більше з множини компонентів залежно від інших компонентів. Наприклад, додаткова базова допоміжна інформація може представляти допоміжну інформацію, яка стосується індивідуальних монауральних сигналів, залежно від інших монауральних сигналів. Таким чином, додаткова базова допоміжна інформація може згадуватися як залежна базова допоміжна інформація.In embodiments, the additional basic support information may include information that determines the decoding (eg, recovery) of one or more of the plurality of components depending on the other components. For example, the additional base support information may represent support information that relates to individual monaural signals depending on other monaural signals. Thus, additional basic supporting information may be referred to as dependent basic supporting information.

У варіантах здійснення стиснене представлення звуку може бути оброблене для послідовних часових інтервалів, наприклад часових інтервалів однакового розміру. Послідовні часові інтервали можуть бути кадрами. Таким чином, спосіб може працювати на основі кадрів, тобто стиснене представлення звуку може бути закодоване покадрово. Стиснене представлення звуку може бути доступне для кожного послідовного часового інтервалуIn embodiments, the compressed audio representation may be processed for consecutive time intervals, such as time intervals of the same size. Consecutive time intervals can be frames. Thus, the method can operate on a frame-by-frame basis, i.e., the compressed audio representation can be encoded frame-by-frame. A compressed audio representation may be available for each successive time slot

Зо (наприклад, для кожного кадру). Таким чином, операція стиснення, за допомогою якої було одержане стиснене представлення звуку, може працювати на основі кадрів.Zo (for example, for each frame). Thus, the compression operation by which the compressed audio representation was obtained can operate on a frame-by-frame basis.

У варіантах здійснення спосіб може додатково включати в себе формування інформації конфігурації, яка вказує для кожного рівня компоненти базового стисненого представлення звуку, які присвоєні цьому рівню. Таким чином, декодер може без складностей здійснити доступ до інформації, необхідної для декодування, без непотрібного аналізу прийнятих корисних навантажень даних.In embodiments, the method may further include generating configuration information that indicates, for each level, the components of the underlying compressed sound representation that are assigned to that level. Thus, the decoder can easily access the information needed for decoding without unnecessary analysis of the received data payloads.

Відповідно до іншого аспекту описаний спосіб багаторівневого кодування стисненого представлення звуку або звукового поля. Стиснене представлення звуку може включати в себе базове стиснене представлення звуку, що включає в себе множину компонентів. Множина компонентів може бути взаємодоповнюючими компонентами. Стиснене представлення звуку може додатково включати в себе базову допоміжну інформацію (наприклад, незалежну базову допоміжну інформацію) і третю інформацію (наприклад, залежну базову допоміжну інформацію) для декодування базового стисненого представлення звуку в базове відтворене представлення звуку або звукового поля. Базова допоміжна інформація може включати в себе інформацію, яка визначає декодування одного або більше з множини компонентів індивідуально, незалежно від інших компонентів. Додаткова базова допоміжна інформація може включати в себе інформацію, яка визначає декодування одного або більше з множини компонентів залежно від відповідних інших компонентів. Спосіб може включати в себе підрозділяння (наприклад, угруповання) множини компонентів на множину груп компонентів. Спосіб може додатково включати в себе присвоєння (наприклад, додавання) кожної з множини груп відповідному одному з множини ієрархічних рівнів. Присвоєння може вказувати відповідність між відповідними групами і рівнями.According to another aspect, a method of multi-level coding of a compressed sound representation or sound field is described. A compressed audio representation may include a basic compressed audio representation that includes a plurality of components. A plurality of components may be complementary components. The compressed sound representation may further include basic auxiliary information (eg, independent basic auxiliary information) and third information (eg, dependent basic auxiliary information) to decode the basic compressed sound representation into a basic reproduced sound representation or sound field. The basic auxiliary information may include information that determines the decoding of one or more of the plurality of components individually, independently of other components. Additional basic support information may include information that determines the decoding of one or more of the plurality of components depending on the corresponding other components. The method may include subdividing (for example, grouping) a plurality of components into a plurality of component groups. The method may additionally include assigning (eg, adding) each of the plurality of groups to a corresponding one of the plurality of hierarchical levels. The assignment can indicate the correspondence between the respective groups and levels.

Можна сказати, що компоненти, присвоєні відповідному рівню, включені в цей рівень. Кількість груп може відповідати (наприклад, дорівнювати) кількості рівнів. Множина рівнів може включати в себе базовий рівень і один або більше ієрархічних поліпшуючих рівнів. Спосіб може додатково включати в себе додавання базової допоміжної інформації до базового рівня (наприклад, включення базової допоміжної інформації в базовий рівень або розподіл базової допоміжної інформації базовому рівню, наприклад, з метою передачі або зберігання). Спосіб може додатково включати в себе виконання декомпозиції додаткової базової допоміжної інформації на множину частин додаткової базової допоміжної інформації і додавання частин додаткової 60 базової допоміжної інформації до базового рівня (наприклад, включення частин додаткової базової допоміжної інформації в базовий рівень або розподіл частин додаткової базової допоміжної інформації базовому рівню, наприклад, з метою передачі або зберігання). Кожна частина додаткової базової допоміжної інформації може бути пов'язана з відповідним рівнем і включати в себе інформацію, яка визначає декодування одного або більше компонентів, присвоєних відповідному рівню, залежно від відповідних інших компонентів, присвоєних відповідному рівню і будь-яким рівням нижче відповідного рівня.We can say that the components assigned to the respective level are included in that level. The number of groups can correspond to (for example, equal to) the number of levels. The plurality of levels may include a base level and one or more hierarchical enhancement levels. The method may further include adding base support information to the base layer (eg, incorporating base support information into the base layer or distributing base support information to the base layer, for example, for transmission or storage purposes). The method may additionally include performing decomposition of additional basic auxiliary information into multiple parts of additional basic auxiliary information and adding parts of additional basic auxiliary information 60 to the basic layer (for example, including parts of additional basic auxiliary information in the basic layer or distributing parts of additional basic auxiliary information to the basic level, for example, for the purpose of transmission or storage). Each piece of additional basic auxiliary information may be associated with a corresponding layer and include information that determines the decoding of one or more components assigned to the corresponding layer depending on the corresponding other components assigned to the corresponding layer and any layers below the corresponding layer.

Сконфігурований таким чином запропонований спосіб гарантує, що для кожного рівня доступна придатна додаткова базова допоміжна інформація для декодування компонентів, включених у будь-який рівень аж до відповідного рівня, не вимагаючи коректного прийому або декодування (або, у цілому, знання) будь-яких більш високих рівнів. У випадку стисненого представлення НОА запропонований спосіб гарантує, що в режимі векторного кодування придатний М-вектор доступний для всього компонента, що належить рівням аж до найбільш високого застосовного рівня. Зокрема, запропонований спосіб виключає випадок, у якому елементи У-вектора, що відповідає компонентам на більш високих рівнях, явно не повідомлені.Configured in this way, the proposed method ensures that suitable additional basic support information is available for each layer to decode the components included in any layer up to the corresponding layer, without requiring the correct reception or decoding (or, in general, knowledge) of any more high levels. In the case of a compressed NOA representation, the proposed method ensures that in the vector coding mode, a suitable M-vector is available for the entire component belonging to the levels up to the highest applicable level. In particular, the proposed method excludes the case in which the elements of the U-vector corresponding to the components at higher levels are not explicitly reported.

Відповідно до цього інформація, включена в рівні аж до найбільш високого застосовного рівня, є достатньою для декодування (наприклад, відновлення) будь-яких компонентів, що належать рівням аж до найбільш високого застосовного рівня. Тим самим забезпечується придатне відновлення відповідних відтворених представлень НОА для більш низьких рівнів, навіть якщо більш високі рівні не могли бути коректно прийняті декодером. З іншого боку, запропонований спосіб дозволяє повністю використовувати переваги скорочення необхідної ширини смуги, що може бути досягнуто при застосуванні багаторівневого кодування.Accordingly, the information included in the levels up to the highest applicable level is sufficient to decode (eg, recover) any components belonging to the levels up to the highest applicable level. This ensures a suitable restoration of the corresponding reproduced representations of the NOA for the lower levels, even if the higher levels could not be correctly received by the decoder. On the other hand, the proposed method allows you to fully use the advantages of reducing the required bandwidth, which can be achieved when using multi-level coding.

Варіанти здійснення цього аспекту можуть належати до варіантів здійснення попереднього аспекту.The embodiments of this aspect may belong to the embodiments of the previous aspect.

Відповідно до іншого аспекту описаний спосіб декодування стисненого представлення звуку або звукового поля. Стиснене представлення звуку може бути закодоване у множині ієрархічних рівнів. Множина ієрархічних рівнів може включати в себе базовий рівень і один або декілька ієрархічних зростаючих рівнів. Множина рівнів може мати присвоєні їм компоненти базового стисненого представлення звуку або звукового поля. Інакше кажучи, множина рівнів може включати в себе компоненти базової стисненої допоміжної інформації. Компоненти можуть бутиAccording to another aspect, a method for decoding a compressed representation of a sound or a sound field is described. A compressed representation of sound can be encoded in a plurality of hierarchical levels. A plurality of hierarchical levels may include a base level and one or more hierarchical ascending levels. A set of layers can have components of an underlying compressed sound representation or sound field assigned to them. In other words, the plurality of levels may include components of basic compressed auxiliary information. Components can be

Зо присвоєні відповідним рівням у відповідних групах компонентів. Множина компонентів може бути взаємодоповнюючими компонентами. Базовий рівень може включати в себе базову допоміжну інформацію для декодування базового стисненого представлення звуку. Кожен рівень може включати в себе частину поліпшуючої допоміжної інформації, що включає в себе параметри для поліпшення базового відтвореного представлення звуку, доступні з даних, включених у відповідний рівень і будь-які рівні нижче відповідного рівня. Спосіб може включати в себе прийом корисних навантажень даних, відповідно пов'язаних з множиною ієрархічних рівнів. Спосіб може додатково включати в себе визначення першого індексу рівня, що вказує найбільш високий застосовний рівень з множини рівнів для використання для декодування базового стисненого представлення звуку в базове відтворене представлення звуку або звукового поля. Спосіб може додатково включати в себе одержання базового відтвореного представлення звуку з компонентів, присвоєних найбільш високому застосовному рівню і будь- яким рівням нижче найбільш високого застосовного рівня, з використанням базової допоміжної інформації. Спосіб може додатково включати в себе визначення другого індексу рівня, що вказує, яка частина поліпшуючої допоміжної інформації повинна використовуватися для поліпшення (наприклад, розширення) базового відтвореного представлення звуку. Спосіб, крім того, може включати в себе одержання відтвореного представлення звуку або звукового поля з базового відтвореного представлення звуку з посиланням на другий індекс рівня.Z are assigned to the corresponding levels in the corresponding component groups. A plurality of components may be complementary components. The base layer may include basic support information for decoding the basic compressed audio representation. Each layer may include a portion of enhancement support information, including parameters to enhance the underlying reproduced sound representation, available from data included in the respective layer and any layers below the respective layer. The method may include receiving data payloads, respectively, associated with a plurality of hierarchical levels. The method may further include determining a first level index indicating the highest applicable level of the plurality of levels to use for decoding the base compressed sound representation into the base reproduced sound or sound field representation. The method may further include deriving a basic reproduced representation of the sound from the components assigned to the highest applicable level and any levels below the highest applicable level, using the basic auxiliary information. The method may further include determining a second level index that indicates how much of the enhancement support information should be used to enhance (eg, expand) the base reproduced sound representation. The method may further include deriving a reproduced sound representation or sound field from a base reproduced sound representation with reference to the second level index.

Сконфігурований таким чином запропонований спосіб гарантує, що відтворене представлення звуку має оптимальну якість з використанням доступної (наприклад, коректно прийнятої) інформації в найкращому можливому ступені.Configured in this way, the proposed method ensures that the reproduced sound representation is of optimal quality using the available (eg correctly received) information to the best possible extent.

У варіантах здійснення компоненти базового стисненого представлення звуку можуть відповідати монауральним сигналам (наприклад, монауральним транспортним сигналам).In embodiments, the components of the underlying compressed audio representation may correspond to monaural signals (eg, monaural transport signals).

Монауральні сигнали можуть представляти або переважні звукові сигнали, або послідовності коефіцієнтів представлення НОА. Монауральні сигнали можуть бути квантовані.Monaural signals can represent either dominant sound signals or sequences of NOA representation coefficients. Monaural signals can be quantized.

У варіантах здійснення базова допоміжна інформація може включати в себе інформацію, яка визначає декодування (наприклад, відновлення) одного або більше з множини компонентів індивідуально, незалежно від інших компонентів. Наприклад, базова допоміжна інформація може представляти допоміжну інформацію, яка стосується індивідуальних монауральних сигналів, незалежно від інших монауральних сигналів. Таким чином, базова допоміжна бо інформація може згадуватися як незалежна базова допоміжна інформація.In embodiments, the basic auxiliary information may include information that determines the decoding (eg, recovery) of one or more of the plurality of components individually, independently of the other components. For example, the basic auxiliary information may represent auxiliary information that applies to individual monaural signals independently of other monaural signals. Thus, basic auxiliary bo information can be referred to as independent basic auxiliary information.

У варіантах здійснення спосіб може додатково включати в себе визначення для кожного рівня, чи був відповідний рівень прийнятий коректно. Спосіб може додатково включати в себе визначення першого індексу рівня як індексу того рівня, який знаходиться безпосередньо нижче найбільш низького рівня, що не був прийнятий коректно.In embodiments, the method may further include determining for each level whether the corresponding level was received correctly. The method may further include defining the first index of the level as the index of the level immediately below the lowest level that was not accepted correctly.

У варіантах здійснення визначення другого індексу рівня може включати в себе або визначення другого індексу рівня як такого, що дорівнює першому індексу рівня, або визначення значення індексу як другого індексу рівня, яке вказує, що не слід використовувати яку-небудь поліпшуючу допоміжну інформацію при одержанні відтвореного представлення звуку. В останньому випадку відтворене представлення звуку може бути еквівалентне базовому відтвореному представленню звуку.In embodiments, determining the second level index may include either determining the second level index as being equal to the first level index, or determining an index value as the second level index that indicates that no enhancement support information should be used in obtaining the reproduced presentation of sound. In the latter case, the reproduced sound representation may be equivalent to the underlying reproduced sound representation.

У варіантах здійснення корисні навантаження даних можуть бути прийняті і оброблені для послідовних часових інтервалів, наприклад часових інтервалів однакового розміру. Послідовні часові інтервали можуть бути кадрами. Таким чином, спосіб може працювати на основі кадрів.In embodiments, data payloads may be received and processed for consecutive time slots, such as time slots of the same size. Consecutive time intervals can be frames. Thus, the method can work on a frame-by-frame basis.

Спосіб може додатково включати в себе визначення другого індексу рівня як такого, що дорівнює першому індексу рівня, якщо стиснені представлення звуку для послідовних часових інтервалів можуть бути декодовані незалежно одне від одного.The method may further include defining the second level index as being equal to the first level index if the compressed audio representations for successive time intervals can be decoded independently of each other.

Спосіб може додатково включати в себе для даного часового інтервалу з послідовних часових інтервалів визначення для кожного рівня, чи був відповідний рівень прийнятий коректно, якщо стиснені представлення звуку для послідовних часових інтервалів не можуть бути декодовані незалежно одне від одного. Спосіб може додатково включати в себе визначення першого індексу рівня для даного часового інтервалу як меншого індексу з першого індексу рівняThe method may additionally include, for a given time interval from successive time intervals, determining for each level whether the corresponding level was received correctly, if the compressed sound representations for successive time intervals cannot be decoded independently of each other. The method may additionally include determining the first level index for a given time interval as a smaller index from the first level index

Зо часового інтервалу, що передує даному часовому інтервалу, і індексу рівня, який знаходиться безпосередньо нижче найбільш низького рівня, що не був прийнятий коректно.From the time interval preceding this time interval and the index of the level immediately below the lowest level that was not accepted correctly.

У варіантах здійснення спосіб може додатково включати в себе для даного часового інтервалу визначення, чи дорівнює перший індекс рівня для даного часового інтервалу першому індексу рівня для попереднього часового інтервалу, якщо стиснені представлення звуку для послідовних часових інтервалів не можуть бути декодовані незалежно одне від одного. Спосіб може додатково включати в себе визначення, що другий індекс рівня для даного часового інтервалу дорівнює першому індексу рівня для даного часового інтервалу, якщо перший індекс рівня для даного часового інтервалу дорівнює першому індексу рівня для попереднього часового інтервалу. Спосіб може додатково включати в себе визначення значення індексу як другого індексу рівня, який вказує, що не слід використовувати яку-небудь поліпшуючу допоміжну інформацію при одержанні відтвореного представлення звуку, якщо перший індекс рівня для даного часового інтервалу не дорівнює першому індексу рівня для попереднього часового інтервалу.In embodiments, the method may further include, for a given time interval, determining whether the first level index for the given time interval is equal to the first level index for the previous time interval, if the compressed audio representations for successive time intervals cannot be decoded independently of each other. The method may additionally include determining that the second level index for a given time interval is equal to the first level index for a given time interval if the first level index for a given time interval is equal to the first level index for the previous time interval. The method may further include defining an index value as a second level index that indicates that no enhancing auxiliary information should be used in obtaining a reproduced sound representation if the first level index for a given time interval is not equal to the first level index for a previous time interval .

У варіантах здійснення базовий рівень може включати в себе щонайменше одну частину додаткової базової допоміжної інформації, яка пов'язана з відповідним рівнем і включає в себе інформацію, що визначає декодування одного або більше компонентів з компонентів, присвоєних відповідному рівню, залежно від інших компонентів, присвоєних відповідному рівню і будь-яким рівням нижче відповідного рівня. Спосіб може додатково включати в себе для кожної частини додаткової базової допоміжної інформації декодування частини додаткової базової допоміжної інформації за допомогою посилання на компоненти, присвоєні її відповідному рівню і будь-яким рівням нижче відповідного рівня. Спосіб може додатково включати в себе корекцію частини додаткової базової допоміжної інформації за допомогою посилання на компоненти, присвоєні найбільш високому застосовному рівню і будь-яким рівням між найбільш високим застосовним рівнем і відповідним рівнем. Базове відтворене представлення звуку може бути одержане з компонентів, присвоєних найбільш високому застосовному рівню і будь-яким рівням нижче найбільш високого застосовного рівня, з використанням базової допоміжної інформації і скоректованої частини додаткової базової допоміжної інформації, одержаної із частин додаткової базової допоміжної інформації, що відповідають рівням аж до найбільш високого застосовного рівня.In embodiments, the base layer may include at least one piece of additional base support information that is associated with the corresponding layer and includes information that determines the decoding of one or more components from the components assigned to the corresponding layer depending on the other components assigned the relevant level and any levels below the relevant level. The method may further include, for each piece of additional basic auxiliary information, decoding a portion of the additional basic auxiliary information by reference to components assigned to its corresponding level and any levels below the corresponding level. The method may further include correcting a portion of the additional basic supporting information by reference to components assigned to the highest applicable level and any levels between the highest applicable level and the corresponding level. A base reproduced sound representation may be derived from the components assigned to the highest applicable level and any levels below the highest applicable level, using the base support information and an adjusted portion of the additional base support information derived from the portions of the additional base support information corresponding to the levels up to the highest applicable level.

Відповідно до іншого аспекту описаний спосіб декодування стисненого представлення звуку або звукового поля. Стиснене представлення звуку може бути закодоване в множині ієрархічних рівнів. Множина ієрархічних рівнів може включати в себе базовий рівень і один або декілька ієрархічних зростаючих рівнів. Множина рівнів може мати присвоєні їм компоненти базового стисненого представлення звуку або звукового поля. Інакше кажучи, множина рівнів може включати в себе компоненти базової стисненої допоміжної інформації. Компоненти можуть бути присвоєні відповідним рівням у відповідних групах компонентів. Множина компонентів може бути взаємодоповнюючими компонентами. Базовий рівень може включати в себе базову допоміжну інформацію для декодування базового стисненого представлення звуку. Базовий рівень може додатково включати в себе щонайменше одну частину додаткової базової допоміжної інформації, яка пов'язана з відповідним рівнем і включає в себе інформацію, що визначає декодування одного або більше компонентів з компонентів, присвоєних відповідному рівню, залежно від інших компонентів, присвоєних відповідному рівню і будь-яким рівням нижче відповідного рівня. Спосіб може включати в себе прийом корисних навантажень даних, відповідно пов'язаних з множиною ієрархічних рівнів. Спосіб може додатково включати в себе визначення першого індексу рівня, що вказує найбільш високий застосовний рівень з множини рівнів для використання для декодування базового стисненого представлення звуку в базове відтворене представлення звуку або звукового поля. Спосіб може додатково включати в себе для кожної частини додаткової базової допоміжної інформації декодування частини додаткової базової допоміжної інформації за допомогою посилання на компоненти, присвоєні її відповідному рівню і будь-яким рівням нижче відповідного рівня. Спосіб може додатково включати в себе для кожної частини додаткової базової допоміжної інформації корекцію частини додаткової базової допоміжної інформації за допомогою посилання на компоненти,According to another aspect, a method for decoding a compressed representation of a sound or a sound field is described. A compressed representation of sound can be encoded in a plurality of hierarchical levels. A plurality of hierarchical levels may include a base level and one or more hierarchical ascending levels. A set of layers can have components of an underlying compressed sound representation or sound field assigned to them. In other words, the plurality of levels may include components of basic compressed auxiliary information. Components can be assigned to appropriate levels in appropriate component groups. A plurality of components may be complementary components. The base layer may include basic support information for decoding the basic compressed audio representation. The base layer may further include at least one piece of additional base support information that is associated with the corresponding layer and includes information that determines the decoding of one or more components from the components assigned to the corresponding layer depending on the other components assigned to the corresponding layer and any levels below the appropriate level. The method may include receiving data payloads, respectively, associated with a plurality of hierarchical levels. The method may further include determining a first level index indicating the highest applicable level of the plurality of levels to use for decoding the base compressed sound representation into the base reproduced sound or sound field representation. The method may further include, for each piece of additional basic auxiliary information, decoding a portion of the additional basic auxiliary information by reference to components assigned to its corresponding level and any levels below the corresponding level. The method may additionally include, for each part of the additional basic auxiliary information, correcting a part of the additional basic auxiliary information by reference to the components,

Зо присвоєні найбільш високому застосовному рівню і будь-яким рівням між найбільш високим застосовним рівнем і відповідним рівнем. Базове відтворене представлення звуку може бути одержане з компонентів, присвоєних найбільш високому застосовному рівню і будь-яким рівням нижче найбільш високого застосовного рівня, з використанням базової допоміжної інформації і скоректованої частини додаткової базової допоміжної інформації, одержаної із частин додаткової базової допоміжної інформації, що відповідають рівням аж до найбільш високого застосовного рівня. Спосіб може додатково включати визначення другого індексу рівня, який або дорівнює першому індексу рівня, або вказує опущення поліпшуючої допоміжної інформації під час декодування.Z are assigned to the highest applicable level and any levels between the highest applicable level and the corresponding level. A base reproduced sound representation may be derived from the components assigned to the highest applicable level and any levels below the highest applicable level, using the base support information and an adjusted portion of the additional base support information derived from the portions of the additional base support information corresponding to the levels up to the highest applicable level. The method may further include determining a second layer index that is either equal to the first layer index or indicates the omission of enhancement support information during decoding.

Сконфігурований таким чином запропонований спосіб гарантує, що додаткова базова допоміжна інформація, яка в остаточному підсумку використовується для декодування базового стисненого представлення звуку, не включає в себе надлишкові елементи, тим самим реалізуючи більш ефективне фактичне декодування базового стисненого представлення звуку.The proposed method configured in this way ensures that the additional basic auxiliary information that is ultimately used to decode the basic compressed audio representation does not include redundant elements, thereby realizing a more efficient actual decoding of the basic compressed audio representation.

Відповідно до іншого аспекту описаний кодер для багаторівневого кодування стисненого представлення звуку або звукового поля. Стиснене представлення звуку може включати в себе базове стиснене представлення звуку, що включає в себе множину компонентів. Множина компонентів може бути взаємодоповнюючими компонентами. Стиснене представлення звуку може додатково включати в себе базову допоміжну інформацію для декодування базового стисненого представлення звуку в базове відтворене представлення звуку або звукового поля.According to another aspect, an encoder for multi-level encoding of a compressed sound representation or sound field is described. A compressed audio representation may include a basic compressed audio representation that includes a plurality of components. A plurality of components may be complementary components. The compressed sound representation may further include basic support information for decoding the basic compressed sound representation into a basic reproduced sound or sound field representation.

Стиснене представлення звуку може, крім того, включати в себе поліпшуючу допоміжну інформацію, яка включає в себе параметри для поліпшення (наприклад, розширення) базового відтвореного представлення звуку. Кодер може включати в себе процесор, виконаний з можливістю виконувати деякі або всі етапи способів відповідно до першого згаданого вище аспекту і другого згаданого вище аспекту.The compressed audio representation may, in addition, include enhancement support information that includes options to improve (eg, expand) the underlying reproduced audio representation. The encoder may include a processor configured to perform some or all of the steps of the methods in accordance with the first aspect mentioned above and the second aspect mentioned above.

Відповідно до іншого аспекту описаний декодер для декодування стисненого представлення звуку або звукового поля. Стиснене представлення звуку може бути закодоване в множині ієрархічних рівнів. Множина ієрархічних рівнів може включати в себе базовий рівень і один або декілька ієрархічних зростаючих рівнів. Множина рівнів може мати присвоєні їм компоненти 60 базового стисненого представлення звуку або звукового поля. Інакше кажучи, множина рівнів може включати в себе компоненти базової стисненої допоміжної інформації. Компоненти можуть бути присвоєні відповідним рівням у відповідних групах компонентів. Множина компонентів може бути взаємодоповнюючими компонентами. Базовий рівень може включати в себе базову допоміжну інформацію для декодування базового стисненого представлення звуку.According to another aspect, a decoder is described for decoding a compressed representation of a sound or a sound field. A compressed representation of sound can be encoded in a plurality of hierarchical levels. A plurality of hierarchical levels may include a base level and one or more hierarchical ascending levels. A plurality of levels may have components 60 of the underlying compressed sound representation or sound field assigned to them. In other words, the plurality of levels may include components of basic compressed auxiliary information. Components can be assigned to appropriate levels in appropriate component groups. A plurality of components may be complementary components. The base layer may include basic support information for decoding the basic compressed audio representation.

Кожен рівень може включати в себе частину поліпшуючої допоміжної інформації, яка включає в себе параметри для поліпшення (наприклад, розширення) базового відтвореного представлення звуку, доступні з даних, включених у відповідний рівень і будь-які рівні нижче відповідного рівня. Декодер може включати в себе процесор, виконаний з можливістю виконувати деякі або всі етапи способів відповідно до третього згаданого вище аспекту і четвертого згаданого вище аспекту.Each level may include a portion of enhancement support information that includes parameters for improving (eg, augmenting) the underlying reproduced sound representation available from data included in the corresponding level and any levels below the corresponding level. The decoder may include a processor configured to perform some or all steps of the methods in accordance with the third aspect mentioned above and the fourth aspect mentioned above.

Відповідно до інших аспектів способи, пристрої і системи спрямовані на декодування стисненого представлення Нідпег-Огаег АтБізопіс5 (НОА) звуку або звукового поля (просторового звуку або звукового поля вищого порядку). Пристрій може мати приймач, виконаний з можливістю приймати, або спосіб може приймати бітовий потік, який містить стиснене представлення НОА, що відповідає множині ієрархічних рівнів, які включають в себе базовий рівень і один або більше ієрархічних поліпшуючих рівнів. Множина рівнів має присвоєні їм компоненти базового стисненого представлення звуку або звукового поля, компоненти присвоєні відповідним рівням у відповідних групах компонентів. Пристрій може мати декодер, виконаний з можливістю декодувати, або спосіб може декодувати стиснене представлення НОА на основі базової допоміжної інформації, яка пов'язана з базовим рівнем, і на основі поліпшуючої допоміжної інформації, яка пов'язана з одним або більше ієрархічними поліпшуючими рівнями. Базова допоміжна інформація може включати в себе базову незалежну допоміжну інформацію, яка стосується перших індивідуальних монауральних сигналів, які будуть декодуватися незалежно від інших монауральних сигналів. Кожний з одного або більше ієрархічних поліпшуючих рівнів може включати в себе частину поліпшуючої допоміжної інформації яка включає в себе параметри для поліпшення базового відтвореного представлення звуку, доступні з даних, включених у відповідні рівні і будь-які рівні нижче відповідного рівня.According to other aspects, methods, devices and systems are directed to decoding a compressed Nidpeg-Ogaeg AtBizopis5 (NOA) representation of a sound or a sound field (spatial sound or a higher order sound field). The device may have a receiver configured to receive, or the method may receive, a bit stream that contains a compressed representation of the NOA corresponding to a plurality of hierarchical levels that include a base level and one or more hierarchical enhancement levels. A plurality of levels have components of the underlying compressed sound representation or sound field assigned to them, the components being assigned to corresponding levels in their respective component groups. The device may have a decoder configured to decode, or the method may decode, a compressed representation of the NOA based on the base auxiliary information associated with the base layer and the enhancement auxiliary information associated with one or more hierarchical enhancement layers. The basic auxiliary information may include the basic independent auxiliary information relating to the first individual monaural signals to be decoded independently of the other monaural signals. Each of the one or more hierarchical enhancement levels may include a portion of enhancement support information that includes parameters for improving the underlying reproduced sound representation available from data included in the respective levels and any levels below the respective level.

Базова незалежна допоміжна інформація може вказувати, що перші індивідуальніBasic independent supporting information may indicate that the former are individual

Зо монауральні сигнали представляють спрямований сигнал з напрямком падіння. Базова допоміжна інформація може додатково включати в себе базову залежну допоміжну інформацію, яка стосується других індивідуальних монауральних сигналів, які будуть декодуватися залежно від інших монауральних сигналів. Базова залежна допоміжна інформація може включати в себе основані на векторах сигнали, які розподілені по напрямках у звуковому полі, причому розподіл по напрямках визначений за допомогою вектора. Компоненти вектора встановлені такими, що дорівнюють нулю, і не є частиною стисненого векторного представлення.Since monaural signals represent a directional signal with a direction of incidence. The base auxiliary information may further include the base dependent auxiliary information relating to the second individual monaural signals to be decoded depending on the other monaural signals. The base dependent auxiliary information may include vector-based signals that are distributed along the directions in the sound field, and the distribution along the directions is determined by the vector. The vector components are set to zero and are not part of the compressed vector representation.

Компоненти базового стисненого представлення звуку можуть відповідати монауральним сигналам, які представляють або переважні звукові сигнали, або послідовності коефіцієнтів представлення НОА. Бітовий потік включає в себе корисні навантаження даних, відповідно пов'язані з множиною ієрархічних рівнів. Поліпшуюча допоміжна інформація може включати в себе параметри, що стосуються щонайменше одного з перерахованого: просторове прогнозування, синтез спрямованих підсмугових сигналів і параметричне дублювання звукового оточення. Поліпшуюча допоміжна інформація може включати в себе інформацію, яка робить можливим прогнозування частин звуку або звукового поля, яких не вистачає, на основі спрямованих сигналів. Може бути додатково визначено для кожного рівня, чи був відповідний рівень прийнятий коректно, і індекс рівня, що знаходиться безпосередньо нижче найбільш низького рівня, який не був прийнятий коректно.The components of the basic compressed sound representation may correspond to monaural signals that represent either the predominant sound signals or sequences of NOA representation coefficients. A bitstream includes data payloads, respectively, associated with a plurality of hierarchical levels. The enhanced support information may include parameters related to at least one of the following: spatial prediction, synthesis of directional subband signals, and parametric surround sound duplication. Enhancing assistive information may include information that makes it possible to predict missing portions of a sound or sound field based on directional cues. It may be further determined for each level whether the corresponding level was accepted correctly and the index of the level immediately below the lowest level that was not accepted correctly.

Відповідно до іншого аспекту описана програма. Програма може бути адаптована для виконання на процесорі і для виконання деяких або всіх етапів способу, викладених у даному документі, при її виконанні на обчислювальному пристрої.According to another aspect, a program is described. The program may be adapted to run on a processor and to perform some or all of the steps of the method described herein when executed on a computing device.

Відповідно до ще одного аспекту описаний запам'ятовуючий носій. Запам'ятовуючий носій може містити програму, адаптовану для виконання на процесорі і для виконання деяких або всіх етапів способу, викладених у даному документі, при її виконанні на обчислювальному пристрої.According to yet another aspect, a storage medium is described. The storage medium may contain a program adapted for execution on a processor and for execution of some or all of the steps of the method set forth in this document when executed on a computing device.

Твердження, зроблені відносно будь-якого зі згаданих вище аспектів або їх варіантів здійснення, також стосуються відповідних інших аспектів або їх варіантів здійснення, як зрозуміє фахівець в галузі техніки. Повторення цих тверджень для кожного аспекту або варіанта здійснення було опущене для стислості.Statements made with respect to any of the above-mentioned aspects or embodiments thereof also apply to corresponding other aspects or embodiments thereof, as will be understood by one skilled in the art. Repetition of these statements for each aspect or embodiment has been omitted for brevity.

Способи і пристрої, що включають у себе переважні варіанти здійснення, викладені в даному документі, можуть використовуватися автономно або в сполученні з іншими способами і бо системами, розкритими в цьому документі. Крім того, всі аспекти способів і пристроїв, викладені в даному документі, можуть бути довільним чином об'єднані. Зокрема, ознаки пунктів формули винаходу можуть бути об'єднані одна з одною довільним чином.The methods and devices that include the preferred embodiments set forth herein may be used independently or in conjunction with other methods and systems disclosed herein. In addition, all aspects of the methods and devices set forth herein may be arbitrarily combined. In particular, the features of the claims can be combined with each other arbitrarily.

Етапи способів і ознаки пристроїв можуть бути взаємозамінними різним чином. Зокрема, подробиці розкритого способу можуть бути реалізовані як пристрій, виконаний з можливістю виконувати деякі або всі етапи способу, і навпаки, як зрозуміє фахівець в галузі техніки.The steps of the methods and features of the devices may be interchangeable in various ways. In particular, the details of the disclosed method can be implemented as a device designed to perform some or all of the steps of the method, and vice versa, as will be understood by a person skilled in the art.

КОРОТКИЙ ОПИС КРЕСЛЕНЬBRIEF DESCRIPTION DRAWING

Винахід роз'яснений нижче ілюстративно з посиланням на прикладені креслення.The invention is explained below illustratively with reference to the attached drawings.

Фіг 1 - блок-схема послідовності етапів, що ілюструє приклад способу багаторівневого кодування відповідно до варіантів здійснення розкриття.Fig. 1 is a block diagram of a sequence of steps illustrating an example of a multi-level coding method according to embodiments of the disclosure.

Фіг. 2 - блок-схема, що схематично ілюструє приклад стадії кодера відповідно до варіантів здійснення розкриття.Fig. 2 is a block diagram schematically illustrating an example of an encoder stage in accordance with embodiments of the disclosure.

Фіг. З - блок-схема послідовності етапів, що ілюструє приклад способу декодування стисненого представлення звуку або звукового поля, що було закодоване у множині ієрархічних рівнів, відповідно до варіантів здійснення розкриття.Fig. C is a block diagram of a sequence of steps illustrating an example of a method of decoding a compressed representation of a sound or a sound field that has been encoded in a plurality of hierarchical levels, in accordance with embodiments of the disclosure.

Фіг 4А і фіг. 48 - блок-схеми, що схематично ілюструють приклади стадії декодера, відповідно до варіантів здійснення розкриття.Fig. 4A and Fig. 48 are block diagrams schematically illustrating example decoder stages in accordance with embodiments of the disclosure.

Фіг. 5 - блок-схема, що схематично ілюструє приклад апаратної реалізації кодера відповідно до варіантів здійснення розкриття.Fig. 5 is a block diagram schematically illustrating an example of a hardware implementation of an encoder in accordance with embodiments of the disclosure.

Фіг. б - блок-схема, що схематично ілюструє приклад апаратної реалізації декодера відповідно до варіантів здійснення розкриття.Fig. b is a block diagram schematically illustrating an example of a hardware implementation of a decoder in accordance with the embodiments of the disclosure.

ЗДІЙСНЕННЯ ВИНАХОДУIMPLEMENTATION OF THE INVENTION

Спочатку буде описане стиснене представлення звуку (або звукового поля) (далі для стислості назване стисненим представленням звуку), до якого застосовні способи і кодери/декодери відповідно до даного розкриття. У цілому повне стиснене представлення звуку (або звукового поля) (далі для стислості назване повним стисненим представленням звуку) може містити три наступні компоненти (наприклад, складатися з них): базове стиснене представлення звуку (або звукового поля) (далі для стислості назване базовим стисненим представленням звуку), базову допоміжну інформацію і поліпшуючу допоміжну інформацію.First, a compressed sound (or sound field) representation (hereinafter referred to as a compressed sound representation for brevity) to which methods and encoders/decoders according to the present disclosure are applicable will be described. In general, a full compressed sound (or sound field) representation (hereinafter referred to as a full compressed sound representation) may contain (eg consist of) the following three components: a basic compressed sound (or sound field) representation (hereinafter referred to as a basic compressed sound presentation), basic supporting information and enhancing supporting information.

Саме базове стиснене представлення звуку містить декілька компонентів (наприклад, складається з них) (наприклад, взаємодоповнюючих компонентів). Базове стиснене представлення звуку може брати до уваги визначено найбільший відсоток повного стисненого представлення звуку. Базове стиснене представлення звуку може складатися з монауральних транспортних сигналів, що представляють або переважні звукові сигнали, або послідовності коефіцієнтів початкового представлення НОА.The underlying compressed sound representation itself contains (eg, consists of) several components (eg, complementary components). The base compressed audio representation may take into account a determined largest percentage of the full compressed audio representation. The underlying compressed audio representation may consist of monaural transport signals representing either the predominant audio signals or a sequence of coefficients of the original NOA representation.

Базова допоміжна інформація потрібна для декодування базового стисненого представлення звуку і, як передбачається, має набагато менший розмір у порівнянні з базовим стисненим представленням звуку. Це може бути зроблено аж до її найбільшої частини незв'язних частин, кожна з яких визначає відновлення тільки одного конкретного компонента базового стисненого представлення звуку. Базова допоміжна інформація може містити першу частину, що може бути відома як незалежна базова допоміжна інформація, і другу частину, що може бути відома як додаткова базова допоміжна інформація.The base support information is required to decode the underlying compressed audio representation and is assumed to be much smaller in size compared to the underlying compressed audio representation. This can be done up to its largest part in unconnected parts, each of which defines the recovery of only one particular component of the underlying compressed sound representation. The basic supporting information may comprise a first part, which may be known as independent basic supporting information, and a second part, which may be known as additional basic supporting information.

Ї перша, і друга частини, незалежна базова допоміжна інформація і додаткова базова допоміжна інформація можуть визначати відновлення конкретних компонентів базового стисненого представлення звуку. Друга частина є факультативною і може бути опущена. У цьому випадку можна сказати, що стиснене представлення звуку містить першу частину (наприклад, базову допоміжну інформацію).The first and second parts, the independent basic support information and the additional basic support information can determine the restoration of specific components of the basic compressed sound representation. The second part is optional and can be omitted. In this case, the compressed representation of the sound can be said to contain the first part (eg basic auxiliary information).

Перша частина (наприклад, базова допоміжна інформація) може містити допоміжну інформацію, яка описує індивідуальні (взаємодоповнюючі) компоненти базового стисненого представлення звуку, незалежно від інших (взаємодоповнюючих) компонентів. Зокрема, перша частина (наприклад, базова допоміжна інформація) може визначати декодування одного або більше з множини компонентів індивідуально, незалежно від інших компонентів. Таким чином, перша частина може згадуватися як незалежна базова допоміжна інформація.The first part (eg, basic auxiliary information) may contain auxiliary information that describes individual (interrelated) components of the basic compressed sound representation, independently of other (interrelated) components. In particular, the first part (eg, the basic supporting information) may determine the decoding of one or more of the plurality of components individually, independently of the other components. Thus, the first part can be referred to as an independent basic supporting information.

Друга (факультативна) частина може містити допоміжну інформацію, також відому як додаткова базова допоміжна інформація, може описувати індивідуальні (взаємодоповнюючі) компоненти базового стисненого представлення звуку залежно від інших (взаємодоповнюючих) компонентів. Ця друга частина може також згадуватися як залежна базова допоміжна інформація. Зокрема, залежність може мати наступні властивості: - залежна базова допоміжна інформація для кожного індивідуального (взаємодоповнюючого) компонента базового стисненого представлення звуку може досягати свого найбільшого ступеня, коли інші визначені (взаємодоповнюючі) компоненти не містяться в базовому стисненому представленні звуку; - у випадку, якщо додаткові визначені (взаємодоповнюючі) компоненти додані до базового стисненого представлення звуку, залежна базова допоміжна інформація для індивідуального (взаємодоповнюючого) компонента, що розглядається, може стати підмножиною початкової залежної базової допоміжної інформації, тим самим скорочуючи її розмір.The second (optional) part may contain auxiliary information, also known as additional basic auxiliary information, may describe individual (interrelated) components of the basic compressed audio representation depending on other (interrelated) components. This second part may also be referred to as dependent basic supporting information. In particular, the dependency can have the following properties: - the dependent basic auxiliary information for each individual (mutually complementary) component of the basic compressed sound representation can reach its greatest degree when other specified (mutually complementary) components are not contained in the basic compressed sound representation; - in the event that additional defined (interlocking) components are added to the basic compressed sound representation, the dependent base auxiliary information for the individual (interlocking) component under consideration can become a subset of the original dependent base auxiliary information, thereby reducing its size.

Поліпшуюча допоміжна інформація також є факультативною. Вона може використовуватися для поліпшення або розширення (наприклад, параметричного поліпшення або розширення) базового стисненого представлення звуку. Її розмір, як може також передбачатися, набагато менше, ніж у базового стисненого представлення звуку.Enhancing supporting information is also optional. It can be used to enhance or expand (eg, parametrically enhance or expand) the underlying compressed audio representation. Its size, as can also be expected, is much smaller than that of the basic compressed audio representation.

Таким чином, у варіантах здійснення стиснене представлення звуку може містити базове стиснене представлення звуку, що містить множину компонентів, базову допоміжну інформацію для декодування (наприклад, відновлення) базового стисненого представлення звуку до базового відтвореного представлення звуку або звукового поля і поліпшуючу допоміжну інформацію, яка включає в себе параметри для поліпшення або розширення (наприклад, параметричного поліпшення або розширення) базового відтвореного представлення звуку.Thus, in embodiments, a compressed sound representation may comprise a base compressed sound representation comprising a plurality of components, base support information for decoding (e.g., reconstructing) the base compressed sound representation to a base reproduced sound representation or sound field, and enhancement support information that includes itself parameters to improve or expand (for example, parametric improvement or expansion) the basic reproduced sound representation.

Стиснене представлення звуку може також містити додаткову базову допоміжну інформацію для декодування (наприклад, відновлення) базового стисненого представлення звуку до базового відтвореного представлення звуку, що може включати в себе інформацію, яка визначає декодування одного або більше з множини компонентів залежно від відповідних інших компонентів.The compressed audio representation may also include additional base support information for decoding (eg, restoring) the base compressed audio representation to the base reproduced audio representation, which may include information that determines the decoding of one or more of the plurality of components depending on the respective other components.

Один приклад такого типу повного стисненого представлення звуку заданий за допомогою стисненого представлення Нідпетг-Огаег Атрізопіс5 (НОА) звукового поля (просторового звукового поля вищого порядку), як визначено за допомогою попередньої версії аудіостандартуOne example of this type of full compressed sound representation is given by the Nidpetg-Ogaeg Atrizopis5 (NOA) compressed representation of the sound field (higher order spatial sound field) as defined by a previous version of the audio standard

МРЕС-Н 30 (посилання 1), глава 12 і додаток С.5. Таким чином, стиснене представлення звуку може відповідати стисненому представленню НОА звуку (або звукового поля).MPRES-N 30 (reference 1), chapter 12 and appendix C.5. Thus, the compressed sound representation can correspond to the compressed NOA representation of the sound (or sound field).

Для цього прикладу базове стиснене представлення звукового поля (базове стиснене представлення звуку) може містити декілька компонентів (наприклад, може бути ідентифіковане за допомогою них). Компоненти можуть являти собою монауральні сигнали (наприклад,For this example, the underlying compressed sound field representation (basic compressed sound representation) may contain (eg may be identified by) multiple components. The components can be monaural signals (for example,

Зо відповідати їм). Монауральні сигнали можуть являти собою квантовані монауральні сигнали.To answer them). Monaural signals can be quantized monaural signals.

Монауральні сигнали можуть представляти або переважні звукові сигнали, або послідовності коефіцієнтів оточуючого компонента НОА звукового поля.Monaural signals can represent either dominant sound signals or sequences of coefficients of the surrounding component of the NOA of the sound field.

Базова допоміжна інформація може описувати, серед іншого, для кожного із цих монауральних сигналів, яким чином він додає просторовий внесок у звукове поле. Наприклад, базова допоміжна інформація може визначати переважний звуковий сигнал як чисто спрямований сигнал, що означає загальну плоску хвилю з деяким напрямком падіння. Як альтернатива базова допоміжна інформація може визначати монауральний сигнал як послідовність коефіцієнтів початкового представлення НОА, що має деякий індекс. Базова допоміжна інформація також може бути розділена на першу частину і другу частину, як зазначено вище.The background support information may describe, among other things, for each of these monaural signals, how it adds a spatial contribution to the sound field. For example, the underlying support information may define the predominant audio signal as a purely directional signal, meaning a general plane wave with some direction of incidence. Alternatively, the basic auxiliary information can define the monaural signal as a sequence of coefficients of the initial representation of the NOA, having some index. The basic supporting information can also be divided into the first part and the second part as mentioned above.

Перша частина є допоміжною інформацією (наприклад, незалежною базовою допоміжною інформацією), яка стосується конкретних індивідуальних монауральних сигналів. Ця незалежна базова допоміжна інформація незалежна від існування інших монауральних сигналів. Така допоміжна інформація може, наприклад, визначати монауральний сигнал для представлення спрямованого сигналу (що, наприклад, означає загальну плоску хвилю) з деяким напрямком падіння. Як альтернатива монауральний сигнал може бути визначений як послідовність коефіцієнтів початкового представлення НОА, що має деякий індекс. Перша частина може згадуватися як незалежна базова допоміжна інформація. У цілому перша частина (наприклад, базова допоміжна інформація) може визначати декодування одного або більше з множини монауральних сигналів індивідуально, незалежно від інших монауральних сигналів.The first part is auxiliary information (eg, independent baseline auxiliary information) that relates to specific individual monaural signals. This independent background auxiliary information is independent of the existence of other monaural signals. Such auxiliary information may, for example, determine a monaural signal to represent a directional signal (meaning, for example, a total plane wave) with some direction of incidence. As an alternative, the monaural signal can be defined as a sequence of coefficients of the initial representation of the NOA, which has some index. The first part can be referred to as independent basic supporting information. In general, the first part (eg, the basic auxiliary information) may determine the decoding of one or more of the plurality of monaural signals individually, independently of the other monaural signals.

Друга частина є допоміжною інформацією (наприклад, додатковою базовою допоміжною інформацією), яка стосується конкретних індивідуальних монауральних сигналів. Ця допоміжна інформація залежить від існування інших монауральних сигналів. Така допоміжна інформація може бути використана, наприклад, якщо монауральні сигнали визначені як основані на векторах сигнали (див., наприклад, посилання 1, розділ 12.4.2.4.4). Ці сигнали розподілені по напрямках у звуковому полі, причому розподіл по напрямках може бути визначений за допомогою вектора. У деякому режимі (див., наприклад, Содеаммесі епдій-1) окремі компоненти цього вектора неявно встановлені такими, що дорівнюють нулю, і не є частиною стисненого векторного представлення. Цими компонентами є компоненти з індексами, що 60 дорівнюють індексам послідовностей коефіцієнтів початкового представлення НОА і частини базового стисненого представлення звуку. Це означає, що, якщо індивідуальні компоненти вектора закодовані, їх загальна кількість може залежати від базового стисненого представлення звуку. Зокрема, загальна кількість може залежати від того, які послідовності коефіцієнтів містить початкове представлення НОА.The second part is auxiliary information (eg, additional basic auxiliary information) that relates to specific individual monaural signals. This auxiliary information depends on the existence of other monaural signals. Such auxiliary information can be used, for example, if monaural signals are defined as vector-based signals (see, for example, reference 1, section 12.4.2.4.4). These signals are distributed along the directions in the sound field, and the distribution along the directions can be determined using a vector. In some mode (see, for example, Sodeammesi epdium-1), the individual components of this vector are implicitly set to zero and are not part of the compressed vector representation. These components are components with indices that are equal to the indices of the sequences of the coefficients of the initial representation of the NOA and part of the basic compressed sound representation. This means that if the individual components of the vector are encoded, their total number may depend on the underlying compressed sound representation. In particular, the total number may depend on which sequences of coefficients the initial representation of the NOA contains.

Якщо послідовності коефіцієнтів початкового представлення НОА не містяться в базовому стисненому представленні звуку, залежна базова допоміжна інформація для кожного основаного на векторі сигналу складається із всіх векторних компонентів і має свій найбільший розмір. У випадку, якщо послідовності коефіцієнтів початкового представлення НОА з деякими індексами додаються до базового стисненого представлення звуку, векторні компоненти із цими індексами видаляються з допоміжної інформації для кожного основаного на векторі сигналу, тим самим скорочуючи розмір залежної базової допоміжної інформації для основаних на векторах сигналів.If the sequences of coefficients of the original NOA representation are not contained in the underlying compressed audio representation, the dependent baseline auxiliary information for each vector-based signal consists of all vector components and has its largest size. In the case that sequences of coefficients of the original NOA representation with some indices are added to the basic compressed sound representation, the vector components with these indices are removed from the auxiliary information for each vector-based signal, thereby reducing the size of the dependent basic auxiliary information for the vector-based signals.

Поліпшуюча допоміжна інформація (наприклад, поліпшуюча допоміжна інформація) може містити параметри, що стосуються (широкосмугового) просторового прогнозування (див. посилання 1, розділ 12.4.2.4.3), (або параметри, що стосуються синтезу спрямованих підсмугових сигналів і параметричного дублювання звукового оточення.The enhancement support information (eg, enhancement support information) may include parameters related to (wideband) spatial prediction (see reference 1, section 12.4.2.4.3), (or parameters related to the synthesis of directional subband signals and parametric surround sound duplication .

Параметри, що стосуються (широкосмугового) просторового прогнозування, можуть використовуватися для (лінійного) прогнозування частин звукового поля, яких не вистачає, зі спрямованих сигналів.Parameters related to (wideband) spatial prediction can be used to (linearly) predict missing parts of the sound field from directional signals.

Синтез спрямованих підсмугових сигналів і параметричне дублювання звукового оточення є інструментами стиснення, які були нещодавно введені в аудіостандарт МРЕС-Н 30 за допомогою поправки (див. посилання 2, розділ 11). Ці два інструменти дозволяють залежному від частоти параметричному прогнозуванню додаткових монауральних сигналів бути просторово розподіленим, щоб доповнювати просторово неповне або неповністю стиснене представленняDirectional subband synthesis and parametric surround duplication are compression tools that were recently introduced into the MPEC-H 30 audio standard by way of an amendment (see reference 2, section 11). These two tools allow frequency-dependent parametric prediction of additional monaural signals to be spatially distributed to complement a spatially incomplete or incompletely compressed representation

НОА. Прогнозування може бути основане на послідовностях коефіцієнтів базового стисненого представлення звуку.NOA. The prediction can be based on sequences of coefficients of the underlying compressed sound representation.

Важливо відзначити, що згаданий вище взаємодоповнюючий внесок у звукове поле представлений у стисненому представленні НОА не за допомогою додаткових квантованих сигналів, а за допомогою додаткової допоміжної інформації порівняно набагато меншогоIt is important to note that the above-mentioned complementary contribution to the sound field is represented in the compressed representation of the NOA not by means of additional quantized signals, but by means of additional auxiliary information of a comparatively much smaller

Зо розміру. Отже, два згаданих інструменти кодування особливо придатні для стиснення представлень НОА на низьких швидкостях передачі даних.From the size. Therefore, the two coding tools mentioned are particularly suitable for compressing NOA representations at low data rates.

Другий приклад стисненого представлення одного або більше монауральних сигналів зі згаданою вище структурою може містити закодовану спектральну інформацію для незв'язних частотних смуг аж до деякої верхньої частоти, що може розглядатися як базове стиснене представлення; базову допоміжну інформацію, яка визначає закодовану спектральну інформацію (наприклад, за допомогою кількості і ширини закодованих частотних смуг); і поліпшуючу допоміжну інформацію, яка містить параметри копіювання спектральної смуги (ЗВЕ) (наприклад, складається з них), які описують, як параметрично відтворити з базового стисненого представлення спектральну інформацію для смуг більш високої частоти, які не розглядаються в базовому стисненому представленні.A second example of a compressed representation of one or more monaural signals with the above-mentioned structure may contain coded spectral information for unrelated frequency bands up to some upper frequency, which can be considered as a basic compressed representation; basic auxiliary information that determines the coded spectral information (for example, using the number and width of the coded frequency bands); and enhancement support information that includes (eg, consists of) Spectral Band Copy (SBR) parameters that describe how to parametrically recreate from the base compressed representation spectral information for higher frequency bands not considered in the base compressed representation.

Дане розкриття пропонує спосіб багаторівневого кодування повного стисненого представлення звуку (або звукового поля), що має згадану вище структуру.The present disclosure proposes a method for multi-level coding of a full compressed representation of a sound (or sound field) having the structure mentioned above.

Стиснення може бути основане на кадрах у тому розумінні, що воно забезпечує стиснені представлення (у формі пакетів даних або еквівалентно корисного навантаження кадрів) для послідовних часових інтервалів. Часові інтервали можуть мати однакові або різні розміри. Ці пакети даних, як може передбачатися, містять прапор коректності, значення, що вказує їх розмір, а також фактичні дані стисненого представлення. Далі без навмисного обмеження буде передбачатися, що стиснення є основаним на кадрах. Крім того, якщо не зазначено інакше і без навмисного обмеження, буде зроблений фокус на обробці одного кадру, і тому індекс кадру буде опущений.Compression may be frame-based in the sense that it provides compressed representations (in the form of data packets or equivalently frame payloads) for successive time intervals. Time intervals can have the same or different sizes. These data packets predictably contain a correctness flag, a value indicating their size, and the actual data of the compressed representation. In the following, it will be assumed without intentional limitation that the compression is frame-based. Also, unless otherwise specified and without intentional restriction, the focus will be on processing a single frame, and therefore the frame index will be omitted.

Кожне корисне навантаження кадру повного стисненого ппедставлення звуку (або звукового поля), що розглядається, як передбачається, містить / пакетів даних (або корисних навантажень кадру) кожний для одного компонента базового стисненого представлення звуку, які позначені як ВК, р 1... Крім того, передбачається, що пакет містить незалежну базову допоміжну інформацію (базову допоміжну інформацію), позначену як В яка визначає окремі компоненти ВКС; базового стисненого представлення звуку, незалежно від інших компонентів. Факультативно може додатково передбачатися, що пакет містить залеж ззову допоміжну інформацію (додаткову базову допоміжну інформацію), позначену як ОП, якаEach frame payload of the full compressed sound representation (or sound field) under consideration is assumed to contain / data packets (or frame payloads) each for one component of the underlying compressed sound representation, which are denoted as VK, p 1... Except it is assumed that the package contains independent basic auxiliary information (basic auxiliary information), marked as B, which defines individual components of the VKS; basic compressed sound representation, independent of other components. Optionally, it may additionally be assumed that the packet contains dependent auxiliary information (additional basic auxiliary information), denoted as OP, which

. ВАС, . визначає окремі компоненти / базового стисненого представлення звуку залежно від інших компонентів. вс ОВ. YOU, defines individual components / of the underlying compressed audio representation depending on other components. all OV

Інформація, яка міститься у двох пакетах даних 77 7770, може бути факультативно згрупована в єдиний пакет даних В5І базової допоміжної інформації. Можна сказати, що єдиний редрданих ВБІ містить, серед іншого, / частин, кожна з яких визначає один окремий компонент і базового стисненого представлення звуку. Можна сказати, що кожна із цих частин, у свою чергу, містить частину незалежної допоміжної інформації і факультативно частину залежної допоміжної інформації.The information contained in two data packets 77 7770 can optionally be grouped into a single data packet B5I basic auxiliary information. It can be said that a single redrdanich VBI contains, among other things, / parts, each of which defines one separate component and the basic compressed sound representation. It can be said that each of these parts, in turn, contains a part of independent auxiliary information and optionally a part of dependent auxiliary information.

В остаточному підсумку, вона може включати в себе корисне навантаження поліпшуючої допоміжної інформації (поліпшуючої допоміжної інформації), позначене як В5БІ, з описом того, як поліпшити або розширити відтворений звук (або звукове поле) на основі повного базового стисненого представлення звуку.Ultimately, it may include an enhancement support information (enhancement support information) payload, denoted as B5BI, describing how to enhance or expand the reproduced sound (or sound field) based on the full underlying compressed sound representation.

Пропоноване рішення для багаторівневого кодування спрямоване на етапи, які вимагаються для забезпечення можливості як для частини стиснення, що включає в себе упакування пакетів даних для передачі, так і для частини прийому і відновлення. Кожна частина буде докладно описана далі.The proposed multi-level coding solution addresses the steps required to enable both the compression part, which includes packing data packets for transmission, and the reception and recovery part. Each part will be described in detail below.

Спочатку будуть описані стиснення і упакування (наприклад, для передачі). Зокрема, будуть описані компоненти і елементи повного стисненого представлення звуку (або звукового поля) у випадку багаторівневого кодування.First, compression and compression (for example, for transmission) will be described. In particular, the components and elements of a complete compressed representation of sound (or sound field) in the case of multi-level coding will be described.

Фіг. 1 схематично ілюструє блок-схему послідовності етапів прикладу способу стиснення і упакування (наприклад, способу кодування або способу багаторівневого кодування стисненого представлення звуку або звукового поля). Присвоєння (наприклад, розподіл) індивідуальних корисних навантажень базовому рівню і (М-1) поліпшуючим рівням може бути досягнуте за допомогою пакувальника транспортних рівнів. Фіг. 2 схематично ілюструє блок-схему прикладу присвоєння/розподілу індивідуальних корисних навантажень.Fig. 1 schematically illustrates a block diagram of a sequence of steps of an example of a method of compression and packaging (for example, a method of coding or a method of multi-level coding of a compressed representation of a sound or a sound field). The assignment (eg distribution) of individual payloads to the base layer and (M-1) enhancement layers can be achieved using a transport layer packer. Fig. 2 schematically illustrates a block diagram of an example assignment/allocation of individual payloads.

Як зазначено вище, повне стиснене представлення 2100 звуку може стосуватися, наприклад, стисненого представлення НОА, що містить базове стиснене представлення звуку.As noted above, the full compressed sound representation 2100 may refer, for example, to a compressed NOA representation containing a basic compressed sound representation.

Повне стиснене представлення 2100 звуку може містити множину компонентів (наприклад, монауральні сигнали) 2110-1,..., 2110-7, незалежну базову допоміжну інформацію (базовуA complete compressed sound representation 2100 may contain a plurality of components (eg, monaural signals) 2110-1,..., 2110-7, independent basic auxiliary information (basic

Зо допоміжну інформацію) 2120, факультативну поліпшуючу допоміжну інформацію (поліпшуючу допоміжну інформацію) 2140 і факультативну залежну базову допоміжну інформацію (додаткову базову допоміжну інформацію) 2130. Базова допоміжна інформація 2120 може бути інформацією для декодування базового стисненого представлення звуку в базове відтворене представлення звуку або звукового поля. Базова допоміжна інформація 2120 може включати в себе інформацію, яка визначає декодування одного або більше компонентів (наприклад, монауральних сигналів) індивідуально, незалежно від інших компонентів. Поліпшуюча допоміжна інформація 2140 може включати в себе параметри для поліпшення (наприклад, розширення) базового відтвореного представлення звуку. Додаткова базова допоміжна інформація 2130 може бути (додатковою) інформацією для декодування базового стисненого представлення звуку в базове відтворене представлення звуку і може включати в себе інформацію, яка визначає декодування одного або більше з множини компонентів залежно від відповідних інших компонентів.from auxiliary information) 2120, optional enhancing auxiliary information (enhancing auxiliary information) 2140 and optional dependent basic auxiliary information (additional basic auxiliary information) 2130. The basic auxiliary information 2120 may be information for decoding a basic compressed sound representation into a basic reproduced sound representation or fields Basic auxiliary information 2120 may include information that determines the decoding of one or more components (eg, monaural signals) individually, independently of other components. The enhancing auxiliary information 2140 may include parameters for improving (eg, expanding) the basic reproduced representation of the sound. The additional basic auxiliary information 2130 may be (additional) information for decoding the basic compressed sound representation into the basic reproduced sound representation and may include information that determines the decoding of one or more of the plurality of components depending on the corresponding other components.

Фіг. 2 ілюструє основне допущення, у якому існує множина ієрархічних рівнів, що включає у себе один базовий рівень (основний рівень) і один або більше (ієрархічних) поліпшуючих рівнів.Fig. 2 illustrates a basic assumption in which there is a plurality of hierarchical levels, including one basic level (basic level) and one or more (hierarchical) enhancing levels.

Наприклад, може бути всього М рівнів, тобто один базовий рівень і М-1 поліпшуючих рівнів.For example, there can be a total of M levels, that is, one basic level and M-1 improving levels.

Множина ієрархічних рівнів має послідовно збільшуваний індекс рівня. Найнижче значення індексу рівня (наприклад, індекс 1 рівня) відповідає базовому рівню. Далі мається на увазі, що рівні впорядковані від базового рівня, через поліпшуючі рівні, аж до повного найбільш високого поліпшуючого рівня (тобто повного найбільш високого рівня).A set of hierarchical levels has a successively increasing level index. The lowest level index value (for example, level 1 index) corresponds to the base level. It further implies that the levels are ordered from the base level, through the improvement levels, up to the full highest improvement level (ie the full highest level).

Запропонований спосіб може бути виконаний на основі кадру (тобто покадрово). Зокрема, стиснене представлення 2100 звуку може бути стиснене для послідовних часових інтервалів, наприклад часових інтервалів однакового розміру. Кожен часовий інтервал може відповідати кадру. Описані нижче етапи можуть бути виконані для кожного послідовного часового інтервалу (наприклад, кадру).The proposed method can be performed on a frame-by-frame (ie, frame-by-frame) basis. In particular, the compressed audio representation 2100 may be compressed for consecutive time intervals, such as time intervals of the same size. Each time slot can correspond to a frame. The steps described below can be performed for each successive time interval (eg, frame).

На етапі 51010 на фіг. 1 множина компонентів 2110 підрозділяється на множину груп компонентів. Кожна з множини груп потім присвоюється (наприклад, додається або розподіляється) відповідному одному з множини ієрархічних рівнів. При цьому кількість груп відповідає кількості рівнів. Наприклад, кількість груп може дорівнювати кількості рівнів, щоб була одна група компонентів для кожного рівня. Як зазначено вище, множина рівнів може включати в себе базовий рівень і один або більше (наприклад, М-1) ієрархічних поліпшуючих рівнів.At step 51010 in FIG. 1 set of components 2110 is subdivided into a set of component groups. Each of the plurality of groups is then assigned (eg, added or distributed) to a corresponding one of the plurality of hierarchical levels. At the same time, the number of groups corresponds to the number of levels. For example, the number of groups can be equal to the number of levels, so that there is one component group for each level. As indicated above, the plurality of levels may include a base level and one or more (eg, M-1) hierarchical improvement levels.

Інакше кажучи, базове стиснене представлення звуку підрозділене на частини, які будуть присвоєні окремим рівням. Без втрет" спільуесті уе»тповання може бути описане за допометсIn other words, the basic compressed sound representation is divided into parts that will be assigned to individual levels. Without prejudice to the coexistence, the application can be described in detail

Я в тем Че ій у попе,I am in tem Che iy u pope,

М-А1 чисел ї. вечія, де "У 7 л у результаті чого компоненти і! . тА Щу/«/,, присвоюються т-ому рівню для я -15/ ЛяM-A1 numbers and. vechiya, where "In 7 l as a result the components i! . tA Shchu/«/,, are assigned to the th level for i -15/ La

На етапі 51020 групи компонентів присвоюються своїм відповідним рівням. На етапі 51030 базова допоміжна інформація 2120 додається (наприклад, розподіляється) до базового рівня (тобто до найбільш низького з множини ієрархічних рівнів).In step 51020, component groups are assigned to their respective levels. In step 51030, the basic supporting information 2120 is added (eg, distributed) to the basic level (ie, to the lowest of the plurality of hierarchical levels).

Таким чином, внаслідок її невеликого розміру запропоновано включати повну базову допоміжну інформацію (базову допоміжну інформацію і факультативну додаткову базову допоміжну інформацію) у базовий рівень, щоб уникнути її непотрібної фрагментації.Thus, due to its small size, it is proposed to include the full basic supporting information (basic supporting information and optional additional basic supporting information) in the basic layer to avoid its unnecessary fragmentation.

Якщо стиснене представлення звуку, що розглядається, містить залежну базову допоміжну інформацію (додаткову базову допоміжну інформацію), спосіб додатково може включати (не показано на фіг. 1) декомпозицію додаткової базової допоміжної інформації на множину частин 2130-1,..., 2130-М додаткової базової допоміжної інформації. Частини додаткової базової допоміжної інформації потім можуть бути додані (наприклад, розподілені) до базового рівня.If the compressed sound representation under consideration contains dependent basic auxiliary information (additional basic auxiliary information), the method may further include (not shown in Fig. 1) decomposing the additional basic auxiliary information into a plurality of parts 2130-1,..., 2130- M additional basic supporting information. Pieces of additional basic support information can then be added (eg distributed) to the basic layer.

Інакше кажучи, частини додаткової базової допоміжної інформації можуть бути включені в базовий рівень. Кожна частина додаткової базової допоміжної інформації може бути пов'язана з відповідним рівнем і може включати в себе інформацію, яка визначає декодування одного або більше компонентів, присвоєних відповідному рівню, залежно від інших компонентів, присвоєних відповідному рівню і будь-яким рівням нижче відповідного рівня. вIn other words, parts of additional basic supporting information can be included in the basic level. Each piece of additional basic support information may be associated with a corresponding layer and may include information that determines the decoding of one or more components assigned to the corresponding layer depending on other components assigned to the corresponding layer and any layers below the corresponding layer. in

Таким чином, у той час як незалежна базова допоміжна інформація ! (базова допоміжна інформація) 2120 залишається без змін для присвоєння, залежна базова допоміжна інформація повинна бути оброблена спеціально для багаторівневого кодування, щоб дозволити правильне декодування на стороні приймача, з одного боку, і скоротити розмір залежної базової допоміжної інформації для передачі, з іншого боку. Запропоноване ст""енати декомпозицію залежної базової допоміжної інформації на М частин, позначених як т, п 1.0М де т- . . . . - В5АС, та частина містить залежну базову допоміжну інформацію для кожного з компонентів у, «Р «і .Thus, while the independent basic auxiliary information ! (basic auxiliary information) 2120 remains unchanged for assignment, the dependent basic auxiliary information must be processed specifically for multi-level coding to allow correct decoding at the receiver side on the one hand, and to reduce the size of the dependent basic auxiliary information for transmission on the other hand. It is proposed to decompose the dependent basic auxiliary information into M parts, marked as t, n 1.0M where t-. . . . - B5AC, that part contains dependent basic auxiliary information for each of the components y, "P" and .

Зо Їт-1 , Ла базового стисненого представлення звуку, присвоєного т-ому рівню, у припущенні, що факультативна залежна базова допоміжна інформація існує для стисненого представлення звуку, що розглядається. У випадку, якщо відгрерт"" залежна допоміжна інформація не існує, для стисненого представлення звуку частин бли може передбачатися порожньою. Кожна частина залежної базової допоміжної інформації р,т може залежати від . - ВККС. 157/ . с. . всіх компонентів і, і Лв що містяться на всіх рівнях аж до т-го (тобто містяться на всіх рівнях / всZ Yit-1 , La of the basic compressed sound representation assigned to the th level, assuming that optional dependent basic auxiliary information exists for the compressed sound representation under consideration. In the event that the dependent auxiliary information does not exist, bli can be assumed to be empty for a compressed representation of the sound of the parts. Each part of the dependent basic auxiliary information p,t can depend on . - VKKS 157/. with. . of all components i and Lv contained on all levels up to the th (i.e. contained on all levels / all

Якщо пакет 77 7 незалежної базової допоміжної інформації має нехтувано малий розмір, розумно утримувати його як ціле і додавати (присвоювати) його до базового рівня.If the package 77 7 of independent basic auxiliary information is negligibly small, it is reasonable to keep it as a whole and add (assign) it to the basic level.

Факультативно подібна декомпозиція, як для залежної базової допоміжної інформації, такожOptionally, a similar decomposition as for the dependent underlying supporting information, as well

В бути виконана для незалежної базової допоміжної інформації, забезпечуючи пакети бот т ТМ, Це корисно для скорочення розміру базового рівня за допомогою додавання (присвоєння) частин незалежної базової допоміжної інформації до рівнів з відповідними компонентами базового стисненого представлення звуку.It is useful to reduce the size of the base layer by adding (assigning) parts of the independent base support information to the layers with the corresponding components of the base compressed sound representation.

На етапі 51040 може бути визначена множина частин 2140-1,..., 2140-М поліпшуючої допоміжної інформації. Кожна частина поліпшуючої допоміжної інформації може включати в себе параметри для поліпшення (наприклад, розширення) відтвореного представлення звуку, доступні з даних, включених у відповідний рівень і будь-які рівні нижче відповідного рівня.At step 51040, a plurality of parts 2140-1,..., 2140-M of improving auxiliary information can be determined. Each piece of enhancing auxiliary information may include parameters for improving (eg, expanding) the reproduced sound representation available from data included in the corresponding layer and any layers below the corresponding layer.

Причина виконання цього етапу полягає в тому, що у випадку багаторівневого кодування важливо реалізувати, щоб поліпшуюча допоміжна інформація обчислювалася для кожного додаткового рівня, оскільки передбачається поліпшити попередній відновлений звук (або звукове поле), що, однак, залежить від доступних рівнів для відновлення. Зокрема, попередній відновлений звук (або звукове поле) для даного найбільш високого декодованого рівня (найбільш високого застосовного рівня) залежить від компонентів, включених у найбільш високий декодований рівень і будь-які рівні нижче найбільш високого декодованого рівня. Отже, стиснення повинно забезпечити М індивідуальних пакетів даних полешр"ючої допоміжної інформації (частин поліпшуючої допоміжної інформації), позначених як т т 1..М. деThe reason for performing this step is that, in the case of multi-level coding, it is important to realize that the improving auxiliary information is calculated for each additional level, as it is expected to improve the previous reconstructed sound (or sound field), which, however, depends on the available layers for reconstruction. In particular, the previous reconstructed sound (or sound field) for a given highest decoded level (the highest applicable level) depends on the components included in the highest decoded level and any levels below the highest decoded level. Therefore, compression should provide M individual data packets of improving auxiliary information (parts of improving auxiliary information), denoted as t t 1..M. where

. . . . . ЕІ . поліпшуюча допоміжна інформація в т-ому пакеті даних обчислюється, щоб поліпшити представлення звуку (або звукового поля), одержане із всіх даних, що містяться на базовому рівні і поліпшуючих рівнях з індексами нижче т (наприклад, всіх даних, що містяться на т-ому рівні і будь-яких рівнях нижче т-го рівня).. . . . . EI. the enhancement auxiliary information in the th data packet is calculated to enhance the representation of the sound (or sound field) obtained from all the data contained in the base layer and the enhancement layers with indices below t (for example, all the data contained in the th level and any levels below the th level).

На етапі 51050 множина частин 2140-1,..., 2140-М поліпшуючої допоміжної інформації присвоюється (наприклад, додана або розподіляється) множині рівнів. Кожна з множини частин поліпшуючої допоміжної інформації присвоюється відповідному одному з множини рівнів.In step 51050, the plurality of pieces 2140-1,..., 2140-M of the improving auxiliary information are assigned (eg, added or distributed) to the plurality of levels. Each of the plurality of pieces of enhancing auxiliary information is assigned to a corresponding one of the plurality of levels.

Наприклад, кожний з множини рівнів включає в себе відповідну частину поліпшуючої допоміжної інформації.For example, each of the plurality of levels includes a corresponding portion of the enhancement support information.

Присвоєння базової і/або поліпшуючої допоміжної інформації відповідним рівням може бути зазначене в інформації конфігурації, яка формується за допомогою способу кодування. Інакше кажучи, відповідність між базовою і/або поліпшуючою допоміжною інформацією і відповідними рівнями може бути зазначена в інформації конфігурації. Крім того, інформація конфігурації може вказувати для кожного рівня компоненти базового стисненого представлення звуку, які присвоєні (наприклад, включені) цьому рівню. Частини додаткової базової допоміжної інформації, включені в базовий рівень, все ж таки можуть відповідати рівням, що відрізняються від базового рівня.The assignment of basic and/or enhancing auxiliary information to the corresponding levels can be specified in the configuration information, which is formed using the coding method. In other words, the correspondence between the basic and/or enhancing auxiliary information and the corresponding levels may be specified in the configuration information. In addition, the configuration information may indicate for each layer the components of the underlying compressed audio representation that are assigned (eg, enabled) to that layer. Parts of additional basic support information included in the basic level may still correspond to levels other than the basic level.

Е уриваючи підсумок, на стадії стиснення забезпечується пакет даних кадру, позначений якE interrupting the summary, at the compression stage, a packet of frame data is provided, denoted as

ЕВАМЕ - ІВ5КС, я В5АС, В В5Іб: Віри Б5 ОО. ЕІ м () . В5І,. Ві, праг . . :EVAME - IV5KS, I B5AS, B B5Ib: Vera B5 OO. EI m () . B5I,. You, Prag. . :

Крім того, пакети її Вт оддя ТЕ 1... М могли б бути об'єднані в єдиний пакет 5 2 ли ЕКАМЕ дк у ЯIn addition, the packages of her W oddya TE 1... M could be combined into a single package 5 2 li EKAME dk u I

І «нн Аг 171 пани позами лез ія плоти печалі, ср щ Мп лати постУПНИИ скла, ІI «nn Ag 171 lords poses blades and flesh of sorrow, sr sh Mp lati postUPNYI skla, I

УЧЕВАМЕ --|В5АСІ В5КС; .- ВАС, ВБІ БЕБІ; Б5І; с Бім с) дUCHEVAME --|В5АСИ В5КС; .- YOU, WBI BABY; B5I; c Bim c) d

Порядок проходження індивідуальних корисних навантажень з пакетом даних кадру в загальному випадку може бути довільним.The order of passage of individual payloads with a frame data packet can generally be arbitrary.

Індивідуальні пакети даних потім можуть бути згруповані в корисних навантаженнях, які визначені як спеціальні пакети даних, які містять прапор коректності, значення, що вказує їх розмір, а також фактичні стиснені дані представлення. Використання корисних навантажень дозволяє просте демультиплексування на стороні приймача, пропонуючи перевагу можливості відкидати неактуальні корисні навантаження без необхідності їх аналізу. Одне можливе угруповання задане як вхвс І - присвоєння (наприклад, розподіл) кожного ,; пакета /-1..5/ індивідуальномуIndividual data packets can then be grouped into payloads, which are defined as special data packets that contain a correctness flag, a value indicating their size, and the actual compressed representation data. The use of payloads allows simple demultiplexing on the receiver side, offering the advantage of being able to discard irrelevant payloads without having to analyze them. One possible grouping is given as вхвс И - assignment (for example, distribution) of each ,; package /-1..5/ to an individual

ВР. корисному навантаженню, позначеномуяк 7; ру присвоєння (наприклад, дез ліл) т-го пакета даних поліпшуючої допоміжної інформації т ії т-го пакета даних бот. залежної допоміжної інформації одному поліпшуючомуVR. payload, marked as 7; ru assignment (for example, dez lil) of the tth data package of the improving auxiliary information of the tth bot data package. dependent auxiliary information to one improver

ЕР тпт-1 М корисному навантаженню, позначеному як ; нен Вс - присвоєння пакета незалежної базової допоміжної інформації Ї окремому корисному навантаженню допоміжної інформації, позначеному як ВБР.ER tpt-1 M payload, denoted as ; nen Vs - assignment of a package of independent basic auxiliary information Y to a separate payload of auxiliary information designated as VBR.

Факультативно, в позміюо незалежної базової допоміжної інформації великий, кожен т-ий з її компонентів, то ть 1,..М. може бути присвоєний (наприклад, розподілений) . ЕР, поліпшуючому корисному навантаженню т, У цьому випадку корисне навантаження ВТР допоміжної інформації є порожнім і може бути проігнороване. втра факультативна можливість полягає в тому, щоб присвоїти всі залежні пакети даних б.т базової допоміжної інформації корисному навантаженню ВІР допоміжної інформації, що є розумним, якщо розмір залежної базової допоміжної інформації є невеликим. уOptionally, in the pozmiuo of independent basic auxiliary information, each t-th of its components is large, i.e. 1,..M. can be assigned (eg distributed) . ER, to the improving payload t. In this case, the payload VTR of the auxiliary information is empty and can be ignored. Another optional option is to assign all dependent data packets bt basic auxiliary information to the auxiliary information VIR payload, which is reasonable if the size of the dependent basic auxiliary information is small. in

В остаточному підсумку може бути забезпечений пакет даних кадру, позначений як КВАМЕUltimately, a frame data packet designated as a QUAME may be provided

О пвалхято Тов вв оветв»онь ство! "Є ЕВАМЕ «|ВР, ... ВР, ВІР ЕР... ЕР у (3Oh pvalhyato Tov vv ovetv»on stvo! "E EVAME «|VR, ... VR, VIR ER... ER in (3

Спосіб може додатково включати (не показано на фіг. 1) формування для кожного з множини рівнів пакета транспортного рівня (наприклад, пакета 2200 базового рівня і М-1 пакетів 2300-1,..., 2300-(М-1)) поліпшуючого рівня, що включають у себе дані відповідного рівня (наприклад, компоненти, базову допоміжну інформацію і поліпшуючу допоміжну інформацію для базового рівня або компоненти і поліпшуючу допоміжну інформацію для одного або більше поліпшуючих рівнів).The method may additionally include (not shown in Fig. 1) the formation for each of the plurality of levels of the packet of the transport layer (for example, the packet 2200 of the base level and M-1 packets 2300-1,..., 2300-(M-1)) improving level, including data of the corresponding level (for example, components, basic supporting information and improving supporting information for the basic level or components and improving supporting information for one or more improving levels).

Пакети транспортного рівня для різних рівнів можуть мати різні пріоритети передачі. Таким чином, спосіб може додатково включати (не показано на фіг. 1) формування транспортного потоку для передачі даних множини рівнів, причому базовий рівень має найбільш високий пріоритет передачі, і ієрархічні поліпшуючі рівні мають спадні пріоритети передачі. При цьому більш високий пріоритет передачі може відповідати більшому ступеню захисту від помилок, і навпаки.Transport layer packets for different layers may have different transmission priorities. Thus, the method may additionally include (not shown in Fig. 1) forming a transport flow for data transmission of a plurality of levels, with the base level having the highest transmission priority and the hierarchical improving levels having descending transmission priorities. At the same time, a higher transmission priority may correspond to a higher degree of error protection, and vice versa.

Якщо етапи не вимагають деяких інших етапів як попередні умови, згадані вище етапи можуть виконуватися в будь-якому порядку, і передбачається, що ілюстративний порядок, показаний на фіг. 1, не має обмежувального характеру.If the steps do not require some other steps as preconditions, the steps mentioned above can be performed in any order, and it is assumed that the illustrative order shown in FIG. 1, is not restrictive.

Фіг. З ілюструє спосіб декодування стисненого представлення звуку або звукового поля для декодування або відновлення. Приклади відповідної стадії прийому і відновлення схематично проілюстровані на блок-схемах на фіг. 4А і фіг. 48.Fig. C illustrates a method of decoding a compressed representation of a sound or sound field for decoding or reconstruction. Examples of the corresponding stage of reception and recovery are schematically illustrated in the block diagrams in fig. 4A and fig. 48.

Як випливає з попереднього опису, стиснене представлення звуку може бути закодоване в множині ієрархічних рівнів. Множина рівнів може мати присвоєні їм (наприклад, може включати в себе) компоненти базового стисненого представлення звуку, компоненти присвоюються відповідним рівням у відповідних групах компонентів. Базовий рівень може включати в себе базову допоміжну інформацію для декодування базового стисненого представлення звуку.As follows from the previous description, the compressed sound representation can be encoded in a plurality of hierarchical levels. A plurality of levels may have (eg may include) components of the underlying compressed audio representation assigned to them, the components being assigned to corresponding levels in their respective component groups. The base layer may include basic support information for decoding the basic compressed audio representation.

Кожен рівень може включати в себе одну зі згаданих вище частин поліпшуючої допоміжної інформації яка включає в себе параметри для поліпшення базового відтвореного представлення звуку, доступних з даних, включених у відповідний рівень і будь-які рівні нижче відповідного рівня.Each level may include one of the above-mentioned pieces of enhancement support information that includes parameters for improving the basic reproduced sound representation available from the data included in the corresponding level and any levels below the corresponding level.

Запропонований спосіб може бути виконаний на основі кадрів (тобто покадрово). Зокрема, відновлене представлення звуку або звукового поля може бути сформоване для послідовних часових інтервалів, наприклад часових інтервалів однакового розміру. Часові інтервали можуть бути, наприклад, кадрами. Описані нижче етапи можуть бути виконані для кожних послідовних часових інтервалів (наприклад, кадрів).The proposed method can be performed on a frame-by-frame basis (that is, frame-by-frame). In particular, the reconstructed representation of the sound or sound field can be formed for consecutive time intervals, for example time intervals of the same size. Time intervals can be, for example, frames. The steps described below can be performed for each successive time interval (eg frames).

На етапі 53010 приймаються корисні навантаження даних (наприклад, пакети транспортногоAt stage 53010, data payloads are accepted (for example, packets of transport

Зо рівня), що відповідають множині рівнів. Корисні навантаження даних можуть бути прийняті як частина бітового потоку, що містить стиснене представлення НОА звуку або звукового поля, представлення відповідає множині ієрархічних рівнів. Ієрархічні рівні включають в себе базовий рівень і один або більше ієрархічних поліпшуючих рівнів. Множина рівнів має присвоєні їм компоненти базового стисненого представлення звуку або звукового поля. Компоненти присвоєні відповідним рівням у відповідних групах компонентів.From the level), corresponding to a set of levels. The data payloads may be received as part of a bit stream containing a compressed representation of the NOA of the sound or sound field, the representation corresponding to a plurality of hierarchical levels. Hierarchical levels include a base level and one or more hierarchical improvement levels. A set of levels have components of the underlying compressed sound representation or sound field assigned to them. Components are assigned to corresponding levels in their respective component groups.

Пакети індивідуальних рівнів можуть бути мультиплексовані для забезпечення прийнятого пакета кадру повного стисненого представлення звуку. Прийнятий пакет кадру може бути позі Й що Віва - В5ірм о Б5 В5КС В5КС ре Ем ВАС, ящ В5КО т (М-1) | (4)Individual layer packets can be multiplexed to provide the received frame packet with a full compressed audio representation. The received frame package can be in the position of Viva - V5irm o B5 V5KS V5KS re Em VAS, yash V5KO t (M-1) | (4)

В альтернативному випадку пакети ВО, і ВУ рт для 1..М об'єднані в єдиний пакетIn the alternative case, the packages VO and VU rt for 1..M are combined into a single package

В5І, пакети індивідуальних рівнів можуть бути мультиплексовані для забезпечення прийнятогоB5I, packets of individual layers can be multiplexed to ensure received

ВІ Е5І, В5КО о, ВЗ 3-12 ЕІ. ВК, пе КО 1: (М-1 (5)VI E5I, V5KO o, VZ 3-12 EI. VC, pe KO 1: (M-1 (5)

М/ сничихв ан пичхя думали паями п пихи жи пт ижаличи ни жаль я яв п ит Я уM/ snychikhv an pikhya thought by payam p pyhy zhi pt pity ni pity i yav p it I y

ВАМ - Ів? Вр.ВсІрРЕрР, Ер ЇЙ пакет кадру може бути заданий як і ! 1 М (6)Yves to you? Vr.VsIrRERR, Er HER frame package can be specified as and ! 1 M (6)

Прийнятий пакет кадру потім може бути переданий на декомпресор або декодер 4100. Якщо передача індивідуального рівня була б'тр"милковою, прапор коректності щонайменше частини вміщеного корисного навантаження т поліпшуючої допоміжної інформації (наприклад, відповідної частини поліпшуючої допоміжної інформації) установлений таким, що дорівнює "істинному". У випадку помилки внаслідок передачі індивідуального рівня прапор коректності щонайменше в корисному навантаженні поліпшуючої допоміжної інформації на цьому рівні встановлений таким, що дорівнює "хибному". Отже, коректність пакета рівня може бути визначена на основі коректності вміщеного корисного навантаження поліпшуючої допоміжної інформації (наприклад, на основі його прапора коректності).The received frame packet may then be transmitted to the decompressor or decoder 4100. If the transmission of the individual layer was false, the correctness flag of at least a portion of the enhanced auxiliary information contained in the payload (eg, the corresponding portion of the enhanced auxiliary information) is set to " true". In the event of an error due to the transmission of an individual layer, the correctness flag of at least the enhancement auxiliary information payload at that layer is set to "false". Therefore, the correctness of the layer packet can be determined based on the correctness of the contained enhancement auxiliary information payload (e.g. , based on its correctness flag).

У декомпресорі 4100 прийнятий пакет кадру може бути демультиплексований. Із цією метою може використовуватися інформація розміру кожного корисного навантаження, щоб уникнути непотрібного аналізу даних індивідуальних корисних навантажень.In the decompressor 4100, the received frame packet may be demultiplexed. For this purpose, the size information of each payload can be used to avoid unnecessary data analysis of individual payloads.

На етапі 53020 перший індекс рівня, що вказує найбільш високий рівень (наприклад, найбільш високий застосовний рівень або найбільш високий декодований рівень), визначається з множини рівнів для використання для декодування базового стисненого представлення звуку в базове відтворене представлення звуку або звукового поля.In step 53020, a first level index indicating the highest level (eg, the highest applicable level or the highest decoded level) is determined from the plurality of levels for use in decoding the base compressed sound representation into the base reproduced sound or sound field representation.

Крім того, на етапі 53020 може бути вибране значення (наприклад, індекс рівня) Мв найбільш високого рівня (найбільш високого застосовного рівня), що буде використовуватися для відновлення базового представлення звуку. Найбільш високий поліпшуючий рівень, що буде фактично використовуватися для відновлення базового представлення звуку, заданий якIn addition, step 53020 may select a value (eg, level index) of the highest level (highest applicable level) to be used to restore the base sound representation. The highest enhancement level that will actually be used to restore the base audio representation is given by

Мв -1. Оскільки кожен рівень містить точно одне корисне навантаження поліпшуючої допоміжної інформації (частину поліпшуючої допоміжної інформації), можна визначити на основі корисного навантаження поліпшуючої допоміжної інформації, чи є коректним вміщуючий рівень (наприклад, був коректно прийнятий). Отже, вибір може бутеет сягнУтий з використанням всіх корисних навантажень поліпшуючої допоміжної інформації тот 1.М (або, відповідно,MV -1. Since each layer contains exactly one enhancement support information payload (a portion of the enhancement support information), it can be determined based on the enhancement support information payload whether the containing layer is correct (eg, was received correctly). Therefore, the choice can be reached using all the payloads of the improving auxiliary information that 1.M (or, accordingly,

ЕР т - 1оМ,,ER t - 1oM,,

На етапі 53030 одержується базове відтворене представлення звуку. Базове відтворене представлення звуку може бути одержане з компонентів, присвоєних найбільш високому застосовному рівню, зазначеному першим індексом рівня, і будь-яким рівням нижче цього найбільш високого застосовного рівня з використанням базової допоміжної інформації (або в цілому з використанням базової допоміжрер С С ВевсAt step 53030, a basic reproduced representation of the sound is obtained. A basic reproduced representation of the sound can be obtained from the components assigned to the highest applicable level indicated by the first level index and any levels below that highest applicable level using the basic auxiliary information (or generally using the basic auxiliary

Корисні навантаження компонентів Тез ) базового стисненого представлення звуку можуть бути забезпечені порусІЗ бр «орисними навантаженнями базової допоміжної інформації (наприклад, 957 або Іі Бт, ть 1,...МУ ї значенням Мв процесору 4200 відновлення базового представлення. Процесор 4200 відновлення базового представлення (проілюстрований на фіг. 4А і 4В) відтворює базове представлення звуку (або звукового поля) з використанням тільки тих компонентів базового стисненого представлення звуку, які містяться на найбільш низьких Мв рівнях, які являють собою базовий рівень і Мв -1ї поліпшуючих рівнівThe payloads of the components (Tez) of the basic compressed sound representation can be provided by porusIZ br "outline loads of the basic auxiliary information (for example, 957 or Ii Bt, t 1,...MU and the Mv value of the processor 4200 restoring the basic representation. The processor 4200 restoring the basic representation ( illustrated in Fig. 4A and 4B) reproduces the basic sound representation (or sound field) using only those components of the basic compressed sound representation that are contained in the lowest Mv levels, which represent the base level and Mv -1st improving levels

Зо (тобто рівні аж до рівня, зазначеного першим індексом рівня). Як альтернатива процесору 4200 відновлення базового представлення можуть бути забезпечені тільки корисні навантаження компонентів базового стисненого представлення звуку, що містяться на найбільш низьких Мв рівнях разом з відповідними корисними навантаженнями базової допоміжної інформації.Zo (that is, levels up to the level indicated by the first level index). As an alternative to the basic representation recovery processor 4200, only the payloads of the components of the basic compressed audio representation contained at the lowest MV levels can be provided along with the corresponding payloads of the basic auxiliary information.

Необхідна інформація про те, які компоненти базового стисненого представлення звуку (або звукового поля) містяться на індивідуальних рівнях, передбачається відомою декомпресору 4100 з пакета даних з інформацією конфігурації, яка передбачається відправленою і прийнятою перед пакетами даних кадру.The necessary information about which components of the basic compressed sound representation (or sound field) are contained at the individual levels is assumed to be known to the decompressor 4100 from the data packet with configuration information that is supposed to be sent and received before the frame data packets.

Ва тех 1... . и. шеAnd tech 1... . and what

Щоб забезпечити пакети даних рт, "78 залежної допоміжної інформації і пакетTo provide packets of RT data, "78 dependent auxiliary information and a packet

ГТ даних ТЕ поліпшуючої допоміжної інформації, всі поліпшуючі корисні навантаження можуть введені в частковий аналізатор 4400 (див. фіг. 48) декомпресора 4100 разом зі значенням Ме і значенням Мв. Аналізатор може відкинути всі корисні навантаження і пакети даних, які не будуть використовуватися для фактичного відновлення. Якщо значення Ме дорівнює нулю, то може передбачатися, що всі пакети даних поліпшуючої допоміжної інформації є порожніми.HT data TE improving auxiliary information, all improving payloads can be entered into the partial analyzer 4400 (see Fig. 48) of the decompressor 4100 together with the Me value and the Mv value. The analyzer can discard all payloads and data packets that will not be used for actual recovery. If the value of Me is zero, then it can be assumed that all data packets of the improving auxiliary information are empty.

Якщо базовий рівень включає в себе щонайменше одне залежне корисне навантаження базової допоміжної інформації (частину додаткової базової допоміжної інформації), що відповідає відповідному рівню, декодування кожного ВЕУ "дгисного навантаження залежної базової допоміжної інформації (наприклад, Вт, пе ов (частина додаткової базової допоміжної інформації)) може включати в себе (ї) декодування частини додаткової базової допоміжної інформації за допомогою посилання на компоненти, присвоєні її відповідному рівню і будь-яким рівням нижче відповідного рівня (попереднє декодування), і (ії) корекцію частини додаткової базової допоміжної інформації за допомогою посилання на компоненти, присвоєні найбільш високому застосовному рівню і будь-яким рівням між найбільш високим застосовним рівнем і відповідним рівнем (корекція). При цьому додаткова базова допоміжна інформація, що відповідає відповідному рівню, включає в себе інформацію, яка визначає декодування одного або більше компонентів з компонентів, присвоєних відповідному рівню, залежно від інших компонентів, присвоєних відповідному рівню і будь-яким рівням нижче відповідного рівня.If the basic layer includes at least one dependent payload of the basic auxiliary information (part of the additional basic auxiliary information) corresponding to the corresponding layer, decoding of each WEU "dgys payload of the dependent basic auxiliary information (for example, W, pe ov (part of the additional basic auxiliary information) )) may include (i) decoding a portion of the additional basic auxiliary information by reference to the components assigned to its respective layer and any layers below the corresponding layer (pre-decoding), and (ii) correcting a portion of the additional basic auxiliary information using references to the components assigned to the highest applicable level and any levels between the highest applicable level and the corresponding level (correction), wherein the additional basic supporting information corresponding to the corresponding level includes information that specifies the decoding of one or more components with the component c assigned to the corresponding level, depending on the other components assigned to the corresponding level and any levels below the corresponding level.

Потім базове відтворене представлення звуку може бути одержане (наприклад, сформоване) з компонентів, присвоєних найбільш високому застосовному рівню і будь-яким рівням нижче найбільш високого застосовного рівня, з використанням базової допоміжної інформації і скоректованих частин додаткової базової допоміжної інформації, одержаних із частин додаткової базової допоміжної інформації, що відповідає рівням аж до найбільш високого застосовного рівня. Вс. тії МA base reproduced sound representation may then be derived (eg formed) from the components assigned to the highest applicable level and any levels below the highest applicable level, using the base support information and adjusted portions of the additional base support information derived from the portions of the additional base supporting information corresponding to levels up to the highest applicable level. Sun. that M

Зокрема, попереднє декодування кожного корисного навантаження то ливIn particular, the preliminary decoding of each payload is

В5ВСІ, ВАС збе використання його залежності від перших ля компонентів те базового стисненого представлення звуку, які містяться на перших т рівнях, що передбачалося на стадії кодування. вс - МB5VSI, VAS bbe using its dependence on the first la components of the basic compressed sound representation, which are contained in the first t levels, which was assumed at the stage of coding. all - M

Послідовна корекція кожного корисного навантаження т, те» В може виленети у- в себе прийняття ло мваги. шо базовий компонент звуку нарешті відтворений з перших ХвSuccessive correction of each payload t, te" B can result in acceptance of weight. that the basic sound component is finally reproduced from the first minutes

В5КСОВ5Ко, 3-1 компонентів В базового стисненого представлення звуку, які містяться на перших МВ т рівнях, що є більшою кількістю компонентів, ніж передбачалося для попереднього декодування. Отже, корекція може бути досягнута за допомогою відкидання неадекватної інформації, що можливе внаслідок початково прийнятої властивості залежної базової допоміжної інформації, що полягає в тому, що, якщо деякі взаємодоповнюючі компоненти додаються до базового стисненого представлення звуку, залежна базова допоміжна інформація для кожного індивідуального (взаємодоповнюючого) компонента стає підмножиною початкової.В5КСОВ5Ко, 3-1 B components of the basic compressed audio representation contained in the first MW t levels, which is more components than expected for previous decoding. Therefore, the correction can be achieved by discarding inadequate information, which is possible due to the originally accepted property of the dependent base auxiliary information, which is that if some complementary components are added to the base compressed sound representation, the dependent base auxiliary information for each individual (mutually complementary ) of the component becomes a subset of the initial one.

На етапі 53040 може бути визначений другий індекс рівня. Другий індекс рівня може вказувати частину (частини) поліпшуючої допоміжної інформації, яка повинна використовуватися для поліпшення (наприклад, розширення) базового відтвореного представлення звуку.At step 53040, a second level index may be determined. The second level index may indicate the portion(s) of enhancement auxiliary information to be used to enhance (eg, expand) the underlying reproduced sound representation.

На доповнення до першого індексу рівня може бути визначений індекс Ме (другий індекс рівня) корисного навантаження поліпшуючої допоміжної інформації (частини другої поліпшуючої інформації) для використання для відновлення. Другий індекс Ме рівня може завжди або дорівнювати першому індексу Мв рівня, або дорівнювати нулю. Поліпшення може бути досягнуте або завжди відповідно до базового представлення звуку, одержаного з найбільш високого застосовного рівня, або ніколи.In addition to the first level index, the Me index (the second level index) of the payload of the improving auxiliary information (part of the second improving information) can be determined for use in recovery. The second index of the Me level can always either be equal to the first index of the Mv level, or be equal to zero. The enhancement can be achieved either always according to the base representation of the sound obtained from the highest applicable level, or never.

На етапі 53050 відтворене представлення звуку або звукового поля одержується (наприклад, формується) з базового відтвореного представлення звуку з посиланням на другий індекс рівня.In step 53050, the reproduced sound or sound field representation is derived (eg, formed) from the base reproduced sound representation with reference to the second level index.

Таким чином, відтворене представлення звуку одержується за допомогою (параметричного) поліпшення або розширення базового відтвореного представлення звуку, наприклад за допомогою використання поліпшуючої допоміжної інформації (частини поліпшуючої допоміжної інформації), зазначеної другим індексом рівня. Як зазначено далі, другий індекс рівня може вказувати на те, щоб взагалі не використовувати яку-небудь поліпшуючу допоміжну інформацію на даній стадії. Тоді відтворене представлення звуку буде відповідати базовому відтвореному представленню звуку.Thus, the reproduced sound representation is obtained by (parametric) improvement or expansion of the basic reproduced sound representation, for example by using the enhancement auxiliary information (part of the enhancement auxiliary information) indicated by the second level index. As indicated below, the second level index may indicate not to use any enhancement aids at all at this stage. Then the reproduced sound representation will match the underlying reproduced sound representation.

Із цією метенеу пепререне базове представлення звуку разом з усіма корисними навантаженнями ТИТА поліпшуючої допоміжної інформації, корисними навантаженнями базової допоміжної інформації (наприклад, 997 або В, і Вот, т 1...5М) і значенням Ме забезпечуються процесору 4300 відновлення розширеного представлення (проілюстрованому на фіг. 4А і 48), який обчислює остаточне розширене предстагрср,, І 2100' звуку (або звукового поля) з використанням тільки корисного навантаження ТЕ поліпшуючої допоміжної інформації, і відкидаючи всі інші корисні навантаження поліпшуючої допоміжної інформації. Як альтернатива процесору 4300 відновлереї поліпшуючого представлення може бути забезпечене тільки корисне навантаження МЕ поліпшуючої допоміжної інформації замість всіх корисних навантажень поліпшуючої допоміжної інформації. Якщо значення Ме дорівнює нулю, всі корисні навантаження поліпшуючої допоміжної інформації відкидаються (або як альтернатива корисне навантаження поліпшуючої допоміжної інформації не забезпечується), і відтворене фінальне розширене представлення 2100' звуку дорівнює відтвореному основномуWith this method, the specified base audio representation along with all TITA payloads of enhancement support information, base support information payloads (eg, 997 or B, and Vot, t 1...5M) and the Me value are provided to the enhanced representation recovery processor 4300 (illustrated in Fig. 4A and 48), which calculates the final extended prestagrsr,, I 2100' of the sound (or sound field) using only the TE payload of the enhancing auxiliary information, and discarding all other payloads of the enhancing auxiliary information. As an alternative to the processor 4300, only the ME payload of the enhancing auxiliary information can be provided instead of all the payloads of the enhancing auxiliary information. If the value of Me is zero, all enhancement payloads are discarded (or alternatively no enhancement payload is provided) and the final enhanced representation of the 2100' audio rendered is equal to the rendered base

ЕБ5Ід, представленню звуку. Корисне навантаження ТЕ поліпшуючої допоміжної інформації може бути одержане за допомогою часткового аналізатора 4400.EB5Id, presentation of sound. A TE payload of enhancing auxiliary information may be obtained using a partial analyzer 4400.

Фіг. З також у цілому ілюструє декодування стисненого представлення НОА на основі базової допоміжної інформації, що пов'язана з базовим рівнем, і на основі поліпшуючої допоміжної інформації, що пов'язана з одним або більше ієрархічними поліпшуючими рівнями.Fig. C also generally illustrates the decoding of the compressed representation of the NOA based on the base auxiliary information associated with the base level and on the basis of the enhancement auxiliary information associated with one or more hierarchical enhancement levels.

Якщо етапи не вимагають деяких інших етапів як попередні умови, згадані вище етапи можуть виконуватися в будь-якому порядку, і передбачається, що ілюстративний порядок, показаний на фіг. 3, не має обмежувального характеру.If the steps do not require some other steps as preconditions, the steps mentioned above can be performed in any order, and it is assumed that the illustrative order shown in FIG. 3, is not restrictive.

Далі будуть описані подробиці вибору рівнів для відновлення (вибір першого і другого індексів рівнів) на етапах 53020 і 53040.The details of the selection of levels for recovery (selection of the first and second index levels) in steps 53020 and 53040 will be described next.

Визначення першого індексу рівня може включати в себе визначення для кожного рівня, чи був відповідний рівень прийнятий коректно. Визначення першого індексу рівня може додатково включати в себе визначення першого індексу рівня як індексу того рівня, який знаходиться безпосередньо нижче найбільш низького рівня, що не був коректно прийнятий. Чи був рівень прийнятий коректно, може бути визначено за допомогою оцінки, чи було коректно прийняте корисне навантаження поліпшуючої допоміжної інформації цього рівня. Це, у свою чергу, може бути виконане за допомогою оцінки прапорів коректності в корисних навантаженнях поліпшуючої допоміжної інформації.Determining the first level index may include determining, for each level, whether the corresponding level was received correctly. Defining the first level index may further include defining the first level index as the index of the level immediately below the lowest level that was not correctly received. Whether a layer has been received correctly can be determined by evaluating whether the payload of the enhancing support information of that level has been correctly received. This, in turn, can be accomplished by evaluating the correctness flags in the payloads of the enhancement support information.

Визначення другого індексу рівня в загальному випадку може включати в себе або визначення другого індексу рівня як такого, що дорівнює першому індексу рівня, або визначення значення індексу як другого індексу рівня (наприклад, значення 0 індексу), яке вказує, що не слід використовувати яку-небудь поліпшуючу допоміжну інформацію при одержанні відтвореного представлення звуку.Determining the second level index can generally include either defining the second level index as equal to the first level index, or defining an index value as the second level index (eg, an index value of 0) that indicates that no- any improving auxiliary information when obtaining a reproduced representation of the sound.

У випадку, якщо всі пакети даних кадру можуть бути відновлені незалежно один від одного, і номер Мв найбільш високого рівня (найбільш високого застосовного рівня) для фактичного використання для відновлення базового представлення звуку, і індекс Ме корисного навантаження поліпшуючої допоміжної інформації для використання для відновлення можуть бути встановлені такими, що дорівнюють найбільшому номеру і! коректного корисного навантаження поліпшуючої допоміжної інформації, який сам може бути визначений заIn the case that all the data packets of the frame can be recovered independently of each other, both the Mv number of the highest level (the highest applicable level) to actually use for the recovery of the basic sound representation and the Me index of the payload of the enhancement auxiliary information to be used for recovery can be set equal to the largest number and! correct payload of improving auxiliary information, which itself can be determined by

Зо допомогою оцінки прапорів коректності в корисних навантаженнях поліпшуючої допоміжної інформації. Використовуючи знання розміру кожного корисного навантаження поліпшуючої допоміжної інформації можна уникнути складного аналізу фактичних даних корисних навантажень для визначення їх коректності.By evaluating the correctness flags in the payloads of the improving auxiliary information. Using the knowledge of the size of each payload of the improving auxiliary information, it is possible to avoid complex analysis of the actual data of the payloads to determine their correctness.

Таким чином, другий індекс рівня може бути визначений як такий, що дорівнює першому індексу рівня, якщо стиснені представлення звуку для послідовних часових інтервалів можуть бути декодовані незалежно. У цьому випадку відтворене базове представлення звуку може бути розширене на основі корисного навантаження поліпшуючої допоміжної інформації найбільш високого застосовного рівня.Thus, the second level index can be determined to be equal to the first level index if the compressed sound representations for successive time intervals can be decoded independently. In this case, the reproduced base representation of the sound may be augmented based on the payload of the highest applicable level of enhancement support information.

У випадку, якщо використається це диференціальне відновлення з міжкадровими залежностями, на доповнення потрібно розглядати рішення від попереднього кадру. Слід зазначити, що з диференціальним відновленням звичайно незалежні пакети даних кадру передаються з регулярними часовими інтервалами, щоб дозволити починати відновлення з тих моментів часу, коли визначення значень Мв і Ме стають незалежними від кадрів, і це виконується, як описано вище.In case this differential recovery with inter-frame dependencies is used, the solution from the previous frame should be considered for addition. It should be noted that with differential recovery, normally independent frames of data packets are transmitted at regular time intervals to allow recovery to begin at those points in time when the determinations of Mv and Me values become frame-independent, and this is done as described above.

Для докладного роз'яснення запропонованого залежного від кадрів рішення, найбільший номер (наприклад, індекс рівня) коректного корисного навантаження поліпшуючої допоміжної інформації для К-го кадру позначений як (КК), номер найбільш високого рівня (наприклад, індекс рівня) для вибору і використання для відновлення базового представлення звуку позначений якFor a detailed explanation of the proposed frame-dependent solution, the largest number (e.g., level index) of the correct payload of the improving auxiliary information for the K-th frame is denoted as (CC), the highest level number (e.g., level index) for selection and use to restore the basic representation of the sound is marked as

Мв(К), ії номер (наприклад, індекс рівня) корисного навантаження поліпшуючої допоміжної інформації для використання для відновлення позначений як Ме(К).Mv(K), the number (eg, level index) of the payload of the enhancement support information to be used for recovery is denoted as Me(K).

Використовуючи ці позначення, номер найбільш високого рівня для використання для відд (Кк - піп (вік ТБ у МвВ(К) може бути обчислений відповідно доUsing these notations, the highest level number to use for div (Kc - pip (TB age in MvV(K)) can be calculated according to

За допомогою вибору Мв(К) не більше, ніж Мві(К-1) і | (К), забезпечується, що вся інформація, необхідна для диференціального відновлення базового представлення звуку, є доступною.By choosing Mv(K) no more than Mvi(K-1) and | (K) ensures that all the information required for differential reconstruction of the underlying sound representation is available.

Таким чином, якщо стиснені представлення звуку для послідовних часових інтервалів (наприклад, кадрів) не можуть бути декодовані незалежно одне від одного, визначення першого індексу рівня може включати визначення для кожного рівня, чи був відповідний рівень прийнятий коректно, і визначення першого індексу рівня для даного часового інтервалу як меншого індексу з першого індексу рівня часового інтервалу, що передує даному часовому інтервалу, і індексу рівня, що знаходиться безпосередньо нижче найбільш низького рівня, що не був коректно прийнятий.Thus, if the compressed audio representations for successive time intervals (eg, frames) cannot be decoded independently of each other, determining the first level index may include determining for each level whether the corresponding level was received correctly and determining the first level index for that time interval as a smaller index from the first index of the level of the time interval preceding this time interval and the index of the level immediately below the lowest level that was not correctly received.

Номер МЕ(К) корисного навантаження поліпшуючої допоміжної інформації для використання для МВ іє МИ - Мік Шк з уповідно доNumber ME(K) of the payload of the improving auxiliary information for use for the MV ie MY - Mik Shk with according to

Мме(ю -| , і : еібе - (8)Mme(yu -| , and : eibe - (8)

При цьому вибір 0 для Ме(К) вказує, що відтворене базове представлення звуку не повинно поліпшуватися або розширюватися з використанням поліпшуючої допоміжної інформації.At the same time, the choice of 0 for Me(K) indicates that the reproduced basic representation of the sound should not be improved or expanded with the use of improving auxiliary information.

Це означає, зокрема, що, за умови, що номер Мве(К) найбільш високого рівня для використання для відновлення базового представлення звуку не змінюється, вибирається той же самий відповідний номер поліпшуючого рівня. Однак у випадку зміни Мв(К) поліпшення забороняється за допомогою установлення Ме(К) таким, що дорівнює нулю. Внаслідок передбачуваного диференціального відновлення поліпшуючої допоміжної інформації її зміна відповідно до Мве(К) неможлива, оскільки це вимагало б відновлення відповідного рівня поліпшуючої допоміжної інформації в попередньому кадрі, що, як передбачається, не було виконано.This means, in particular, that, provided that the Mve(K) number of the highest level to use for restoring the basic sound representation does not change, the same corresponding enhancement level number is selected. However, in the case of a change in Mv(K), improvement is prohibited by setting Me(K) equal to zero. Due to the assumed differential recovery of the enhancement auxiliary information, its change according to Mve(K) is not possible, since this would require the restoration of the corresponding level of the enhancement auxiliary information in the previous frame, which is assumed not to have been performed.

Таким чином, якщо стиснені представлення звуку для послідовних часових інтервалів (наприклад, кадрів) не можуть бути декодовані незалежно одне від одного, визначення другого індексу рівня може включати визначення, чи дорівнює перший індекс рівня для даного часового інтервалу першому індексу рівня для попереднього часового інтервалу. Якщо перший індекс рівня для даного часового інтервалу дорівнює першому індексу рівня для попереднього часового інтервалу, другий індекс рівня для даного часового інтервалу може бути визначений (наприклад, вибраний) як такий, що дорівнює першому індексу рівня для даного часового інтервалу. З іншого боку, якщо перший індекс рівня для даного часового інтервалу не дорівнює першому індексу рівня для попереднього часового інтервалу, значення індексу може бути визначене (наприклад, вибране) як другий індекс рівня, який вказує, що не слід використовувати яку-небудь поліпшуючу допоміжну інформацію при одержанні відтвореного представлення звуку.Thus, if the compressed audio representations for successive time intervals (eg, frames) cannot be decoded independently of each other, determining the second level index may include determining whether the first level index for the given time interval is equal to the first level index for the previous time interval. If the first level index for a given time interval is equal to the first level index for a previous time interval, the second level index for the given time interval may be determined (eg, selected) to be equal to the first level index for the given time interval. On the other hand, if the first level index for a given time interval is not equal to the first level index for a previous time interval, the index value may be determined (eg, selected) as the second level index indicating that no enhancing auxiliary information should be used when receiving a reproduced representation of the sound.

Як альтернатива, якщо при відновленні всі корисні навантаження поліпшуючої допоміжноїAs an alternative, if during the recovery all the payloads of the improving auxiliary

Зо інформації з номером аж до МЕ(К) відновлені паралельно, правило вибору в рівнянні (4) може бутідл рве -From the information with the number up to ME(K) restored in parallel, the selection rule in equation (4) can be

Ех 3 - МИ (9)Ex 3 - WE (9)

Нарешті, слід зазначити, що для диференціального відновлення номер найбільш високого використовуваного рівня Мв може тільки збільшуватися в незалежних пакетах даних кадру, тоді як зменшення можливе в кожному кадрі.Finally, it should be noted that for differential recovery, the number of the highest Mv level used can only increase in independent data packets of a frame, while a decrease is possible in each frame.

Мається на увазі, що запропонований спосіб багаторівневого кодування стисненого представлення звуку може бути реалізований кодером для багаторівневого кодування стисненого представлення звуку. Такий кодер може містити відповідні блоки, виконані з можливістю виконувати відповідні описані вище етапи. Приклад такого кодера 5000 схематично проілюстрований на фіг. 5. Наприклад, такий кодер 5000 може містити блок 5010 підрозділяння компонентів, виконаний з можливістю виконувати згаданий вище етап 51010, блок 5020 присвоєння компонентів, виконаний з можливістю виконувати згаданий вище етап 51020, блок 5030 присвоєння базової допоміжної інформації, виконаний з можливістю виконувати згаданий вище етап 51030, блок 5040 розбивки поліпшуючої допоміжної інформації, виконаний з можливістю виконувати згаданий вище етап 51040, і блок 5050 присвоєння поліпшуючої допоміжної інформації, виконаний з можливістю виконувати згаданий вище етап 51050. Далі мається на увазі, що відповідні блоки такого кодера можуть бути реалізовані за допомогою процесора 5100 обчислювального пристрою, який виконаний з можливістю виконувати обробку, виконувану кожним зі згаданих відповідних блоків, тобто він виконаний з можливістю виконувати деякі або всі згадані вище етапи, а також будь-які додаткові етапи запропонованого методу кодування. Кодер або обчислювальний пристрій може додатково містити пам'ять 5200, до якої процесор 5100 може здійснювати доступ.It is understood that the proposed method of multi-level coding of a compressed sound representation can be implemented by an encoder for multi-level coding of a compressed sound representation. Such an encoder may contain appropriate blocks designed to perform the appropriate steps described above. An example of such an encoder 5000 is schematically illustrated in Fig. 5. For example, such an encoder 5000 may include a component subdivision unit 5010 configured to perform the above-mentioned step 51010, a component assignment unit 5020 configured to perform the above-mentioned step 51020, a base support information assignment unit 5030 configured to perform the above-mentioned step step 51030, a block 5040 of the breakdown of the enhancement auxiliary information, executed with the possibility of performing the above-mentioned step 51040, and the block 5050 of assigning of the enhancement auxiliary information, executed with the possibility of performing the above-mentioned step 51050. It is further understood that the corresponding blocks of such an encoder can be implemented by by means of a processor 5100 of a computing device, which is configured to perform the processing performed by each of the mentioned corresponding blocks, that is, it is configured to perform some or all of the steps mentioned above, as well as any additional steps of the proposed encoding method. The encoder or computing device may further include a memory 5200 that the processor 5100 may access.

Далі мається на увазі, що запропонований спосіб декодування стисненого представлення звуку, яке закодоване в множині ієрархічних рівнів, може бути реалізований декодером для декодування стисненого представлення звуку, яке закодоване в множині ієрархічних рівнів.It is further implied that the proposed method of decoding a compressed sound representation that is encoded in a plurality of hierarchical levels can be implemented by a decoder for decoding a compressed sound representation that is encoded in a plurality of hierarchical levels.

Такий декодер може містити відповідні блоки, виконані з можливістю виконувати відповідні описані вище етапи. Приклад такого декодера 6000 схематично проілюстрований на фіг. 6.Such a decoder may contain appropriate blocks designed to perform the appropriate steps described above. An example of such a decoder 6000 is schematically illustrated in Fig. 6.

Наприклад, такий декодер 6000 може містити блок 6010 прийому, виконаний з можливістю виконувати згаданий вище етап 53010, блок 6020 визначення першого індексу рівня, виконаний з можливістю виконувати згаданий вище етап 53020, блок 6030 базового відтворення, виконаний з можливістю виконувати згаданий вище етап 53030, блок 6040 визначення другого індексу рівня, виконаний з можливістю виконувати згаданий вище етап 53040, і блок 6050, виконаний з можливістю виконувати згаданий вище етап 53050. Далі мається на увазі, що відповідні блоки такого кодера можуть бути реалізовані за допомогою процесора 6100 обчислювального пристрою, який виконаний з можливістю виконувати обробку, виконувану кожним зі згаданих відповідних блоків, тобто він виконаний з можливістю виконувати деякі або всі згадані вище етапи, а також будь-які додаткові етапи запропонованого методу кодування.For example, such a decoder 6000 may include a receiving unit 6010 configured to perform the above-mentioned step 53010, a first layer index determination unit 6020 configured to perform the above-mentioned step 53020, a basic playback unit 6030 configured to perform the above-mentioned step 53030, second level index determination unit 6040, configured to perform the above-mentioned step 53040, and unit 6050, configured to perform the above-mentioned step 53050. It is further understood that the corresponding blocks of such an encoder may be implemented by a processor 6100 of a computing device that is configured to perform the processing performed by each of the respective blocks mentioned, that is, it is configured to perform some or all of the steps mentioned above, as well as any additional steps of the proposed encoding method.

Кодер або обчислювальний пристрій може додатково містити пам'ять 6200, до якої процесор 6100 може здійснювати доступ.The encoder or computing device may further include a memory 6200 that the processor 6100 may access.

Слід зазначити, що опис і креслення лише ілюструють принципи запропонованих способів і пристроїв. Таким чином, буде очевидно, що фахівці в галузі техніки зможуть створювати різні структури, які, хоча явно не описані і не показані в даному документі, реалізовують принципи винаходу і включені в межі його суті і обсягу. Крім того, всі приклади, наведені в даному документі, переважно явно призначені лише для навчання, щоб допомоїти читачеві в розумінні принципів запропонованих способів і пристроїв, і концепцій, внесених винахідниками в розвиток галузі техніки, і повинні бути витлумачені як такі, що не є обмеженнями для таких спеціальним чином наведених прикладів і умов. Крім того, передбачається, що всі твердження в даному документі, що викладають принципи, аспекти і варіанти здійснення винаходу, а також їх конкретні приклади, охоплюють його еквіваленти.It should be noted that the description and drawings only illustrate the principles of the proposed methods and devices. Thus, it will be obvious that experts in the field of technology will be able to create various structures that, although not clearly described and not shown in this document, implement the principles of the invention and are included within the limits of its essence and scope. In addition, all examples provided herein are primarily expressly intended for educational purposes only, to assist the reader in understanding the principles of the proposed methods and devices, and the concepts contributed by the inventors to the advancement of the art, and should be construed as non-limiting. for such specifically stated examples and conditions. In addition, it is intended that all statements herein that set forth the principles, aspects, and embodiments of the invention, as well as specific examples thereof, encompass equivalents thereof.

Способи і пристрій, описані в даному документі, можуть бути реалізовані як програмне забезпечення, програмно-апаратне забезпечення і/або апаратні засоби. Деякі компоненти, наприклад, можуть бути реалізовані як програмне забезпечення, що працює на процесорі цифрової обробки сигналів або мікропроцесорі. Інші компоненти, наприклад, можуть бути реалізовані як апаратні засоби і/або як спеціалізовані інтегральні схеми. Сигнали, що зустрічаються в описаних способах і пристрої, можуть бути збережені на носіях, таких як оперативний запам'ятовуючий пристрій або оптичні запам'ятовуючі носії. Вони можуть бути перенесені через мережі, такі як радіомережі, супутникові мережі, безпровідні мережі абоThe methods and apparatus described herein may be implemented as software, hardware, and/or hardware. Some components, for example, may be implemented as software running on a digital signal processing processor or microprocessor. Other components, for example, can be implemented as hardware and/or as specialized integrated circuits. The signals found in the described methods and devices can be stored on media such as a non-volatile memory device or optical storage media. They can be carried over networks such as radio networks, satellite networks, wireless networks or

Зо провідні мережі, наприклад Інтернет.From leading networks, such as the Internet.

Джерела інформації: 1: ІБОЛЕС 0Т7С1/52029/Л/Л/511 23008-3:2015(Е). Іптоппайоп їесппоіоду-Нідп ейісіепсу содіпд апа теайїа аеїїмегу іп пех(егодепеои5 епмігоптепів-Рап 3: ЗО ацадіо, Ребгиагу 2015. 2: ІЗОЛЕС 0О0ТС1/5029//1/0511 23008-3:2015/РОАМ3. Іптогтацоп (есппоіоду-Нідп ейісіепсу содіпд апа тедіа аеїїмегу іп пехтегодепеои5 епмігоптепів-Рагі 3: ЗО ацайю, АМЕМОМЕМТ 3: МРЕС-Sources of information: 1: IBOLES 0T7S1/52029/Л/Л/511 23008-3:2015(E). Iptoppayop eesppoiodu-Nidp eyisiepsu sodipd apa teayia aeiiimegu ip peh(egodepeoi5 epmigoptepiv-Rap 3: ZO acadio, Rebgiagu 2015. 2: ISOLES 0О0ТС1/5029//1/0511 23008-3:2015/ROAM3. Iptogtapsidiepipsudu sodipiodu-Nesp apa tedia aeiiimegu ip pehtegodepeoi5 epmigoptepiv-Ragi 3: ZO atsaiyu, AMEMOMEMT 3: MRES-

Н 30 Айцаіо Ріазе 2, шу 2015.H 30 Aizaio Riaze 2, 2015.

Claims

ФОРМУЛА ВИНАХОДУFORMULA OF THE INVENTION

1. Спосіб декодування стиснутого представлення звуку або звукового поля системи Амбісонік вищого порядку (НОА), причому спосіб включає етапи, на яких: приймають бітовий потік, що містить стиснуте представлення (2100) НОА, що відповідає множині ієрархічних рівнів, які включають в себе базовий рівень і два або більше ієрархічних поліпшуючих рівнів, і містить базову допоміжну інформацію (2120), яка пов'язана з базовим рівнем, і поліпшуючу допоміжну інформацію (2140), яка пов'язана з двома або більше ієрархічними поліпшуючими рівнями, при цьому множина рівнів мають присвоєні їм компоненти базового стиснутого представлення звуку або звукового поля, причому компоненти присвоюються відповідним рівням у відповідних групах компонентів, і при цьому два або більше ієрархічних поліпшуючих рівнів містять найбільш високий застосовний ієрархічний поліпшуючий рівень, який відрізняється тим, що кожний з двох або більше ієрархічних поліпшуючих рівнів включає в себе частину поліпшуючої допоміжної інформації (2140), що включає в себе параметри для поліпшення базового відтвореного представлення звуку, доступні з даних, включених у відповідний рівень і будь-які рівні нижче відповідного рівня, і спосіб також включає етап, на якому декодують стиснуте представлення (2100) НОА на основі базової допоміжної інформації (2120), яка пов'язана з базовим рівнем, на основі частини поліпшуючої допоміжної інформації (2140), яка пов'язана з найбільш високим застосовним ієрархічним поліпшуючим рівнем, і не на основі частини поліпшуючої допоміжної інформації1. A method for decoding a compressed sound representation or sound field of a higher order Ambisonic system (HOA), the method comprising the steps of: receiving a bitstream containing a compressed representation (2100) of the HOA corresponding to a plurality of hierarchical levels that include a base layer and two or more hierarchical enhancement layers, and includes basic support information (2120) that is associated with the base layer and enhancement support information (2140) that is associated with two or more hierarchical enhancement levels, wherein the plurality of levels have components of a base compressed representation of sound or sound field assigned to them, the components being assigned to corresponding levels in the respective component groups, and wherein the two or more hierarchical enhancement levels contain the highest applicable hierarchical enhancement level, characterized in that each of the two or more hierarchical enhancement levels of improving levels includes a part of improving auxiliary information tion (2140), including parameters for improving the base reproduced sound representation available from data included in the corresponding layer and any layers below the corresponding layer, and the method also includes the step of decoding the compressed representation (2100) of the NOA at based on the base support information (2120) that is associated with the base level, based on the portion of the enhancement support information (2140) that is associated with the highest applicable hierarchical enhancement level, and not based on the portion of the enhancement support information

(2140), яка пов'язана з будь-яким іншим рівнем з двох або більше ієрархічних поліпшуючих рівнів.(2140) which is associated with any other level of two or more hierarchical enhancing levels.

2. Спосіб за п. 1, який відрізняється тим, що компоненти базового стиснутого представлення звуку відповідають монауральним сигналам (2110), і монауральні сигнали (2110) представляють або переважні звукові сигнали, або послідовності коефіцієнтів представлення НОА.2. The method according to claim 1, characterized in that the components of the basic compressed sound representation correspond to monaural signals (2110), and the monaural signals (2110) represent either predominant sound signals or sequences of coefficients of the NOA representation.

З. Спосіб за будь-яким із пп. 1-2, який відрізняється тим, що бітовий потік включає в себе корисні навантаження даних, відповідно пов'язані з одним або більше ієрархічними рівнями.C. The method according to any one of claims 1-2, characterized in that the bitstream includes data payloads respectively associated with one or more hierarchical levels.

4. Спосіб за будь-яким із пп. 1-3, який відрізняється тим, що поліпшуюча допоміжна інформація (2140) включає в себе параметри, що стосуються щонайменше одного з перерахованого: просторове прогнозування, синтез направлених підсмугових сигналів і параметричне дублювання звукового оточення, і/або при цьому поліпшуюча допоміжна інформація (2140) включає в себе інформацію, яка забезпечує можливість прогнозування частин звуку, яких бракує, або звукового поля з направлених сигналів.4. The method according to any one of claims 1-3, characterized in that the improving auxiliary information (2140) includes parameters related to at least one of the following: spatial prediction, synthesis of directional subband signals and parametric duplication of the sound environment, and /or while the improving auxiliary information (2140) includes information that provides the possibility of predicting parts of the sound that are missing or the sound field from the directional signals.

5. Спосіб за будь-яким із пп. 1-4, який відрізняється тим, що визначають для кожного рівня, чи був відповідний рівень прийнятий коректно, і визначають індекс рівня, що знаходиться безпосередньо нижче найбільш низького рівня, який не був прийнятий коректно.5. The method according to any one of claims 1-4, which is characterized in that it is determined for each level whether the corresponding level was accepted correctly, and the index of the level immediately below the lowest level that was not accepted correctly is determined.

6. Спосіб за п. 5, який відрізняється тим, що додатково містить етап, на якому визначають додатковий індекс рівня, який або дорівнює індексу рівня, або вказує виключення поліпшуючої допоміжної інформації (2140) під час декодування.6. The method according to claim 5, which is characterized by the fact that it additionally includes the step of determining an additional level index, which is either equal to the level index or indicates the exclusion of the improving auxiliary information (2140) during decoding.

7. Спосіб за будь-яким із пп. 1-6, який відрізняється тим, що базовий рівень включає в себе щонайменше одну частину додаткової базової допоміжної інформації (2130), пов'язаної з відповідним рівнем, і включає в себе інформацію, яка визначає декодування одного або більше компонентів серед компонентів, присвоєних відповідному рівню, залежно від інших компонентів, присвоєних відповідному рівню і будь-яким рівням нижче відповідного рівня, причому спосіб для кожної частини додаткової базової допоміжної інформації (2130) включає етапи, на яких: декодують частину додаткової базової допоміжної інформації (2130) за допомогою посилання на компоненти, присвоєні її відповідному рівню і будь-яким рівням нижче відповідного рівня, і коректують частину додаткової базової допоміжної інформації за допомогою посилання на компоненти, присвоєні найбільш високому застосовному ієрархічному поліпшуючому рівню і будь-яким рівням між найбільш високим застосовним ієрархічним поліпшуючим рівнем і відповідним рівнем, при цьому базове відтворене представлення звуку виходить з компонентів, присвоєних найбільш високому застосовному ієрархічному поліпшуючому рівню і будь-яким рівням нижче найбільш високого застосовного ієрархічного поліпшуючого рівня, з використанням базової допоміжної інформації (2120) і відкоригованих частин додаткової базової допоміжної інформації (2130), отриманої з частин додаткової базової допоміжної інформації (2130), що відповідають рівням аж до найбільш високого застосовного ієрархічного поліпшуючого рівня.7. The method according to any one of claims 1-6, characterized in that the base layer includes at least one part of additional base support information (2130) associated with the corresponding layer, and includes information that determines the decoding one or more components among the components assigned to the corresponding layer depending on the other components assigned to the corresponding layer and any layers below the corresponding layer, and the method for each part of the additional basic auxiliary information (2130) includes the steps of: decoding the part of the additional basic information auxiliary information (2130) by reference to the components assigned to its corresponding level and any levels below the corresponding level, and adjust a portion of the additional basic auxiliary information by reference to the components assigned to the highest applicable hierarchical enhancement level and any levels between the highest applicable hierarchical improving level and the corresponding ri wherein the base reproduced sound representation is derived from the components assigned to the highest applicable hierarchical enhancement level and any levels below the highest applicable hierarchical enhancement level, using base support information (2120) and adjusted portions of additional base support information (2130) , derived from portions of additional basic supporting information (2130) corresponding to levels up to the highest applicable hierarchical enhancement level.

8. Пристрій (6000) для декодування стиснутого представлення звуку або звукового поля системи Амбісонік вищого порядку (НОА), причому пристрій (6000) містить: приймач (6010) для прийому бітового потоку, що містить стиснуте представлення (2100) НОА, що відповідає множині ієрархічних рівнів, які включають в себе базовий рівень і два або більше ієрархічних поліпшуючих рівнів, і що містить базову допоміжну інформацію (2120), яка пов'язана з базовим рівнем, і поліпшуючу допоміжну інформацію (2140), яка пов'язана з двома або більше ієрархічними поліпшуючими рівнями, при цьому множина рівнів мають присвоєні їм компоненти базового стиснутого представлення звуку або звукового поля, причому компоненти присвоюються відповідним рівням у відповідних групах компонентів, при цьому два або більше ієрархічних поліпшуючих рівнів містять найбільш високий застосовний ієрархічний поліпшуючий рівень, який відрізняється тим, що кожний з двох або більше ієрархічних поліпшуючих рівнів включає в себе частину поліпшуючої допоміжної інформації (2140), що включає в себе параметри для поліпшення базового відтвореного представлення звуку, доступні з даних, включених у відповідні рівні і будь-які рівні нижче відповідного рівня, і пристрій (6000) також містить декодер (6020, 6030, 6040, 6050) для декодування стиснутого представлення (2100) НОА на основі базової допоміжної інформації (2120), яка пов'язана з базовим рівнем, на основі частини поліпшуючої допоміжної інформації (2140), яка пов'язана з бо найбільш високим застосовним ієрархічним поліпшуючим рівнем, і не на основі частини поліпшуючої допоміжної інформації (2140), яка пов'язана з будь-яким іншим рівнем з двох або більше ієрархічних поліпшуючих рівнів.8. An apparatus (6000) for decoding a compressed sound representation or sound field of a higher-order Ambisonic system (HOA), the apparatus (6000) comprising: a receiver (6010) for receiving a bitstream containing a compressed representation (2100) of the HOA corresponding to a plurality hierarchical levels that include a base level and two or more hierarchical enhancement levels, and containing basic support information (2120) that is associated with the base level and enhancement support information (2140) that is associated with two or by more hierarchical enhancement levels, wherein the plurality of levels have components of the underlying compressed sound representation or sound field assigned to them, and the components are assigned to corresponding levels in respective component groups, wherein the two or more hierarchical enhancement levels contain the highest applicable hierarchical enhancement level that is distinguished by , that each of two or more hierarchical improving levels includes a part of gender enhancing auxiliary information (2140) including parameters for improving the basic reproduced sound representation available from data included in the respective layers and any layers below the respective layer, and the device (6000) also includes a decoder (6020, 6030, 6040 , 6050) to decode a compressed representation (2100) of the NOA based on the base auxiliary information (2120) that is associated with the base layer, based on a portion of the enhancement auxiliary information (2140) that is associated with the highest applicable hierarchical enhancement layer , and not based on a portion of the enhancement support information (2140) that is associated with any other level of two or more hierarchical enhancement levels.

9. Пристрій (6000) за п. 8, який відрізняється тим, що компоненти базового стиснутого представлення звуку відповідають монауральним сигналам (2110), і монауральні сигнали (2110) представляють або переважні звукові сигнали, або послідовності коефіцієнтів представлення НОА.9. The device (6000) according to claim 8, characterized in that the components of the basic compressed sound representation correspond to the monaural signals (2110), and the monaural signals (2110) represent either predominant sound signals or sequences of coefficients of the NOA representation.

10. Пристрій (6000) за будь-яким із пп. 8-9, який відрізняється тим, що бітовий потік включає в себе корисні навантаження даних, відповідно пов'язані з одним або більше ієрархічними рівнями.10. Device (6000) according to any one of claims 8-9, characterized in that the bit stream includes data payloads respectively associated with one or more hierarchical levels.

11. Пристрій (6000) за будь-яким із пп. 8-10, який відрізняється тим, що поліпшуюча допоміжна інформація (2140) включає в себе параметри, що стосуються щонайменше одного з перерахованого: просторове прогнозування, синтез направлених підсмугових сигналів і параметричне дублювання звукового оточення, і/або при цьому поліпшуюча допоміжна інформація (2140) включає в себе інформацію, яка забезпечує можливість прогнозування частин звуку, яких бракує, або звукового поля з направлених сигналів.11. The device (6000) according to any one of claims 8-10, characterized in that the improving auxiliary information (2140) includes parameters related to at least one of the following: spatial prediction, synthesis of directional subband signals, and parametric audio duplication the environment, and/or at the same time the improving auxiliary information (2140) includes information that provides the possibility of predicting the missing parts of the sound or the sound field from the directional signals.

12. Пристрій (6000) за будь-яким із пп. 8-11, який відрізняється тим, що виконаний з можливістю: визначати для кожного рівня, чи був відповідний рівень прийнятий коректно, і визначати індекс рівня, що знаходиться безпосередньо нижче найбільш низького рівня, який не був прийнятий коректно.12. The device (6000) according to any one of claims 8-11, which is characterized by the fact that it is possible to: determine for each level whether the corresponding level was received correctly, and determine the index of the level immediately below the lowest level, which was not accepted correctly.

13. Пристрій за п. 12, який відрізняється тим, що додатково виконаний з можливістю визначати додатковий індекс рівня, який або дорівнює індексу рівня, або вказує виключення поліпшуючої допоміжної інформації (2140) під час декодування.13. The device according to claim 12, which is further configured to determine an additional level index, which is either equal to the level index or indicates the exclusion of the enhancing auxiliary information (2140) during decoding.

14. Пристрій за будь-яким із пп. 8-13, який відрізняється тим, що базовий рівень включає в себе щонайменше одну частину додаткової базової допоміжної інформації (2130), пов'язаної з відповідним рівнем, і включає в себе інформацію, яка визначає декодування одного або більше компонентів серед компонентів, присвоєних відповідному рівню, залежно від інших компонентів, присвоєних відповідному рівню і будь-яким рівням нижче відповідного рівня, і Зо при цьому для кожної частини додаткової базової допоміжної інформації (2130) пристрій (6000) виконаний з можливістю: декодувати частину додаткової базової допоміжної інформації (2130) за допомогою посилання на компоненти, присвоєні її відповідному рівню і будь-яким рівням нижче відповідного рівня, і корегувати частину додаткової базової допоміжної інформації (2130) за допомогою посилання на компоненти, присвоєні найбільш високому застосовному ієрархічному поліпшуючому рівню і будь-яким рівням між найбільш високим застосовним ієрархічним поліпшуючим рівнем і відповідним рівнем, при цьому базове відтворене представлення звуку отримується з компонентів, присвоєних найбільш високому застосовному ієрархічному поліпшуючому рівню і будь-яким рівням нижче найбільш високого застосовного ієрархічного поліпшуючого рівня, з використанням базової допоміжної інформації (2120) і відкоригованих частин додаткової базової допоміжної інформації (2130), отриманої з частин додаткової базової допоміжної інформації (2130), що відповідають рівням аж до найбільш високого застосовного ієрархічного поліпшуючого рівня.14. The device according to any one of claims 8-13, characterized in that the basic layer includes at least one part of additional basic auxiliary information (2130) associated with the corresponding layer, and includes information that determines the decoding one or more components among the components assigned to the corresponding level, depending on the other components assigned to the corresponding level and any levels below the corresponding level, and Zo at the same time for each part of the additional basic auxiliary information (2130), the device (6000) is made with the ability to: decode a portion of the additional base auxiliary information (2130) by reference to the components assigned to its respective layer and any levels below the corresponding layer, and correct the portion of the additional base auxiliary information (2130) by reference to the components assigned to the highest applicable hierarchical enhancer level and any levels between the highest applicable hierarchical improvement layer and the corresponding layer, whereby the base reproduced sound representation is obtained from the components assigned to the highest applicable hierarchical enhancement level and any levels below the highest applicable hierarchical enhancement level, using the base auxiliary information (2120) and the adjusted portions of the additional base auxiliary information (2130) derived from parts of additional basic auxiliary information (2130) corresponding to levels up to the highest applicable hierarchical enhancement level.

15. Постійний машиночитаний носій, що містить інтерпретовані за допомогою комп'ютера інструкції, які при їх виконанні одним або більше процесорами обчислювального пристрою наказують обчислювальному пристрою виконувати спосіб за будь-яким із пп. 1-7.15. Permanent machine-readable medium containing computer-interpreted instructions which, when executed by one or more processors of the computing device, command the computing device to perform the method of any one of claims 1-7.

віщо щи Виконати підвВозділяння множини - компонентів БО --, а Присвоїти групи компонентів відповідним рівням 5ІОЗО -, Й й в | Присвоїти базову допоміжну інформацію базовому рівню БІО . Визначити мнажину частим поліпшуючої доломіжної інформації Іза ше Присвоїти множину частин - поліпнуючоаї дапоменної інформації відповідних рівнямFirst of all: Perform sub-division of the set - components of BO --, and Assign groups of components to the corresponding levels of 5ІОЗО -, І and в | Assign basic supporting information to the basic BIO level. To determine the set of multiples of the improving data-name information of Izashe Assign the set of parts - of the improving data-name information to the corresponding levels

Фіг. 1Fig. 1

ТТ, ї Пакет транспортних рівнів Е Шини 21303... 2130М ШІ ши п ан и НУ ин МИ ит Со веяня ЕО Паро у : | : Вт щ базового рівня т вер Ж ! ї 3 поь: і, НТТ, и Package of transport levels E Tires 21303... 2130М ШИ ши пан и НУ ин МИ it So veyanya EO Paro y : | : Wh basic level tver Х ! i 3 poi: i, N

Б. он МОУ пиши Пакетпевного 0000000 Віа т ДО, 0 ЗОНІ стисненого ШЕ Не ЧИ Е І придставленне звуки 1 щ. дин АЙ іабозауковопиполя:! жо ок ней Явна ш- ВВО0-1 дляодногохадру 1001 щі : Й Щ КЕ соя : Я і подіпшуючога: інн пит жан ПЕНЯ НН звнаєі 1 і ї нич Зикккккккнккнккнннн 5 Н ВЕБ ЖЕ Тай й Ше . Б сивий и СЕ: ння в ! І г 53 ТІ : - ща й ! оон т ВІДО С 100 2100 ши ж Та що М .: : ' і ШЕ «дв Яахет ' т ЗОВ ТОР Метою: зі рівня ЯМІ) й ши | ЩЕ пннпютнанктюнкннюкиннсо нн пиши 1 | ! БО дння МАМО ню АЮB. he MOU write Packetspecific 0000000 Via t TO, 0 ZONES of compressed SHE Not ЧЕ E and the sound of 1 sh. din AY and the scientific field:! zho ok ney Yavna sh- VVO0-1 dlyaodnogokhadr 1001 shchi : Y SH KE soya : I and podipshuyuchoga: inn pit zan PENYA NN zvnayei 1 and i nich Zykkkkkkkknknkkknnnn 5 N WEB ZHE Tai y She . B gray and SE: ny in ! And g 53 TI: - yes! oon t VIDO S 100 2100 shi z That M .: : ' i SHE «dv Yaakhet ' t ZOV TOR Purpose: from the level of YAMI) y shi | MORE pnnpyutnanktyunknyukinnso nn write 1 | ! BECAUSE MOM IS AYU

Фіг. 2Fig. 2

ЗО ! и Й ! пн І Прийняти корисні навантаження даних, У ! ще відповідають множині рівнів і 53020 тА | Визначити перший індекс рівня, що -2еі вказує найбільш високий рівень для ! використання для декодування ! 030 ! : ще Одержати базове відновлене ї представлення звуку 5ЗОЯ4О Шк ! Визначити другий інденс рівня, шо вказує, Здняя ! вину частину другої дапоміжної інформації використовувати для поліпшення базового ! відтвапеного предстанлення звуку 53050 --.. | Одержати відтворене представлення звуку в | на зснові базового відтяеспенога 7 представлення звуку з посиланням на другий ! індекс рівняZO! and Y! Mon And Accept data payloads, In ! also correspond to a set of levels and 53020 tA | Determine the first level index that -2ei indicates the highest level for ! use for decoding ! 030 ! : more Get the basic restored representation of the sound 5ЗОЯ4О Shk ! Determine the second indens of the level that indicates, Zdnyaya ! use part of the second auxiliary information to improve the basic one! of a distorted representation of sound 53050 --.. | Get a reproduced representation of the sound in | based on the basic otjaespenoga 7 representation of the sound with reference to the second ! level index

Фіг. З ше 11... 2110 кн і 0 Декомпраесор ! Й ше наве дження МАК, ла гом п НИ : мч : Відновлення і рівня : Б и С -яс базового і Ї; нів пінінівнівініннівннньво Її : й Її : ї : : і - "представлен і і Прийнятий пахет 1: : во : не І і Я ї повного списненото | Крут няття Тоня з15а представленнязнну 080 -а500. Пакет зівбозвуновететолярі СОБКО ї пЕй і б і ШИ во ни С ь виш ї ; паліпиллючого В влюодноокадює її од. с. : . : рок вк. дині ;Fig. From 11... 2110 kn and 0 Decompressor! Also guidance of the MAK, the log of the PN: mch: Restoration of the level: B and C - the base and Y; nyv pininivnivininnivnnnvo Her : and Her : і : : і - "presented and і Accepted package 1: : vo : no І and I і full spisnenoto | Krut nyat Tonya z15a presentednnyaznnu 080 -a500. Package zivbozvunovetolary SOBKO і пей і б і Ші vo We are higher, the pill-popping V vlyudno-smokes her one village: .: rock in. melon

її. я і 1; ї - Н рівня НІ р НЕ Е БО Вибв Нідновлене базове дя ЩО 22222222: І Е тк х : ві : ' :ВіВНІВ жк представлення знуку а ож : жи ку пенеоКілиня х трииі 23004-7 Е СОБІ от 300 : ї ши НН ! - ! НЯ : Кі Н (Хе лев : РО ББосооутрнняхя Відновлення: МОВИЕНЕ Пакет Н : фол лячячатА люд ЛТІЖ поліпшен у пОолІізшеНа І : Ко: оса тел жеєютю ее скін жк екю век од ДЕМИТННВНОНКК й ІЗ паліпиуючого ШЕ НІ ; ї й ставлен. представлення : : ЖЕ М здетавлен: 3 Н рівня ЯМА) ЗЕ во пи ву ТА : : х ї на й х ш2204 ре : ук Н : : і о: НК КД ИН : Н ззп004М- Шк ГИ : УЖ лю реф ються же : і зяогу 2 ; ) о чи пі - М аю 0 - ВРЯО,.. 2І14Я0Мher. i and 1; і - Н level NO r NO E BO Dropped Not renewed basic дя ХО 22222222: I E tk x : vi : ' :ViVNIV zhk presentation of the name a oz: zhi ku peneoKilinya x triiii 23004-7 E SOBI ot 300 : і ши НН ! - ! NYA : Ki N (He lion : RO BBosoootrnnyakhya Restoration: LANGUAGE Package N : fol lyachyachatA lud LTIZH improved in poOilizsheNa I : Ko: osa tel zheyyutyu ee skin zhk ekyu vek od DEMITNNVNONKK and IZ palilipyuing SHE NI; і і ставлен. presentation : : The same M is detailed: 3 N of the PIT level) ZE vo py vu TA : : x i na y x sh2204 re : uk N : : i o: NK KD YN : N zzp004M- Shk GY : UZH lu ref yutsya same : and zyaogu 2 ; ) o or pi - M ayu 0 - VRYAO,.. 2I14Я0М

Фіг. 4А дян пттпап ння пот тя оттопт т тотопросдстссосссстсосоогооонттн нт, Декомпресор і 4200 ря поннненнанатнвнетнванотня , і Її 7 - ІЗ х Е : і же Я ; ї х сежнннннклная ІМ НІ Пакет базового ! пін 7 Відновлення Н г І Я : Ь : - - З у 0ожях хів ПреДстТВВЛЕеН Я Її АПрийнятим пакет ; хуя ня ; и: НІ Е павчого стисненого ї х : ; Е Н представлення і Б: ШІ НЕ днк й - ; зввуну (або ї г. роя ха : НІ Пакет | уж й ві Вибір х; НЕ Відновлене базове поліпнуючого | бо пуковогоа паля; для ші рівнів | я їх хх представлення звуку вня ЩІ 51 яднога яадох кі Р Му 1: т редет я звуку ТЕ Я З рю хх. х Й я ї СА ення АЗОЮ пенні ва і коди кеку йFig. 4A dyan pttpap nya pot tya ottopt t totoprosdstssosssstsosoogooonttn nt, Decompressor and 4200 rya ponnnennatnvnetnvanotnya, and Her 7 - IZ x E: and I; и х sezhnnnnklnaya IM NO Basic package! pin 7 Restoration of N g I I : b : - - I will accept the package from those who represented me; fuck y: NO E peacock compressed y x : ; E N representation and B: AI NOT dnk y - ; zvvunu (or y g. roya ha: NO Packet | already y vi Selection x; NO Restored basic sticking | bo pokovoa pile; for si levels | и и х х representation of sound вня ШІ 51 яднога яадох ки R Mu 1: т redet я sound TE I Z ryu xx. x Y i y SA enie AZOYU penny va and kek codes y

Р. зі фольк не ві 5 Ко х ме геR. with folk ne vi 5 Ko x me ge

Е. В БОШ перу зх ! І З 5 ЕЕ Жах хе КОЖ фехя 4 вн. й ; | І жу ще | , ї ІВідновлення; Бідновлене Я Я КТ х зе х х х ї Панет і в ве фесіне Часткове но : поліпшеногої поліпшене в КО: ї вий КУ жа представлення поліпшуючого | КЕ: з ї представлені ще 1 аналіз? й звуну рівня 5 (МА | МОВ я ей ї ня Ї ТОВ Б жу : па ЕК е кох М ж св ж. ; ї І 4400-77 4300E. In BOSH peru zh! I Z 5 EE Horror he KOHZ fehya 4 vn. and | And there's more , th IRecovery; Impoverished I I KT x ze x x x y Panet and in vefesin Partial no: improvednogoi improved in KO: th vy KU zha presentation of the improver | KE: Is 1 more analysis presented? and call level 5 (MA | MOV i ey i nya Y TOV B zhu: pa EK e koh M z sv z.; и I 4400-77 4300

Фіг. 48Fig. 48

БОЮ ОТО 5ОгО БОЗО - І і "т" т 5040 ОБО і Ж 5100 водоBOYU OTO 5OgO BOZO - I and "t" t 5040 OBO and Zh 5100 water

Фіг. 5Fig. 5

ЩІ 7 Сай во1о вого возо І і І водо воБо А і де МО й вадаSHHI 7 Sai vo1o vozo I and I vodo voBo A and where MO and vada

Фіг. 6Fig. 6