TWI313857B - Apparatus for generating a parameter representation of a multi-channel signal and method for representing multi-channel audio signals - Google Patents

Apparatus for generating a parameter representation of a multi-channel signal and method for representing multi-channel audio signals Download PDF

Info

Publication number
TWI313857B
TWI313857B TW094126934A TW94126934A TWI313857B TW I313857 B TWI313857 B TW I313857B TW 094126934 A TW094126934 A TW 094126934A TW 94126934 A TW94126934 A TW 94126934A TW I313857 B TWI313857 B TW I313857B
Authority
TW
Taiwan
Prior art keywords
channel
parameter
channels
balance
pair
Prior art date
Application number
TW094126934A
Other languages
Chinese (zh)
Other versions
TW200636676A (en
Inventor
Heiko Purnhagen
Lars Villemoes
Jonas Engdegard
Jonas Roeden
Kristofer Kjoerling
Original Assignee
Coding Tech Ab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from PCT/EP2005/003849 external-priority patent/WO2005101371A1/en
Application filed by Coding Tech Ab filed Critical Coding Tech Ab
Publication of TW200636676A publication Critical patent/TW200636676A/en
Application granted granted Critical
Publication of TWI313857B publication Critical patent/TWI313857B/en

Links

Landscapes

  • Stereophonic System (AREA)

Description

1313857. 九、發明說明: 【發明所屬之技術領域】 本發明係有關於使用空間參數編碼音頻信號的多聲道 表示。本發明教示用以估計及界定可用以從多數聲道(少於 輸出聲道的數目)重建一多聲道信號之適當參數的新方法。 特別地’其著重在最小化該多聲道表示之位元率及提供編碼 該多聲道信號之表示’而能夠容易地針對所有可能聲道配置 編碼及解碼該資料。 φ 【先前技術】 發明名稱爲「用於低位元率音頻編碼應用之有效及可調 式參數立體聲編碼」之PCT/SE〇2/〇l372已顯示可從一單聲 道信號重建一非常相似於原始立體聲圖像之立體聲圖像(假 設具有該立體聲圖像之非常緊密表示)。基本原理係將輸入 信號分割成頻帶及時間區段,以及針對這些頻帶及時間區段 估計聲道間強度差(IID)及聲道間同調性(ICC)。第一參數係 在特定頻帶中之兩個聲道間的功率分佈之測量,以及第二參 #數係該特定頻帶之兩個聲道間的相關性之估計。在解碼器側 上,藉由依據該IID資料在兩個輸出聲道間分配該單聲道信 號及藉由加入一解相關信號以便保持原始立體聲道之聲道 相關性,以從該單聲道信號重建該立體聲圖像。 對於一多聲道情況(在上下文中之多聲道表示兩個輸出 聲道以上)而言,必須說明幾個額外問題。現在有幾個多聲 道配置。最通常所知道的是5.1配置(中間聲道、前左/右聲 道、環繞左/右聲道及LFE聲道)。然而’有許多其它配置。 1313857. 從完整的編碼器/解碼器系統觀點來說,期望具有一可針對 所有聲道配置使用相同參數組(例如:IID及ICC)或其子組。 ITU-R BS.755界定幾個下行混音架構(down-mix schemes)’ 以便能從一特定聲道配置獲得一包括較少聲道之聲道配 置。取代經常必須解碼所有聲道及依據一下行混音,可期望 具有一多聲道表示,其能使一接收器在解碼該等聲道前擷取 有關於手上聲道配置之參數。再者,從一可調式或內嵌式編 碼觀點而言,期望有一固有可調之參數組,在該觀點中例如 φ 可將對應於該等環繞聲道之資料儲存在位元流之一加強層 中〇 相反於以上所述,亦可期望能依據所處理之信號的特性 來使用不同參數界定,以便在會對所處理之目前信號區段導 致最低位元率負擔的參數化間做切換。 使用一加總信號或下行混音信號及額外參數附加資訊 之多聲道信號的另一表示係爲本技藝中所知之雙聲道信號 編碼(Binaural Cue Coding,BCC)。此技術被描述於2003年 春11月第6期第11卷IEEE語音處理會刊之作者爲F. Baumgarte及C. Faller的「雙聲道信號編碼-第一篇:聽覺 心理學基礎及設計原理」及2003年1 1月第6期第1 1卷IEEE 語音處理會刊之作者爲C. Faller及F. Baumgarte的「雙聲 道信號編碼-第二篇:架構及應用」中。 通常,雙聲道信號編碼係一種依據一下行混音聲道及附 加資訊的多聲道空間表示之方法。針對音頻重建或音頻提供 以一 BCC編碼器所計算及以一BCC解碼器所使用之幾個參 1313857 數包括聲道間電平差、聲道間時間差及聲道間同調參數。這 些聲道間信號係一空間圖像之感知的決定因數。這些參數係 提供給該原始多聲道信號之時間樣本的區塊及亦提供有頻 率選擇性,以便多聲道信號樣本之每一區塊對於數個頻帶而 言具有數個信號。在C播放聲道之一·般情況中,在複數對聲 道間之每一子頻帶中(亦即,針對相對於一參考聲道之每— 聲道)考量該等聲道電平差及該等聲道間時間差。將一聲道 界定成對每一聲道間電平差之參考聲道。由於該等聲道間電 Φ 平差及該等聲道間時間差,因而可提供一音源至一所使用之 播放裝設的複數對揚聲器中之一對揚聲器間的任何方向。爲 了決定一已提供音源之擴散的寬度,考量所有音頻聲道之每 一子頻帶的一參數係足夠的。此參數係該聲道間同調參數。 該已提供音源之寬度係藉由修改該等子頻帶信號來控制,以 便所有可能聲道對具有相同聲道間同調參數。 在BCC編碼中,決定在該參考聲道1與任何其它聲道 間之所有聲道間電平差。當例如決定該中央聲道爲該參考聲 • 道時,計算在該左聲道與該中央聲道間之第一聲道間電平 差、在該右左聲道與該中央聲道間之第二聲道間電平差、在 該左環繞聲道與該中央聲道間之第三聲道間電平差及在該 右環繞聲道與該中央聲道間之第四聲道間電平差。此情節描 述一 5-聲道架構。當該5-聲道架構額外地包括一低頻增強 型聲道(亦爲所知之「超低音喇叭(sub-woofer)」聲道)時,計 算在該低頻增強型聲道與該中央聲道(該單一參考聲道)間 之第五聲道間電平差。 1313857. 當使用該單一下行混音聲道(亦稱爲「單」聲道)及傳輸 信號(例如:ICLD(聲道間電位差、ICTD(聲道間時間差)及 ICC(聲道間同調))來重建該原始多聲道時,使用這些信號來 修改該單信號之頻譜係數。使用一用以決定每一頻譜係數之 電平修改的正實數以實施該電平修改。使用一複數之大小來 決定每一頻譜係數的相位修改以產生該聲道間時間差。另一 功能決定該同調影響。藉由先計算該參考聲道之因數以計算 每一聲道之電平修改的因數。計算該參考聲道之因數,以便 φ 對於每一頻率部分而言所有聲道之功率的總和相同於該合 量信號之功率。然後,依據該參考聲道之電平修改因數,使 用個別IC LD參數來計算其它聲道之電平修改因數。 因此,爲了實施BCC合成,計算該參考聲道之電平修 改因數。爲了此計算,需要一頻帶之所有IC LD參數。然後, 依據該單聲道之電平修改,可計算其它聲道(亦即,非該參 考聲道之聲道)之電平修改因數。 此方法之缺點在於:對於一完整重建而言,需要每一聲 #道間電平差。當出現一易出錯傳輸聲道時,此需求會造成更 大問題。因爲需要每一聲道間電平差以計算每一多聲道輸出 信號,所以在一傳送聲道間電平差內之每一錯誤將導致在該 重建多聲道信號中之錯誤。在另一情況中,雖然一聲道間電 平差僅是例如該左環繞聲道或右環繞聲道所需,但是當在傳 輸期間遺失此聲道間電平差時,則無法實施重建,其中因爲 重要資訊係包含在該前左聲道(下面稱爲左聲道)、該前石聲 道(下面稱爲右聲道)及該中央聲道中,所以該左環繞聲道 1313857 及右環繞聲道對於多聲道重建並非是重要的。當在輸輸期間 遺失該低頻增強型聲道之聲道間電平差時,此情況變得更 糟。在此情況中,雖然該低頻增強型聲道對收聽者之收聽舒 適並非是決定性的,但是可能不會有多聲道重建或僅有一錯 誤多聲道重建。因爲,將在單一聲道間電平差中之錯誤傳播 至每一重建輸出聲道內之錯誤。 此外,當因該單一參考聲道而考量到一直覺收聽情節 時’在2〇〇2年5月10-13日德國慕尼黑之AES會議報告中 # C_ Faller及F. Baumgarte發表的「應用至立體聲及多聲道音 頻壓縮之雙聲道信號編碼」中所描述之現有BCC架構並非 是相當合適的。當然整個音頻處理之最終目標係要使每件事 相關於一單一參考聲道,然此對人類係不自然的。取而代 之’人類在頭的兩個個不同側上具有兩個耳朵。因此,人類 之自然收聽印象係是否使信號在左側或右側更加平衡或使 信號在前面與後面之間平衡。相反地,人類感覺在一聽覺範 圍中之某一聲音源在每一揚聲器相對於一單一參考揚聲器 •間是否處於某一平衡是不自然的。當考量到位元率需求、可 調能力需求、彈性需求、重建人工因素需求或抗錯誤需求, 該自然收聽印象與BCC之數學/物理模型間之差異可能導致 編碼架構之負面結果。 【發明內容】 本發明之一目的在於提供一種用以呈現多聲道音頻信 號之改良觀念。 此目的係藉由依據申請專利範圍第1項之一種用以產生 1313857 一多聲道輸入信號之一參數表示的裝置、依據申請專利範圍 第21項之一種用以產生一重建多聲道表示之裝置、依據申 請專利範圍第3 1項或第3 2項之方法、依據申請專利範圍第 3 3項之一種電腦程式或依據申請專利範圍第3 4項之一種參 數表示來完成。 本發明係依據下面之硏究結果:對於一多聲道表示而言 必須依據聲道對間之平衡參數。此外,已發現到可藉由提供 至少兩個不同平衡參數以實現一多聲道信號參數表示,其中 φ 該至少兩個不同平衡參數表示兩個不同聲道對間之平衡。特 別地,彈性、可調能力、抗錯誤及甚至位元率效率係由該第 一聲道對(係第一平衡參數之根據)不同於第二聲道對(係第 二平衡參數之根據)的事實所造成之結果,其中形成這些聲 道對之四個聲道皆彼此不同。 因此,本發明之觀念不同於該單一參考聲道觀念及使用 一多平衡或超平衡觀念,該多平衡或超平衡觀念對人類之聲 音印象更直學及更自然。特別地,構成該第一及第二平衡參 • 數之聲道對可包括原始聲道、下行混音聲道或最好是輸入聲 道間之某些組合。 已發現到一從該中央聲道(做爲該第一聲道)所獲得之 平衡參數以及該左原始聲道與該右原始聲道(做爲該聲道對 之第二聲道)之加總對於在該中央聲道與該左及右聲道間提 供一精確能量分佈是特別有用的。注意到在此上下文中這三 個聲道通常包括聲場之大部分資訊,其中特別地該左-右立 體聲局部化不僅受左與右間之平衡的影響,而且亦受中央與 -10- 1313857‘ 左右之加總間的平衡之影響°依據本發明之一較佳實施例藉 由使用此平衡參數來反映此觀察。 最好,當傳送一單一單下行混音信號時,已發現到除該 中央/左+右平衡參數之外,還有一左/右平衡參數、一後-左/ 後-右平衡參數及一前/後平衡參數係一位元率-有效參數表 示之最佳解答,其係彈性、抗錯誤及可免於大程度人工因素。 在接收器側上,相較於單獨藉由該已傳輸資訊來計算每 一聲道之BCC合成’本發明之多平衡表示額外地使用在用 φ 以產生該下行混音聲道之下行混音架構上的資訊。因此,依 據本發明,在該下行混音架構(未使用於習知技藝系統中)上 之資料亦使用於除該平衡參數之外還有上行混音。因此,實 施該上行混音操作,以便藉由該平衡參數來決定在一重建多 聲道信號(針對一平衡參數形成一聲道對)內之聲道間的平 衡。 此觀念(亦即,不同平衡參數具有不同聲道對)可產生多 數聲道而不需知道每一傳輸平衡參數。特別地,依據本發 φ 明,可重建該左、右及中央聲道而不需知道任何後-左/後-右平衡或不需知道前/後平衡。因爲從一位元流擷取一額外 參數或傳送一額外平衡參數至一接收器因而允許一個或多 個額外聲道之重建,所以此結果允許非常微調之可調能力。 此與該習知技藝單一參考系統成對比,在該習知技藝單一參 考系統中需要每一聲道間電平差以重建所有已重建輸出聲 道之所有子群或只有一子群。 因爲可使該等平衡參數之選擇適應於某一重建環境,所 -11- 1313857 以本發明觀念亦是有彈性的。當例如:一 5 -聲道裝設形成該 原始多聲道信號裝設時及當一 4-聲道裝設形成一重建多聲 道裝設時,一前-後平衡參數允許計算該組合環繞聲道而不 需要對該左環繞聲道及該左環繞聲道有任何了解,其中該重 建多聲道裝設只具有一單一環繞揚聲器,而該單一環繞揚聲 器例如是設置在收聽者之後面。此與一單一參考聲道系統成 對比,在該單一參考聲道系統中必須從該資料流擷取該左環 繞聲道之聲道間電平差及該右環繞聲道之聲道間電平差。然 Φ 後,必須計算該左環繞聲道及該右環繞聲道。最後,必須加 入兩個聲道以針對一 4-聲道重建裝設獲得該單—環繞揚聲 器聲道。因爲由於該更直覺及更使用者導向之平衡參數表示 並非受限於一單一參考聲道而亦可允許使用原始聲道之組 合以做爲一平衡參數聲道對之一聲道因而可自動地發送該 組合環繞聲道’所以不必在該平衡參數表示中實施所有這些 步驟。 本發明係有關於音頻信號之參數化多聲道表示的問 •題。本發明提供一有效方式以界定該多聲道表示之適當參數 及亦提供可擷取用以表示所期望聲道組態之參數的能力而 不需解碼所有聲道。本發明進一步解決針對一特定信號區段 以選擇最佳參數組態之問題,以便最小化要針對該特定信號 區段編碼該空間參數所需之位元率。本發明亦槪述如何在一 般多聲道環境中應用在先前只可應用於兩個聲道情況之解 相關方法。 在較佳實施例中’本發明包括下面特徵: 131.3857 -在該等編碼器側上,將該多聲道信號下行混音成爲一個或 兩個聲道表示; -已知有該多聲道信號,界定用以表示該等多聲道信號之參 數’以便在一彈性每幀基礎中最小化位元率或使該解碼器 能擷取在一位元流位準上之聲道組態; -假設該聲道組態目前係由該解碼器來支援,在該解碼器側 上擷取該相關參數組; -假設有該目前聲道組態,產生所需數目之相互解相關信 _ 號; -假設該參數組係由該位元流資料及該等解相關信號所解 碼,重建該等輸出信號; -界定該多聲道音頻信號之參數化,以便可使用相同參數或 該等參數之一子組,而無關於該聲道組態; -界定該多聲道音頻信號之參數化,以便可在一可調式編碼 架構中使用該等參數,在該架構處將該參數組之子組傳送 於該可調式流之不同層中; # -界定該多聲道音頻信號之參數化,以便來自該解碼器之輸 出信號的能量重建不會受下面音頻編解碼器所損害,該音 頻編解碼器係用以編碼該下行混音信號; -在該多聲道音頻信號之不同參數化間做切換,以便最小化 用以編號該參數化之位元率負擔; -界定該多聲道音頻信號之參數化,其中包括一用以表示該 下行混音信號之能量校正因數的參數; -使用數個相互解相關之解相關器,以重建該多聲道信號; 1313857 多 該 建 重 1-彐一 矩 音 混 行 上 之 算 |+ 所 組 數 參 送 傳 該 。 據號 依信 及 一 道 以從聲 其 明 發。 本神 述精 描或 來圍 例範 範之 πρ 月 說發 之本 式定 圖限 附以 所用 於非 關並 有例 由範 1 藉明式 將說方 現等施 該實 中 t 下面描述之實施例僅用以說明本發明在音頻信號之多 聲道表示的原理。可了解到在此所述之配置及細節的修改及 • 變化對熟習該項技藝者而言係顯而易知的。因此,意思僅由 即將描述之申請專利範例所限定,而非由在此之實施例的描 述及說明所呈現之特定細節來限定。 在槪述如何參數化IID及ICC參數及如何應用這些參數 以便重建音頻信號之多聲道表示的本發明之下面描述中,假 設所有提及之信號係在一濾波器阻中之子頻帶信號或對應 聲道之整個頻率範圍的一部分之多數其它頻率選擇性表 示。因此,了解到本發明並非局限在一特定濾波器組,以及 # 下面針對該信號之子頻帶表示的一頻帶來槪述本發明,以及 相同操作應用至所有子頻帶信號。 雖然一平衡參數亦稱爲一「聲道間強度差(IID)」參數, 但是強調在一聲道對間之一平衡參數沒有必要是在該聲道 對之第一聲道中的能量或強度及在該聲道對中之第二聲道 的能量或強度。通常,該平衡參數表示在該聲道對之兩個聲 道間的一聲音源的局部化。雖然此局部化通常係由能量/電 平/強度差所提供,但是可使用信號之其它特性(例如:兩個 -14- 1313857 聲道之功率測量或該等聲道之時間或頻率包封等)。 在第1圖中,顯現一 5 .1聲道組態之不同聲道’其中a(t) 1〇1表示該左環繞聲道,b(t) 102表示該左前聲道,c(t) 1〇3 表示該中央聲道,d(t) 104表示該右前聲道’ e(t)丨〇5表示 該右環繞聲道,以及f(t) 106表不該LFE(低頻音效(i0w frequency effect))聲道 ° 假設我們界定期望運算子爲: ^ 〇 以及因此上面所槪述之聲道的能量可依據下面來界定 (在此以左環繞聲道做爲範例): 在該編碼器側上將5-聲道下行混音成爲一 2-聲道表示 或一 1-聲道表示。此能夠以幾個方式來完成,以及一通常所 使用之方式爲由下面所界定之ITU下行混音: 5.1至2-聲道下行混音: h = ab{t) + βα{ί) + yc{t) + 5f (〇 rd{t) = ad{t) + βε{ί) + yc{t) + (〇 以及5.1至1-聲道下行混音: ^(0 = ^(^(0+^(0) 1313857 將該IID參數界定成爲兩個任意選擇聲道或加權群之聲 道的能量比。假如有上述針對該5 .1聲道組態之所槪述的聲 道之能量’則可界定幾組之IID參數。 第7圖表示—普通下行混音器700,其使用上述方程 式’以便計算一單聲道m或兩個最佳立體聲道1{1及rd。通 常’該下行混音器使用某些下行混音資訊。在一線性下行混 音之較佳實施例中,此下行混音資訊包括加權因數α、j3、y 及δ。在本技藝中已知可使用更多或更少常數或非常數加權 φ 因數。 在一 ITU建議下行混音中’ α係設定爲1,β及γ係設定 爲等於〇_5之平方根,以及δ係設定爲〇。通常,因素α可在 1.5與0_5之間變化。此外,因素β與γ係彼此不同的及在〇 與1之間變化。該低頻增強型聲道f(t)具有相同之事實。此 聲道之因數δ可在0與1之間變化。此外,該左-下行混音及 該右-下行混音之因數不必彼此相等。當考量一例如藉由一 音效工程師所實施之非自動下行混音時,此變得更清楚。進 •—步指導該音效工程實施一創造性下行混音而非一由任何 數學定律所支配之下行混音。取而代之,該音效工程師係由 他本身自己的創造感覺來支配。當某一參數組記錄此「創造 性的」下行混音時,將依據本發明由第8圖所示之一發明上 fr混音器來使用該「創造性的」下彳了混音,此不僅由該等参 數來支配,而且亦由該下行混音架構之額外資訊來支配。 當如同在第7圖中已實施一線性下行混音時,該等加權 參數係在該下行混音架構上要由該上行混音器所使用之最 -16- 1313857 佳資訊。然而’當呈現在該下行混音架構中所使用之其它資 訊時’一上行混音器亦可使用此其它資訊以做爲該下行混音 架構之資訊。此其它資訊例如亦可以是在一上行混音-矩陣 之矩陣元素內之某些矩陣元素或某些因素或函數(例如:如 第1 1圖所示)。 在第1圖所槪述之5 · 1聲道組態及觀察其它聲道組態如 何相關於該5 · 1聲道組態:對於不可獲得環繞聲道之3 -聲道 情況而言’亦即可依據上述記號獲得B、C及D。對於一4-^ 聲道組態而W ’可獲得B、C及D,然而亦可獲得用以表示 該單環繞聲道或在此上下文中所一般表示之後聲道的A與£ 之組合。 本發明界定應用至所有這些聲道之IID參數,亦即,該 5 · 1聲道組態之4-聲道子組在描述該5.1聲道之IID參數組 內具有一對應子組。下面IID參數組解決此問題:IX. INSTRUCTIONS: TECHNICAL FIELD OF THE INVENTION The present invention relates to multi-channel representations of encoding audio signals using spatial parameters. The present invention teaches a new method for estimating and defining appropriate parameters that can be used to reconstruct a multi-channel signal from a majority of channels (less than the number of output channels). In particular, the emphasis is on minimizing the bit rate of the multi-channel representation and providing a representation of the multi-channel signal' and the data can be easily encoded and decoded for all possible channel configurations. φ [Prior Art] The PCT/SE〇2/〇l372, entitled "Efficient and Adjustable Parameter Stereo Coding for Low Bit Rate Audio Coding Applications" has been shown to be reconstructable from a mono signal. Stereo image of a stereo image (assuming a very tight representation of the stereo image). The basic principle is to divide the input signal into frequency bands and time segments, and to estimate inter-channel intensity difference (IID) and inter-channel coherence (ICC) for these bands and time segments. The first parameter is a measure of the power distribution between two channels in a particular frequency band, and the second parameter is an estimate of the correlation between the two channels of that particular frequency band. On the decoder side, by assigning the mono signal between the two output channels in accordance with the IID data and by adding a decorrelated signal to maintain the channel correlation of the original stereo channel, from the mono channel The signal reconstructs the stereo image. For a multi-channel case (multiple channels in the context represent more than two output channels), several additional questions must be accounted for. There are now several multi-channel configurations. The most commonly known is the 5.1 configuration (middle channel, front left/right channel, surround left/right channel, and LFE channel). However, there are many other configurations. 1313857. From a complete encoder/decoder system perspective, it is desirable to have a set of identical parameters (eg, IID and ICC) or a subset thereof for all channel configurations. ITU-R BS.755 defines several downstream down-mix schemes to enable a channel configuration comprising fewer channels from a particular channel configuration. Instead of having to decode all of the channels and relying on the next line of mixing, it may be desirable to have a multi-channel representation that enables a receiver to retrieve parameters relating to the hand channel configuration prior to decoding the channels. Furthermore, from an adjustable or inline coding perspective, it is desirable to have an inherently adjustable set of parameters, for example, φ can store data corresponding to the surround channels in one of the bitstreams. In contrast to the above, it may also be desirable to use different parameter definitions depending on the characteristics of the signal being processed in order to switch between parameterizations that would result in the lowest bit rate burden for the current signal segment being processed. Another representation of a multi-channel signal using a summed or downmix signal and additional parameter additional information is known as Binaural Cue Coding (BCC). This technique is described in the Spring of November 2003, Volume 6, Volume 11, IEEE Speech Processing, by F. Baumgarte and C. Faller, "Two-Channel Signal Coding - Part I: The Foundation and Design Principles of Auditory Psychology" And the author of the IEEE Speech Processing Journal, Volume 1, Volume 1, January, 2003, is the "Two-Channel Signal Coding - Part Two: Architecture and Applications" by C. Faller and F. Baumgarte. In general, two-channel signal coding is a method of multi-channel spatial representation based on the next line of mixing channels and additional information. For audio reconstruction or audio supply, several reference 1313857 numbers calculated by a BCC encoder and used by a BCC decoder include inter-channel level difference, inter-channel time difference, and inter-channel coherence parameters. These inter-channel signals are the determining factor for the perception of a spatial image. These parameters are the blocks of time samples supplied to the original multi-channel signal and are also frequency selective such that each block of the multi-channel signal samples has several signals for several frequency bands. In the case of one of the C playback channels, in each subband between the complex pairs of channels (i.e., for each channel relative to a reference channel), the channel level differences are considered and The time difference between the channels. One channel is defined as a reference channel for the level difference between each channel. Due to the inter-channel power Φ adjustment and the time difference between the channels, it is possible to provide any direction from one source to one of the plurality of pairs of speakers used in a playback device. In order to determine the width of the diffusion of the supplied sound source, it is sufficient to consider a parameter of each sub-band of all audio channels. This parameter is the coherence parameter between the channels. The width of the supplied sound source is controlled by modifying the sub-band signals so that all possible channel pairs have the same inter-channel co-modulation parameters. In BCC encoding, the level difference between all channels between the reference channel 1 and any other channel is determined. When, for example, determining that the center channel is the reference channel, calculating a level difference between the first channel between the left channel and the center channel, between the right channel and the center channel Level difference between two channels, level difference between the left channel between the left surround channel and the center channel, and level between the fourth channel between the right surround channel and the center channel difference. This episode describes a 5-channel architecture. When the 5-channel architecture additionally includes a low frequency enhanced channel (also known as a "sub-woofer" channel), the low frequency enhanced channel and the center channel are calculated The fifth channel level difference between (the single reference channel). 1313857. When using this single downmix channel (also known as "single" channel) and transmitting signals (eg ICLD (inter-channel potential difference, ICTD (inter-channel time difference) and ICC (channel-to-channel homology)) To reconstruct the original multichannel, these signals are used to modify the spectral coefficients of the single signal. A positive real number is used to determine the level modification of each spectral coefficient to implement the level modification. The phase modification of each spectral coefficient is determined to produce the inter-channel time difference. Another function determines the coherence effect. The factor of the level modification of each channel is calculated by first calculating the factor of the reference channel. The factor of the channel, so that the sum of the powers of all channels for each frequency portion is the same as the power of the combined signal. Then, based on the level modification factor of the reference channel, the individual IC LD parameters are used to calculate The level modification factor of the other channels. Therefore, in order to perform BCC synthesis, the level modification factor of the reference channel is calculated. For this calculation, all IC LD parameters of one frequency band are required. According to the level modification of the mono, the level modification factor of other channels (that is, the channels other than the reference channel) can be calculated. The disadvantage of this method is that for a complete reconstruction, each needs to be Sound # level difference between the channels. This requirement will cause more problems when an error-prone transmission channel occurs. Because each level difference between channels is required to calculate each multi-channel output signal, Each error within the inter-channel level difference will result in an error in the reconstructed multi-channel signal. In another case, although the inter-channel level difference is only for example the left surround channel or the right surround sound Required for the channel, but when this level difference between channels is lost during transmission, reconstruction cannot be performed because important information is included in the front left channel (hereinafter referred to as left channel), the front stone channel (hereinafter referred to as the right channel) and the center channel, so the left surround channel 1313857 and the right surround channel are not important for multi-channel reconstruction. When the low frequency enhanced channel is lost during the transmission This situation becomes even worse when the levels are poor between channels In this case, although the low-frequency enhanced channel is not decisive for the listener's listening comfort, there may not be multi-channel reconstruction or only one wrong multi-channel reconstruction. Because, it will be in a single channel. The error in the level difference propagates to the error in each reconstructed output channel. In addition, when considering the single reference channel and always listening to the plot, 'German 10-13 May 2nd The existing BCC architecture described in "C_ Faller and F. Baumgarte's "Two-Channel Signal Coding for Stereo and Multi-Channel Audio Compression" published by C_ Faller and F. Baumgarte in Munich is not quite suitable. Of course, the entire audio processing The ultimate goal is to relate everything to a single reference channel, which is unnatural to humans. Instead, humans have two ears on two different sides of the head. Therefore, the natural listening effect of humans makes the signal more balanced on the left or right side or balances the signal between the front and the back. Conversely, it is unnatural for a human to feel that a certain sound source in a range of hearing is at a certain balance between each speaker and a single reference speaker. Differences between the natural listening impression and the mathematical/physical model of the BCC may lead to negative outcomes of the coding architecture when considering the bit rate requirement, the tunable capacity requirement, the elastic demand, the reconstructed artificial factor requirement, or the anti-error requirement. SUMMARY OF THE INVENTION One object of the present invention is to provide an improved concept for presenting multi-channel audio signals. The object is to generate a reconstructed multi-channel representation by means of a device for generating a parameter representation of a 1313857 multi-channel input signal according to the first aspect of the patent application, according to one of the claims of claim 21 The device is completed according to a method of applying for a patent scope No. 31 or item 3, a computer program according to the third paragraph of the patent application scope, or a parameter representation according to item 34 of the patent application scope. The present invention is based on the following findings: for a multi-channel representation it must be based on the balance parameters between the pairs of channels. Furthermore, it has been discovered that a multi-channel signal parameter representation can be achieved by providing at least two different balance parameters, wherein φ the at least two different balance parameters represent a balance between two different pairs of channels. In particular, the flexibility, adjustability, error resistance, and even bit rate efficiency are different from the second channel pair (based on the second balance parameter) by the first channel pair (based on the first balance parameter) The result of the fact that the four channels forming these pairs of channels are different from each other. Thus, the concept of the present invention differs from the single reference channel concept and the use of a multi-balance or over-balance concept that is more straightforward and more natural to the human voice. In particular, the pairs of channels that make up the first and second balanced parameters may include some combination of the original channel, the downmix channel, or preferably the input channel. A balance parameter obtained from the center channel (as the first channel) and the addition of the left original channel and the right original channel (as the second channel of the channel pair) have been found It is always useful to provide a precise energy distribution between the center channel and the left and right channels. It is noted that in this context these three channels usually comprise most of the information of the sound field, wherein in particular the left-right stereo localization is not only affected by the balance between the left and the right, but also by the central and -10- 1313857 'The effect of the balance between the left and right sums. This observation is reflected by the use of this balancing parameter in accordance with a preferred embodiment of the present invention. Preferably, when transmitting a single single downmix signal, it has been found that in addition to the center/left + right balance parameter, there is a left/right balance parameter, a back-left/back-right balance parameter, and a front The /after balance parameter is the best solution for one-bit rate-effective parameter representation, which is elastic, error-resistant and immune to large artificial factors. On the receiver side, the BCC synthesis of each channel is calculated by the transmitted information alone. The multi-balance representation of the present invention additionally uses φ to generate the downmix channel below the line mix. Architectural information. Thus, in accordance with the present invention, the data on the downstream mixing architecture (not used in prior art systems) is also used in addition to the balancing parameters for upstream mixing. Therefore, the upstream mixing operation is implemented to determine the balance between the channels in a reconstructed multi-channel signal (forming a pair of channels for a balanced parameter) by the balance parameter. This concept (i.e., different balance parameters have different pairs of channels) can produce multiple channels without knowing each transmission balance parameter. In particular, according to the present invention, the left, right and center channels can be reconstructed without knowing any back-left/back-right balance or without knowing the front/back balance. This result allows very fine-tuning of the ability to adjust because one additional parameter is extracted from a single stream or an additional balanced parameter is transmitted to a receiver, thus allowing reconstruction of one or more additional channels. This is in contrast to the prior art single reference system in which a level difference between channels is required to reconstruct all subgroups or only a subgroup of all reconstructed output channels. Since the selection of the balance parameters can be adapted to a certain reconstruction environment, the concept of the present invention is also flexible. A front-back balance parameter allows calculation of the combined surround when, for example, a 5-channel installation forms the original multi-channel signal setup and when a 4-channel setup forms a reconstructed multi-channel setup The channel does not require any knowledge of the left surround channel and the left surround channel, wherein the reconstructed multi-channel device has only a single surround speaker, and the single surround speaker is, for example, disposed behind the listener. This is in contrast to a single reference channel system in which the inter-channel level difference of the left surround channel and the inter-channel level of the right surround channel must be retrieved from the stream. difference. After Φ, the left surround channel and the right surround channel must be calculated. Finally, two channels must be added to obtain the single-surround speaker channel for a 4-channel reconstruction setup. Because the more intuitive and user-oriented balanced parameter representation is not limited to a single reference channel, the combination of the original channels can be allowed to be used as a balanced channel pair of channels and thus automatically Sending the combined surround channel' so it is not necessary to implement all of these steps in the balanced parameter representation. The present invention is related to the problem of parameterized multi-channel representation of audio signals. The present invention provides an efficient way to define the appropriate parameters for the multi-channel representation and also provides the ability to retrieve parameters for representing the desired channel configuration without having to decode all of the channels. The present invention further addresses the problem of selecting an optimal parameter configuration for a particular signal segment in order to minimize the bit rate required to encode the spatial parameter for that particular signal segment. The present invention also describes how to apply a decorrelation method that was previously only applicable to two channels in a general multi-channel environment. In a preferred embodiment, the invention comprises the following features: 131.3857 - on the encoder side, the multi-channel signal is down-mixed into one or two channel representations; - the multi-channel signal is known Defining the parameters used to represent the multi-channel signals' to minimize the bit rate in an elastic per-frame basis or to enable the decoder to capture the channel configuration at one bit stream level; Assuming that the channel configuration is currently supported by the decoder, the relevant parameter set is retrieved on the decoder side; - assuming the current channel configuration, the required number of mutual decorrelation signals are generated; - assuming that the parameter set is decoded by the bit stream data and the decorrelated signals to reconstruct the output signals; - defining the parameterization of the multi-channel audio signal so that the same parameter or one of the parameters can be used a subgroup, regardless of the channel configuration; - defining a parameterization of the multichannel audio signal so that the parameters can be used in an adjustable coding architecture at which the subset of the parameter group is transmitted The different layers of the adjustable stream; # -界Parameterizing the multi-channel audio signal such that energy reconstruction from the output signal of the decoder is not compromised by an audio codec for encoding the downstream mix signal; Switching between different parameterizations of the multi-channel audio signal to minimize the bit rate burden used to number the parameterization; - defining a parameterization of the multi-channel audio signal, including a representation of the downstream mix The parameter of the energy correction factor of the signal; - using several de-correlated decorrelators to reconstruct the multi-channel signal; 1313857 more than 1彐-one moment tones on the mixed line|+ Send it. According to the letter, according to the letter and one way to make it clear. This divine description or the circumstance of the paradigm Fan π ρ 月 发 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本 本The examples are merely illustrative of the principles of the multi-channel representation of the present invention in audio signals. It will be appreciated that modifications and variations of the configurations and details described herein are readily apparent to those skilled in the art. Therefore, the meaning of the present invention is defined by the specific examples of the patent application to be described, and not by the specific details presented in the description and description of the embodiments herein. In the following description of the present invention, which describes how to parameterize IID and ICC parameters and how to apply these parameters in order to reconstruct a multi-channel representation of an audio signal, assume that all of the mentioned signals are sub-band signals or corresponding in a filter block. Most other frequencies are selectively represented by a portion of the entire frequency range of the channel. Thus, it is understood that the present invention is not limited to a particular filter bank, and that the present invention is described below for a frequency band represented by a sub-band of the signal, and the same operation is applied to all sub-band signals. Although a balance parameter is also referred to as an "inter-channel intensity difference (IID)" parameter, it is not necessary to emphasize the balance parameter of one channel pair as the energy or intensity in the first channel of the channel pair. And the energy or intensity of the second channel in the pair of channels. Typically, the balance parameter represents the localization of a sound source between the two channels of the pair of channels. Although this localization is usually provided by the energy/level/intensity difference, other characteristics of the signal can be used (eg, two 14-1313857 channel power measurements or time or frequency encapsulation of the channels) ). In Fig. 1, a different channel of a 5.1 channel configuration is shown, where a(t) 1〇1 represents the left surround channel, b(t) 102 represents the left front channel, c(t) 1〇3 indicates the center channel, d(t) 104 indicates that the right front channel 'e(t)丨〇5 indicates the right surround channel, and f(t) 106 indicates the LFE (low frequency sound effect (i0w frequency) Effect)) Channel ° Suppose we define the expected operator as: ^ 〇 and therefore the energy of the channel described above can be defined as follows (here the left surround channel is used as an example): On the encoder side The upper 5-channel downmix is a 2-channel representation or a 1-channel representation. This can be done in several ways, and one commonly used way is the ITU Downmix as defined below: 5.1 to 2-channel downmix: h = ab{t) + βα{ί) + yc {t) + 5f (〇rd{t) = ad{t) + βε{ί) + yc{t) + (〇 and 5.1 to 1-channel downmix: ^(0 = ^(^(0+ ^(0) 1313857 Define the IID parameter as the energy ratio of the channels of two randomly selected channels or weighted groups. If there is the energy of the channel described above for the 5.1 channel configuration, then Several sets of IID parameters can be defined. Figure 7 shows a normal downmixer 700 that uses the above equation 'to calculate a mono m or two optimal stereo channels 1 {1 and rd. Usually 'the downmix The sounder uses some of the downstream mix information. In a preferred embodiment of the linear downmix, the downmix information includes weighting factors a, j3, y, and δ. It is known in the art to use more or Less constant or non-constant weighted φ factor. In an ITU recommended downmix, 'alpha is set to 1, beta and gamma are set equal to the square root of 〇_5, and δ is set to 〇. Usually, The prime α can vary between 1.5 and 0_ 5. In addition, the factors β and γ are different from each other and vary between 〇 and 1. The low-frequency enhanced channel f(t) has the same fact. δ may vary between 0 and 1. In addition, the factors of the left-downmix and the right-down mix need not be equal to one another. When considering a non-automatic downmix performed by a sound engineer, for example, This becomes clearer. Step-by-step guides the sound project to implement a creative downmix instead of a mix of sounds under the laws of any mathematics. Instead, the sound engineer is dominated by his own creative sense. When a certain parameter group records this "creative" downstream mix, the "creative" is mixed with the fr mixer according to the invention shown in Fig. 8, which is not only These parameters are governed and are also governed by additional information about the downstream mix architecture. When a linear downmix is implemented as in Figure 7, the weighting parameters are due to the downstream mix architecture. The upstream mixer The best information is used. However, 'when presenting other information used in the downstream mixing architecture', an upstream mixer can also use this other information as information for the downstream mixing architecture. Such other information may also be, for example, some matrix elements or certain factors or functions within a matrix element of an upstream mix-matrix (eg, as shown in FIG. 1). 5 as described in FIG. · 1-channel configuration and observation of how other channel configurations are related to the 5 · 1 channel configuration: For 3-channel situations where surround channels are not available, 'B, C can be obtained based on the above symbols. And D. B, C, and D are available for a 4-channel configuration, but a combination of A and £ to represent the single surround channel or the general channel in this context is also available. The present invention defines the IID parameters applied to all of these channels, i.e., the 4-channel sub-group of the 5.1 channel configuration has a corresponding sub-group within the IID parameter set describing the 5.1 channel. The following IID parameter group resolves this issue:

'上:a2B + fi2A + y2C + S2F '及 ^c2D + fi1E + Y2C + 62F a2(B + D)'Up: a2B + fi2A + y2C + S2F ' and ^c2D + fi1E + Y2C + 62F a2(B + D)

β\Α + Ε) a\B + D) + Y22Cβ\Α + Ε) a\B + D) + Y22C

r _ β2Α A 4 —r;—=—r _ β2Α A 4 —r; —=—

β2Ε E ^2F_Ε2Ε E ^2F_

a2(B + D) + fi2(A + E) + y22C 明顯可知r ,參數對應於該左下行混音聲道與該右下行 1313857 混音聲道間之能量比。r2參數對應於中央聲道與該左及右前 聲道間之能量比。r3參數對應於該三個前聲道與該兩個環繞 聲道間之能量比。r4參數對應於該兩個環繞聲道間之能量 比。r5參數對應於該LFE聲道與所有其它聲道間之能量比。 在第4圖中,描述上面所述之能量比。不同輸出聲道係 由101至105所表示及相同於第1圖所示以及因而在此不做 詳細陳述。將該揚聲器裝設分割成左及右半部,其中該中央 聲道103係兩個半部之部分。依據本發明該左半部平面與該 右半部平面間之能量比正好是Γι寥1數。此係藉由第4圖中之 U下方的實體線來表示。再者,依據本發明在該中央聲道1〇3 與該左前102及右前103聲道間之能量分佈係由r2所表示。 最後,在該整個前聲道裝設(102、103及104)與該後聲道(1〇1 及1〇5)間之能量分佈係以^參數由第4圖中之箭頭來描述。 假設有上述參數化及該傳輸單下行混音聲道之能量: Μ = ~(α\Β + 0) + β\Α + Ε) + 2y2C + 252F) 2 , φ 該等重建聲道之能量可表示成爲:A2(B + D) + fi2(A + E) + y22C It is obvious that r, the parameter corresponds to the energy ratio between the left downmix channel and the right down 1313857 mix channel. The r2 parameter corresponds to the energy ratio between the center channel and the left and right front channels. The r3 parameter corresponds to the energy ratio between the three front channels and the two surround channels. The r4 parameter corresponds to the energy ratio between the two surround channels. The r5 parameter corresponds to the energy ratio between the LFE channel and all other channels. In Figure 4, the energy ratios described above are described. The different output channels are denoted by 101 to 105 and are identical to those shown in Figure 1 and thus will not be described in detail herein. The speaker assembly is divided into left and right halves, wherein the central channel 103 is part of two halves. According to the invention, the energy ratio between the left half plane and the right half plane is exactly Γι寥1. This is indicated by the solid line below U in Figure 4. Furthermore, the energy distribution between the center channel 1〇3 and the left front 102 and right front 103 channels in accordance with the present invention is represented by r2. Finally, the energy distribution between the entire front channel arrangement (102, 103 and 104) and the rear channel (1〇1 and 1〇5) is described by the arrows in Fig. 4. Suppose there is the above parameterization and the energy of the single downlink mixing channel: Μ = ~(α\Β + 0) + β\Α + Ε) + 2y2C + 252F) 2 , φ The energy of these reconstructed channels can be Expressed as:

F = — —2M 2γ2 1 + r5 厂丰丄丄丄2M Ρ 1 + r4 1 + 1 + r5F = — —2M 2γ2 1 + r5 厂丰丄丄丄 2M Ρ 1 + r4 1 + 1 + r5

E = ——^———2M β2 \ + r4\ + r3\ + r5 2γ2 \ + r2 \-\-r3 1H- r5 -18- 1313857E = ——^———2M β2 \ + r4\ + r3\ + r5 2γ2 \ + r2 \-\-r3 1H- r5 -18- 1313857

Μ - p1A-Y1C-S1F J 因此’可將M信號之能量分配至該等重建聲道,導致重 建聲道具有相同於該等原始聲道之能量。 上述較佳上行混音架構係描述於第8圖中。從F、A、E、 C、B及D之方程式可清楚知道該下行混音架構之由該上行 混音器所使用的資訊係該等加權因數(X、β、γ及δ,在使此加 權或未加權聲道一起加入或彼此扣減以便獲得某一數目之 下行混音聲道前’使用該等加權因數以加權該等原始聲道, 其中下行混音聲道之數目小於原始聲道之數目。因此,從第 8圖可清楚知道依據本發明該等重建聲道之能量不僅由從一 編碼側傳送至一解碼側之平衡參數所決定,而且可由該下行 混音因數α、β、γ及δ。 當考量第8圖時,變成可清楚知道爲了計算該左及右能 量Β及D,可在該方程式中使用已計算之聲道能量f、A、Ε 及C。然而’此沒有必要包含一連續上行混音架構。取而代 之,爲了獲得一例如使用某一上行混音矩陣(具有某些上行 混音矩陣元素)來實施之完全平行上行混音架構,將A、C、 E及F之方程式***B及D之方程式。因此,變得可清楚知 道重建聲道能量僅由平衡參數、下行混音聲道及該下行混音 架構之資訊(例如:該等下行混音因數)來決定。 如從下面將可明顯知道,假設有上述11D參數,則明顯 易知已解決用以界定一參數組之IID參數(可用於數個聲道 -19- 1313857 組態)的問題。觀察該三個聲道組態(亦即,從一可獲得聲道 重建三個前聲道)來做爲一個範例,可明顯易知因爲A、E及 F聲道不存在’所以r3、r4及r5參數係不顯著的。亦可明顯 易知因爲參數Π描述該左與右前聲道間之能量比及參數r2 描述該中央聲道與該左及右前聲道間之能量比,所以參數ri 及r2係足以從一下行混音單聲道重建該三個聲道。 在更一般情況中,可容易地看到上述之IID參數(Γι...Γ5) 係應用至用以從m個聲道重建η個聲道之所有子組,其中 _ m<n^6。觀察第4圖,可以說: -對於一從1聲道重建2聲道之系統而言,從ri參數獲得用 以保持該等聲道間之正確能量比的充分資訊; -對於一從1聲道重建3聲道之系統而言,從^及r2參數 獲得用以保持該等聲道間之正確能量比的充分資訊; -對於一從1聲道重建4聲道之系統而言,從ri、r2及r3 參數獲得用以保持該等聲道間之正確能量比的充分資訊; -對於一從1聲道重建5聲道之系統而言,從ri、r2、r3及 ® “參數獲得用以保持該等聲道間之正確能量比的充分資 訊; -對於一從1聲道重建5·1聲道之系統而言,從ri、r2、r3、 Η及r5參數獲得用以保持該等聲道間之正確能量比的充 分資訊; -對於一從2聲道重建5.1聲道之系統而言,從r2、r3、r4 及r5參數獲得用以保持該等聲道間之正確能量比的充分 資訊。 -20- 1313^57 上述可調能力特徵可藉由第1 Ob圖中之列表來描述。第 1 Oa圖中所述且在稍後所說明之可調位元流亦可適用於第 1 〇b圖中之列表,以便獲得比第1 〇a圖所述者更細之可調能 力。 本發明之優點特別在於:可容易地從一單平衡參數η 重建該左及右聲道,而無需知道或擷取任何其它平衡參數。 爲此目的,在第8圖中之B、D的方程式中,將聲道A、C、 F及E簡單地設定成爲零。 P 在另一情況中,當只考量該平衡參數^時,該等重建聲 道係該中央聲道與該低頻聲道(此聲道未被設定成零)間之 加總及該左與右聲道間之加總。因此,可只使用一單一參數 來重建該中央聲道及該單音信號。此特徵對於一簡單3-聲道 表示是有用的,其中例如藉由對分以從左及右之加總獲得該 左及右信號,以及其中藉由該平衡參數r2正確地決定該中央 與該左右之加總間的能量。 在此上下文中,該等平衡參數r 1或r2係位於一較低調 整層中。 至於第1 Ob圖之列表中的第二項,表示如何只使用兩個 平衡參數取代所有5個平衡參數來產生三個聲道B、D及C 與F間之加總,相較位於該較低調整層中之參數r!或Γ2, 這些參數Γ|及Γ2中之一可以已經在一較高調整層中。 當考量第8圖中之方程式時,變得清楚知道:爲了計算 c,將未擷取參數r5及另一非擷取參數r3設定成零。在另一 情況中’亦將該等未使用聲道A、E及F亦設定成零,以便 -21 · 1313857 可計算該三個聲道B、D及該中央聲道與該低頻增強型聲道 F之組合。 當使一 4-聲道表示上行混音時,只從該參數資料流擷取 rl、Q及Γ3係足夠的。在此上下文中,相較於參數r!或Γ2, r3可以在一下一較高調整層中。因爲如同稍後有關於第6圖 所述,已從該等前聲道與該等後聲道之組合獲得第三平衡參 數r3,所以該4-聲道組態特別適合相關於本發明之超平衡參 數表示。此乃是基於下面事實:該參數r3係一從該聲道對所 φ 獲得之前-後平衡參數,該聲道對具有該等後聲道A與E之 組合(做爲第一聲道)及具有左聲道B、右聲道E及中央聲道 C之組合(做爲該等前聲道)。 因此,如同是在一單一參數聲道裝設中之情況,可自動 地獲得兩個環繞聲道之組合聲道能量,而無需任何進一步分 離計算及隨後組合。 當必須從一單聲道重建五個聲道時,需要另一平衡參數 Γ4。此參數r4可再次位於一下一較高調整層中。 # 當必須實施一 5.1重建時,需要每一平衡參數。因此, 必須將一下一較高調整層(包括該下一平衡參數r5)傳送至一 接收器及由該接收器來估計。 然而,使用依據聲道之擴充數目來擴充該IID參數的相 同方法,可擴充上述IID參數以涵蓋具有比該5 .1組態大之 數目的聲道之聲道組態。因此,本發明並非局限於上面槪述 之範例。 現在觀察該聲道組態係一 5 . 1聲道組態之情況,此爲最 -22- 1313857 通常使用情況中之一。再者,假設從兩個聲道重建髮 道。對於此情況而言,可藉由以下面式子來取代該等 及r4以界定一不同組之參數: β2Ε 該等參數q3及q4表示該前與後左聲道間之能薑 前與後右聲道間之能量比。可想像幾個其它參數化| 在第5圖中,可見到修改之參數化。取代具有-述該前與後聲道間之能量分佈的參數(如第4圖中之 述)及一用以描述該左環繞聲道與該右環繞聲道間5 佈(如第4圖中之r4所槪述),使用該等參數q3及q 該左前102與左環繞101聲道間之能量比及該右前g 與該右環繞聲道1 0 5間之能量比。 本發明教示可使用幾個參數組以表示該等多 號。本發明之一額外特徵係可依據所使用之參數的| 態以選擇不同參數化。 以一使用參數化之粗量化的系統做爲一個範例 位元率限制,因而應該使用一在該上行混音程序中q 誤差之參數化。 觀察在一用以從一聲道重建5.1聲道之系統中 能量的兩個表示式: 1 ( \ 5 = _L 2-^~M-P2A-y1C-52F α 1 1 + r. 5_1聲 參數r3 ^比及該 D -用以槪 r3所槪 :能量分 4以描述 f 道 104 聲道信 I化之型 ,由於高 F會擴大 1述重建 -23 - 1313857Μ - p1A-Y1C-S1F J Thus the energy of the M signal can be distributed to the reconstructed channels, resulting in the reconstructed channels having the same energy as the original channels. The preferred upstream mixing architecture described above is depicted in FIG. From the equations of F, A, E, C, B, and D, it is clear that the information used by the upstream mixer for the downlink mixing architecture is the weighting factors (X, β, γ, and δ). Weighted or unweighted channels are added together or deducted from each other to obtain a certain number of lower line mixing channels before 'using the weighting factors to weight the original channels, where the number of downstream mixing channels is less than the original channel Therefore, it is clear from Fig. 8 that the energy of the reconstructed channels according to the present invention is determined not only by the balance parameters transmitted from an encoding side to a decoding side, but also by the downstream mixing factors α, β, γ and δ. When considering Fig. 8, it becomes clear that in order to calculate the left and right energy Β and D, the calculated vocal energy f, A, Ε, and C can be used in the equation. It is necessary to include a continuous upstream mixing architecture. Instead, in order to obtain a fully parallel upstream mixing architecture implemented using an upstream mixing matrix (with certain upstream mixing matrix elements), A, C, E, and F are used. The equation is inserted into B The equation of D. Therefore, it becomes clear that the reconstructed channel energy is determined only by the balance parameters, the downmix channel, and the information of the downmix architecture (eg, the downmix factors). It is obvious that, assuming the above 11D parameters, it is obvious that the problem of defining the IID parameter of a parameter group (available for several channels -19 - 1313857 configuration) has been solved. Observe the three channel configuration ( That is, reconstructing three front channels from an available channel as an example, it is obvious that since the A, E, and F channels do not exist, the r3, r4, and r5 parameters are not significant. It can be clearly seen that because the parameter Π describes the energy ratio between the left and right front channels and the parameter r2 describes the energy ratio between the center channel and the left and right front channels, the parameters ri and r2 are sufficient to mix from the next line. Mono channel reconstruction of the three channels. In a more general case, it can be easily seen that the above IID parameters (Γι...Γ5) are applied to reconstruct all of the n channels from m channels. Group, where _ m<n^6. Observing Figure 4, you can say: - For From a 1-channel reconstructed 2-channel system, sufficient information is obtained from the ri parameter to maintain the correct energy ratio between the channels; - for a system that reconstructs 3 channels from 1 channel, from The ^ and r2 parameters obtain sufficient information to maintain the correct energy ratio between the channels; - for a system that reconstructs 4 channels from 1 channel, obtained from the ri, r2, and r3 parameters to maintain these Full information on the correct energy ratio between channels; - For a system that reconstructs 5 channels from 1 channel, the ri, r2, r3, and ® parameters are obtained to maintain the correct energy ratio between the channels. Full information; - For a system that reconstructs 5.1 channels from 1 channel, sufficient information is obtained from the ri, r2, r3, Η and r5 parameters to maintain the correct energy ratio between the channels; - For a system that reconstructs 5.1 channels from 2 channels, sufficient information is obtained from the r2, r3, r4, and r5 parameters to maintain the correct energy ratio between the channels. -20- 1313^57 The above adjustable capability features can be described by the list in the first Ob diagram. The tunable bit stream described in Figure 1Oa and described later may also be applied to the list in Figure 1b to obtain a finer tunability than that described in Figure 1a. The advantage of the invention is in particular that the left and right channels can be easily reconstructed from a single balance parameter η without knowing or taking any other balancing parameters. For this purpose, in the equations of B and D in Fig. 8, the channels A, C, F, and E are simply set to zero. In another case, when only the balance parameter ^ is considered, the reconstructed channels are the sum of the center channel and the low frequency channel (the channel is not set to zero) and the left and right The sum of the channels. Therefore, the center channel and the tone signal can be reconstructed using only a single parameter. This feature is useful for a simple 3-channel representation in which the left and right signals are obtained, for example, by summing from left and right, and wherein the center is correctly determined by the balance parameter r2 The energy between the left and right. In this context, the equalization parameters r 1 or r2 are located in a lower adjustment layer. As for the second item in the list of the 1st Ob, it shows how to replace all 5 balance parameters with only two balance parameters to generate the sum of the three channels B, D and C and F. The parameter r! or Γ2 in the low adjustment layer, one of these parameters Γ| and Γ2 may already be in a higher adjustment layer. When considering the equation in Fig. 8, it becomes clear that in order to calculate c, the untaken parameter r5 and the other non-taken parameter r3 are set to zero. In another case, 'the unused channels A, E and F are also set to zero, so that - 21 · 1313857 can calculate the three channels B, D and the center channel and the low frequency enhanced sound The combination of the road F. When a 4-channel is indicated for the upstream mix, it is sufficient to extract only rl, Q, and Γ3 from the parameter stream. In this context, r3 can be in a next higher adjustment layer than the parameter r! or Γ2. Since the third balance parameter r3 has been obtained from the combination of the front channel and the rear channels as described later with respect to FIG. 6, the 4-channel configuration is particularly suitable for the super-related to the present invention. Balance parameter representation. This is based on the fact that the parameter r3 is a pre-post balance parameter obtained from the channel pair φ, the channel pair having the combination of the back channels A and E (as the first channel) and It has a combination of left channel B, right channel E and center channel C (as the front channel). Thus, as is the case in a single parametric channel setup, the combined channel energy of the two surround channels can be automatically obtained without any further separation calculations and subsequent combinations. When five channels must be reconstructed from a single channel, another balancing parameter Γ4 is required. This parameter r4 can again be located in the next higher adjustment layer. # When a 5.1 reconstruction must be implemented, each balancing parameter is required. Therefore, a higher adjustment layer (including the next balance parameter r5) must be transmitted to and evaluated by the receiver. However, using the same method of augmenting the IID parameter depending on the number of expansions of the channel, the above IID parameters can be extended to cover a channel configuration having a larger number of channels than the 5.1 configuration. Therefore, the present invention is not limited to the examples described above. Now observe that the channel configuration is a 5.1 channel configuration, which is one of the most commonly used -22- 1313857. Again, assume that the channel is reconstructed from both channels. For this case, the parameters of a different group can be defined by substituting the following equations and r4 to define a different set of parameters: β2Ε These parameters q3 and q4 represent the energy between the front and rear left channels before and after the right The energy ratio between the channels. Imagine a few other parameterizations | In Figure 5, the parameterization of the modifications can be seen. Instead of having a parameter describing the energy distribution between the front and rear channels (as described in FIG. 4) and a description between the left surround channel and the right surround channel (as shown in FIG. 4) As described in r4, the energy ratio between the left front 102 and the left surround 101 channel and the energy ratio between the right front g and the right surround channel 1 0 5 are used. The teachings of the present invention may use several parameter sets to represent the multiple numbers. An additional feature of the invention may be selected for different parameterization depending on the state of the parameter used. A system using parameterized coarse quantization is used as an example bit rate limit, so a parameterization of q error in the upstream mix should be used. Observe two expressions of energy in a system for reconstructing 5.1 channels from one channel: 1 ( \ 5 = _L 2-^~M-P2A-y1C-52F α 1 1 + r. 5_1 acoustic parameter r3 ^ Compared with the D - used for 槪r3: energy is divided into 4 to describe the type of 104 channel I signal, because high F will expand the reconstruction -23 - 1313857

D = —~Γ 2-Μ — β1 Ε — y2C — 52FD = —~Γ 2-Μ — β1 Ε — y2C — 52F

« I 1+η J 明顯可知由於該Μ、A、C及F參數之相當小量化效應, 因而該等減算會產生該B及D能量之大變化。 依據本發明’應該使用一幾乎對該等參數之量化不會有 敏感之不同參數化。因此,如果使用粗量化,則上述所界定 之η參數:« I 1+η J clearly shows that due to the relatively small quantization effect of the Μ, A, C and F parameters, these reductions will produce large changes in the B and D energies. In accordance with the present invention, a parameterization that is almost insensitive to the quantification of such parameters should be used. Therefore, if coarse quantization is used, the η parameter defined above:

r _ a2B +fi2A + y2C + S2F a2D + fi2E + Y2C + 52F ® 可由依據下式之替代界定來取代:r _ a2B +fi2A + y2C + S2F a2D + fi2E + Y2C + 52F ® can be replaced by an alternative definition according to the following formula:

B K ——— D。 此產生依據下式之重建能量的方程式: B =B K ——— D. This produces an equation for the reconstruction energy according to the following formula: B =

j\__1__1__1_ 1 + r, 1 + r2 1 + r3 1 + r5j\__1__1__1_ 1 + r, 1 + r2 1 + r3 1 + r5

2M D = J__1__1__1__1_ a2 1 + r, 1 + r2 1 + r3 1 + r52M D = J__1__1__1__1_ a2 1 + r, 1 + r2 1 + r3 1 + r5

2M 及A、E、C及F之重建能量的方程式保持與上述相同。 明顯可知此參數從量化觀點來看表示一最佳狀態系統。 在第6圖中,描述上述所說明之能量比。不同輸出聲道 以1 0 1至1 0 5來表示且相同於第1圖以及因此在此不做進一 步詳述。將該揚聲器裝設分割成前部及後部。藉由第6圖中 由r3參數所表示之箭頭來描述該整個前聲道裝設(102、103 及104)與該等後聲道(1〇1及105)間之能量分佈。 本發明之另一重要顯著特徵在於當觀察該參數化 -24 - 1313857 r - r22C 2 a2(B + D)The equations for the reconstruction energy of 2M and A, E, C, and F remain the same as described above. It is apparent that this parameter represents an optimal state system from a quantitative point of view. In Fig. 6, the energy ratios described above are described. The different output channels are represented by 1 0 1 to 1 0 5 and are identical to the first picture and therefore will not be further detailed here. The speaker assembly is divided into a front portion and a rear portion. The energy distribution between the entire front channel arrangement (102, 103 and 104) and the rear channels (1〇1 and 105) is described by the arrow indicated by the r3 parameter in Fig. 6. Another important distinguishing feature of the present invention is that when the parameterization is observed -24 - 1313857 r - r22C 2 a2(B + D)

B r,= —B r,= —

D 時,從量化觀點來看它不僅是一更佳狀態系統。上述參數 化亦具有下列優點:可獲得用以重建三個前聲道之參數而不 會對該等環繞聲道有任何影響。可相像一參數r2係描述該中 央聲道與所有其它聲道間之關係。然而,此將具有下例缺 點:該等環繞聲道將包含在該等前聲道所述之參數的估計 #中。 記住,在本發明中所描述之參數化亦可應用至聲道間之 關聯或同調的測量,明顯可知在r2之計算中包含該等後聲道 對精確地重建該等前聲道之成功有顯著的負面影響。 可相像在所有前聲道中之相同信號及在該等後聲道中 之完全無相關信號的情況,以做爲一個範例。此不是罕見 的,假設經常使用該等後聲道以重建該原始聲音之周圍環境 資訊。 ^ 如果描述該中央聲道係有關於所有其它聲道,則因爲該 等後聲道完全不相關,所以該中央與所有其它聲道之加總間 之相關程度將相當低。對於一用以估計該前左/右聲道與該 後左/右聲道間之相關性的參數具有相同之事實。 因此,我們達成一可正確地重建該等能量之參數化,然 而該參數化並沒有包括所有前聲道係相同(亦即,非常相關) 的資訊。該參數化確實包括將該左及右前聲道解相關至該等 後聲道及亦將該中央聲道解相關至該等後聲道之資訊。然 -25 - 1313857 而’所有前聲道係相同之事實係無法從此一參數化來推論。At D, it is not only a better state system from a quantitative point of view. The above parameterization also has the advantage that parameters for reconstructing the three front channels can be obtained without any effect on the surround channels. The relationship between the central channel and all other channels can be described as a parameter r2. However, this would have the following drawback: the surround channels will be included in the estimate # of the parameters described in the front channels. It is to be noted that the parameterization described in the present invention can also be applied to the correlation or coherence measurement between channels, and it is apparent that the success of reconstructing the front channels by including the rear channel pairs in the calculation of r2 is known. There are significant negative effects. It can be seen as an example of the same signal in all the front channels and the absence of relevant signals in the back channels. This is not uncommon, assuming that the back channels are often used to reconstruct the surrounding information of the original sound. ^ If the description of the center channel is related to all other channels, the correlation between the center and all other channels will be quite low because the back channels are completely uncorrelated. The same fact is used for a parameter for estimating the correlation between the front left/right channel and the rear left/right channel. Therefore, we have reached a parameterization that correctly reconstructs the energy, but the parameterization does not include all of the same (ie, very relevant) information about the front channel. The parameterization does include de-correlating the left and right front channels to the back channels and also correlating the center channels to the back channels. However -25 - 1313857 and the fact that all front channels are the same cannot be inferred from this parametric.

因爲該等後聲道未包含在該解碼器側上所使用之參數 的估計中以重建該等前聲道’所以此可藉由使用下列本發明 所教示之式子來克服·· y22C 广=— ·__ 2 a2(B + D)Since the back channels are not included in the estimation of the parameters used on the decoder side to reconstruct the front channels', this can be overcome by using the following formula taught by the present invention. y22C — ·__ 2 a2(B + D)

B η = ~B η = ~

D 依據本發明藉由Γ2來表示該中央聲道1〇3與該左前1〇2 Β 及右前103聲道間之能量分佈。藉由r4來描述該左環繞聲道 101與該右環繞聲道1〇5間之能量分佈。最後,藉由ri來提 供該左前聲道102與該右前聲道104間之能量分佈。明顯可 知’除Π在此對應於該左前揚聲器與該右前揚聲器間之能量 分佈(因相對於整個左側及整個右側)之外,所有參數相同於 第4圖中所述。基於完整性,該參數r5亦提供用以槪述該中 央聲道103與該LFE聲道106間之能量分佈。 ^ 第6圖顯示本發明之較佳參數化實施例的槪要。該第一 平衡參數1Ί (由實線所表示)構成一前-左/前-右平衡參數。該 第二平衡參數r2係一中央左-右平衡參數。該第三平衡參數 r3構成一前/後平衡參數。該第四平衡參數r4構成一後-左/ 後-右平衡參數。最後,該第五參數r5構成一中央/LFE平衡 參數。 第4圖顯示一相關情況。該第一平衡參數η (在一下行 混音左/右平衡中藉由第4圖中之實線來描述)可由一在該等 聲道Β與D(下面聲道對)間所界定之原始前-左/前-右平衡參 -26- 1313857 數來取代。此以第4圖中之虛線Γι來描述及對應於第5圖及 第6圖中之實線r,。 在一雙聲道情況中,該等參數r 3及r 4 (亦即,該前/後平 衡參數及該後-左/右平衡參數)由兩個單側前/後參數所取 代。該第一單側前/後參數q3亦可被視爲該第一平衡參數, 其中該第一平衡參數係從該左環繞聲道A及該左聲道B所構 成之聲道對所獲得。該第二單側前/左平衡參數係該參數 q4’其可被視爲該第二參數,該第二參數係根據該右聲道D φ 及該右環繞聲道E所構成之第二聲道對。再者,兩個聲道對 係彼此不相關的。該中央/左-右平衡參數Γ2亦具有相同之事 實’該中央/左-右平衡參數r2具有一中央聲道C以做爲一第 一聲道及該左及右聲道B及D之加總以做爲一第二聲道。 依據本發明界定另一參數化,該另一參數化針對一從一 個或兩個聲道重建5_1聲道之系統本身相當適合於粗量化。D According to the present invention, the energy distribution between the center channel 1〇3 and the left front 1〇2 Β and the right front 103 channel is represented by Γ2. The energy distribution between the left surround channel 101 and the right surround channel 1〇5 is described by r4. Finally, the energy distribution between the left front channel 102 and the right front channel 104 is provided by ri. It is apparent that the parameters except for the energy distribution between the left front speaker and the right front speaker (as opposed to the entire left and the entire right side) are the same as described in FIG. Based on the integrity, the parameter r5 is also provided to describe the energy distribution between the central channel 103 and the LFE channel 106. ^ Figure 6 shows a summary of a preferred parametric embodiment of the present invention. The first balance parameter 1 Ί (represented by the solid line) constitutes a front-left/front-right balance parameter. The second balance parameter r2 is a central left-right balance parameter. The third balance parameter r3 constitutes a pre/post balance parameter. The fourth balance parameter r4 constitutes a back-left/back-right balance parameter. Finally, the fifth parameter r5 constitutes a central/LFE balance parameter. Figure 4 shows a related situation. The first balance parameter η (described by the solid line in FIG. 4 in a downmix left/right balance) may be defined by an original between the channels D and D (the lower channel pair) The front-left/front-right balance -26- 1313857 is replaced by a few. This is described by the broken line 第ι in Fig. 4 and corresponds to the solid line r in Figs. 5 and 6. In the case of a two-channel, the parameters r 3 and r 4 (i.e., the front/rear balance parameter and the back-left/right balance parameter) are replaced by two one-sided front/rear parameters. The first one-sided front/rear parameter q3 can also be regarded as the first balance parameter, wherein the first balance parameter is obtained from the pair of channels formed by the left surround channel A and the left channel B. The second one-sided front/left balance parameter is the parameter q4' which can be regarded as the second parameter, and the second parameter is based on the right channel D φ and the second surround channel E Right. Furthermore, the two channel pairs are not related to each other. The central/left-right balance parameter Γ2 also has the same fact that the central/left-right balance parameter r2 has a center channel C as a first channel and the left and right channels B and D. Always as a second channel. Another parameterization is defined in accordance with the present invention which is inherently suitable for coarse quantization for a system that reconstructs 5_1 channels from one or two channels.

就一個聲道至5.1聲道而言 β2Α a2B _ y2C Μ Μ Μ a2D ~M~ β2Ε 从及 S2F ~M~From one channel to 5.1 channel β2Α a2B _ y2C Μ Μ Μ a2D ~M~ β2Ε From and S2F ~M~

以及就二個聲道至5.1聲道而言: β2Λ _ a2B r2C a2DAnd for two channels to 5.1 channels: β2Λ _ a2B r2C a2D

L <hL <h

LL

MM

R β2Ε R及 52Fnr 明顯可知上述參數化包括比嚴格理論觀點所需要要多 之參數,以正確地再分配該等傳輸信號之能量至該等重建之 信號。然而,該參數化對量化誤差之敏感係非常遲鈍的。 上述針對一 2 -聲道裝設所提及之參數組使用幾個參考 聲道。然而’對照於第6圖中之參數組態,第7圖中之參數 -27 - 131.3857 組僅依據下行混音聲道而非原始聲道來做爲參考聲道。該等 平衡參數q】、q3及q4係由完全不同聲道對所獲得。 雖然已描述幾個本發明實施例,其中用以獲得平衡參數 之聲道對僅包括原始聲道(第4圖、第5圖及第6圖)或包括 原始聲道及下行混音聲道(第4圖及第5圖)或僅依據該下行 混音聲道以做爲在第7圖之底部所表示的參考聲道,但是最 好在第2圖之環繞資料編碼器206內所包括之參數產生器係 操作以僅使用原始聲道或原始聲道之組合而非在該等聲道 φ 對中之聲道的一基本聲道或基本聲道之組合,其中該等平衡 參數係根據該等聲道對。此乃是由於無法完全保證該單一基 本聲道或該兩個立體聲基本聲道不在會在從一環繞編碼器 傳輸至一環繞解碼器期間發生能量變化。可藉由一音頻編碼 器20 5 (第2圖)或一音頻解碼器3 02(第3圖)在一低-位元率 狀態下操作以造成該下行混音聲道或該單一下行混音聲道 之能量變化。此情況會導致該單下行混音聲道或該等立體下 行混音聲道之能量的操控,該操控在該左與右立體聲下行混 φ 音聲道間可以是不同的或甚至可以是頻率選擇性的或時間 選擇性的。 爲了完全安全地反對此能量變化,依據本發明針對每一 下行混音聲道之每一區域及頻帶傳送一額外電平參數。所以 當該等平衡參數係根據該原始信號而非該下行混音信號 時,因爲任何能量校正將不影響該等原始聲道間之平衡情 況,所以—單一校正因數對每一頻帶係足夠的。甚至當沒有 傅送額外電平參數時’任何下行混音聲道能量變化將不會在 -28 - 1313857 該音頻圖像中導致音源之失真局部化,然而將只會導致一般 音量變化’該一般音量變化不會像藉由改變平衡狀態所造成 之音源的遷移一樣惱人。 重要的是要注意需要小心,以便(該等下行混音聲道之) 能量Μ係上面所槪述之能量B、D、A、E、C及F之加總。 由於在被下行混音至一個聲道之不同聲道間的相位相依 性’所以不會經常是這種情況。可傳送該能量校正因數以做 爲一額外參數rM,以及因此將在該解碼器側上所接收之下行 • 混音信號界定成爲: rMM =^-(α2(Β + ϋ) + β2(Α + Ε) + 2r2C + 2S2F) ^ ο 在第9圖中,槪述該額外參數rM之應用。在將該下行 混音信號傳送至該上行混音模組701-705前在901中藉由該 額外參數rM修改該下行混音信號。這些係相同於第7圖所 述者及在此將不做進一步詳述。熟習該項技藝者明顯可知上 面單聲道下行混音範例之參數rM可擴充至每一下行混音一 φ 個參數及因此並非局限於一單一下行混音聲道。 第9a圖描述一發明電平參數計算器900,然而第9b圖 表示一發明電平校正器902。第9a圖表示在該編碼器側上之 情況,以及第9b圖描述在該解碼器側上之對應情況。該電 平參數或「額外」參數rM係一用以提供某一能量比之校正 因數。假設下面示範性情節來做解釋。針對某一原始多聲道 信號,一方面具有一「主下行混音J及另一方面具有一「參 數下行混音」。已依據例如主觀品質印象由在一播音室中之 -29 - 1313857 音效工程師產生該主下行混音。此外,某一音頻儲存媒體亦 包括該參數下行混音,該參數下行混音已藉由例如第2圖之 環繞編碼器2 0 3來實施。該參數下行混音包括一基本聲道或 兩個1基本:聲道’上述基本聲道使用該原始多聲道信號之平衡 參數組或任何其它參數表示來形成該多聲道重建之基礎。 例如’可以是下面情況:廣播員希望不要傳送該參數下 行混音’然而希望將該主下行混音從一發送器傳送至接收 器。此外’爲了將該主下行混音提升至多聲道表示,該廣播 # 員亦傳送該原始多聲道信號之一參數表示。因爲(在一頻帶 中及在一區塊中之)能量可(或通常將)在該主下行混音與該 參數下行混音間做變化,所以在區塊900中產生一相對電平 參數rM及將其傳送至該接收器以做爲一額外參數。該電平 參數係從該主下行混音及該參數下行混音所獲得及最好是 在該主下行混音及該參數下行混音之一區塊及一頻帶內之 能量的比率。 通常,計算該電平參數以成爲該等原始聲道之能量 ♦ (E〇Hg)的加總與該(等)下行混音聲道之能量間的比率,其中 此(等)下行混音聲道可以是該參數下行混音(EPD)或該主下 行混音(EMD)或任何其它下行混音信號。通常,使用從一編 碼器傳送至一解碼器之特定下行混音信號的能量。 第9b圖描述該電平參數使用之一解碼器側實施。將該 電平參數及該下行混音信號輸入至該電平校正器區塊902。 該電平校正器依據該電平參數校正該單一基本聲道或該幾 個基本聲道。因爲該額外參數rM係一相對値,所以此相對 -30 - 1313857 値係藉由該對應基本聲道之能量來操 雖然第9a及9b圖表示一對該下 混音聲道施加電平校正之情況,但是 合至該上行混音矩陣中。爲此目的, m之每次出現係由「r M Μ」來取代。 硏究當從2聲道重建5.1聲道之 如果使用具有第2圖及第3圖所 3〇2的本發明,需要多數更多考量。 φ 參數,其中依據下面式子來界定Γι:R β2 Ε R and 52 Fnr obviously show that the above parameterization includes more parameters than the strict theoretical point of view to correctly redistribute the energy of the transmitted signals to the reconstructed signals. However, the sensitivity of this parameterization to quantization errors is very slow. The above mentioned parameter sets for a 2-channel installation use several reference channels. However, in contrast to the parameter configuration in Figure 6, the parameters -27 - 131.3857 in Figure 7 are based only on the downmix channel instead of the original channel. The equalization parameters q], q3 and q4 are obtained from completely different pairs of channels. Although several embodiments of the invention have been described in which the channel pairs used to obtain the balance parameters include only the original channels (Figs. 4, 5, and 6) or include the original channels and the downmix channels ( 4 and 5) or only the downlink mixing channel as the reference channel represented at the bottom of FIG. 7, but preferably included in the surround encoder 206 of FIG. The parameter generator is operative to use only a combination of the original channel or the original channel rather than a combination of a base channel or a base channel of the channel in the equal channel φ pair, wherein the equalization parameters are based on the Equal channel pair. This is due to the inability to fully guarantee that the single fundamental channel or the two stereo base channels will not undergo an energy change during transmission from a surround encoder to a surround decoder. Operation may be performed in a low-bit rate state by an audio encoder 20 5 (Fig. 2) or an audio decoder 302 (Fig. 3) to cause the downmix channel or the single downmix The energy of the channel changes. This condition may result in the manipulation of the energy of the single downmix channel or the stereo downmix channels, which may be different between the left and right stereo downmix φ channels or even a frequency selection Sexual or time selective. In order to completely safely oppose this energy change, an additional level parameter is transmitted in accordance with the present invention for each region and frequency band of each of the downstream mixing channels. Therefore, when the balance parameters are based on the original signal rather than the downmix signal, since any energy correction will not affect the balance between the original channels, a single correction factor is sufficient for each band. Even when there is no extra level parameter sent by Fu, 'any downmix channel energy change will not cause localization of the distortion of the source in the audio image -28 - 1313857, but will only result in a general volume change'. Volume changes are not as annoying as the migration of sound sources caused by changing the balance. It is important to note that care must be taken so that the energy of the downstream mixing channels is the sum of the energies B, D, A, E, C and F described above. This is not always the case due to the phase dependence between the different channels being downmixed to one channel. The energy correction factor can be transmitted as an additional parameter rM, and thus the line-mixed signal received on the decoder side is defined as: rMM =^-(α2(Β + ϋ) + β2(Α + Ε) + 2r2C + 2S2F) ^ ο In Figure 9, the application of this additional parameter rM is described. The downstream mix signal is modified in 901 by the additional parameter rM before the downstream mix signal is transmitted to the upstream mix module 701-705. These are the same as those described in Figure 7 and will not be described in further detail herein. It will be apparent to those skilled in the art that the parameter rM of the upper mono downmixing example can be extended to one φ parameter per downlink mix and thus is not limited to a single downmix channel. Figure 9a depicts an inventive level parameter calculator 900, whereas Figure 9b shows an inventive level corrector 902. Figure 9a shows the situation on the encoder side, and Figure 9b depicts the corresponding situation on the decoder side. The level parameter or "extra" parameter rM is used to provide a correction factor for a certain energy ratio. The following exemplary scenarios are assumed to be explained. For a certain original multi-channel signal, on the one hand, there is a "main downlink mix J and on the other hand, a "parameter down mix". The primary downmix has been generated by a -29 - 1313857 sound engineer in a studio based on, for example, a subjective quality impression. In addition, an audio storage medium also includes the parameter downmix, which is implemented by, for example, the surround encoder 203 of Fig. 2. The parameter downmix includes a base channel or two 1 base: channel' the base channel uses the balanced parameter set of the original multichannel signal or any other parameter representation to form the basis for the multichannel reconstruction. For example, 'the following may be the case: the broadcaster wishes to not transmit the parameter to mix down'. However, it is desirable to transfer the primary downmix from a transmitter to the receiver. In addition, in order to promote the main downmix to a multi-channel representation, the broadcaster also transmits a parameter representation of the original multi-channel signal. Since (in a frequency band and in a block) energy can (or will typically) vary between the main downmix and the parameter downmix, a relative level parameter rM is generated in block 900. And pass it to the receiver as an additional parameter. The level parameter is obtained from the main downmix and the downmix of the parameter and preferably the ratio of the energy in a block of the main downmix and the downmix of the parameter and a band. Typically, the level parameter is calculated to be the ratio of the sum of the energy ♦ (E〇Hg) of the original channels to the energy of the (equal) downmix channel, wherein the (equal) downmix sound The track can be the parameter Downmix (EPD) or the Main Downmix (EMD) or any other downstream mix. Typically, the energy of a particular downstream mix signal transmitted from a codec to a decoder is used. Figure 9b depicts the implementation of this level parameter using one of the decoder sides. The level parameter and the downmix signal are input to the level corrector block 902. The level corrector corrects the single base channel or the plurality of base channels in accordance with the level parameter. Since the additional parameter rM is a relative 値, the relative -30 - 1313857 操 is operated by the energy of the corresponding basic channel, although the 9a and 9b diagrams indicate that a pair of the downmix channels are applied with level correction. The situation, but in the upmix matrix. For this purpose, each occurrence of m is replaced by "r M Μ". Investigating when 5.1 channel is reconstructed from 2 channels If the present invention having Figs. 2 and 3 is used, most of the considerations are required. φ parameter, where Γι is defined according to the following formula:

_L _ a2B + fi2A + y2C + 52F R~ a2D + p2E + r2C + S2F 因爲該系統從2聲道重建5.1聲 輸聲道係該等環繞聲道之立體聲下朽 示地可在該解碼器側上獲得。 然而,在一位元率限制下操作之 該頻譜分佈,以便在該解碼器側上所 φ於在該編碼器側上之數値。依據本發 建5 .1聲道時之情況藉由傳送下列參 能量分佈的影響消失:_L _ a2B + fi2A + y2C + 52F R~ a2D + p2E + r2C + S2F Because the system reconstructs the 5.1 channel from the 2 channel, the stereo channel of the surround channels can be displayed on the decoder side. obtain. However, the spectral distribution is operated at a one-bit rate limit so that φ on the decoder side is on the encoder side. The situation when the 5.1 channel is created according to the present invention disappears by transmitting the influence of the following reference energy distribution:

B r,=— D。 如果提供發信手段,則該解碼器 目前信號區段及選擇用以對所要處 最低負擔之IID參數。該右前與後聲 相似的,以及該前與後左聲道間之會丨B r, = - D. If a means of signaling is provided, the decoder currently signals the segment and selects the IID parameter to use for the lowest burden desired. The right front and rear sounds are similar, and the relationship between the front and rear left channels

控。 行混音聲道或該等下行 該亦可將該電平參數整 在第8圖之方程式中的 情況,可觀察下面描述。 槪述之編解碼器205及 觀察稍早所界定之IID 道,其中假設該兩個傳 F混音,所以此參數係暗 音頻編解碼器可以修改 測量之L及R能量不同 亦可針對從兩個聲道重 數以使對該重建聲道之 可使用不同參數組編碼 理之特定信號區段提供 道間之能量電平可能係 兰量電平可能係相似的, 1313857 然而在該右前與後聲道中之電平係顯著不同的。 之差量編碼(delta c〇ding)及隨後熵編碼(entropy 使用參數Q3及q4以取代r3及r4係更有效的。對 不同特性之信號區段而言,一不同參數組可以提 元率負擔。本發明允許自由地在不同參數表示間 便最小化該目前已編碼信號區段之位元率負擔, 區段之特性係已知的。切換於該等IID參數之不 以便獲得最低可能位元率負擔及提供發信手段 φ 使用什麼參數化的能力係本發明之基本特徵。 再者,可在頻率方向或在時間方向完成該等 編碼’以及完成不同參數間之差量編碼。依據本 提供發信手段以表示所使用之特定差量編碼,則 相對於任何其它參數實施差量編碼。 任何編碼架構之一重要特徵係實施可調編碼 意味著可將該已編碼位元流分割成幾個不同層。 本身來解碼,以及可解碼較高層以增強該已解 ®號。對於不同情況而言,可獲得層之數目可以是 而只要該核心層係可獲得的,該解碼器可產生輸 用該Π至r5參數之上面所槪述的多聲道編碼之 相當適合於可調式編碼。因此,可將例如該兩個 及E)之資料儲存在—增強層(亦即,該等參數Γ3 核心層中對應於該等前聲道之參數(由參數ri及 中〇 在第1 〇圖中,槪述依據本發明之可調位元 假設有參數 c 〇 d i n g),貝丨J 於另一具有 供一較低位 做切換,以 其中該信號 同參數化間 以表示目前 參數之差量 發明,假設 可對一參數 丨之能力。此 核心層可由 碼核心層信 變化的,然 出樣本。使 參數化本身 環繞聲道(A 及r4及在一 Q所表示)) 流實施。該 -32 - 1313857 等位元流層係以1 Ο Ο 1及1 Ο Ο 2來描述,其中1 ο ο 1係該核心 層’其持有該波形編碼下行混音信號及持有用以重建該等前 聲道(102、103及104)之參數η及r2。1002所描述之增強層 持有用以重建該等後聲道(101及105)之參數。 本發明之另一動要觀點係在一多聲道組態中使用解相 關器。在PCT/SE02/01372專利文件中已針對一個或兩個聲 道情況詳細一解相關器之使用的觀點。然而,當將此理論擴 充至多於兩個聲道時,會產生本發明所要解決之數個問題。 P 基本數學顯示:爲了從N個信號完成Μ個相互解相關 信號’需要Μ-Ν個解相關器,其中所有不同解相關器用以 從一共同輸入信號產生複數個相互正交輸出信號。假設一輸 入川)產生一輸出7(1)且4>;丨21=£卜丨21及幾乎使交互相關£[吁*]消 失,則一解相關器通常是一全通或幾乎全通濾波器。另外的 知覺準則可獲得一良好解相關器之設計,設計方法之多數範 例在加入該原始信號至該解相關信號時亦可最小化梳形濾 波器特性及最小化在暫態信號上之一有時太長之脈衝響應 # 的效應。多數習知技藝解相關器使用一人造反射鏡來解相 關。習知技藝亦可藉由例如修改複雜子頻帶樣本之相位以包 括分數延遲,進而達到較高回聲密度及因而完成更長時間之 擴散。 本發明提出用以修改一以反射鏡爲主之解相關器以便 達到多個可從一共同輸入信號產生複數個相互解相關輸出 信號之解相關器的方法。如果兩個解相關器之輸出y!(t)及 y2(t)具有消失或幾乎消失之交互相關(假設有相同輸入),則 -33 - 1313857 使該兩個解相關器相互地解相關。假設該輸入係靜態白雜 訊,則接著在消失或幾乎消失之感知中該等脈衝響hi 及h2必須是正交的。複數組之成對相互解相關解相關器可 以數個方式來建構。實施此修改之一有效方式係改變相位旋 轉因數q(爲該分數延遲之部分)。 本發明特定相位旋轉因數可以是在該等全通濾波器中 之延遲線的部分或剛好是一總分數延遲。在該後者情況中, 此方法並非局限於全通或反射鏡式濾波器,然而亦可應用至 φ 例如包括一分數延遲部之簡單延遲。可在一 Z-域中將該解相 關器中之一全通濾波器連結描述成爲: 其中q係複數相位旋轉因數(Μ = 1),m係在樣本中之延 遲線長度’以及a係濾波器係數。其於穩定理由,該濾波器 係數之大小必須限制在μ <:1。然而,藉由使用替代濾波器係 數a’ = -a ’以界定一新反射鏡,其具有相同反射延遲特性, ®然而具有一與該未修改反射鏡之輸出顯著不相關之輸出。再 者’該相位旋轉因數q之修改可藉由例如加入一固定相位偏 移q' = qejc來完成。該常數C可用以做爲一固定相位偏移或 可以下列方式來調整:針對所有被施加有該常數C之頻帶而 胃’ ^胃數C將對應於一固定時間偏移。該相位偏移常數C #司*]^是一隨機値,其對於所有頻帶而言係不同的。 依據本發明,藉由將一具有nx(m + p)大小之上行混音矩 陣Η應用至—具有(m + p) X 1大小之行向量信號,以實施從m -34 - 1313857 個聲道產生η個聲道。 m y = s 其中m係m個已下行混音及編碼信號,以及使在s中 之P信號兩者相互地解相關及與在m中之所有信號解相關。 這些解相關信號係藉由解相關器由在m中之信號所產生。然 後,使η個重建信號a1、b'、…包含在該行向量中。 x' = Hy ° 藉由第1 1圖來描述上述情況,其中該等解相關信號係 由該等解相關器1102、1103及1104所產生。該上行混音矩 陣η係由11 ο 1所提供,用以對該向量y操作以提供該輸出 信號X'。 假設RzEUxI爲該原始信號向量之相關矩陣’假設 R’ = E[x’x〃]爲該重建信號之相關矩陣。在此及在下面中,對 於一具有複數項之向量X的矩陣而言,X#表示伴隨矩陣---X 之複數共軛轉置。 R之對角線包含該等能量値A、B、C…及可由上面所界 定之能量定額解碼成一總能量電平。因爲’所以只有 n(n-l)/2個不同非對角線交互相關値,其包含將藉由調整該 上行混音矩陣Η來完全地或部分地重建之資訊。該完整相關 結構之重建對應於該情況R’ = R。正確能量電平之重建僅對 應於下列情況,其中R'及R在對角線上係相等的。 在從m=l聲道成爲η聲道之情況中,藉由使用p = n-i 個相互解相關解相關器(一上行混音矩陣Η)達成該完整相關 -35 - 1313857, 結構之重建’其中該上行混音矩陣Η滿足下列條件:control. The line mixing channel or the down line can also be used to align the level parameter in the equation of Fig. 8, and the following description can be observed. Describe the codec 205 and observe the IID track defined earlier, which assumes that the two F-mixes are mixed, so this parameter is that the dark audio codec can modify the measured L and R energy differently. The number of channels is such that the energy level between the channels of the reconstructed channel that can be encoded using different parameter sets can be similar. 1313857 However, before and after the right front and back The levels in the channels are significantly different. The difference coding (delta c〇ding) and subsequent entropy coding (entropy use parameters Q3 and q4 to replace r3 and r4 are more effective. For different characteristic signal segments, a different parameter group can increase the rate load The present invention allows for freely minimizing the bit rate burden of the currently encoded signal segment between different parameter representations, the characteristics of the segments being known. Switching to the IID parameters does not result in the lowest possible bit. Rate Burden and Providing Means of Signaling φ The ability to use parameterization is an essential feature of the present invention. Furthermore, the encoding can be done in the frequency direction or in the time direction and the difference encoding between different parameters can be accomplished. The signaling means performs differential coding with respect to the particular difference used, and performs differential coding with respect to any other parameters. One of the important features of any coding architecture is that implementing tunable coding means that the encoded bit stream can be split into several Different layers. Decode themselves, and can decode higher layers to enhance the solved ®. For different situations, the number of available layers can be as long as the core It is available that the decoder can generate a multi-channel code as described above for translating the Π to r5 parameters, which is suitable for tunable coding. Therefore, for example, the data of the two and E) can be stored in - the enhancement layer (i.e., the parameters corresponding to the front channels in the core parameters of the parameters (3 (in the first 〇 diagram by the parameter ri and the middle 槪, the parameter of the tunable bit according to the invention is assumed to have parameters) c 〇 ding), Beckham J has another ability to switch for a lower bit, in which the signal is parameterized to represent the difference between the current parameters, assuming that the ability to align a parameter. This core layer can be The code core layer changes, but the sample is made. The parameterization itself is implemented around the channel (A and r4 and represented by a Q). The -32 - 1313857 equal-bit flow layer is described by 1 Ο Ο 1 and 1 Ο Ο 2, where 1 ο ο 1 is the core layer 'which holds the waveform-coded downmix signal and holds for reconstruction The parameters η of the front channels (102, 103, and 104) and the enhancement layer described by r2. 1002 hold parameters for reconstructing the back channels (101 and 105). Another important aspect of the present invention is the use of a decorrelator in a multi-channel configuration. The use of a resolver has been detailed for one or two channel conditions in the PCT/SE02/01372 patent document. However, when this theory is expanded to more than two channels, several problems to be solved by the present invention arise. P Basic Mathematical Display: In order to complete a mutual de-correlated signal from N signals, a Μ-Ν decorrelator is required, in which all the different decorrelators are used to generate a plurality of mutually orthogonal output signals from a common input signal. Assuming that an input produces an output of 7(1) and 4>;丨21=£卜丨21 and almost makes the interaction correlation £[*] disappear, then a decorrelator is usually an all-pass or almost all-pass filter Device. Another perceptual criterion can be used to design a good decorrelator. Most of the design methods can minimize the comb filter characteristics and minimize one of the transient signals when adding the original signal to the decorrelated signal. The effect of the pulse response # is too long. Most conventional art decorators use an artificial mirror to resolve the correlation. Conventional techniques can also achieve higher echo densities and thus longer diffusions by, for example, modifying the phase of complex sub-band samples to include fractional delays. The present invention proposes a method for modifying a mirror-based decorrelator to achieve a plurality of decorrelators that can generate a plurality of mutually decorrelated output signals from a common input signal. If the outputs of the two decorrelators y!(t) and y2(t) have an interactive correlation that disappears or almost disappears (assuming the same input), then -33 - 1313857 causes the two decorrelators to decorrelate each other. Assuming that the input is static white noise, then the pulse sounds hi and h2 must be orthogonal in the perception of disappearance or almost disappearance. Pairwise de-correlation decorrelators of complex arrays can be constructed in several ways. One effective way to implement this modification is to change the phase rotation factor q (which is part of the fractional delay). The particular phase rotation factor of the present invention may be part of the delay line in the all-pass filter or just a total fractional delay. In the latter case, the method is not limited to an all-pass or mirror filter, but can also be applied to φ, for example, a simple delay including a fractional delay. One of the all-pass filter connections in the decorrelator can be described in a Z-domain as: where q is the complex phase rotation factor (Μ = 1), m is the delay line length in the sample', and the a-line filtering Factor. For stability reasons, the size of the filter coefficient must be limited to μ <:1. However, by using an alternative filter coefficient a' = -a ' to define a new mirror that has the same reflection delay characteristics, ® however has an output that is significantly uncorrelated with the output of the unmodified mirror. Further, the modification of the phase rotation factor q can be accomplished by, for example, adding a fixed phase offset q' = qejc. This constant C can be used as a fixed phase offset or can be adjusted in such a way that for all bands to which the constant C is applied, the stomach '^ stomach number C will correspond to a fixed time offset. The phase offset constant C #司*]^ is a random 値 which is different for all frequency bands. According to the present invention, by applying an upstream mixing matrix 具有 having an nx(m + p) size to a row vector signal having a size of (m + p) X 1 to implement from m - 34 - 1313857 channels Generate n channels. m y = s where m is the m downmixed and encoded signals, and the P signals in s are de-correlated with each other and de-correlated with all signals in m. These decorrelated signals are generated by the signal in m by the decorrelator. Then, n reconstruction signals a1, b', ... are included in the row vector. x' = Hy ° The above is described by means of Figure 11, wherein the decorrelated signals are generated by the decorrelators 1102, 1103 and 1104. The upstream mixing matrix η is provided by 11 ο 1 for operating on the vector y to provide the output signal X'. Let RzEUxI be the correlation matrix of the original signal vector' hypothesis R' = E[x'x〃] is the correlation matrix of the reconstructed signal. Here and in the following, for a matrix having a vector X of complex terms, X# represents a complex conjugate transpose of the adjoint matrix --X. The diagonal of R contains the energies 値A, B, C... and can be decoded into a total energy level by the energy rating defined above. Because 'only there are only n(n-l)/2 different non-diagonal cross-correlation 値, which contain information that will be completely or partially reconstructed by adjusting the upstream mixing matrix 。. The reconstruction of the complete correlation structure corresponds to the case R' = R. The reconstruction of the correct energy level corresponds only to the case where R' and R are equal on the diagonal. In the case of changing from m = 1 channel to η channel, the complete correlation -35 - 1313857 is achieved by using p = ni mutual decorrelation decorrelator (an upstream mixing matrix Η) The uplink mixing matrix Η satisfies the following conditions:

HH* =—R Μ 其中Μ係該單傳輸信號之能量。因爲R係正半定矩陣, 所以已熟知現在〜個解答。再者,針對Η之設計保留n(n-l)/2 自由度’其係使用於本發明中以獲得該上行混音矩陣之另外 期望特性。一中心設計準則爲Η對該傳輸相關資料之相依性 應該是平順的。HH* = -R Μ where Μ is the energy of the single transmitted signal. Because R is a positive semi-definite matrix, it is well known now ~ a solution. Furthermore, n(n-l)/2 degrees of freedom is reserved for the design of the Η which is used in the present invention to obtain additional desirable characteristics of the upstream mixing matrix. A central design criterion is that the dependencies on the transmission-related data should be smooth.

φ 參數化該上行混音矩陣之一方便方式爲H = UDV,其中U 及V係正交矩陣以及d係一對角矩陣。可選擇D之絕對値 的平方等於R/M之特徵値。刪去V及挑選該等特徵値以便 將最力値應用至第一座標將最小化在該輸出中之解相關信 號的總能量。在實數情況中該正交矩陣U係藉由n(n-1 )/2 旋轉角度來參數化。傳送在那些角度之形式中的相關資料及 D之η個對角値將立即提供η之期望平順相依性。然而,因 爲能量資料必須被變換成特徵値,所以此方法犧牲可調能 •力。 本發明所教示之第二方法係藉由以R = GR〇G來界定一正 規化相關矩陣Ro以使在R中之能量部與相關部分離,其中 G係一具有等於R之對角項的平方根之對角値(亦即,VI、 7^…)的對角矩陣,RG在對角線上具有相同對角値。假設H0 係一正交上行混行矩陣,其在同等能量之完全無關信號的情 況中界定較佳正規化上行混音。此較佳上行混音矩陣之範例 爲: -36 - 1313857 '1 1 V2' '1 1 1 1 ' ιΓι -Γ 1 ,一 I 1 -V2 1 > 一 1 1 -1 -1 1 _ 2 V2 — V2 0 2 1 _ 1 -1 1 1 -1 1 -ι_ 然後,以// = 來界定上行混音,其中該矩陣S解 出ss、rg。選擇此解答對在RQ中之正規化交互相關値的相 依性爲連續的,以便在= I之情況中S等於單位矩陣。 將該η個聲道分割成較少聲道之群係一種重建部分交互 相關結構之合宜方式。依據本發明,對於從1聲道重建5.1 φ 聲道之情況而言’一特別有利編組爲{a,e}{c}{b, d}{f},其 中沒有解相關應用至該等群{c}及{f},以及該等群{a,e}及 {b,d}係藉由相同下行混音/解相關對之上行混音所產生。對 於這兩個子系統而言,選擇在完全未相關情況中之較佳正規 化上行混音分別成爲: 丄「1 -1]丄「1 1 ' β -1 1 -,w -1 -】·。 因此,將只傳送及重建15個交互相關之總數中的兩個, φ亦即’在聲道{a,e}與{b,d}間之交互相關。在上述所使用之 術語中’此對於n = 6、m=l及p = i之情況而言是設計上的一 個範例。該上行混音矩陣Η係6x2之大小且在第3及第6 列上的第2行中之對應於輸出c·及f,的兩個項爲零。 本發明所教示之用以併入解相關信號的第三方法係一 較簡單觀點:每一輸出聲道具有一不同解相關器,以造成解 相關信號sa、sb…。然後使該等重建信號成爲: a =」A/ M {mcos<p。+ SaSinip。), 1313857 b = V-S/M(wcos% + sin^A), 等等。 該等參數(pa、cpb.··控制在輸出聲道a'、b'...中 解相關信號的數量。該相關資料係以這些角度之: 送。可易於'計算:在例如聲道a'與b'間之結果正規 關係等於乘積 COS(paCOS(pb。當成對交互相關;^ n(n-l)/2及具有η個解相關器時,如果n>3,則通 以此方法來匹配一特定相關結構,然而優點是一非 φ 穩定解碼方法及對在每一輸出聲道中所呈現之解 的所產生數量之直接控制。此能使解相關信號之混 倂入有例如聲道對之能量電平差的感知準則。 對於從m>l聲道重建η聲道之情況而言,不再 陣Ry = E[yy¢]假設爲對角矩陣,以及必須考慮到R·: 該目標R之匹配。因爲Ry具有分塊矩陣結構 「尺,ο 1 y L〇 • 所以產生簡化,其中Rm = E[mm·*;^ RfEtss*·] 假設爲相互解相關解相關器,該矩陣Rs爲對角矩 到此亦會影響有關於正確能量之重建的上行混音設 方法係要在該解碼器中計算或從編碼器傳送有關 行混音信號之相關結構Rm的資訊。 對於從2聲道重建5.1聲道之情況,上行混音 法爲: 所呈現之 形式來傳 化交互相 L數目爲 常不可能 常簡單且 相關信號 合係根據 將相關矩 = HRyH* 對 。再者, 陣。注意 :計。解決 於該等下 之較佳方 -38 - 1313857 a 、丨 0 0 ' b' K 0 ^23 0 c K ^32 0 0 d, 0 办42 0 e 0 ^52 0 K •f、· kl 0 0 _ m2 s\ S2 其中si可從mi=ld之解相關來獲得及s2可從m2 = rd之解 相關來獲得。 在此,將該等群{a,b}及{d,e}視爲已考量成對交互相關 之分離1^2聲道系統。對於聲道c及f而言,調整加權,以 籲便 +^2m2|2j=C ( 本發明可針對各種用於類比或數位信號之儲存或傳輸 的使用任意編解碼器之系統實施在硬體晶片及D S P中。第2 圖及第3圖顯示本發明之可能實施。在此範例中,顯示一用 以操作6個輸入信號之系統(一5 . 1聲道組態)。在顯示該編 ®碼器側之第2圖中’將該等分離聲道之類比輸入信號轉換成 爲數位信號201及使用每一聲道之濾波器組來分析202。將 該濾波器組之輸出饋入該環繞編碼器203,該環繞編碼器203 包括一參數產生器’其實施一下行混音以產生由該音頻編碼 器205所編碼之一個或二個聲道。再者,依據本發明擷取像 IID及ICC參數之環繞參數’以及依據本發明擷取用以槪述 資料之時間頻率格(time frequency grid)及哪一個參數化被 使用的控制資料2〇4。如本發明所教示,編碼該等擷取參數 -39 - 1313857 2 06,以切換於不同參數化之間或以可調方式配置該等參 數。將該等環繞參數207、控制信號及編號下行混音信號208 多工處理209成爲一串列位元流。One convenient way to parameterize the upstream mixing matrix is H = UDV, where U and V are orthogonal matrices and d is a pair of angular matrices. The square of the absolute 値 of D can be chosen to be equal to the characteristic R of R/M. Deleting V and selecting the features so that the best force is applied to the first coordinate will minimize the total energy of the decorrelated signal in the output. In the real case, the orthogonal matrix U is parameterized by the n(n-1)/2 rotation angle. Transmitting the relevant data in the form of those angles and the n diagonals of D will immediately provide the desired smoothness of η. However, since the energy data must be transformed into a characteristic 値, this method sacrifices the adjustable energy. The second method taught by the present invention defines a normalized correlation matrix Ro by R = GR 〇 G to separate the energy portion in R from the correlation portion, wherein the G system has a diagonal term equal to R The diagonal matrix of the square roots (ie, VI, 7^...), RG has the same diagonal 値 on the diagonal. It is assumed that H0 is an orthogonal up-mixing matrix that defines a better normalized upstream mix in the case of completely unrelated signals of equal energy. An example of such a preferred upstream mixing matrix is: -36 - 1313857 '1 1 V2' '1 1 1 1 ' ιΓι -Γ 1 , an I 1 -V2 1 > a 1 1 -1 -1 1 _ 2 V2 – V2 0 2 1 _ 1 -1 1 1 -1 1 -ι_ Then, the upstream mix is defined by // =, where the matrix S solves for ss, rg. The choice of this solution is continuous for the normalized interaction correlations in RQ, so that S is equal to the identity matrix in the case of =I. Segmenting the n channels into groups of fewer channels is a convenient way to reconstruct a portion of the cross-correlation structure. According to the present invention, for the case of reconstructing 5.1 φ channels from 1 channel, a particularly advantageous grouping is {a, e}{c}{b, d}{f}, where no decorrelation is applied to the groups. {c} and {f}, and the groups {a, e} and {b, d} are generated by the same downmix/de-correlation pair of upstream mixes. For both subsystems, the preferred normalized upstream mix in the completely unrelated case is: 丄 "1 -1] 丄 "1 1 ' β -1 1 -, w -1 -]· Therefore, only two of the 15 cross-correlation totals will be transmitted and reconstructed, φ, which is the 'interaction between the channels {a, e} and {b, d}. In the terms used above' This is an example of design for the case of n = 6, m = 1, and p = i. The upstream mix matrix is 6x2 in size and corresponds to the second row on the 3rd and 6th columns. The two terms of the outputs c· and f are zero. The third method for incorporating the decorrelated signal as taught by the present invention is a simpler view: each output channel has a different decorrelator to cause The correlation signals sa, sb, ... are resolved. Then the reconstructed signals are made: a = "A/ M {mcos<p. + SaSinip. ), 1313857 b = V-S/M(wcos% + sin^A), and so on. These parameters (pa, cpb.·· control the number of decorrelated signals in the output channels a', b'... The relevant data is at these angles: send. Can be easily 'calculated: in eg channel The normal relationship between a' and b' is equal to the product COS (paCOS(pb. When pairwise interaction is related; ^n(nl)/2 and has n decorrelators, if n>3, then this method is used Matching a particular correlation structure, however, the advantage is a non-φ stable decoding method and direct control of the number of generated solutions for each output channel. This enables the decorrelation signal to be mixed into, for example, a channel. Perceptual criterion for the difference in energy level. For the case of reconstructing the η channel from the m>1 channel, the matrix Ry = E[yy¢] is assumed to be a diagonal matrix, and R·: Matching of the target R. Since Ry has a block matrix structure "foot, ο 1 y L〇• so a simplification occurs, where Rm = E[mm·*;^ RfEtss*·] is assumed to be a mutual decorrelation decorator, the matrix Rs is the diagonal moment, which also affects the reconstruction of the correct energy. The method of the upstream mixing is to be in the decoder. Calculate or transmit information about the correlation structure Rm of the line mix signal from the encoder. For the case of reconstructing 5.1 channels from 2 channels, the upmix method is: The form presented to pass the number of interactive phases L is not always It may be simple and the relevant signal is based on the correlation moment = HRyH* pair. Again, the array. Note: Calculate the better side of the -38 - 1313857 a, 丨0 0 ' b' K 0 ^ 23 0 c K ^32 0 0 d, 0 do 42 0 e 0 ^52 0 K •f, · kl 0 0 _ m2 s\ S2 where si can be obtained from the decorrelation of mi=ld and s2 can be obtained from m2 = The rd's solution is obtained. Here, the groups {a, b} and {d, e} are regarded as separate 1^2 channel systems that have been considered for pairwise interaction. For channels c and f Adjusting the weighting to call +^2m2|2j=C (The present invention can be implemented in hardware chips and DSPs for various systems using arbitrary codecs for storage or transmission of analog or digital signals. Figure 2 And Figure 3 shows a possible implementation of the invention. In this example, a system for operating six input signals (a 5.1 channel configuration) is shown. In the second diagram of the coder side, the analog input signals of the separate channels are converted into a digital signal 201 and the filter bank of each channel is used to analyze 202. The output of the filter bank is fed. The surround encoder 203 includes a parameter generator that performs a line mix to produce one or two channels encoded by the audio encoder 205. Furthermore, according to the present invention, a surround parameter such as IID and ICC parameters is captured, and a time frequency grid for deciphering data and a parameterized control data are used in accordance with the present invention. . As taught by the present invention, the parameters -39 - 1313857 2 06 are encoded to switch between different parameterizations or to configure the parameters in an adjustable manner. The surround parameters 207, control signals, and numbered downmix signal 208 multiplex processing 209 are a series of bitstreams.

在第3圖中,顯示一典型解碼器實施(亦即,一用以產 生多聲道重建之裝置)。在此,假設該音頻解碼器以一頻域 表示法輸出一信號,例如:在QMF合成濾波器組前之MPEG-4 高效率AAC解碼器的輸出。對該串列位元流實施解多工處 理30 1及將該編碼環繞資料饋入該環繞資料解碼器3 03及將 φ 該等下行混音編碼聲道饋入該音頻解碼器3 02(在此範例中 爲MPEG-4高效率AAC解碼器)。該環繞資料解碼器解碼該 環繞資料及將其饋入該環繞解碼器3 0 5,該環繞解碼器305 包括一上行混音器,其依據該解碼下行混音聲道及該環繞資 料與該等控制信號以重建6個聲道。合成3 06該環繞解碼器 之頻域輸出以成爲時域信號,接著將該等時域信號藉由DAC 3 07轉換成爲類比信號。 雖然本發明已描述主要有關於平衡參數之產生及使 •用’但是在此要強調用以獲得平衡參數之聲道對的相同編組 最好亦是用以計算聲道間同調參數或這兩個聲道對間之「寬 度」參數。此外,使用相同於該平衡參數計算所用之聲道對 亦可獲得聲道間時間差或一種「相位信號」。在接收器側上, 亦可使用除該等平衡參數之外或做爲該等平衡參數之替代 的這些參數’以產生一多聲道重建。在另一情況中,除其它 參考聲道所決定之其它聲道間電平差之外,還可使用該等聲 道間同調參數或甚至該等聲道間時間差。然而,有鑑於如第 -40 - 1313857 10a圖及第l〇b圖所述之本發明的可調能力特徵,最好對戶斤 有參數使用相同聲道對,以便在一可調位元流中每一調整層 包括用以重建該子群之輸出聲道的所有參數,其中該子群之 輸出聲道可藉由在第l〇b圖之列表的倒數第二行中所槪述之 個別調整層來產生。本發明在只計算在個別聲道對間之同調 參數或時間差參數及將其傳送至一解碼器時係有用的。在此 情況中,當實施一多聲道重建時,該等電平參數已存在於該 解碼器以供使用。 p 可依據本發明方法之某些實施需求,以硬體或軟體方式 實施本發明方法。可使用一數位儲存媒體(特別是儲存有電 子可讀取控制信號之磁碟或光碟)來實施,該等電子可讀取 控制信號與一可程式電腦系統配合,以便實施本發明方法。 因此,本發明通常係一具有儲存在一機械可讀取載體中之程 式碼的電腦程式產品,當該電腦程式產品在一電腦上執行 時’該程式碼係操作用以實施本發明方法。因此,換句話說, 本發明方法係一具有程式碼之電腦程式,該程式碼用以在該 #電腦程式在一電腦上執行時實施本發明方法中之至少一方 法。 【圖式簡單說明】 第1圖描述在本發明中之一 5 .1聲道組態所使用的學術 用語; 第2圖描述本發明之一合適編碼器實施; 第3圖描述本發明之一合適解碼器實施; 第4圖描述依據本發明之多聲道信號的一較佳參數化; -41- 1313857 第5圖描述依據本發明之多聲道信號的一較佳參數化= 第6圖描述依據本發明之多聲道信號的一較佳參數化; 第7圖描述一用以產生一單一基本聲道或兩個基本聲道 之下行混音架構的示意裝設; 第8圖描述一上行混音架構之示意表示,該上行混音架 構係依據本發明平衡參數及該下行混音架構之資訊; 第9 a圖描述在編碼器側上之一電平參數的決定; 第9b圖描述在解碼器側上之一電平參數的使用; φ 第l〇a圖描述一在位元流之不同層中具有該多聲道參數 化之不同部分的可調式位元流; 第10b圖描述一可調能力表,其表示使用哪些平衡參數 來建構哪些聲道及不使用及計算哪些平衡參數及聲道;以及 第11圖描述依據本發明之上行混音矩陣的應用。 【主要元件符號說明】 101 左環繞聲道 102 左前聲道 103 中央聲道 104 右前聲道 105 右環繞聲道 106 LEF聲道 20 1 ADC 202 分析濾波器組 203 環繞編碼器 204 控制信號 -42 - 1313857 205 206 207 208 209 3 0 1 3 02 3 03In Fig. 3, a typical decoder implementation (i.e., a device for generating multi-channel reconstruction) is shown. Here, it is assumed that the audio decoder outputs a signal in a frequency domain representation, for example, the output of an MPEG-4 high efficiency AAC decoder in front of the QMF synthesis filter bank. Performing a demultiplexing process 30 1 on the serial bit stream, feeding the encoded surround data into the surround data decoder 303, and feeding φ the downlink mixed code channels into the audio decoder 302 (in In this example, it is an MPEG-4 high efficiency AAC decoder). The surround data decoder decodes the surround data and feeds it into the surround decoder 305. The surround decoder 305 includes an upstream mixer, according to the decoded downlink mix channel and the surround data and the like. The control signal is used to reconstruct 6 channels. The frequency domain output of the surround decoder is synthesized to become a time domain signal, and then the time domain signals are converted into analog signals by the DAC 307. Although the present invention has been described primarily with respect to the generation and use of balance parameters, the same grouping of channel pairs used to obtain balanced parameters is preferably used to calculate inter-channel coherence parameters or both. The "width" parameter between the channel pairs. In addition, a channel time difference or a "phase signal" can also be obtained using the same channel pair used for the calculation of the balance parameter. On the receiver side, these parameters can be used in addition to or as an alternative to the equalization parameters to produce a multi-channel reconstruction. In another case, the inter-channel co-modulation parameters or even the inter-channel time differences may be used in addition to other inter-channel level differences as determined by other reference channels. However, in view of the adjustable capability characteristics of the present invention as described in Figures 40 - 1313857 10a and Figure lb, it is preferable to use the same channel pair for the parameters of the household, so as to be in an adjustable bit stream Each of the adjustment layers includes all parameters for reconstructing the output channel of the subgroup, wherein the output channels of the subgroup are individually recited in the penultimate row of the list of the lth diagram Adjust the layer to produce. The present invention is useful in calculating only the coherence parameters or time difference parameters between individual channel pairs and transmitting them to a decoder. In this case, when a multi-channel reconstruction is implemented, the level parameters are already present at the decoder for use. p The process of the invention may be carried out in a hard or soft manner in accordance with certain implementation requirements of the method of the invention. It can be implemented using a digital storage medium (particularly a disk or optical disk storing electronically readable control signals) that cooperate with a programmable computer system to carry out the method of the present invention. Accordingly, the present invention is generally a computer program product having a program code stored in a mechanically readable carrier, which is operative to carry out the method of the present invention when the computer program product is executed on a computer. Thus, in other words, the method of the present invention is a computer program having a program code for performing at least one of the methods of the present invention when the #computer program is executed on a computer. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 depicts an academic term used in one of the 5.1 channel configurations of the present invention; FIG. 2 depicts one suitable encoder implementation of the present invention; FIG. 3 depicts one of the present inventions Suitable decoder implementation; FIG. 4 depicts a preferred parameterization of a multi-channel signal in accordance with the present invention; -41-1313857 Figure 5 depicts a preferred parameterization of a multi-channel signal in accordance with the present invention = Figure 6 A preferred parameterization of a multi-channel signal in accordance with the present invention is described; Figure 7 depicts a schematic arrangement for generating a single base channel or two basic channel sub-mixing architectures; Figure 8 depicts a The schematic diagram of the uplink mixing architecture indicates that the uplink mixing architecture is based on the balance parameter of the present invention and the information of the downlink mixing architecture; FIG. 9a depicts the determination of a level parameter on the encoder side; FIG. 9b depicts Use of a level parameter on the decoder side; φ Figure 1a depicts a tunable bit stream with different portions of the multi-channel parameterization in different layers of the bit stream; Figure 10b depicts An adjustable capacity table indicating which balances are used Which channels and which balance parameters and channels are not used to construct and calculation; FIG. 11 and described in accordance with the application of up-mix matrix according to the present invention. [Main component symbol description] 101 Left surround channel 102 Left front channel 103 Center channel 104 Right front channel 105 Right surround channel 106 LEF channel 20 1 ADC 202 Analysis filter bank 203 Surround encoder 204 Control signal -42 - 1313857 205 206 207 208 209 3 0 1 3 02 3 03

3 05 3 06 3 07 700 900 902 100 1 φ 1002 110 1 1103 1104 Id, rd m r i Γ2 音頻編碼器 環繞資料編碼器 環繞參數3 05 3 06 3 07 700 900 902 100 1 φ 1002 110 1 1103 1104 Id, rd m r i Γ2 Audio encoder Surround data encoder Surround parameters

編號下行混音信號 多工器 解多工器 音頻解碼器 環繞資料解碼器 控制信號 環繞解碼器 合成濾波器組 DAC 下行混音器 電平參數計算器 電平校正器 位元流層 位元流層 解相關器 解相關器 解相關器 最佳立體聲道 單聲道 參數 參數 -43 - 1313857No. Downstream Mixing Signal Multiplexer Demultiplexer Audio Decoder Surround Data Decoder Control Signal Surrounding Decoder Synthetic Filter Bank DAC Downstream Mixer Level Parameter Calculator Level Corrector Bit Stream Layer Bit Stream Layer Desorber decorrelator decorator best stereo channel mono parameter parameter -43 - 1313857

Γ3 參 數 Γ4 參 數 Γ5 參 數 Γμ 額 外 參 數 α 加 權 因 數 β 加 權 因 數 Ί 加 權 因 數 δ 加 權 因 數 A 聲 道 Β 聲 道 C 聲 道 D 聲 道 Ε 聲 道 Ε Μ D 主 下 行 混 立 曰 Ε 〇 r i g 原 始 聲 道 之 能 量 Ε ρ D 參 數 下 行 混 音 F 重 建 聲 道 之 能 量 -44 -Γ3 Parameter Γ4 Parameter Γ5 Parameter Γμ Additional parameter α Weighting factor β Weighting factor 加权 Weighting factor δ Weighting factor A Channel 声道 Channel C channel D channel Ε Channel Ε Μ D Main downmix 曰Ε 〇 raw channel Energy Ε ρ D parameter downmix F reconstruction energy of the channel -44 -

Claims (1)

1313857 ?f年Ί月)丨日修(更)正替換頁 第94126934號「產生多聲道輸入信號之參數表示的裝置及 表示多聲道音頻信號之方法」專利案 (2008年7月修正) 十、申請專利範圍: 1.一種用以產生多聲道輸入信號之參數表示的裝置,該多聲 道輸入信號具有至少三個原始聲道,該裝置包括: 一參數產生器(203),用以在一第一聲道對之間產生一第 一平衡參數、一第一同調參數或一第一時間差參數,及用 以在一第二聲道對之間產生一第二平衡參數、一第二同調 參數或一第二時間參數,該等平衡參數、同調參數或時間 參數形成該參數表示; 其中該第一聲道對具有兩個聲道,該兩個聲道係不同於 該第二聲道對之兩個聲道,以及 其中該兩個聲道對之每一聲道係該等原始聲道中之 一、該等原始聲道之一加權或未加權組合、一下行混音聲 道、或至少兩個下行混音聲道之一加權或未加權組合,以 •及 其中該第一聲道對及該第二聲道對包括在該三個原始 聲道上之資訊。 2 .如申請專利範圍第1項之裝置,其中該等原始聲道包括一 左聲道(Β)、一右聲道(D)及一中央聲道(C),以及 其中該第二平衡參數(r2)係一中央平衡參數,及該第二 聲道對包括該中央聲道以做爲一第一聲道,及一包括該左 聲道及該右聲道之聲道組合以做爲一第二聲道。 1313857 77年^^日修(更)正替換頁 3_如申請專利範圍第2項之裝置,其中該參數產生器係操作 以依據下面方程式計算該中央平衡數: r22C l(B + D) 其中r2係該中央平衡參數,其中C表示該中央聲道,其 中B表示一左-聲道,其中D表不一右·聲道,以及其中γ 及α表示下行混音因數。 4 ·如申請專利範圍第1至3項中任一項之裝置,其中該第一 平衡參數(Π)係一左/右平衡參數,以及其中該第一聲道對 包括一左-聲道或一左下行混音聲道以做爲一第一聲道,及 聲道 一右聲道或一右下行混音聲道以做爲 5. 如申請專利範圍第4項之裝置,其中該參數產生器係操作 以依據下面方程式計算該第一平衡參數: _L _ a^B + P1A + y1C + 82F a2D + P2E + y2C + S2F ^ η =B/D 其中r!係該第一平衡參數,其中L係一第一下行混音聲 道,其中R係一第二下行混音聲道,其中B表示一左-聲 道’其中D表不一右-聲道,其中A表示一後-左聲道,其 中E表不一後-右聲道’其中C表示一中央聲道,宜中f 表不一低頻增強型聲道’以及其中α、β、γ及g係下行混音 因數。 6. 如申請專利範圍第1項之裝置,其中該等原始聲道包括一 後-左聲道(A)及一後-右聲道(E), 其中該參數產生器係操作以在一前/後聲道對之間產生 一前/後參數以做爲一第三平衡參數(r3)或做爲該第一及第 -2- 1313857 二平衡參數中之一 Π年q月V曰修(更)正替換頁 該前/後聲道對具有一包括該後-左聲 道及該後-右聲道之聲道組合以做爲―第一聲道,及另一包 括一左聲道及一右聲道之聲道組合以做爲一第二聲道。 7_如申請專利範圍第6項之裝置’其中該參數產生器係操作 以依據下面方程式計算該前/後參數(r3): β\Α + Ε) ^ = a\B + D) + y22C 其中U係該前/後平衡參數,其中A係一後-左聲道,其 中E係一後-右聲道,其中B表示一左·聲道,其中D表示 —右·聲道’其中c表示一中央聲道,以及其中α、卩及^表 示下行混音參數。 8. 如申請專利範圍第1項之裝置,其中該原始多聲道信號包 括一後-左聲道及一後·右聲道,其中該參數產生器係操作 以在一後左/右聲道對之間產生一後左/右平衡參數(^)以 做爲一額外平衡參數或做爲該第一或第二平衡參數,該後 左/右聲道對具有該後-左聲道以做爲一第一聲道及該後_ 右聲道以做爲一第二聲道。 9. 如申請專利範圍第1項之裝置,其中該原始多聲道信號包 括一低頻增強型聲道及一中央聲道, 其中該參數產生器係操作以在一低頻增強型聲道對之 間產生一低頻增強型平衡參數以做爲一額外平衡參數或 送第—或第二平衡參數,該低頻增強型聲道對具有該低頻 增強型聲道以做爲一第一聲道,及該中央聲道或一聲道組 口以做爲一第二聲道’該聲道組合包括該等原始聲道之中 央聲道及左與右聲道。 -3- 1313857 作”月啪修 —-- ι〇·如申請專利範圍第9項之裝置,其中該參數產生器係操作 以依據下面方程式計算該低頻增強型平衡參數: _ S22F G _ a2(B + D) + fi\A + E) + r22C 其中A對應於一後-左聲道’其中e對應於一後-右聲 道,其中B對應於一左聲道,其中d對應於一右聲道,其 中C對應於一中央聲道,其中f對應於該低頻增強型聲 道’其中ct、β、γ及δ係下行混音因數,以及其中Γ5係該低 φ 頻增強型平衡參數。 1 1 .如申請專利範圍第1項之裝置,其中進一步包括一資料流 產生器’用以產生一可調資料流(1001及1002),該資料流 產生器係操作以輸入該第一或第二平衡參數至一較低調 整層中及輸入任何其它參數至一較高調整層中。 1 2 _如申請專利範圍第π項之裝置,其中該參數產生器係操 作以產生除該第一或第二平衡參數之外的一個或多個平 衡黎數’以及其中該資料流產生器係操作以輸入該一個或 •多個額外平衡參數至一單一或複數個較高調整層中。 1 3 ·如申請專利範圍第1 2項之裝置,其中該資料流產生器係 操作以引入每一額外參數至一專屬調整層中。 1 4.如申請專利範圍第1項之裝置,其中該參數產生器係操作 以產生一左/右平衡參數做爲該第一平衡參數、一中央平衡 參數做爲該第二平衡參數、一前/後平衡參數做爲一第三平 衡參數、一後-左/右平衡參數做爲一第四平衡參數,及一 低頻增強型平衡參數做爲一第五平衡參數,以及 其中該資料流產生器係操作以輸入該第一及第二平衡 -4- 1313857 日修(更)正替換頁j 參數至一較低調整層中及輸入該第三至第四平衡參數或 對應同調參數或對應時間差至一個或多個較高調整層中。 1 5 ·如申請專利範圍第1項之裝置,其中該參數產生器係操作 以在一單側前/後聲道對之間產生至少一單側前/後平衡參 數(q3,q〇做爲該第一及第二平衡參數中之一或做爲一額 外平衡參數,該單側前/後聲道對具有一後-左聲道做爲一 第一聲道,及一左聲道做爲一第二聲道,或者一後-右聲道 做爲一第一聲道,及一右聲道做爲一第二聲道。 16.如申請專利範圍第1項之裝置,其中該第一及第二平衡參 數中之一係一第一左或右平衡參數,以及該聲道對包括一 左下行混音聲道做爲一第一聲道,及一左原始聲道或一後 -左原始聲道做爲一第二聲道,或者其中該第一及第二平衡 參數中之一係一右平衡參數,以及該聲道對包括一右下行 混音聲道做爲一第一聲道’及一右原始聲道或一後-右原始 聲道做爲一第二聲道,或者 其中該等第一或第二平衡參數中之一或一額外平衡參 數係一中央平衡參數’以及該聲道對包括該左及右下行混 音聲道之加總以做爲一第一聲道,及一原始中央聲道做爲 一第二聲道。 如申§靑專利範圍第16項之裝置’其中該參數產生器係操 作以產生一左平衡參數做爲一第一平衡參數、一右平衡參 數做爲一第二平衡參數’及一中央平衡參數做爲一第三平 衡參數。 18.如申請專利範圍第17項之裝置,其中該參數產生器係操 -5- 1313857 月i日修(更)正替換頁 作以產生一左/左環繞平衡參數做爲一第四平衡参;^,及^ 右/右環繞平衡參數做爲一第五平衡參數。 J 9.如申請專利範圍第1項之裝置,其中進一步包括: 一參數編碼器,用以產生該等平衡參數、該等同調 或該等聲道間時間差之編碼版本,該參數編碼器包?舌胃{匕 器。 2〇.如申請專利範圍第1項之裝置,其中該參數產生器係操作 以僅使用原始聲道或原始聲道之組合而非一基本聲道$ 基本聲道之組合,做爲在該等聲道對中之聲道。 2 1 .如申請專利範圍第1項之裝置, 其中該參數產生器係操作以產生不同參數組,每一組包 括至少兩個參數,其中用以計算在該等不同組中之參數的 聲道對係彼此不同的, 其中該參數產生器進一步操作以選擇該等不同組中之 一組來輸出,其由給定某一參數編碼架構而導致一較低位 元率, 該裝置進一步包括一參數編碼器,用以使用某一參數編 碼架構編碼該選擇組;以及 一參數控制資訊產生器,用以產生控制資訊,該控制資 訊表不該選擇參數架構之特性。 22.—種用以產生一原始多聲道信號之經重建多聲道表示的 裝置’其中該原始多聲道信號之經重建多聲道表示具有至 少三個原始聲道,使用多數基本聲道及在一第一聲道對間 使用一第一平衡參數、一第一同調參數或一第一時間差參 -6- 13138571313857 ?f年Ί月)丨日修 (more) is replacing page 94126934 "Device for generating parameter representation of multi-channel input signal and method for representing multi-channel audio signal" Patent case (revised in July 2008) X. Patent application scope: 1. A device for generating a parameter representation of a multi-channel input signal having at least three original channels, the device comprising: a parameter generator (203), Generating a first balance parameter, a first coherence parameter or a first time difference parameter between a first pair of channels, and generating a second balance parameter between the second pair of channels, a second coherent parameter or a second time parameter, the equalization parameter, the coherent parameter or the time parameter forming the parameter representation; wherein the first channel pair has two channels, the two channel systems being different from the second sound Two channels of the pair, and each of the two channel pairs is one of the original channels, one of the original channels is weighted or unweighted, and the next mixing channel , or at least two downstream mixes One of the weighted or unweighted combinations of the track, and the information of the first channel pair and the second channel pair included on the three original channels. 2. The device of claim 1, wherein the original channels comprise a left channel (Β), a right channel (D), and a center channel (C), and wherein the second balance parameter (r2) is a central balance parameter, and the second channel pair includes the center channel as a first channel, and a channel combination including the left channel and the right channel as a The second channel. 1313857 77年^^日修 (more) is being replaced by the apparatus of claim 2, wherein the parameter generator is operated to calculate the central balance according to the following equation: r22C l(B + D) R2 is the central balance parameter, where C represents the center channel, where B represents a left-channel, where D represents a right channel, and wherein γ and a represent the downmix factor. 4. The device of any one of claims 1 to 3, wherein the first balance parameter (Π) is a left/right balance parameter, and wherein the first channel pair comprises a left-channel or a left down mixing channel as a first channel, and a channel one right channel or a right down mixing channel as a device of claim 4, wherein the parameter is generated The system operates to calculate the first equilibrium parameter according to the following equation: _L _ a^B + P1A + y1C + 82F a2D + P2E + y2C + S2F ^ η = B/D where r! is the first equilibrium parameter, where L A first downlink mixing channel, wherein R is a second downlink mixing channel, wherein B represents a left-channel 'where D is not a right-channel, where A represents a back-left sound The channel, where E is not a post-right channel 'where C denotes a center channel, preferably f is a low frequency enhanced channel' and wherein the alpha, beta, gamma and g are the downmixing factors. 6. The device of claim 1, wherein the original channels comprise a back-left channel (A) and a back-right channel (E), wherein the parameter generator is operated in a front /After the channel pair, a pre/post parameter is generated as a third balance parameter (r3) or as one of the first and second 2-113857 two balance parameters. Further, the front/rear channel pair has a combination of the channel including the rear-left channel and the rear-right channel as the first channel, and the other includes a left channel and A channel of the right channel is combined as a second channel. 7_A device as claimed in claim 6 wherein the parameter generator operates to calculate the pre/post parameter (r3) according to the following equation: β\Α + Ε) ^ = a\B + D) + y22C U is the front/back balance parameter, wherein A is a back-left channel, where E is a back-right channel, where B represents a left channel, where D represents - right channel 'where c A center channel, and wherein α, 卩, and ^ represent the downmix parameters. 8. The device of claim 1, wherein the original multi-channel signal comprises a back-left channel and a back-right channel, wherein the parameter generator operates in a rear left/right channel A rear left/right balance parameter (^) is generated between the pair as an additional balance parameter or as the first or second balance parameter, and the rear left/right channel pair has the back-left channel to do A first channel and a rear_right channel are used as a second channel. 9. The device of claim 1, wherein the original multi-channel signal comprises a low frequency enhanced channel and a center channel, wherein the parameter generator is operative between a low frequency enhanced channel pair Generating a low frequency enhanced balance parameter as an additional balance parameter or sending a first or second balance parameter, the low frequency enhanced channel pair having the low frequency enhanced channel as a first channel, and the center The channel or the one-channel group port serves as a second channel. The channel combination includes the center channel and the left and right channels of the original channels. -3- 1313857 「 啪 啪 — - 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如 如B + D) + fi\A + E) + r22C where A corresponds to a back-left channel 'where e corresponds to a back-right channel, where B corresponds to a left channel, where d corresponds to a right a channel, where C corresponds to a center channel, where f corresponds to the low frequency enhanced channel 'where ct, β, γ, and δ are the downmixing factors, and wherein Γ5 is the low φ frequency enhanced balance parameter. 1 1. The apparatus of claim 1, further comprising a data stream generator for generating an adjustable data stream (1001 and 1002), the data stream generator operating to input the first or the first The second balance parameter is added to a lower adjustment layer and any other parameter is input to a higher adjustment layer. 1 2 _ The device of claim π, wherein the parameter generator operates to generate the first or One or more balances in addition to the second balance parameter' and Wherein the data stream generator is operative to input the one or more additional balance parameters into a single or a plurality of higher adjustment layers. 1 3 · The apparatus of claim 12, wherein the data stream is generated The apparatus operates to introduce each additional parameter into a dedicated adjustment layer. 1 4. The apparatus of claim 1, wherein the parameter generator is operative to generate a left/right balance parameter as the first balance The parameter, a central balance parameter is used as the second balance parameter, a front/rear balance parameter is used as a third balance parameter, a rear-left/right balance parameter is used as a fourth balance parameter, and a low frequency enhanced balance is used. The parameter is used as a fifth balance parameter, and wherein the data stream generator is operated to input the first and second balances -4- 1313857 to repair (more) replace the page j parameter into a lower adjustment layer and input The third to fourth balance parameter or the corresponding coherence parameter or the corresponding time difference is to the one or more higher adjustment layers. 1 5 . The device of claim 1, wherein the parameter generator is operated in a single At least one one-sided front/rear balance parameter is generated between the front/rear channel pairs (q3, q〇 as one of the first and second balance parameters or as an additional balance parameter, the one-sided front/rear The channel pair has a back-left channel as a first channel, and a left channel as a second channel, or a back-right channel as a first channel, and a right channel The device is the second channel. The device of claim 1, wherein one of the first and second balance parameters is a first left or right balance parameter, and the channel pair comprises a The left downmix channel is used as a first channel, and a left original channel or a back-left original channel is used as a second channel, or one of the first and second balance parameters a right balance parameter, and the channel pair includes a right downmix channel as a first channel' and a right original channel or a back-right original channel as a second channel, or One of the first or second balancing parameters or an additional balancing parameter is a central balancing parameter 'and the pair of channels includes the left and right downstream mixes The sum of the channels is used as a first channel, and an original center channel is used as a second channel. The device of claim 16 wherein the parameter generator operates to generate a left balance parameter as a first balance parameter, a right balance parameter as a second balance parameter, and a central balance parameter As a third balance parameter. 18. The apparatus of claim 17, wherein the parameter generator is operated by -5 1313857 i (revision) replacement page to generate a left/left surround balance parameter as a fourth balance parameter ;^, and ^ The right/right surround balance parameter is used as a fifth balance parameter. J. The device of claim 1, further comprising: a parameter encoder for generating the equalized parameter, the equivalent tone, or an encoded version of the time difference between the channels, the parameter encoder package? Tongue stomach {匕器. 2. The apparatus of claim 1, wherein the parameter generator is operative to use only a combination of original channels or original channels instead of a combination of a base channel and a base channel; The channel of the channel centering. 2 1. The apparatus of claim 1, wherein the parameter generator is operative to generate different sets of parameters, each set comprising at least two parameters, wherein the channels for calculating parameters in the different sets are The pairs are different from each other, wherein the parameter generator is further operative to select one of the different groups for output, which results in a lower bit rate given a certain parameter encoding architecture, the device further comprising a parameter An encoder for encoding the selection group using a parameter encoding architecture; and a parameter control information generator for generating control information, the control information table not selecting characteristics of the parameter architecture. 22. Apparatus for generating a reconstructed multi-channel representation of an original multi-channel signal wherein the reconstructed multi-channel representation of the original multi-channel signal has at least three original channels, using most of the base channels And using a first balance parameter, a first coherence parameter or a first time difference between a first channel pair -6-13 13857 數’該等基本聲道係藉由使用一下行混音架構來轉換該原 始多聲道信號所產生;以及用以在一第二聲道對間產生一 第二平衡參數、一第二同調參數或一第二時間參數,該等 平衡參數、同調參數或時間參數形成該參數表示,其中該 第一聲道對具有兩個聲道’該兩個聲道係不同於該第二聲 道對之兩個聲道,以及其中該兩個聲道對之每一聲道係該 等原始聲道中之一、該等原始聲道之一加權或未加權組 合、一下行混音聲道或至少兩個下行混音聲道之一加權或 未加權組合,以及其中該第一聲道對及該第二聲道對包括 在該三個原始聲道上之資訊,該裝置包括: 一上行混音器(3〇5) ’用以產生多數上行混音聲道,該等 上行混音聲道之數目大於該等基本聲道之數目及小於或 等於該等原始聲道之數目; 其中該上行混音器係操作以依據該下行混音架構之資 訊及使用該等平衡參數、該等同調參數或該等聲道間時間 差來產生重建聲道’以便藉由該第一平衡參數、該第一聲 道間同調參數或該第一聲道間時間差來決定在一第一重 建聲道對間之一平衡或同調或聲道間時間差,以及藉由該 第二平衡參數、該第二聲道間同調參數或該第二聲道間時 間差參數來決定在一第二聲道對間之一平衡、一聲道間同 調或一聲道間準位差。 2 3 ·如申請專利範圍第2 2項之裝置,其中該等原始聲道包括 一左聲道(B)、一右聲道(D)及一中央聲道(C),以及其中該 第二平衡參數(Ο)係一中央平衡參數及該第二聲道對包括 -7- 1313857 ?1年^|月;丨日修(更)正替換頁 1 1 ........ 該中央聲道以做爲一第一聲道,及一包括該左聲道及該右 聲道之聲道組合以做爲一第二聲道, 其中該上行混音器係操作以依據該第二平衡參數(r2)產 生一重建中央聲道。 24.如申請專利範圍第22項之裝置,其中該第一平衡參數(ri) 係一左/右平衡參數,以及其中該第一聲道對包括一左-聲 道或一左下行混音聲道做爲一第一聲道及一右-聲道或一 右下行混音聲道以做爲一第二聲道,以及 • 其中該上行混音器係操作以依據該第一平衡參數(r,)產 生一重建左聲道及一重建右聲道。 25·如申請專利範圍第22項之裝置’其中該等原始聲道包括 一後-左聲道(A)及一後-右聲道(E),其中該參數表示包括 在一前/後聲道對間之一前/後參數做爲一第三平衡參數 (Γ3)或該第一及第二平衡參數中之一,該前/後聲道對具有 一包括該後-左聲道及該後-右聲道之聲道組合,做爲一第 一聲道及另一包括一左聲道及一右聲道之聲道組合做爲 •—第二聲道,以及 其中該上行混音器係操作以使用該前/後平衡參數(Γ3)產 生一重建組合後聲道。 26.如申請專利範圍第22項之裝置,其中該原始多聲道信號 包括一後-左聲道及一後-右聲道,該參數表示包括在—後 左/右聲道對間之一後左/右平衡參數(r 4)做爲一額外平衡 參數或做爲該第一或第二平衡參數,該後左/右聲道對具有 該後-左聲道做爲一第一聲道及該後-右聲道做爲一第二聲 -8- 1313857 道,及 % Vi 曰修(更)正替換頁 其中該上行混音器係操作以依據該後左/右平衡參數產 生一重建後-左聲道及一重建後-右聲道。 27.如申請專利範圍第22項之裝置,其中提供給該裝置之一 參數資訊包括一左/右平衡參數做爲該第一平衡參數、一中 央平衡參數做爲該第二平衡參數、一前/後平衡參數做爲一 第三平衡參數、一後-左/右平衡參數做爲一第四平衡參數 及一低頻增強型平衡參數做爲一第五平衡參數,以及其中 一資料流包括在一較低調整層中之第一及第二平衡參數 以及在一或多個較高調整層中之第三及第四平衡參數或 對應同調參數或對應時間差,以及 其中該上行混音器係操作以使用該第一平衡參數及該 第二平衡參數,以便產生一左輸出聲道、一右輸出聲道及 一包括該中央聲道之輸出聲道,或者 其中該上行混音器係操作以額外地使用該前/後平衡參 數,以便額外地重建在該後-左聲道與該後-右聲道間之加 總;或者 其中該上行混音器係操作以另外使用該後左/右平衡參 數,以便重建一後左聲道及一後右聲道。 28.如申請專利範圍第27項之裝置,其中該上行混音器係操 作以產生該重建多聲道信號,以便完成下列方程式; F = r5 2M 2γ2 l + r5 1 r4 E = β2 l + r4l + r3\ + r5J__1__r,__1_ β2 1 + r4 1 + r3 1 + r5 2M 2M -9- 1313857 c r2 2γ2 1 + r2 1 + r3 1 + r5 2MNumber 'the basic channels are generated by converting the original multi-channel signal using a line mixing architecture; and generating a second balance parameter and a second co-modulation parameter between the second channel pair Or a second time parameter, the equalization parameter, the homology parameter or the time parameter forming the parameter representation, wherein the first channel pair has two channels 'the two channel systems are different from the second channel pair Two channels, and wherein each of the two channel pairs is one of the original channels, one of the original channels is weighted or unweighted, the next mixed channel, or at least two One of the weighted or unweighted combinations of the downlink mixing channels, and the information in which the first channel pair and the second channel pair are included on the three original channels, the apparatus comprising: an upstream mixer (3〇5) 'To generate a majority of the upstream mixing channels, the number of the upstream mixing channels is greater than the number of the basic channels and less than or equal to the number of the original channels; wherein the upstream mix Operating in accordance with the downlink mixing architecture And using the equalization parameter, the equivalent tuning parameter or the inter-channel time difference to generate a reconstructed channel 'to be used by the first balancing parameter, the first inter-channel coherent parameter or the first inter-channel time difference Determining a balance or coherence or inter-channel time difference between a pair of first reconstructed channels, and determining by the second balance parameter, the second inter-channel co-modulation parameter, or the second inter-channel time difference parameter One of the second channel pairs is balanced, one channel is coherent or one channel is in the same position. 2 3. The device of claim 2, wherein the original channels comprise a left channel (B), a right channel (D), and a center channel (C), and wherein the second channel The balance parameter (Ο) is a central balance parameter and the second channel pair includes -7- 1313857 ?1 year ^| month; 丨日修 (more) is replacing page 1 1 ........ The channel is used as a first channel, and a channel comprising the left channel and the right channel is combined as a second channel, wherein the upstream mixer is operated according to the second balance The parameter (r2) produces a reconstructed center channel. 24. The device of claim 22, wherein the first balance parameter (ri) is a left/right balance parameter, and wherein the first channel pair comprises a left-channel or a left-down mix The channel acts as a first channel and a right-channel or a right-down mixing channel as a second channel, and • wherein the upstream mixer operates to depend on the first balance parameter (r ,) produces a reconstructed left channel and a reconstructed right channel. 25. The device of claim 22, wherein the original channels comprise a back-left channel (A) and a back-right channel (E), wherein the parameter representation comprises a front/back sound One of the pair of front/rear parameters is a third balance parameter (Γ3) or one of the first and second balance parameters, the front/rear channel pair having a back-left channel and the Rear-right channel combination, as a first channel and another channel including a left channel and a right channel as the second channel, and the upstream mixer The operation is to generate a reconstructed combined channel using the pre/post balance parameter (Γ3). 26. The device of claim 22, wherein the original multi-channel signal comprises a back-left channel and a back-right channel, the parameter indicating being included in the -left left/right channel pair The rear left/right balance parameter (r 4) is used as an additional balance parameter or as the first or second balance parameter, and the rear left/right channel pair has the back-left channel as a first channel And the rear-right channel is used as a second sound -8-1313857 channel, and the % Vi 曰 repair (more) positive replacement page, wherein the upstream mixer is operated to generate a reconstruction according to the rear left/right balance parameter Rear-left channel and one reconstructed-right channel. 27. The device of claim 22, wherein the parameter information provided to the device comprises a left/right balance parameter as the first balance parameter, a central balance parameter as the second balance parameter, a front The /after balance parameter is used as a third balance parameter, a post-left/right balance parameter is used as a fourth balance parameter and a low frequency enhanced balance parameter is used as a fifth balance parameter, and one of the data streams is included in one First and second balance parameters in the lower adjustment layer and third and fourth balance parameters or corresponding coherence parameters or corresponding time differences in one or more higher adjustment layers, and wherein the upstream mixer is operated Using the first balance parameter and the second balance parameter to generate a left output channel, a right output channel, and an output channel including the center channel, or wherein the upstream mixer is operative to additionally Using the front/back balance parameter to additionally reconstruct the sum between the back-left channel and the back-right channel; or wherein the upstream mixer is operative to additionally use the rear left/right balance parameterNumber to rebuild a rear left channel and a rear right channel. 28. The apparatus of claim 27, wherein the upstream mixer is operative to generate the reconstructed multi-channel signal to perform the following equation; F = r5 2M 2 γ2 l + r5 1 r4 E = β2 l + r4l + r3\ + r5J__1__r, __1_ β2 1 + r4 1 + r3 1 + r5 2M 2M -9- 1313857 c r2 2γ2 1 + r2 1 + r3 1 + r5 2M 2-^—M - β2Α - y2C - S2F B 1 + r丨——…^ 1 J 2-i-M - β2Ε - y2C - §2F 1 + η 其中F對應於一低頻增強型聲道’其中a對應於一左環 繞聲道,其中E對應於一右環繞聲道,其中c對應於一中 央聲道’其中B對應於一左聲道,其中d對應於一右聲道, 其中Π係一左/右平衡參數,其中Γ2係一中央/左·右平衡 參數’其中Ο係一前/右平衡參數,其中Γ4係一後左/右平 衡麥數’其中I"5係一中央/低頻增強型平衡參數,以及其 中α、β、γ及δ係下行混音因數。 29.如申請專利範圍第22項之裝置, 其中該基本聲道之數目大於或等於2,以及其中該參數 表示包括在一單側前/後聲道對間之至少一單側前/後平衡 參數(q3,q4)做爲該第—及二平衡參數中之—或做爲一額 外平衡參數,該單側前/後聲道對具有一後-左聲道做爲一 第一聲道及一左聲道做爲一第二聲道或者一後-右聲道做 爲一桌一聲道及一右聲道以做爲一第二聲道,以及 其中該上行混音器係操作以依據一左聲道或一右聲道 及該對應單側前/後平衡參數產生一重建後左聲道或一重 建後右聲道。 3〇·如申請專利範圍第22之裝置,其中該第一及第二平衡參 數中之一係一第一左或右平衡參數,以及該聲道對包括一 左下行混音聲道做爲一第一聲道及一左原始聲道或一後-左原始聲道做爲一第二聲道,或者 -10- 1313857 替換頁 其中該第一及第二平衡參數中之一係一右平衡參數’以 及該聲道對包括一右下行混音聲道做爲一第一聲道及一 右原始聲道或一後-右原始聲道做爲一第二聲道,或者 其中該第一或第二平衡參數中之一或一額外平衡參數 係一中央平衡參數,以及該聲道對包括該左及右下行混音 聲道之加總做爲一第一聲道及一原始中央聲道做爲一第 二聲道2-^-M - β2Α - y2C - S2F B 1 + r丨——...^ 1 J 2-iM - β2Ε - y2C - §2F 1 + η where F corresponds to a low frequency enhanced channel 'where a corresponds to a left surround channel, where E corresponds to a right surround channel, where c corresponds to a center channel 'where B corresponds to a left channel, where d corresponds to a right channel, where t is a left/right Balance parameter, where Γ2 is a central/left-right balance parameter 'where Ο is a front/right balance parameter, where Γ4 is a rear left/right balance 麦', where I"5 is a central/low frequency enhanced balance parameter And the α, β, γ, and δ system downmixing factors. 29. The device of claim 22, wherein the number of the basic channels is greater than or equal to 2, and wherein the parameter representation comprises at least one one-sided front/rear balance between a single-sided front/rear channel pair The parameter (q3, q4) is used as the first and second balance parameters, or as an additional balance parameter, the one-sided front/rear channel pair has a back-left channel as a first channel and One left channel is used as a second channel or a rear-right channel is used as a table one channel and one right channel as a second channel, and wherein the upstream mixer is operated according to A left channel or a right channel and the corresponding one-sided front/back balance parameter produce a reconstructed left channel or a reconstructed right channel. 3. The device of claim 22, wherein one of the first and second balance parameters is a first left or right balance parameter, and the channel pair comprises a left downmix channel as a The first channel and a left original channel or a back-left original channel are used as a second channel, or a -10- 1313857 replacement page in which one of the first and second balance parameters is a right balance parameter 'and the channel pair includes a right downmix channel as a first channel and a right original channel or a back-right original channel as a second channel, or wherein the first or the first One of the two balance parameters or an additional balance parameter is a central balance parameter, and the sum of the channel includes the left and right downlink mix channels as a first channel and an original center channel as a second channel 以及其中該上行混音器係操作以使用該等參數及該第 一基本聲道、該二基本聲道或該第一及第二基本聲道之組 合來產生該等重建聲道。 3 1 .如申請專利範圍第22項之裝置,其中該等平衡參數係一 可調位元流之部分,該可調位元流在該較低調整層中具有 該第一及第二平衡參數及在至少一較高調整層中具有至 少一額外平衡參數;以及 該裝置進一步包括一資料流擷取器’用以擷取該較低調 整層及多數較高調整層’該較高調整層之數目係在0與小 於調整層之總數的數目之間’ 其中該資料流擷取器係操作以依據一相關於該裝置之 輸出聲道組態擷取該等較高調整層之數目’該聲道組態具 有比該原始多聲道信號之聲道組態少之聲道。 32如申請專利範圍第22項之裝置’其中進一步包括: 一參數架構選擇器’用以控制該上行混音器,以便該上 行混音器應用一由一參數架構控制資訊所表示之參數架And wherein the upstream mixer is operative to generate the reconstructed channels using the parameters and the combination of the first base channel, the two base channels, or the first and second base channels. 3. The device of claim 22, wherein the balance parameter is part of a tunable bit stream having the first and second balance parameters in the lower adjustment layer And having at least one additional balancing parameter in the at least one higher adjustment layer; and the apparatus further comprising a data stream extractor 'for extracting the lower adjustment layer and the plurality of higher adjustment layers' The number is between 0 and less than the total number of adjustment layers 'where the stream extractor is operative to retrieve the number of such higher adjustment layers in accordance with an output channel configuration associated with the device' The channel configuration has fewer channels than the channel configuration of the original multichannel signal. 32. The device of claim 22, wherein the method further comprises: a parameter architecture selector for controlling the upstream mixer, such that the upstream mixer applies a parameter frame represented by a parameter architecture control information -11- 1313857-11- 1313857 該多聲道 該方法包 33.—種產生一多聲道輸入信號之參數表示的方法 輸入信號之參數表示具有至少三個原始聲道 括: 在一第一聲道對間產生(2 03 )—第一平衡參數、一第一同 調參數或一第一時間差參數;以及 在一第二聲道對間產生一第二平衡參數、一第二同調參 數或一第二時間參數,該等平衡參數、同調參數或時間參 數形成該參數表示; 其中該第一聲道對具有兩個聲道,該兩個聲道不同於該 第二聲道對之兩個聲道,以及 其中該兩個聲道對之每一聲道係該等原始聲道中之 一、該等原始聲道之一加權或未加權組合、一下行混音聲 道或至少兩個下行混音聲道之一加權或未加權組合,以及 其中該第一聲道對及該第二聲道對包括在該三個原始 聲道上之資訊。 3 4.—種用以產生一原始多聲道信號之一重建多聲道表示的 方法,使用多數基本聲道及使用在一第一聲道對間之一第 一平衡參數、一第一同調參數或一第一時間差參數,其中 該原始多聲道信號具有至少三個原始聲道,該等基本聲道 係藉由使用一下行混音架構來轉換該原始多聲道信號所 產生;以及用以在一第二聲道對間產生一第二平衡參數、 一第二同調參數或一第二時間參數,該等平衡參數、同調 參數或時間參數形成該參數表示,其中該第一聲道對具有 兩個聲道,該兩個聲道係不同於該第二聲道對之兩個聲 -12- W年Ί月勺H1修(更)正替換頁 等原始聲道中 、一下行混音 1313857 道,以及其中該兩個聲道對之每一聲道係該 之一、該等原始聲道之一加權或未加權組合 聲道或至少兩個下行混音聲道之一加權或未加權組合,以 及其中該第一聲道對及該第二聲道對包括在該三個原始 聲道上之資訊,該方法包括: 產生(3 0 5 )多數上行混音聲道,該等上行混音聲道之數目 大於該等基本聲道之數目及小於或等於原始聲道之數目; 其中該產生步驟包括依據該下行混音架構之資訊及使 # 用該等平衡參數、該等同調參數或該等聲道間時間差來產 生重建聲道,以便藉由該第一平衡參數、該第一聲道間同 調參數或該第一聲道間時間差來決定在一第一重建聲道 對間之一平衡或同調或聲道間時間差,以及藉由該第二平 衡參數、該第二聲道間同調參數或該第二聲道間時間差參 數來決定在一第二聲道對間之一平衡、一聲道間同調或一 聲道間準位差。 3 5 . —種電腦可讀取之記錄媒體,其記錄有電腦程式,用以在 ® —電腦上執行時,實施如申請專利範圍第3 3或3 4項之方 法。 36.—種表示多聲道輸入信號之參數之方法,該多聲道輸入信 號具有至少三個原始聲道,該方法包括:在一第一聲道對 間定義一第一平衡參數、一第一同調參數或一第一時間差 參數,及在一第二聲道對間定義一第二平衡參數、一第二 同調參數或一第二時間參數,該等平衡參數、同調參數或 時間參數形成參數表示, -13- 1313857 1*7年9月2ί I曰修(更)正替換頁 其中該第一聲道對具有兩個聲道,該兩個聲道係不同於 第二聲道對之兩個聲道,以及 其中該兩個聲道對之每一聲道係該等原始聲道中之 ~'該等原始聲道之一加權或未加權組合、一下行混音聲 道’或至少兩個下行混音聲道之一加權或未加權組合,以 及 其中該第一聲道對及該第二聲道對包括在該三個庳始 聲道上之資訊。 ® 37.如申請專利範圍第36項之方法,用以在輸入至如申請專 利範圍第2 2項之裝置中時,控制一多聲道重建。The multi-channel method package 33. A method for generating a parameter representation of a multi-channel input signal. The parameter representation of the input signal has at least three original channels: generated between a first pair of channels (2 03 ) a first balance parameter, a first coherence parameter or a first time difference parameter; and generating a second balance parameter, a second coherence parameter or a second time parameter between the second channel pair, the balance parameter Forming the parameter representation with a coherent parameter or a time parameter; wherein the first channel pair has two channels, the two channels being different from the two channels of the second channel pair, and wherein the two channels Each channel is one of the original channels, one of the original channels is weighted or unweighted, one of the next mixing channels, or one of the at least two downstream mixing channels is weighted or unweighted Combining, and information in which the first channel pair and the second channel pair are included on the three original channels. 3 4. A method for reconstructing a multi-channel representation for generating one of the original multi-channel signals, using a plurality of basic channels and using a first balance parameter, a first coherence, between a first channel pair a parameter or a first time difference parameter, wherein the original multi-channel signal has at least three original channels generated by converting the original multi-channel signal using a next-line mixing architecture; Generating a second balance parameter, a second coherence parameter, or a second time parameter between a pair of second channels, the balance parameter, the coherence parameter or the time parameter forming the parameter representation, wherein the first channel pair It has two channels, which are different from the two channels of the second channel pair, the -12-W year, the H1 repair (more), the replacement page, etc. 1313857, and wherein each of the two channel pairs is one of, one of the original channels, one of the weighted or unweighted combined channels, or one of the at least two downstream mixing channels is weighted or unweighted Combination, and wherein the first channel pair and the second The pair of channels includes information on the three original channels, the method comprising: generating (3 0 5) a plurality of uplink mixing channels, the number of the upstream mixing channels being greater than the number of the basic channels and Less than or equal to the number of original channels; wherein the generating step includes generating a reconstructed channel according to the information of the downlink mixing architecture and causing # to use the equalization parameter, the equivalent tuning parameter, or the time difference between the channels, so as to borrow Determining, by the first balance parameter, the first inter-channel co-modulation parameter or the first inter-channel time difference, a balance or coherence or inter-channel time difference between a first reconstructed channel pair, and by using the first balance parameter The second balance parameter, the second inter-channel coherence parameter or the second inter-channel time difference parameter determines a balance between one channel pair, a channel-to-channel coherence or an inter-channel level difference. 3 5 . A computer readable recording medium having a computer program for implementing a method of applying for a patent application No. 3 3 or 34 when executed on a computer. 36. A method of representing a parameter of a multi-channel input signal having at least three original channels, the method comprising: defining a first balance parameter between a first pair of channels, a first a coordinating parameter or a first time difference parameter, and defining a second balancing parameter, a second coherent parameter or a second time parameter between the second channel pair, the equalizing parameter, the coherent parameter or the time parameter forming parameter Representation, -13- 1313857 1*7 September 2ί I曰 repair (more) is replacing the page where the first channel pair has two channels, the two channels are different from the second channel pair Channels, and wherein each of the two channel pairs is a weighted or unweighted combination of one of the original channels, a lower line of mixing channels, or at least two One of the weighted or unweighted combinations of the downmix channels, and the information in which the first channel pair and the second channel pair are included on the three start channels. ® 37. The method of claim 36, for controlling a multi-channel reconstruction when input into a device as claimed in item 2 of the patent application. -14- 1313857 Η—^圖式_· 1/ί ^年‘月(日修(更)正替換頁 1 οδ--14- 1313857 Η—^图__ 1/ί ^年月(日修(more) is replacing page 1 οδ- ο 104ο 104 第 201 202 209 ADC 分析I_i 濾波器組厂^ ADC —► 分析 L-i 濾波器組厂> ADC 分析 |_y 環繞 濾波器組厂》1 編碼器 ADC 分析1_λ 濾波器組厂^ ADC —P· 分析1_k 濾波器組p? ►ADC —> 分析1_y 濾波器組 音頻編碼器 206 .環繞資料 環繞資料 Ψ 編碼器 ‘ / 干 J 207 控制信號 ----► 204 多Η器 串列位元流 2 08 205 第2圖 203 1313857 2/9 303Page 201 202 209 ADC Analysis I_i Filter Bank Factory ^ ADC —► Analysis Li Filter Bank Factory > ADC Analysis |_y Surrounding Filter Bank Factory 1 Encoder ADC Analysis 1_λ Filter Bank Factory ^ ADC —P· Analysis 1_k Filter bank p? ►ADC —> Analysis 1_y Filter bank audio encoder 206. Surround data surround data 编码 Encoder ' / Dry J 207 Control signal ----- ► 204 Multi-serial serial bit stream 2 08 205 Figure 2 203 1313857 2/9 303 301〜 串列位元流 解多工器301~ tandem bit stream solution multiplexer 合 濾波器組 合成 濾波器組 合成 濾波器組 合成 爐波器組 合成 濾波器組 306 307 DAG DAC DAC DAC DAC k 合成 <濾波器組 DAC 第3圖 左 右 103-Combined Filter Bank Synthetic Filter Bank Synthetic Filter Bank Synthesis Furnace Filter Group Synthesis Filter Bank 306 307 DAG DAC DAC DAC DAC k Synthesis <Filter Bank DAC Figure 3 Left and Right 103- 104 102-104 102- 刖 D 上- A 後 101- 第4圖 105 『3 1313857 3/9 ,车月/日修(更)正替換頁刖 D on - A after 101 - Figure 4 105 『3 1313857 3/9 , Che Yue / Ri repair (more) is replacing page 第5圖Figure 5 1313857 斤年办/日修(更)正替換气 4/91313857 kg year/day repair (more) is replacing gas 4/9 平衡參數= n_ L _ 〇:2B+^2A+y2C+(32F 〇r r B 1 R a2D+^2E+y2C+(52p 1 D y22C 「2= a2(B+D)Balance parameter = n_ L _ 〇: 2B+^2A+y2C+(32F 〇r r B 1 R a2D+^2E+y2C+(52p 1 D y22C "2= a2(B+D) 「3=. 鄉+E) r4= a2(B+D)+y22C 炉八—A ~W = T =_d^2F_ 「5=a2(B+D)+iS2(A+E)+y22C 第6圖 1313857 P年《月/日修(更)正替換頁 5/9"3=.乡+E) r4= a2(B+D)+y22C 八八-A ~W = T =_d^2F_ "5=a2(B+D)+iS2(A+E)+y22C 6 Figure 1313857 P year "month / day repair (more) is being replaced page 5/9 5.1—►兩個基本聲道: ld(t)=ab(t)+與⑴+yc(t)+<5f(i) rd(t)=ad(〇 +卢e(i)+yc(i)十 3i(t) 5.1-個基本聲道: 叫⑴^·^ (丨d⑴+「d⑴ 所傳送單聲道之能量: (每一頻帶及每一區塊) M=y(a2(B+D)+j32(A+£)+2y2C+2<52F), 兩個基本聲道之平衡參數組: Μ 第7圖 700 3b cd 6 t 原始聲道 單一基本聲道m 下行混音器 (下行混音資訊) ^兩個基本聲道 1313857 #年《月/日修(更)正替換頁 6/9 m 基本聲道 - 上行混音器 + (使用下行混音架 構之資訊) - 參數資料流5.1—► Two basic channels: ld(t)=ab(t)+ and (1)+yc(t)+<5f(i) rd(t)=ad(〇+卢e(i)+yc( i) Ten 3i(t) 5.1-one basic channel: Called (1)^·^ (丨d(1)+"d(1) The energy of the mono channel transmitted: (per frequency band and each block) M=y(a2(B +D)+j32(A+£)+2y2C+2<52F), the balance parameter set of two basic channels: Μ Figure 7 700 3b cd 6 t Original channel single basic channel m Downstream mixer (downstream Mixing information) ^Two basic channels 1313857 #年"月/日修 (more) is replacing page 6/9 m Basic channel - Upstream mixer + (using the information of the downlink mixing architecture) - Parameter data stream 重建聲道Reconstruction channel 2M J__[&_ 2y2 1 +『5 Α=^τΓ-4---Γι---·1-··2Μ 〆 1+「41 十「31+「5 r 1 1 r3 1 ns, t— ϋ ~ 1 ~~; 2lvl p1 1+r4 1+Γ3 1+Γ5 r__1[2__1_____L_9m 2^1 + γ21+γ31+Γ52Μ B=i(2T^_A作叫 D=4^/2_1_ Μ-/52Ε-τ2〇-(52ρ a2\ l+r-j OR for「i=吾:2M J__[&_ 2y2 1 +『5 Α=^τΓ-4---Γι---·1-··2Μ 〆1+"41 十"31+"5 r 1 1 r3 1 ns, t- ϋ ~ 1 ~~; 2lvl p1 1+r4 1+Γ3 1+Γ5 r__1[2__1_____L_9m 2^1 + γ21+γ31+Γ52Μ B=i(2T^_A is called D=4^/2_1_ Μ-/52Ε-τ2 〇-(52ρ a2\ l+rj OR for "i=我: B= 1 -Ji-—!——1——1_2M 1 十「1 1 +「2 1 十「3 1 十「5 ——-——^——ί-2Μ αΖ 1 十 Γ| 1 十 Γ2 1 十 Γ3 "I 十 f5 第8圖 1313857 |^年#月/曰修(更)正替換頁 7/9 主下行混音(MD) 900 / 原始聲道 電平參數計算器 參數下行混音 傳送主下行混音及參數 ^ rm=·^ or "SB= 1 -Ji-—!——1——1_2M 1 Ten “1 1 +” 2 1 10 “3 1 10 “5 ——————^——ί-2Μ αΖ 1 十Γ | 1 十Γ2 1十Γ3 "I 十f5 8th figure 1313857 |^年#月/曰修(more) 正换页7/9 Main Downmix (MD) 900 / Original Channel Level Parameter Calculator Parameter Downmix Mixing Main downmix and parameters ^ rm=·^ or "S 第9a圖 ,rM 902 至上行混音器之已校正 基本聲道 1 二 ―浐 η -> 電平校正器Figure 9a, rM 902 to the upstream mixer corrected basic channel 1 2 浐 η -> level corrector 第9b圖 1313857Figure 9b 1313857 斤你4月/日修(更)正替換頁| 8/9 增強層 核心層'Jin you April / Japanese repair (more) is replacing page | 8/9 reinforcement layer core layer ' 第10a圖 所要擷取之參數 重建聲道胃 未使用/計算 1-^2 h"2 B' D/(B+D) r2-r5 A, C, E, F 1->3 IV2 B, D, (C+F) r3'r5 A, E, F 1 -^4 IW3 B, D, (C+F), (A+E) •V5 A, E,F 1->5 「1,「2丨 B, D, (C+F), A,E r5 F 1->5.1 M2,『3,「4,『5 B, D, C, A, E,F / 2^5.1 r2> r3> r4> Γ5 B, D, C, A, E,F 『1The parameters to be retrieved in Fig. 10a are reconstructed. Channel stomach is not used/calculated 1-^2 h"2 B' D/(B+D) r2-r5 A, C, E, F 1->3 IV2 B, D, (C+F) r3'r5 A, E, F 1 -^4 IW3 B, D, (C+F), (A+E) • V5 A, E, F 1->5 “1, "2丨B, D, (C+F), A, E r5 F 1-> 5.1 M2, "3, "4, "5 B, D, C, A, E, F / 2^5.1 r2>R3>r4> Γ5 B, D, C, A, E, F 『1 第10b圖 1313857 if年令月,日修(更)正替換頁 9/9Figure 10b 1313857 if the year of the month, the day of repair (more) is replacing page 9/9 x'=Hyx'=Hy
TW094126934A 2005-04-12 2005-08-09 Apparatus for generating a parameter representation of a multi-channel signal and method for representing multi-channel audio signals TWI313857B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2005/003849 WO2005101371A1 (en) 2004-04-16 2005-04-12 Method for representing multi-channel audio signals

Publications (2)

Publication Number Publication Date
TW200636676A TW200636676A (en) 2006-10-16
TWI313857B true TWI313857B (en) 2009-08-21

Family

ID=45092056

Family Applications (1)

Application Number Title Priority Date Filing Date
TW094126934A TWI313857B (en) 2005-04-12 2005-08-09 Apparatus for generating a parameter representation of a multi-channel signal and method for representing multi-channel audio signals

Country Status (1)

Country Link
TW (1) TWI313857B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI468031B (en) * 2011-05-13 2015-01-01 Fraunhofer Ges Forschung Apparatus and method and computer program for generating a stereo output signal for providing additional output channels
US9014378B2 (en) 2008-09-03 2015-04-21 Dolby Laboratories Licensing Corporation Enhancing the reproduction of multiple audio channels
TWI688280B (en) * 2018-09-06 2020-03-11 宏碁股份有限公司 Sound effect controlling method and sound outputting device with orthogonal base correction
US11838743B2 (en) 2018-12-07 2023-12-05 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus, method and computer program for encoding, decoding, scene processing and other procedures related to DirAC based spatial audio coding using diffuse compensation

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
ES2452348T3 (en) 2007-04-26 2014-04-01 Dolby International Ab Apparatus and procedure for synthesizing an output signal
US8515106B2 (en) 2007-11-28 2013-08-20 Qualcomm Incorporated Methods and apparatus for providing an interface to a processing engine that utilizes intelligent audio mixing techniques
US8660280B2 (en) 2007-11-28 2014-02-25 Qualcomm Incorporated Methods and apparatus for providing a distinct perceptual location for an audio source within an audio mixture
TWI404050B (en) * 2009-06-08 2013-08-01 Mstar Semiconductor Inc Multi-channel audio signal decoding method and device
ITTO20120067A1 (en) * 2012-01-26 2013-07-27 Inst Rundfunktechnik Gmbh METHOD AND APPARATUS FOR CONVERSION OF A MULTI-CHANNEL AUDIO SIGNAL INTO TWO-CHANNEL AUDIO SIGNAL.
TWI587286B (en) 2014-10-31 2017-06-11 杜比國際公司 Method and system for decoding and encoding of audio signals, computer program product, and computer-readable medium
CN110881157B (en) * 2018-09-06 2021-08-10 宏碁股份有限公司 Sound effect control method and sound effect output device for orthogonal base correction
CN112614505A (en) * 2020-11-27 2021-04-06 江苏爱谛科技研究院有限公司 Parallel ultra-fast EMD signal processing system

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9014378B2 (en) 2008-09-03 2015-04-21 Dolby Laboratories Licensing Corporation Enhancing the reproduction of multiple audio channels
US9706308B2 (en) 2008-09-03 2017-07-11 Dolby Laboratories Licensing Corporation Enhancing the reproduction of multiple audio channels
US10356528B2 (en) 2008-09-03 2019-07-16 Dolby Laboratories Licensing Corporation Enhancing the reproduction of multiple audio channels
TWI468031B (en) * 2011-05-13 2015-01-01 Fraunhofer Ges Forschung Apparatus and method and computer program for generating a stereo output signal for providing additional output channels
US9913036B2 (en) 2011-05-13 2018-03-06 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Apparatus and method and computer program for generating a stereo output signal for providing additional output channels
TWI688280B (en) * 2018-09-06 2020-03-11 宏碁股份有限公司 Sound effect controlling method and sound outputting device with orthogonal base correction
US10735883B2 (en) 2018-09-06 2020-08-04 Acer Incorporated Sound effect controlling method and sound outputting device with orthogonal base correction
US11838743B2 (en) 2018-12-07 2023-12-05 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus, method and computer program for encoding, decoding, scene processing and other procedures related to DirAC based spatial audio coding using diffuse compensation
US11856389B2 (en) 2018-12-07 2023-12-26 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Apparatus, method and computer program for encoding, decoding, scene processing and other procedures related to DirAC based spatial audio coding using direct component compensation
US11937075B2 (en) 2018-12-07 2024-03-19 Fraunhofer-Gesellschaft Zur Förderung Der Angewand Forschung E.V Apparatus, method and computer program for encoding, decoding, scene processing and other procedures related to DirAC based spatial audio coding using low-order, mid-order and high-order components generators

Also Published As

Publication number Publication date
TW200636676A (en) 2006-10-16

Similar Documents

Publication Publication Date Title
TWI313857B (en) Apparatus for generating a parameter representation of a multi-channel signal and method for representing multi-channel audio signals
TWI334736B (en) Apparatus and method for generating a level parameter, apparatus and method for generating a multi-channel representation and a storage media stored parameter representation
TWI458365B (en) Apparatus and method for generating a level parameter, apparatus and method for generating a multi-channel representation and a storage media stored parameter representation