TWI470622B - Reduced complexity transform for a low-frequency-effects channel - Google Patents

Reduced complexity transform for a low-frequency-effects channel Download PDF

Info

Publication number
TWI470622B
TWI470622B TW101135308A TW101135308A TWI470622B TW I470622 B TWI470622 B TW I470622B TW 101135308 A TW101135308 A TW 101135308A TW 101135308 A TW101135308 A TW 101135308A TW I470622 B TWI470622 B TW I470622B
Authority
TW
Taiwan
Prior art keywords
conversion
real
calculation
audio signal
coefficients
Prior art date
Application number
TW101135308A
Other languages
Chinese (zh)
Other versions
TW201340095A (en
Inventor
Matthew C Fellers
Original Assignee
Dolby Lab Licensing Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from PCT/US2012/029603 external-priority patent/WO2012134851A1/en
Application filed by Dolby Lab Licensing Corp filed Critical Dolby Lab Licensing Corp
Publication of TW201340095A publication Critical patent/TW201340095A/en
Application granted granted Critical
Publication of TWI470622B publication Critical patent/TWI470622B/en

Links

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Complex Calculations (AREA)
  • Stereophonic System (AREA)

Description

用於低頻效應頻道降低複雜度之轉換Conversion for low frequency effect channel reduction complexity

本發明一般關於數位信號處理,並且更具體地被針對向可用以施加濾波器組到有限帶寬音頻頻道的方法和設備,如所謂的低頻效應(Low-frequency-effects;LFE)頻道,其使用更少的計算資源。The present invention relates generally to digital signal processing, and more particularly to methods and apparatus that can be used to apply a filter bank to a limited bandwidth audio channel, such as the so-called Low-frequency-effects (LFE) channel, which is more Less computing resources.

一些國際,區域和國家標準已經被發展以定義可用以實施多頻道音頻編碼系統的方法和系統。三個這樣標準的例子包括ISO/IEC 13818-7,進階音頻編碼(AAC),也被稱為「MPEG-2 AAC」,和ISO/IEC 14496-3,次部4,也被稱為「MPEG-4音頻」,由「國際標準化組織(International Standards Organization;ISO)」所發表,以及由美國先進電視系統委員會(United States Advanced Television Systems Committee;ATSC)公司所發表於文件A/52B中的標準,標題為「數位音頻壓縮標準」(AC-3,E-AC-3),版本B,發表於2005年6月14日,也被稱為「杜比數位」或「AC-3」。A number of international, regional and national standards have been developed to define methods and systems that can be used to implement multi-channel audio coding systems. Examples of three such standards include ISO/IEC 13818-7, Advanced Audio Coding (AAC), also known as "MPEG-2 AAC", and ISO/IEC 14496-3, Subpart 4, also known as " MPEG-4 Audio, published by the International Standards Organization (ISO) and published by the United States Advanced Television Systems Committee (ATSC) in document A/52B The title is "Digital Audio Compression Standard" (AC-3, E-AC-3), version B, published on June 14, 2005, also known as "Dolby Digital" or "AC-3."

如上述符合標準的音頻系統一般包括傳送器,其施加分析濾波器組到輸入音頻信號的幾個頻道之每一者,處理分析濾波器組的輸出到編碼信號中,並傳送或記錄編碼信號,和接收器,其接收編碼信號,對它們解碼,並施加合成濾波器組到解碼信號以產生輸出音頻信號的頻道,其為 原輸入音頻信號的複製品。許多標準藉由修改型離散餘弦轉換(Modified Discrete Transform;MDCT)和修改型離散餘弦反轉換(Inverse Modified Discrete Transform;IMDCT)規範實施分析和合成濾波器組,其由Princen,Johnson和Bradley描述,「使用基於時域頻疊相消之濾波器組設計次頻帶/轉換編碼」,於聲學,語音和信號處理國際會議(ICASSP 1987 Conf .Proc )中,1987年5月,第2161-64頁。An audio system conforming to the above criteria generally includes a transmitter that applies an analysis filter bank to each of several channels of the input audio signal, processes the output of the analysis filter bank into the encoded signal, and transmits or records the encoded signal, And a receiver that receives the encoded signals, decodes them, and applies a synthesis filter bank to the decoded signal to produce a channel for the output audio signal, which is a replica of the original input audio signal. Many standards implement analytical and synthetic filter banks by Modified Discrete Transform (MDCT) and Modified Inverse Modified Discrete Transform (IMDCT) specifications, described by Princen, Johnson, and Bradley, " Designing Sub-Band/Transcoding Coding Using Filter Banks Based on Time Domain Frequency-Frequency Cancellation, International Conference on Acoustics, Speech and Signal Processing ( ICASSP 1987 Conf . Proc ), May 1987, pp. 21-6-64.

藉由這些特定轉換所實施的濾波器組具有許多吸引人的特性,但大量的處理或計算資源被需要以執行必要的計算。技術是已知的,其可用以更有效地執行轉換,從而減少所需要的計算資源量。這些技術的共同的一特徵是它們的計算複雜性隨所謂的轉換長度變化。技術是已知的,其可以藉由使用較短的轉換長度以處理具有更窄的帶寬的音頻頻道在計算複雜性中實施進一步的削減。Filter banks implemented by these specific transformations have many attractive features, but a large amount of processing or computational resources are needed to perform the necessary calculations. Techniques are known that can be used to perform conversions more efficiently, thereby reducing the amount of computing resources required. A common feature of these techniques is that their computational complexity varies with the so-called transition length. Techniques are known that can implement further reductions in computational complexity by using shorter conversion lengths to process audio channels with narrower bandwidths.

像上述的標準定義數位數據或數位位元串流的序列,其攜帶表示一個或多個音頻通道的編碼表示之數據。有時也被稱為「5.1通道」之通道的一個配置包括表示左(L),右(R),中(C),左環繞(LS),和右環繞(RS)的5個全帶寬通道,和1個有限帶寬通道或低頻效應(Low-frequency-effects;LFE)通道。全帶寬通道典型地具有約20千赫的帶寬,和有限帶寬LFE通道典型地具有約100到200赫茲的帶寬。由於LFE通道的帶寬更窄,已知的技術可用以比可被執行用於全帶寬通道之一者更有 效地執行用於LFE通道之濾波器組轉換。A standard like the above defines a sequence of digital data or digital bit streams carrying data representing an encoded representation of one or more audio channels. One configuration of a channel, sometimes referred to as a "5.1 channel," includes five full-bandwidth channels representing left (L), right (R), medium (C), left surround (LS), and right surround (RS). , and a limited bandwidth channel or Low-frequency-effects (LFE) channel. Full bandwidth channels typically have a bandwidth of about 20 kilohertz, and limited bandwidth LFE channels typically have a bandwidth of about 100 to 200 hertz. Since the bandwidth of the LFE channel is narrower, known techniques can be used more than one that can be implemented for a full bandwidth channel. The filter bank conversion for the LFE channel is performed efficiently.

然而,有必要開發新技術,其進一步改善轉換濾波器組的效率,其被應用到有限帶寬通道,如LFE通道。However, there is a need to develop new techniques that further improve the efficiency of the conversion filter bank, which is applied to limited bandwidth channels, such as LFE channels.

本發明的目的是提供可用以執行轉換的方法,其比使用已知技術可能來得更有效地實施用於有限帶寬通道信號的濾波器組。It is an object of the present invention to provide a method that can be used to perform a conversion that is more efficient to implement a filter bank for a limited bandwidth channel signal than is possible using known techniques.

根據本發明的一態樣,有限帶寬信號藉由接收K實值轉換係數的方塊被處理,其中的只有實值轉換係數的數L代表有限帶寬音頻信號的頻譜分量,其中½L<M<K,且M是2的冪次;施加長度R的第一轉換到推導自M複值轉換係數之複值係數之方塊,其包括代表有限帶寬音頻信號的頻譜分量之L實值轉換係數,其中,且P是2的冪次;施加一組長度P的Q第二轉換到第一轉換的輸出;以及從此組第二轉換的輸出推導出N實值信號樣本的序列,其中N=2.K且實值信號樣本代表有限帶寬音頻信號的時間分量。According to one aspect of the invention, the finite bandwidth signal is processed by a block that receives K real conversion coefficients, wherein only the number L of real conversion coefficients represents the spectral components of the finite bandwidth audio signal, where 1⁄2L < M < K, And M is a power of 2; a first conversion of the length R is applied to a block of complex-valued coefficients derived from the M complex-valued conversion coefficients, which includes L real-valued conversion coefficients representing spectral components of the finite-bandwidth audio signal, wherein And P is a power of 2; a second set of Qs of length P is applied to the output of the first transform; and a sequence of N real-valued signal samples is derived from the output of the second set of transforms, where N=2. K and the real value signal samples represent the time components of the limited bandwidth audio signal.

本發明不同的特徵和它較佳的實施方式藉由參照下面的討論和附圖可更好地被理解,其中在許多圖中相同似的標號代表相同的元件。闡述下面的討論和附圖所載的內容僅作為示例,而不應被理解為表示本發明的範圍上的限制。The various features of the invention and the preferred embodiments of the invention are understood by reference to the The matters contained in the following discussion and the accompanying drawings are merely exemplary and are not to be construed as limiting the scope of the invention.

A.引言A. Introduction

圖1是雙通道音頻編碼系統的示意性說明,其包括傳送器100和接收器200。傳送器100接收輸入音頻信號的路徑11,12的兩個通道。分析濾波器組111,112被施加到輸入音頻通道以獲得第一組頻率次頻帶的信號,其表示輸入音頻信號的頻譜分量。這些分析濾波器組被時域到頻域的轉換實施。編碼器120施加編碼處理到第一組頻率次頻帶信號以產生編碼訊息,其是沿著路徑20通過。接收器200從路徑20接收編碼訊息。解碼器220施加解碼處理到編碼訊息以獲得第二組頻率次頻帶信號。合成濾波器組231,232被施加到第二組頻率次頻帶信號以產生兩個或兩個以上的輸出音頻信號的通道,其是沿著路徑31,32通過。這些合成濾波器組被頻域到時域的轉換實施。路徑20可以是廣播媒體,點至點的通信媒體,記錄媒體,或任何可以傳達或記錄編碼訊息的其它媒體。1 is a schematic illustration of a two-channel audio coding system that includes a transmitter 100 and a receiver 200. The transmitter 100 receives two channels of the paths 11, 12 of the input audio signal. Analysis filter banks 111, 112 are applied to the input audio channel to obtain a signal of the first set of frequency sub-bands, which represents the spectral components of the input audio signal. These analysis filter banks are implemented in time domain to frequency domain conversion. Encoder 120 applies an encoding process to the first set of frequency sub-band signals to produce an encoded message that is passed along path 20. Receiver 200 receives the encoded message from path 20. The decoder 220 applies a decoding process to the encoded message to obtain a second set of frequency sub-band signals. The synthesis filter banks 231, 232 are applied to the second set of frequency sub-band signals to produce channels of two or more output audio signals that pass along paths 31, 32. These synthesis filter banks are implemented by frequency domain to time domain conversion. Path 20 can be broadcast media, point-to-point communication media, recording media, or any other media that can convey or record encoded messages.

編碼器120和解碼器220對於實施本發明不是必需的。如果它們被使用,它們既可以執行無損或有損編碼處理。另外,本發明並不限於任何特定的編碼和解碼處理。Encoder 120 and decoder 220 are not required to practice the invention. If they are used, they can perform both lossless or lossy encoding processing. Additionally, the invention is not limited to any particular encoding and decoding process.

只有輸入和輸出音頻信號的兩個通道被示出在附圖中用於說明清晰。在許多實施中,有兩個以上的輸入音頻信號的通道和兩個以上的輸出音頻信號的通道。輸出的音頻信號的至少一者具有比其他輸出音頻信號的一個或多個的帶寬窄得多的帶寬。Only two channels of input and output audio signals are shown in the drawings for clarity. In many implementations, there are more than two channels of input audio signals and more than two channels of output audio signals. At least one of the output audio signals has a much narrower bandwidth than one or more of the other output audio signals.

本發明旨在是減少所需的計算資源以執行轉換,其實施用以產生較窄的帶寬輸出音頻信號的接收器200中的合成濾波器組231或232。本發明可在接收器200中實施更高效的合成濾波器組,其保持與在現有的傳送器100中的分析濾波器組之相容性。The present invention is directed to reducing the computational resources required to perform the conversion, which implements a synthesis filter bank 231 or 232 in the receiver 200 to produce a narrower bandwidth output audio signal. The present invention can implement a more efficient synthesis filterbank in the receiver 200 that maintains compatibility with the analysis filterbank in existing transmitters 100.

本發明也可被使用以減少所需的計算資源以執行轉換,其在傳送器100中實施施加到較窄的帶寬輸入音頻信號的分析濾波器組111或112。這個實施可以保持與在現有的接收器200中的合成濾波器之相容性。The present invention can also be used to reduce the computational resources required to perform the conversion, which implements an analysis filterbank 111 or 112 applied to the narrower bandwidth input audio signal in the transmitter 100. This implementation can maintain compatibility with the synthesis filter in the existing receiver 200.

B.實施技術B. Implementation technology

合成濾波器組可以由各種各樣的頻域到時域的轉換實施,包括上述離散餘弦反轉換(Inverse Discrete Cosine Transform;IDCT)和修改型離散餘弦反轉換(Inverse Modified Discrete Cosine Transform;IMDCT)的許多變化。定義這些轉換的算法係以直接的方式在此統稱為「直接轉換」。The synthesis filter bank can be implemented by various frequency domain to time domain conversion, including the above-described discrete cosine inverse transform (IDCT) and modified inverse cosine transform (IMDCT). Many changes. The algorithms that define these transformations are collectively referred to herein as "direct conversions" in a straightforward manner.

一種在本文中稱為「折疊的技術」的技術可用以更有效地執行這些直接轉換。折疊的技術包括,如在圖#2所示的三個階段。第二階段402以這種折疊技術實施執行轉換,其具有比直接轉換較短的長度。在第二階段402執行的轉換被稱為「折疊轉換」,藉此,以下的說明中可以更容易地從直接轉換區分它。A technique referred to herein as "folding technology" can be used to perform these direct conversions more efficiently. The folding technique includes three stages as shown in Figure #2. The second stage 402 performs the conversion in this folding technique, which has a shorter length than the direct conversion. The conversion performed in the second stage 402 is referred to as "folding conversion", whereby it can be more easily distinguished from the direct conversion in the following description.

預處理器階段401結合在一個方塊中的K實值頻域轉 換係數的轉換係數到½.K複值轉換係數的方塊中。轉換階段402施加長度½.K的頻域到時域的折疊轉換到複值轉換係數的方塊以產生½.K複值時域樣本。後處理器階段403從½.K複值時域信號樣本推導出的K實值樣本時域的序列。除了任何可能從有限精度算術運算發生的錯誤,藉由這種技術獲得的K時域信號樣本與可以藉由施加長度K的直接轉換到K實值頻域轉換係數的方塊所獲得K時域信號樣本是相同的。此技術改善效率,因為附加的所需來執行直接轉換的計算資源係大於實施預處理器階段401和後處理器階段403中所執行的處理所需要的計算資源,而不是在階段402的折疊轉換。The preprocessor stage 401 combines the K real value frequency domain in a block Change the conversion factor of the coefficient to 1⁄2. K complex value conversion coefficient in the square. The conversion phase 402 applies a length of 1⁄2. The frequency domain to time domain fold of K is converted to the square of the complex value conversion coefficient to produce 1⁄2. K complex value time domain samples. Post processor stage 403 from 1⁄2. The K-valued time domain signal sample derived K-sample sample time domain sequence. In addition to any errors that may occur from finite precision arithmetic operations, K time domain signal samples obtained by this technique and K time domain signals obtained by applying a length K directly converted to K real frequency domain transform coefficients are obtained. The samples are the same. This technique improves efficiency because the additional computational resources required to perform the direct conversion are greater than the computational resources required to implement the processing performed in the preprocessor stage 401 and the post processor stage 403, rather than the folding conversion at stage 402. .

如果轉換係數的方塊表示窄帶寬信號,其中大量的轉換係數是零,附加的轉換分解技術可被使用以增加在階段402中執行折疊轉換的處理效率。If the squares of the conversion coefficients represent narrow bandwidth signals in which a large number of conversion coefficients are zero, additional conversion decomposition techniques can be used to increase the processing efficiency of performing the fold conversion in stage 402.

此技術在下面的章節中討論。This technique is discussed in the following sections.

1.直接轉換Direct conversion

表達式2中所示為直接IMDCT。其互補的修改型離散餘弦轉換(MDCT)顯示在表達式1中。Direct IMDCT is shown in Expression 2. Its complementary modified discrete cosine transform (MDCT) is shown in Expression 1.

其中X (k )=實值頻域轉換係數k ;K=實值頻域轉換係數的總數;x (n )=實值時域信號樣本n ;以及N=樣本的時域視窗的長度,其中N=2K。 Where X ( k ) = real-valued frequency domain conversion coefficient k ; K = total number of real-valued frequency-domain conversion coefficients; x ( n ) = real-time time domain signal sample n ; and N = length of the time domain window of the sample, where N = 2K.

這些直接轉換的適當操作需要分析視窗函數和合成視窗函數的使用,其之長度和形狀滿足特定的要求,在本領域中是公知的。分析視窗函數在MDCT的應用之前被施加到N輸入音頻信號樣本的片段。合成視窗函數被施加到從IMDCT到K轉換係數的方塊的應用所獲得的N樣本的片段,並且這些樣本的視窗片段被重疊以及被添加到從轉換係數的其它方塊獲得的樣本的視窗片段。附加的細節可從以上所引述的Princen等論文獲得。以下段落省略分析視窗函數的進一步討論。Appropriate operation of these direct conversions requires the use of analytical window functions and synthetic window functions, the length and shape of which meet specific requirements, and are well known in the art. The analysis window function is applied to the segment of the N input audio signal sample prior to the application of the MDCT. The synthesis window function is applied to the segments of the N samples obtained from the application of the block of IMDCT to K conversion coefficients, and the window segments of these samples are overlapped and added to the window segments of the samples obtained from the other blocks of the conversion coefficients. Additional details are available from the paper cited by Princen et al., cited above. The following paragraphs omit further discussion of the analysis window function.

2.折疊技術2. Folding technology

在預處理器中所執行的階段401的處理可以被表示為: 其中X' (k )=複值頻域轉換係數k;以及j=虛數算子等於The processing of stage 401 performed in the preprocessor can be expressed as: Where X' ( k ) = complex-valued frequency domain conversion coefficient k; and j = imaginary operator equals .

在轉換階段402中所執行的折疊轉換可以表示為: 其中;以及x' (n )=複值時域信號樣本。The folding transformation performed in the conversion phase 402 can be expressed as: among them ; and x' ( n ) = complex value time domain signal samples.

在後處理器階段403中所執行的處理可以被表示為: 其中y (n )=使用於後續視窗計算中的中間樣本值;Re[x' (n )]=複值x' (n )的實部;以及Im[x' (n )]=複值x' (n )的虛部。The processing performed in post-processor stage 403 can be expressed as: Where y ( n )=the intermediate sample value used in subsequent window calculations; Re[ x' ( n )]=the real part of the complex value x′ ( n ); and Im[ x′ ( n )]=the complex value x The imaginary part of ' ( n ).

3.用於IMDCT的合成視窗函數3. Synthetic window function for IMDCT

IMDCT的正確運算包括施加適當設計的合成視窗函數到由轉換所產生的時域樣本。從這個窗口運算所得到的時域信號樣本可以表示為: 其中h (n )=在合成視窗函數中的點n;以及y' (n )=視窗中間樣本n。The correct operation of IMDCT involves applying a properly designed synthetic window function to the time domain samples produced by the transformation. The time domain signal samples obtained from this window operation can be expressed as: Where h ( n ) = point n in the synthesis window function; and y' ( n ) = window intermediate sample n.

從表達式6中獲得的視窗中間樣本y' 係中間時域樣本,其本來是可以藉由以下方式直接IMDCT到隨合成視窗函數h的施加後之頻域轉換係數X的方塊的施加獲得。正如在上面引用的Princen論文,輸出時域信號樣品藉由重疊和以推導自轉換係數的前方塊之一組「前」視窗中間 樣本增加推導自轉換係數的「目前」方塊的視窗中間樣本獲得。此重疊增加的處理可以表示為:x (n )=y ' (n )+y ' prev (n ) (7)其中y ' prev (n )=前視窗中間樣本。The window intermediate sample y' obtained from Expression 6 is an intermediate time domain sample which can be obtained by directly applying IMDCT to the application of the square of the frequency domain conversion coefficient X after the application of the synthesis window function h. As in the Princen paper cited above, the output time domain signal samples are obtained by adding a window intermediate sample of the "current" square from which the self-conversion coefficients are derived by overlapping and deriving a set of "previous" window intermediate samples of the former square of the self-conversion coefficients. The processing of this overlap increase can be expressed as: x ( n ) = y ' ( n ) + y ' prev ( n ) (7) where y ' prev ( n ) = the front window intermediate sample.

4.轉換分解技術4. Conversion decomposition technology

轉換分解技術可用以推導出用於對有限帶寬信號執行折疊轉換的更有效的方法,其中一些在頻域轉換係數的方塊中的轉換係數係已知為等於零。此分解技術由表達折疊轉換為等效二維轉換以及隨一組單維水平離散傅立葉反轉換(IDFT)之後的分解此二維轉換成單維垂直轉換所組成。垂直轉換具有等於Q的長度,並且此組水平複數的IDFT包含Q轉換,其每一具有等於P的長度,其中P和Q是整數,並且P和Q的積等於折疊轉換的長度。The transform decomposition technique can be used to derive a more efficient method for performing a fold conversion on a limited bandwidth signal, some of which are known to be equal to zero in the square of the frequency domain transform coefficients. This decomposition technique consists of converting the expression fold into an equivalent two-dimensional transformation and transforming this two-dimensional transformation into a one-dimensional vertical transformation with a set of single-dimensional horizontal discrete Fourier inverse transform (IDFT). The vertical conversion has a length equal to Q, and this set of horizontal complex IDFTs includes Q conversions, each having a length equal to P, where P and Q are integers, and the product of P and Q is equal to the length of the folding transition.

參照前面折疊技術的討論,可以看出折疊轉換的長度是J =¼.NK ,因此,PQ =J 。對於P,Q和J的值被限制為2的冪次。Referring to the discussion of the previous folding technique, it can be seen that the length of the folding conversion is J = 1⁄4. N = 1⁄2 K , therefore, P . Q = J. For P, the values of Q and J are limited to powers of two.

水平IDFT和垂直轉換分別顯示於表達式8和9中: Horizontal IDFT and vertical conversion are shown in Expressions 8 and 9, respectively:

在垂直轉換中的轉換核心W N /4 可使用歐拉定律計算: The conversion kernel W N /4 in vertical conversion can be calculated using Euler's law:

因為直接轉換係數X (k )代表在帶寬有限的LFE通道中的音頻信號,只有L這些係數可以具有0以外的值,其中L是遠小於K。結果,從預處理器階段401獲得的不超過複數頻域轉換係數X '(k )可以具有0以外的值,並且垂直轉換的長度可降低。值M被選擇,藉此它是2的最小的 冪次等於或大於這個數位,並且折疊處理被修改以推導出M 複值頻域轉換係數X' (k ),其包括可具有非零值之L實值直接轉換係數。這些M 複值頻域轉換係數X' (k )是要藉由轉換階段402來處理。垂直轉換的尺寸R被選擇為使得。轉換係數X'(P.r+p)是零,對於Pr+p2R,或者替代地,r R 。藉由考慮這些因素,表達式9可以被寫為: Since the direct conversion factor X ( k ) represents an audio signal in a limited bandwidth LFE channel, only L these coefficients can have values other than zero, where L is much smaller than K. As a result, no more than obtained from the preprocessor stage 401 The complex frequency domain conversion coefficient X '( k ) may have a value other than 0, and the length of the vertical conversion may be reduced. The value M is selected whereby it is the smallest power of 2 equal to or greater than This digit, and the folding process is modified to derive the M complex-valued frequency domain conversion coefficient X' ( k ), which includes L real-valued direct conversion coefficients that may have non-zero values. These M complex-valued frequency domain conversion coefficients X' ( k ) are to be processed by the conversion stage 402. The size of the vertical conversion R is selected such that . The conversion factor X'(P.r+p) is zero for Pr+p 2R, or alternatively, r R. By considering these factors, Expression 9 can be written as:

5.集成預處理器和垂直轉換5. Integrated preprocessor and vertical conversion

如上所述折疊技術結合轉換分解技術的效率可如表達式9中所示藉由集成預處理器的階段401和垂直轉換成處理進一步改善。在圖3中示意性地說明了這一點。The efficiency of the folding technique in combination with the conversion decomposition technique as described above can be further improved by the stage 401 of integration of the preprocessor and the vertical conversion processing as shown in Expression 9. This is illustrated schematically in Figure 3.

垂直轉換的長度R可被選擇等於值M或值M的2的 冪次分數。在符合上面提到的AC-3標準的實施方式中,實值頻域轉換係數的數½.N 是等於256,並且在LFE通道中的音頻信號的頻譜內容可以藉由7個實值轉換係數X (k )表示,其中0k<7。預處理器階段401折疊這些7個實值轉換係數成4個複值轉換係數,其後續藉由長度為J =¼.N =128的折疊轉換處理。結果,在本實施方式中已知的4複值轉換係數,M等於4,並且R可藉由分別設定P等於1,2或4,而被設定為等於4,2或1。因為PQ =J ,當P分別等於1,2和4時,水平轉換長度Q等於128,64和32。當P=1時,效率中的增益很少或根本沒有達到。The length R of the vertical transition can be chosen to be equal to the power M of the value M or the value M of 2. . In the embodiment conforming to the AC-3 standard mentioned above, the number of real-valued frequency domain conversion coefficients is 1⁄2. N is equal to 256, and the spectral content of the audio signal in the LFE channel can be represented by 7 real-valued conversion coefficients X ( k ), where 0 k<7. The preprocessor stage 401 folds the seven real value conversion coefficients into four complex value conversion coefficients, which are subsequently followed by a length of J = 1⁄4. N = 128 folding conversion processing. As a result, the 4 complex-valued conversion coefficients known in the present embodiment, M is equal to 4, and R can be set equal to 4, 2 or 1 by setting P equal to 1, 2 or 4, respectively. Because P. Q = J , when P is equal to 1, 2 and 4, respectively, the horizontal transition length Q is equal to 128, 64 and 32. When P = 1, the gain in efficiency is little or not at all.

當P被設定等於2時,從垂直轉換指數的輸出得到的值不必被位元反轉在每個水平轉換中計算已知的小量的係數。用於Cooley-Tukey快速傅立葉轉換算法的轉換指數的位元反轉的需要是眾所周知。然而,當P被設定為等於2,位元反轉不被需要,因為用於長度為2的複數DFT的位元反轉產生相同的係數索引,其藉由不執行位元反轉達成。這種計算的優勢係藉由具有較大量的水平轉換來偏移以執行。用於P和Q的值可被選擇以因應於不同的設計考慮,如所選擇在硬體中的處理限制以實施處理。When P is set equal to 2, the value obtained from the output of the vertical conversion index does not have to be calculated by the bit inversion in each horizontal conversion to calculate a known small amount of coefficient. The need for bit inversion for the conversion index of the Cooley-Tukey fast Fourier transform algorithm is well known. However, when P is set equal to 2, bit inversion is not required because the bit inversion for the complex DFT of length 2 produces the same coefficient index, which is achieved by not performing bit inversion. The advantage of this calculation is offset by having a larger amount of horizontal transitions to perform. The values for P and Q can be selected to accommodate different design considerations, such as processing constraints selected in the hardware to implement the process.

具有表達式9所示的垂直轉換之表達式3中所示的處理的集成可以藉由分別在根據表達式3和10的表達式9中用於作X' (k )和(W N /4 )(Pr +P ).n 的替換被推導出。這些置換得到以下用於垂直轉換的核心函數: 對於The integration of the processing shown in Expression 3 having the vertical conversion shown in Expression 9 can be performed by X' ( k ) and ( W N /4 , respectively, in Expression 9 according to Expressions 3 and 10. ) ( P . r + P ). The replacement of n is derived. These permutations yield the following core functions for vertical conversion: for .

在表達式12中的正弦和餘弦項的外積可以被重寫為: 其中s=P.q+p;;且 The outer product of the sine and cosine terms in Expression 12 can be rewritten as: Where s=P. q+p; And

其可見 其我們表示為I (s ,n )以簡化下面的表達式。使用這種表示,表達式11可以被重寫為: 其中v =Pr +p ;且 Visible It is expressed as I ( s , n ) to simplify the expression below. Using this representation, Expression 11 can be rewritten as: Where v = P . r + p ; and

執行複數乘法,我們獲得: To perform complex multiplication, we get:

函數U (n ,p )的計算複雜度可以進一步減少藉由利用頻域係數X(v) 可以僅對於0 v <2R 是非0的事實。此減少反映在以下表達式中,其亦分別劃分函數為實部和虛部分量函數U R (n ,p )和U I (n ,p ),其中U (n,p )=U R (n,p )+jU I (n,p ): The computational complexity of the function U ( n , p ) can be further reduced by using the frequency domain coefficient X(v) only for 0 v < 2R is a fact of non-zero. This reduction is reflected in the following expression, which also divides the function into real and imaginary component functions U R ( n , p ) and U I ( n , p ), where U ( n,p )= U R ( n ) ,p )+ j . U I ( n,p ):

預處理器階段401和垂直轉換的集成在圖4中被示意性地繪示。The integration of pre-processor stage 401 and vertical conversion is schematically illustrated in FIG.

所需來實施的函數U (n ,p )的計算資源或其分量函數U R (n ,p )和U I (n,p )可以被減少r藉由預先計算用於v,un 的所有值之函數sin(I (v ,n )),cos(I (v ,n )),sin(I (u ,n ))和cos(I (u ,n ))。在查找表中儲存計算的結果需要4.PRQ 表列值,其中4的因子佔在表達式17中的正弦,餘弦,vu 的所有組合。The computational resources of the functions U ( n , p ) required to be implemented or their component functions U R ( n , p ) and U I ( n,p ) can be reduced by pre-calculation for v, u and n The function of all values is sin( I ( v , n )), cos( I ( v , n )), sin( I ( u , n )) and cos( I ( u , n )). The result of storing the calculation in the lookup table needs to be 4. P. R . The Q- listed value, where the factor of 4 occupies all combinations of sine, cosine, v, and u in Expression 17.

藉由對於所有的n,辨識,表的尺寸可進一步減少12.5%。結果,所需用於在表達式17中的X的所有因子的表列值的數目是在3.5PRQ 的順序上。By identifying for all n The size of the watch can be further reduced by 12.5%. As a result, the number of tabular values required for all the factors of X in Expression 17 is 3.5 . P. R . The order of Q.

如果這些表的尺寸是大於所希望的,它們的尺寸可以藉由減少利用由於正弦和餘弦基礎函數的週期性,在表中用於I (v ,n )的表列值具有許多重複的值的事實。因為更詳細的表列值方案將需要存取表中的數據,這種減小尺寸可以被達成以交換需要查找表中的表列值之附加的處理資源。If the size of these tables is larger than desired, their size can be reduced by using the periodicity of the sine and cosine basis functions, and the table column values for I ( v , n ) in the table have many repeated values. fact. Because a more detailed table-column value scheme would require access to the data in the table, this reduction in size can be achieved to exchange additional processing resources that require look-up table values in the table.

其他的技術也可被使用以減少表尺寸要求。例如,如果正弦和餘弦表已經存在在特定實施中,則只有I (v ,n )和I (u ,n )被需要,如此藉由2的因子降低表列值的數目。Other techniques can also be used to reduce table size requirements. For example, if sine and cosine tables are already present in a particular implementation, then only I ( v , n ) and I ( u , n ) are needed, thus reducing the number of table column values by a factor of two.

C.實施C. Implementation

納入本發明不同的態樣的裝置可以被以各種方法實施,包括藉由電腦或一些其他裝置執行的軟體,其包括更專門的組件諸如耦合到類似在通用的電腦中找到的那些組件的數位信號處理器(Digital signal processor;DSP)電路組件。圖5是裝置70的示意性方塊圖,其可用於實施本發明的態樣。處理器72提供計算資源。RAM 73是藉由處理器72使用於處理的系統隨機存取記憶體(Random access memory;RAM)。ROM 74表示持久性儲存的某種形式,諸如唯讀記憶體(Read only memory;ROM),用於儲存所需的程式以操作裝置70,並可能用於實施本發明 不同的態樣。I/O控制75代表介面電路以藉由通信通道76,77接收和傳送信號。在所示實施方式中,所有主要的系統組件連接到匯流排71,這可能代表一個以上的實體或邏輯的匯流排;然而,對於實施本發明,匯流排結構不是必需的。Devices incorporating various aspects of the present invention can be implemented in a variety of ways, including software executed by a computer or some other device, including more specialized components such as digital signals coupled to components similar to those found in general purpose computers. A digital signal processor (DSP) circuit component. FIG. 5 is a schematic block diagram of apparatus 70 that may be used to implement aspects of the present invention. Processor 72 provides computing resources. The RAM 73 is a system random access memory (RAM) used by the processor 72 for processing. ROM 74 represents some form of persistent storage, such as Read Only Memory (ROM), for storing the required programs to operate device 70, and possibly for implementing the present invention. Different aspects. I/O control 75 represents an interface circuit to receive and transmit signals via communication channels 76,77. In the illustrated embodiment, all of the major system components are coupled to busbar 71, which may represent more than one physical or logical busbar; however, for implementing the present invention, busbar architecture is not required.

在藉由通用的電腦系統所實施的實施方式中,附加的組件可包括用於連接到諸如鍵盤或鼠標和顯示器的裝置,和用於控制具有如磁帶或磁碟,或光學媒體之儲存媒體的儲存裝置78。儲存媒體可被使用以記錄用於操作系統,實用程式和應用程式的指令之程式,並且可包括實施本發明不同的態樣之程式。In embodiments implemented by a general purpose computer system, additional components may include means for connecting to a device such as a keyboard or mouse and display, and for controlling storage media having, for example, magnetic tape or magnetic disks, or optical media. Storage device 78. The storage medium can be used to record programs for operating systems, utilities, and applications, and can include programs that implement different aspects of the present invention.

實踐本發明的各態樣所需要的功能可藉由各種各樣的方法所實施的組件執行,其包括離散邏輯組件,積體電路,一個或多個特殊功能積體電路(ASIC)和/或程式控制處理器。這些組件以何種方式被實施對本發明並不重要。The functions required to practice the various aspects of the present invention can be performed by components implemented by a variety of methods, including discrete logic components, integrated circuits, one or more special function integrated circuits (ASICs), and/or The program controls the processor. The manner in which these components are implemented is not critical to the invention.

本發明的軟體實施可以由各種各樣的機器可讀媒體,諸如基帶或調製的通信路徑在包括從超音速到紫外線頻率的整個頻譜,或儲存媒體,其實質上使用任何的記錄技術以傳達訊息,包括磁帶,卡片或磁碟,光學卡或光碟,和包括紙的媒體上可檢測標記。The software implementation of the present invention can be carried out by a wide variety of machine readable media, such as baseband or modulated communication paths, including the entire spectrum from supersonic to ultraviolet frequencies, or storage media, which essentially uses any recording technique to convey the message. , including tapes, cards or disks, optical cards or compact discs, and detectable marks on media including paper.

11‧‧‧路徑11‧‧‧ Path

12‧‧‧路徑12‧‧‧ Path

20‧‧‧路徑20‧‧‧ Path

31‧‧‧路徑31‧‧‧ Path

32‧‧‧路徑32‧‧‧ Path

70‧‧‧裝置70‧‧‧ device

71‧‧‧匯流排71‧‧‧ Busbar

72‧‧‧處理器72‧‧‧ processor

73‧‧‧隨機存取記憶體73‧‧‧ Random access memory

74‧‧‧唯讀記憶體74‧‧‧Read-only memory

75‧‧‧I/O控制75‧‧‧I/O control

76‧‧‧通信通道76‧‧‧Communication channel

77‧‧‧通信通道77‧‧‧Communication channel

78‧‧‧儲存裝置78‧‧‧Storage device

100‧‧‧傳送器100‧‧‧transmitter

111‧‧‧分析濾波器組111‧‧‧Analysis filter bank

112‧‧‧分析濾波器組112‧‧‧Analysis filter bank

120‧‧‧編碼器120‧‧‧Encoder

200‧‧‧接收器200‧‧‧ Receiver

220‧‧‧解碼器220‧‧‧Decoder

231‧‧‧合成濾波器組231‧‧‧Synthesis filter bank

232‧‧‧合成濾波器組232‧‧‧Synthesis filter bank

401‧‧‧階段401‧‧‧ stage

402‧‧‧階段402‧‧‧ stage

圖1是音頻編碼系統的示意性方塊圖,其中本發明不 同的態樣可被實施。1 is a schematic block diagram of an audio coding system in which the present invention does not The same aspect can be implemented.

圖2是處理的示意性方塊圖,其可用以執行於圖1的編碼系統中的合成轉換。2 is a schematic block diagram of a process that may be used to perform the composite conversion in the encoding system of FIG.

圖3和圖4是示意性方塊圖,其繪示出了一些特徵,可用以執行示於圖2的處理的一部分。Figures 3 and 4 are schematic block diagrams depicting features that may be used to perform a portion of the process illustrated in Figure 2.

圖5是裝置的示意性方塊圖,其可用以實施本發明不同的態樣。Figure 5 is a schematic block diagram of an apparatus that can be used to implement different aspects of the present invention.

11、12、20、31、32‧‧‧路徑11, 12, 20, 31, 32‧ ‧ path

100‧‧‧傳送器100‧‧‧transmitter

111‧‧‧分析濾波器組111‧‧‧Analysis filter bank

112‧‧‧分析濾波器組112‧‧‧Analysis filter bank

120‧‧‧編碼器120‧‧‧Encoder

200‧‧‧接收器200‧‧‧ Receiver

220‧‧‧解碼器220‧‧‧Decoder

231、232‧‧‧合成濾波器組231, 232‧‧‧Synthesis filter bank

Claims (7)

一種用於處理一數位音頻信號的方法,其中該方法包含:接收實值轉換係數的一方塊,其中該方塊具有實值轉換係數的一量K,其中的只有該實值轉換係數的一數L代表一有限帶寬音頻信號的頻譜分量,½L<M<K,且M是2的一冪次;施加長度R的一第一轉換到推導自M複值轉換係數之複值係數之一方塊,其包括代表該有限帶寬音頻信號的頻譜分量之該L實值轉換係數,其中,且P是2的一冪次;施加一組長度P的Q第二轉換到該第一轉換的輸出;以及從該組第二轉換的輸出推導出N實值信號樣本的一序列,其中N=2‧K且該實值信號樣本代表該有限帶寬音頻信號的時間分量。A method for processing a digital audio signal, wherein the method comprises: receiving a block of real conversion coefficients, wherein the block has an amount K of real conversion coefficients, wherein only a number L of the real conversion coefficients Representing a spectral component of a finite bandwidth audio signal, 1⁄2L < M < K, and M is a power of 2; applying a first transition of length R to a block of derived complex coefficients of the M complex value conversion coefficient, Included is the L real conversion factor representing a spectral component of the limited bandwidth audio signal, wherein And P is a power of 2; applying a set of lengths Q of Q to the output of the first transform; and deriving a sequence of N real-valued signal samples from the set of second transformed outputs, where N = 2‧K and the real value signal sample represents the time component of the limited bandwidth audio signal. 如申請專利範圍第1項所述之方法,其中:每一該等第二轉換係等效於執行表示為的計算,對於0 n <Q 且,0 m <P ;實值信號樣本的該序列係推導自該組第二轉換的該輸出,藉由執行等效於 計算; 其中x ' 代表該第二轉換的該輸出;U (n ,p )=該第一轉換的一核心函數; y (n )代表該中間信號樣本;Re[x ' (n )]=x' (n )的實部;Im[x ' (n )]=x' (n )的虛部;j =虛數算子等於;以及m,n,及p係用於計算中的指數。The method of claim 1, wherein: each of the second conversion systems is equivalent to execution represented as Calculation for 0 n < Q and 0 m <P; the sequence of real-valued signal samples is derived from the output of the second set of transitions, by performing an equivalent Calculating; wherein x ' represents the output of the second transform; U ( n , p ) = a core function of the first transform; y ( n ) represents the intermediate signal sample; the real part of Re[ x ' ( n )]= x' ( n ); the imaginary part of Im[ x ' ( n )]= x' ( n ); j = the imaginary number Subequal ; and m, n, and p are used for the index in the calculation. 如申請專利範圍第2項所述之方法,其中該第一轉換係等效於執行表示為 的計算,對於0 n <Q ,且0p<P ;其中X代表該等實值轉換係數;;以及 r係用於計算中的一指數。The method of claim 2, wherein the first conversion is equivalent to the execution representation Calculation for 0 n < Q and 0 p<P; where X represents the real value conversion coefficients; And r is used for an index in the calculation. 如申請專利範圍第2項所述之方法,其中該第一轉換係等效於執行表示為 的計算,對於0 n <Q ,且0p<P ;其中X代表該等實值轉換係數; v =Pr +p;以及 r係用於計算中的一指數。The method of claim 2, wherein the first conversion is equivalent to the execution representation Calculation for 0 n < Q and 0 p<P; where X represents the real value conversion coefficients; v = P . r + p ; And r is used for an index in the calculation. 如申請專利範圍第2項所述之方法,其中該第一轉換係等效於執行表示為 的計算,對於0 n <Q ,且0p<P ;其中X代表該等實值轉換係數; v =Pr +p;以及r係用於計算中的一指數。The method of claim 2, wherein the first conversion is equivalent to the execution representation Calculation for 0 n < Q and 0 p<P; where X represents the real value conversion coefficients; v = P . r + p ; And r is used for an index in the calculation. 一種用於處理一數位音頻信號的設備,其中該設備包含用於執行申請專利範圍第1項到第5項之任一項的方法之所有步驟的工具設施。An apparatus for processing a digital audio signal, wherein the apparatus comprises a tooling facility for performing all the steps of the method of any one of claims 1 to 5. 一種儲存媒體,記錄可藉由一裝置執行的指令之一程式以執行用於處理一數位音頻信號的方法,其中該方法包含申請專利範圍第1至5項之任一項的方法的所有步驟。A storage medium recording a program executable by a device to perform a method for processing a digital audio signal, wherein the method comprises all the steps of the method of any one of claims 1 to 5.
TW101135308A 2012-03-19 2012-09-26 Reduced complexity transform for a low-frequency-effects channel TWI470622B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2012/029603 WO2012134851A1 (en) 2011-03-28 2012-03-19 Reduced complexity transform for a low-frequency-effects channel

Publications (2)

Publication Number Publication Date
TW201340095A TW201340095A (en) 2013-10-01
TWI470622B true TWI470622B (en) 2015-01-21

Family

ID=49775728

Family Applications (1)

Application Number Title Priority Date Filing Date
TW101135308A TWI470622B (en) 2012-03-19 2012-09-26 Reduced complexity transform for a low-frequency-effects channel

Country Status (2)

Country Link
AR (1) AR088059A1 (en)
TW (1) TWI470622B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201129970A (en) * 2009-10-20 2011-09-01 Fraunhofer Ges Forschung Audio signal encoder, audio signal decoder, method for encoding or decoding and audio signal using an aliasing-cancellation

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TW201129970A (en) * 2009-10-20 2011-09-01 Fraunhofer Ges Forschung Audio signal encoder, audio signal decoder, method for encoding or decoding and audio signal using an aliasing-cancellation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Patrick de Smet et al: "Optimized MPEG audio decoding using recursive subband synthesis windowing", 2002 IEEE Iternational Conference on Acoustics, Speech, and Signal Processing. Proceedings. (ICASSP). Orlando, FL, May 13 2002, ISBN: 978-0-7803-7402-7. Supriya Dhabal, S. M. Lalan Chowdhury, and P. Venkateswaran, " A Novel Low Complexity Multichannel Cosine Modulated Filter Bank Using IFIR Technique for Nearly Perfect Reconstruction", 1st Int’l Conf. on Recent Advances in Information Technology (RAIT-2012), Date 15~17 March 2012. *

Also Published As

Publication number Publication date
AR088059A1 (en) 2014-05-07
TW201340095A (en) 2013-10-01

Similar Documents

Publication Publication Date Title
US11580995B2 (en) Reconstruction of audio scenes from a downmix
CN101882441B (en) Efficient filtering with a complex modulated filterbank
RU2645271C2 (en) Stereophonic code and decoder of audio signals
JP2022174061A (en) Decoder for decoding encoded audio signal and encoder for encoding audio signal
KR101286329B1 (en) Low complexity spectral band replication (sbr) filterbanks
JP2007526691A (en) Adaptive mixed transform for signal analysis and synthesis
CN102915739A (en) Method and apparatus for encoding and decoding high frequency signal
Britanak et al. Cosine-/Sine-Modulated Filter Banks
US8433584B2 (en) Multi-channel audio decoding method and apparatus therefor
TWI470622B (en) Reduced complexity transform for a low-frequency-effects channel
Khaldi et al. HHT-based audio coding
US10410644B2 (en) Reduced complexity transform for a low-frequency-effects channel
AU2012238001A1 (en) Reduced complexity transform for a low-frequency-effects channel
BR112013022988B1 (en) Method for processing a digital audio signal, apparatus for processing a digital audio signal and storage medium
Zhu et al. Fast convolution for binaural rendering based on HRTF spectrum