WO2008074251A1 - A hierarchical coding decoding method and device - Google Patents

A hierarchical coding decoding method and device Download PDF

Info

Publication number
WO2008074251A1
WO2008074251A1 PCT/CN2007/071154 CN2007071154W WO2008074251A1 WO 2008074251 A1 WO2008074251 A1 WO 2008074251A1 CN 2007071154 W CN2007071154 W CN 2007071154W WO 2008074251 A1 WO2008074251 A1 WO 2008074251A1
Authority
WO
WIPO (PCT)
Prior art keywords
subband
enhancement layer
sub
module
band
Prior art date
Application number
PCT/CN2007/071154
Other languages
French (fr)
Chinese (zh)
Inventor
Hualin Wan
Jun Zhang
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Publication of WO2008074251A1 publication Critical patent/WO2008074251A1/en

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/0212Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/24Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding

Definitions

  • the present invention relates to codec technology, and in particular to a layered audio codec method and apparatus. Background of the invention
  • audio codecs are more and more widely used in digital audio broadcasting, high-quality audio transmission on the Internet, digital movies, and the like.
  • the audio codec system can be adapted to different application environments. Audio layered coding technology is developed under this requirement.
  • the layered feature means that the audio signal is organized in layers, dividing the signal into low-quality parts and high-quality parts, and the low-quality part of the signal is the audio signal.
  • the core layer, the high quality part of the signal is the enhancement layer of the audio signal, and the low quality part can be decoded without any high quality part information.
  • the layering feature is particularly useful when the transmission channel does not guarantee full bandwidth to transmit a complete signal.
  • users accessing audio through high-speed links can play 384kbit/s encoded surround sound at the right time, while users with only 56kbit/s modems cannot enjoy it.
  • users with only 56kbit/s modems cannot enjoy it.
  • users connected at a code rate of 56 kbit/s can download the core layer of the signal and enjoy a lower quality audio.
  • FIG. 1 is a schematic structural diagram of a layered audio coding device in the prior art, which includes a Quadrature Mirror Filterbanks (QMF) 101, a QMF 102, and a Code Excited Linear Prediction (CELP).
  • Encoding module 103 CELP decoding module 104, adder 105, modified discrete cosine transform (MDCT, Modified Discrete Cosine Transform module 106, MDCT module 107, Time Domain Alias Cancellation (TDAC) encoding module 108, Time Domain Bandwidth Extension (TDB WE, Time Domain Bandwidth Extension) module 109, bit stream multiplexing and Packing module 110.
  • QMF Quadrature Mirror Filterbanks
  • CELP Code Excited Linear Prediction
  • the QMF101 filters the input pulse code modulation (PCM) signal and outputs the signal to the core layer.
  • PCM pulse code modulation
  • the input to the QMF101 is a PCM input signal at a sampling frequency of 16,000 Hz.
  • the QMF 102 filters the input PCM signal and outputs the enhancement layer signal.
  • the PCM signal is filtered by QMF101 and QMF2 and divided into a core layer signal and an enhancement layer signal.
  • the CELP encoding module 103 performs CELP encoding on the core layer signal input by the QMF1, and transmits the encoded data to the CELP decoding module 104 and the bit stream multiplexing and packing module 110.
  • the CELP decoding module 104 performs CELP decoding on the encoded data input by the CELP encoding module 103, and then transmits the encoded data to the adder 105.
  • the adder 105 subtracts the core layer signal input from the QMF 101 and the signal input from the CELP decoding module 104, and transmits the output signal to the MDCT module 106.
  • the MDCT module 106 converts the signal input by the adder 105 from the time domain to the frequency domain to obtain MDCT coefficients, which are transmitted to the TDAC encoding module 108.
  • the MDCT module 107 converts the enhancement layer signal input by the QMF 102 from the time domain to the frequency domain to obtain the MDCT coefficients of the enhancement layer, and transmits the coefficients to the TDAC encoding module 108.
  • the TDAC encoding module 108 performs TDAC encoding on the MDCT coefficients input by the MDCT module 106 and the enhancement layer MDCT coefficients input by the MDCT module 107, and transmits the encoded data to the bit stream multiplexing module 110.
  • the MDCT coefficients from 0 to 7000 Hz are divided into 18 sub-bands, and the envelope values of the 18 sub-bands are calculated. According to the size of the envelope value, the number of coding bits is allocated to each sub-band according to the coding of each sub-band. The number of bits quantizes and encodes each subband.
  • the TDBWE module 109 extracts high frequency parameters from the enhancement layer signals input by the QMF 102 and transmits them to the bit stream multiplexing and packing module 110.
  • the bit stream multiplexing and packing module 110 multiplexes and packs the encoded data input by the CELP encoding module 103, the encoded data input by the TDAC encoding module 108, and the data input by the TDBWE 109.
  • FIG. 1 is a schematic structural diagram of a layered audio decoding device corresponding to FIG. 1a, which includes a bit stream demultiplexing module 120, a CELP decoding module 121, a TDAC decoding module 122, and a TDBWE decoding module 123.
  • the bitstream demultiplexing module 120 demultiplexes the received encoded data, transmits the demultiplexed core layer encoded data to the CELP decoding module 121, and transmits the other layer data to the TDAC decoding module 122 and the TDBWE decoding module. 123.
  • the CELP decoding module 121 decodes the received core layer encoded data and transmits it to the adder 124.
  • the TDAC decoding module 122 decodes the received encoded data and transmits it to the inverse MDCT module 125 and the inverse MDCT module 126.
  • the TDBWE decoding module 123 decodes the received encoded data and transmits it to
  • the inverse MDCT module 125 converts the received frequency domain signal into a time domain signal and transmits it to the adder 124.
  • the inverse MDCT module 126 converts the received frequency domain signal into a time domain signal and transmits the signal to the time domain signal.
  • the adder 124 adds the core layer decoded data input by the CELP decoding module 121 and the data input by the inverse MDCT module 125, and transmits the result of the summation to the QMF 127.
  • QMF127 upsamples the received signal to obtain the core layer signal.
  • QMF1208 upsamples the received signal to obtain an enhancement layer signal.
  • the adder 129 adds the core layer signal input by the QMF 127 and the enhancement layer signal input by the QMF 128 to obtain a decompressed PCM code stream.
  • the human auditory system can sense sounds in the frequency range of 20 Hz to 20,000 Hz.
  • the upper limit of the frequency depends on the condition of each person's auditory system and the intensity of the sound.
  • the average human auditory system has a frequency of 2,000 Hz to 8,000 Hz.
  • the sound within the range is sensitive.
  • the prior art processes an input signal of a sampling frequency of 16,000 Hz, assigns the number of coded bits according to the envelope value of each sub-band, and ranks the sub-band coded data with a large envelope value as the lower layer information, which is feasible.
  • an input signal at a sampling frequency of 32,000 Hz, 44, 100 Hz or 48,000 Hz there are four major drawbacks to this approach.
  • a sub-band near 16,000 Hz has a large envelope value, but may not reach the threshold that the human ear can perceive, that is, the human ear is not sensitive. If more bits are allocated for this sub-band, Subbands that are really important do not have enough bits to encode and affect the quality of the encoding. This method may also make the important sub-band of human ear sensitivity be ranked behind the code stream because of the small envelope value, and is preferentially discarded when the network condition is not good, which will affect the user's auditory feeling. That is to say, the prior art layered audio codec method cannot effectively solve the problem of high sampling frequency signal input.
  • the QMF used in the prior art increases the complexity of the codec algorithm and increases the delay of the codec algorithm.
  • the CELP code used for the core layer signal is designed to adapt to the characteristics of the speech signal. It is not suitable for other types of signals that are also low frequency, which will affect the codec effect. Summary of the invention
  • Another object of embodiments of the present invention is to provide a layered audio decoding apparatus that effectively improves decoding quality.
  • a layerable audio coding device comprising: a layered module based on an auditory perception model, an auditory perception model, a subband envelope calculation and coding module, a core layer coding module, an enhancement layer coding module, and bit stream multiplexing and Packaging module
  • the hierarchical module based on the auditory perception model performs modulation overlap transformation on the input signal
  • MLT Modulated Lapped Transform
  • the auditory perception model provides a layered basis for the layered module based on the auditory perception model, and provides a basis for weighting the sub-band importance of the enhancement layer coding module;
  • the sub-band envelope calculation and coding module calculates the envelope value of each sub-band of the core layer signal and the enhancement layer signal based on the auditory perception model based on the core layer signal and the enhancement layer signal,
  • the envelope values of the core layer signal and the core layer signal subbands are sent to the core layer coding module, and the enhancement layer signal and the enhancement layer signal subband envelope values are transmitted to the enhancement layer coding module; Encoding, transmitting the encoded data to the bitstream multiplexing and packing module;
  • the core layer coding module encodes the input core layer signal according to the envelope value of each subband of the input core layer signal, and then transmits the signal to the bit stream multiplexing and packing module;
  • the enhancement layer coding module encodes the input enhancement layer signal according to the auditory perception model and the envelope value of each subband of the input enhancement layer signal, and then transmits the signal to the bit stream multiplexing and packing module;
  • the bit stream multiplexing and packing module the coded data of each sub-band of the core layer input by the core layer coding module, the coded data of each sub-band of the enhancement layer input by the enhancement layer coding module, and the calculation of the sub-band envelope and the input of the coding module
  • the subband envelope value encoded data is multiplexed and packed.
  • a layerable audio decoding device comprising: a bit stream demultiplexing module, a subband envelope decoding module, a core layer decoding module, an enhancement layer decoding module, an auditory perception model, an MLT coefficient reconstruction and an inverse transform module;
  • the bitstream demultiplexing module decomposes the received encoded data into subband envelope value encoded data, core layer encoded data, and enhancement layer encoded data, and transmits the data to the subband envelope decoding module; the subband envelope The decoding module decodes the sub-band envelope value encoded data, obtains the envelope value of each sub-band, and transmits the core layer encoded data and the envelope value of each sub-band of the core layer to the core layer decoding module, and the enhanced layer encoded data And transmitting an envelope value of each subband of the enhancement layer to the enhancement layer decoding module;
  • the core layer decoding module decodes the input core layer encoded data according to the envelope value of each subband of the input core layer, obtains the MLT coefficients of the decompressed core layer subbands, and transmits the MLT coefficients to the MLT coefficient reconstruction and inverse Transformation module
  • the enhancement layer decoding module decodes the input enhanced coding data according to the auditory perception model and the envelope value of each subband of the input enhancement layer, and obtains the MLT coefficients of each subband of the decompressed enhancement layer, and the enhancement layer each sub-band
  • the MLT coefficient of the band and the envelope value of each subband of the enhancement layer are transmitted to the MLT coefficient reconstruction and inverse transform module;
  • the auditory perception model provides a basis for subband importance weighting of the enhancement layer decoding module; the MLT coefficient reconstruction and inverse transform module inverses the MLT coefficients of each subband of the core layer and the MLT coefficients of each subband of the enhancement layer Transform to get the decompressed output signal.
  • a layerable audio codec method comprising:
  • the input signal After the input signal is passed through the MLT, it is divided into a core layer signal and an enhancement layer signal according to the auditory perception model, and the encoded data of the envelope values of each sub-band is obtained according to the core layer signal and the enhancement layer signal;
  • the encoded data of each sub-band of the core layer is obtained according to the envelope values of the core layer signal and the sub-bands of the core layer signal, and the enhancement layer sub-bands are obtained according to the enhancement layer signal, the auditory perception model, and the envelope values of the sub-bands of the enhancement layer signal.
  • the encoded data of the band is multiplexed and packed together with the encoded data of the envelope values of the sub-bands, the coded data of each sub-band of the core layer, and the coded data of each sub-band of the enhancement layer, and then transmitted to the decoding end.
  • the layered audio codec solution of the embodiment of the present invention performs MLT on the input signal, and after multiplexing and packing the data according to the auditory perception model, the data is transmitted to the decoding end, thereby improving the quality of the codec.
  • the invention solves the problem that the high sampling rate input signal cannot be effectively processed in the prior art.
  • Figure la is a schematic structural diagram of a layerable audio coding device in the prior art
  • Figure 1b is a schematic structural view of a layered audio decoding device corresponding to the figure la in the prior art
  • FIG. 2 is a schematic structural diagram of a layered coding apparatus according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram showing the structure of the audio code stream after multiplexing and packing in FIG. 2;
  • FIG. 4 is a schematic structural diagram of a layered decoding apparatus according to an embodiment of the present invention.
  • FIG. 5 is a flowchart of a layered coding method according to an embodiment of the present invention.
  • FIG. 6 is a flowchart of a layered decoding method according to an embodiment of the present invention. Mode for carrying out the invention
  • FIG. 2 is a schematic structural diagram of a layered coding apparatus according to an embodiment of the present invention, including a layered module 210, an auditory perception model 220, and a subband envelope calculation based on an auditory perception model.
  • the layering module 210 based on the auditory perception model divides the PCM signal into MLT coefficients according to the auditory perception model 220, and then divides into the core layer signal and the enhancement layer signal. It includes an MLT module 211, a subband partitioning module 212, and a band importance layering module 213.
  • the MLT module 211 performs MLT on the input PCM signal and converts it into an MLT coefficient.
  • the subband division module 212 divides each frame MLT coefficient into a plurality of equally spaced subbands, or divides each frame MLT coefficient into a plurality of non-equal interval subbands according to the auditory perception model 220.
  • the method of dividing into a plurality of non-equal interval sub-bands is: dividing the MLT coefficient into a plurality of non-equally spaced sub-bands according to the auditory perception model 220, the bandwidth of the sub-band is related to its frequency position, and for the MLT coefficient with low frequency, Divide narrower subbands and divide the wider subbands for high frequency MLT coefficients.
  • the band importance layering module 213 divides the MLT coefficients divided into a plurality of sub-bands into core layer signals including sensitive signals and enhancement layer signals including sub-sensitive signals according to the auditory perception model 220.
  • the MLT coefficients in the frequency range sensitive to the human ear are divided into core layer signals, and the MLT coefficients in the frequency range of the human ear sensitivity are divided into enhancement layer signals.
  • the human ear is sensitive to signals in the frequency range of 2, 000HZ-8, 000HZ, and the MLT coefficient in the frequency range of 0HZ ⁇ 8, 000HZ can be divided into core layer signals, which will be above 8,000HZ.
  • the MLT coefficients in the frequency range are divided into enhancement layer signals.
  • the core layer signal and the enhancement layer signal respectively include a plurality of sub-bands.
  • the auditory perception model 220 provides a basis for the non-equal interval division of the MLT coefficients of the sub-band partitioning module 212, and provides a basis for the sub-band layering of the band importance layering module 213, which is the sub-band importance of the sub-band importance weighting module 251. Weighted basis.
  • Subband envelope calculation and coding module 230 which is calculated according to the core layer and the enhancement layer signal
  • the envelope values of the core layer signals and the subbands of the core layer signal are transmitted to the core layer encoding module 240.
  • the envelope values of the subbands of the enhancement layer signal and the enhancement layer signal are transmitted to the enhancement layer encoding module 250; the subband envelope values are encoded, and the encoded data is transmitted to the bitstream multiplexing and packing module 260.
  • the core layer encoding module 240 encodes the input core layer signal according to the envelope value of each subband of the core layer signal input by the subband envelope calculation and encoding module 230, and then transmits the signal to the bit stream multiplexing and packing module 260. It includes a subband bit allocation module 241 and a quantization and coding module 242.
  • the sub-band bit allocation module 241 receives the core layer signal input by the sub-band envelope calculation and encoding module 230 and the envelope value of each sub-band of the core layer signal, and allocates each sub-band according to the envelope value of each sub-band of the core layer signal.
  • the number of bits, the bit number information of each subband signal and the core layer signal are transmitted to the quantization and coding module 242.
  • the core layer signal includes a plurality of sub-bands, that is, MLT coefficients of a plurality of sub-bands that are divided into layers.
  • the quantization and coding module 242 quantizes and encodes each sub-band signal of the input core layer signal according to the number of bits of each sub-band of the core layer, and transmits the encoded data of each sub-band of the core layer to the bit stream multiplexing and packaging. Module 260.
  • the enhancement layer coding module 250 encodes the input enhancement layer signal according to the envelope value of each subband of the enhancement layer signal input by the subband envelope calculation and coding module 230 and the auditory perception model 220, and then transmits the signal to the bit stream multiplexing.
  • packaging module 260 It includes a subband importance weighting module 251, a subband bit allocation module 252, and a quantization and encoding module 253.
  • the subband importance weighting module 251 receives the enhancement layer signal input by the subband envelope calculation and encoding module 230 and the envelope value of each subband of the enhancement layer, according to the envelope value and the auditory perception model of each subband of the input enhancement layer. 220. Perform weighting calculation on the importance of each subband of the enhancement layer signal, and calculate the weighted result of each subband of the enhancement layer and the enhancement layer signal transmission.
  • the subband is assigned to the bit allocation module 252.
  • the embodiment of the present invention is based on the auditory perception model 220 and the enhancement layer.
  • the signal is weighted: the sub-band sensitive to the human ear, the weighted result is the product of the envelope value of the sub-band and a larger weight value; for the sub-band sensitive to the human ear, the importance weighted result The product of the envelope value of the subband and a smaller weight value.
  • the importance of each sub-band of the enhancement layer is only determined by the envelope value, and in the embodiment of the present invention, the importance of each sub-band of the enhancement layer is jointly performed by the envelope value and the human ear sensitivity. Decide.
  • the sub-band bit allocation module 252 receives the importance weighting result of each sub-band of the enhancement layer input by the sub-band importance weighting module 251 and the enhancement layer signal, and is an enhancement layer according to the weighting of the importance of each sub-band of the enhancement layer signal.
  • Each sub-band signal of the signal is assigned a bit number, and the bit number information and the enhancement layer signal of each sub-band signal are transmitted to the quantization and coding module 253.
  • the quantization and coding module 253 receives the bit number information of each subband of the enhancement layer and the enhancement layer signal input by the subband bit allocation module 252, and according to the bit number of each subband signal of the enhancement layer, each subband of the enhancement layer signal
  • the signals are quantized and encoded, and the encoded data for each subband of the enhancement layer is transmitted to the bitstream multiplexing and packing module 260.
  • the bit stream multiplexing and packing module 260 the coded data of each sub-band of the core layer input by the quantization and coding module 242, the coded data of each sub-band of the enhancement layer input by the quantization and coding module 253, and the sub-band envelope calculation and coding module 230
  • the input subband envelope value encoded data is multiplexed and packed.
  • the subband envelope calculation and the subband envelope value encoded data input by the encoding module 230 The sub-band envelope value corresponding to each sub-band of the core layer signal and the sub-band envelope value corresponding to each sub-band of the enhancement layer signal are included.
  • FIG. 3 it is a schematic diagram of the structure of the audio stream after multiplexing and packing in FIG. 2, including a core part and an enhancement part.
  • the core part includes a frame header, coded data of each sub-band envelope value, and core layer coded data, and the core layer coded data is layer 0 coded data in the figure, and the coded data of each sub-band of the core layer is from low to high according to frequency. Arranged in order.
  • the enhancement layer portion is composed of enhancement layer coded data and is divided into layer 1 coded data as shown in the figure to layer N coded data.
  • the method for placing the coded data of each sub-band of the enhancement layer into the code stream is: placing the coded data of each sub-band of the enhancement layer into the code stream in order of importance from the largest to the smallest, and encoding the data of a certain sub-band of the enhancement layer.
  • the code stream Before placing the code stream, first calculate the sum of the number of bits used by the code stream of the frame and the bit number of the certain sub-band, and compare it with the total number of bits available in the frame, if less than or equal to The total number of bits, the certain sub-band encoded data is placed into the code stream, and the used bit number is updated to the sum of the number of used bits and the bit number of the encoded data of the certain sub-band, and continues Insert the next sub-band encoded data; otherwise, stop placing the sub-band encoded data, and fill the remaining available bits with a preset value, such as "1" or "0", that is, discard the one Subband encoded data and all subband encoded data less important than the certain subband encoded data.
  • a preset value such as "1" or "0"
  • FIG. 4 is a schematic structural diagram of a layered decoding apparatus according to an embodiment of the present invention, including a bit stream demultiplexing module 410, a subband envelope decoding module 420, a core layer decoding module 430, an enhancement layer decoding module 440, and an auditory perception model. 450, MLT coefficient reconstruction and inverse transform module 460.
  • the bit stream demultiplexing module 410 demultiplexes the received encoded data into subband envelope value encoded data, core layer encoded data, and enhancement layer encoded data, and transmits the encoded data to the subband envelope decoding module 420.
  • the core layer coded data is a whole composed of a plurality of core layer sub-band coded data
  • the enhancement layer coded data is a whole composed of a plurality of enhancement layer sub-band coded data.
  • the sub-band envelope decoding module 420 receives the core layer coded data, the sub-band envelope value coded data, and the enhancement layer coded data input by the bit stream demultiplexing module 410, and decodes the sub-band envelope value coded data to obtain each sub-block. After the envelope value, the core layer coded data and the envelope value of each subband of the core layer are transmitted to the core layer decoding module 430, and the envelope values of the enhancement layer coded data and the enhancement layer subbands are transmitted to the enhancement layer for decoding. Module 440.
  • the sub-band envelope values obtained by decoding the sub-band envelope value encoded data include an envelope value of each sub-band of the core layer and an envelope value of each sub-band of the enhancement layer.
  • the core layer decoding module 430 receives the core layer coded data input by the subband envelope decoding module 420 and the envelope value of each subband of the core layer, and decodes the core layer coded data according to the envelope value of each subband of the core layer.
  • the MLT coefficients of the demodulated core sub-bands are obtained and transmitted to the MLT coefficient reconstruction and inverse transform module 460. It includes a subband bit allocation module 431, a subband data extraction module 432, and an inverse quantization and decoding module 433.
  • the sub-band bit allocation module 431 receives the core layer coded data input by the sub-band envelope decoding module 420 and the envelope value of each sub-band of the core layer, and allocates bits for each sub-band according to the envelope value of each sub-band of the core layer.
  • the number of bits and the core layer coded data of each subband of the core layer are transmitted to the subband data extraction module 432.
  • the subband data extraction module 432 receives the bit number information of each subband of the core layer and the core layer encoded data input by the subband bit allocation module 431, and extracts the core layer encoded data according to the number of bits occupied by each subband of the core layer.
  • the encoded data of each subband transmits the encoded data of each subband of the core layer to the inverse quantization and decoding module 433.
  • the core layer coded data input from the subband bit allocation module 431 is an entirety including a plurality of core layer subband coded data, and is output as the coded data of each subband of the core layer by the subband data extraction module 432.
  • the inverse quantization and decoding module 433 receives the encoded data of each sub-band of the core layer input by the sub-band data extraction module 432, and performs inverse quantization and decoding on the encoded data of each sub-band of the core layer.
  • the MLT coefficients of each subband of the decompressed core layer are obtained and passed to the MLT coefficient reconstruction and inverse transform module 460.
  • the enhancement layer decoding module 440 receives the enhancement layer coded data input by the subband envelope decoding module 420 and the envelope value of each subband of the enhancement layer, and the enhancement layer according to the envelope value and the auditory perception model 450 of each subband of the enhancement layer.
  • the encoded data is decoded to obtain MLT coefficients of each subband of the decomposed enhancement layer, and the MLT coefficients of each subband of the enhancement layer and the envelope values of the subbands of the enhancement layer are transmitted to the MLT coefficient reconstruction and inverse transform module 460, which includes the sub The importance weighting module 441, the subband bit allocation module 442, the subband data extraction module 443, and the inverse quantization and decoding module 444.
  • the subband importance weighting module 441 receives the enhancement layer encoded data input by the subband envelope decoding module 420 and the envelope value of each subband of the enhancement layer, according to the envelope value of the subband of the input enhancement layer and the auditory perception model 450. And performing weighting calculation on the importance of each subband of the enhancement layer encoded data, and transmitting the weighted result of each subband of the calculated enhanced encoded data, the enhancement layer encoded data, and the envelope value of each subband of the enhancement layer to Subband bit allocation module 442.
  • the present invention performs the enhancement layer signal according to the auditory perception model 450.
  • Weighted calculation Sub-band sensitive to the human ear, the result of importance weighting is the product of the envelope value of the sub-band and a larger weight value; for sub-bands sensitive to human ears, the result of importance weighting is The product of the envelope value of the subband and a smaller weight value. The larger the value of the calculated result, the greater the importance, and the smaller the value of the calculation result, the smaller the importance.
  • the importance of the enhancement layer sub-band is only determined by the envelope value, and in the present invention, the importance of the enhancement layer sub-band is determined by the envelope value and the human ear sensitivity.
  • the subband bit allocation module 442 receives the importance weighting result of each subband of the enhancement layer encoded data input by the subband importance weighting module 441, the enhancement layer coding data, and each layer of the enhancement layer.
  • the envelope value of the band is weighted according to the importance weight of each sub-band of the enhancement layer coded data, and the bit number of bits is allocated to the coded data of each sub-band of the enhancement layer, and the importance weighting result of each sub-band of the enhancement layer coded data is
  • the bit number information of the encoded data of each subband, the enhancement layer encoded data, and the envelope value of each subband of the enhancement layer are transmitted to the subband data extraction module 443.
  • the subband data extraction module 443 receives the importance weighting result of each subband of the enhanced coded data input by the subband bit allocation module 442, the bit number information of the encoded data of each subband of the enhancement layer, the enhancement layer coded data, and the enhancement layer.
  • the envelope value of each sub-band is extracted according to the importance of each sub-band data of the enhanced coded data, and the code of each sub-band of the enhancement layer coded data is extracted according to the number of bits occupied by each sub-band of the enhancement layer.
  • Data, the encoded data of each subband of the enhancement layer and the envelope value of each subband of the enhancement layer are transmitted to an inverse quantization and decoding module 444.
  • the enhancement layer coded data input from the subband bit allocation module 442 is an entirety including a plurality of enhancement layer subband coded data, and is outputted by the subband data extraction module 443 as coded data of each subband of the enhancement layer.
  • the sub-band encoded data of the enhancement layer coded data is extracted according to the number of bits occupied by the respective sub-bands in order of increasing importance of the sub-band data of the enhanced coded data.
  • extracting data first calculate the sum of the bit number of the code stream of the extracted frame and the bit number of a certain sub-band coded data of the enhancement layer coded data to be extracted, and then with the code stream of the frame in which the frame is located.
  • the total number of bits is compared, if it is greater than the total number of bits, the data is stopped; otherwise, the encoding of the certain subband is extracted, and the number of extracted bits is updated to the previously extracted bit number and the certain sub
  • the sum of the bits occupied by the code continues to extract the next sub-band coded data of the enhancement layer coded data.
  • the inverse quantization and decoding module 444 receives the encoded data of each subband of the enhancement layer and the envelope value of each subband of the enhancement layer input by the subband data extraction module 443, and performs inverse quantization and decoding on the encoded data of each subband of the enhancement layer. , obtain the MLT coefficient of each sub-band of the decompressed enhancement layer, and transmit the MLT coefficient of each sub-band of the enhancement layer and the envelope value of each sub-band of the enhancement layer to the MLT coefficient reconstruction And inverse transform module 460.
  • the auditory perception model 450 provides a basis for the sub-band importance weighting of the sub-band importance weighting module 441; if the data of some sub-bands of the less important enhancement layer is lost in the encoding or transmission process to adapt to the network condition,
  • the MLT coefficient reconstruction module 461 is then provided with a basis for reconstructing the lost enhancement layer MLT coefficients.
  • the MLT coefficient reconstruction and inverse transform module 460 receives the MLT coefficients of the sub-bands of the core layer input by the inverse quantization and decoding module 433, and the MLT coefficients of the sub-bands of the enhancement layer input by the inverse quantization and decoding module 444, and the sub-bands of the enhancement layer.
  • the envelope value, the MLT coefficient of each sub-band of the core layer and the MLT coefficient of each sub-band of the enhancement layer are inverse transformed to obtain a decompressed PCM signal, which includes an MLT coefficient reconstruction module 461 and an MLT inverse transform module 462.
  • the MLT coefficient reconstruction module 461 receives the MLT coefficients of the sub-bands of the core layer input by the inverse quantization and decoding module 433, and the MLT coefficients of each sub-band of the enhancement layer input by the inverse quantization and decoding module 444, and the envelope of each sub-band of the enhancement layer.
  • the value, according to the envelope value of each sub-band of the enhancement layer rearranges the MLT coefficients of the core layer and the enhancement layer sub-bands according to the band order, and then transmits the MLT coefficients to the MLT inverse transform module 462.
  • the rearranged MLT coefficients are a whole including the core layer MLT coefficients and the enhancement layer MLT coefficients.
  • the MLT coefficients of each sub-band of the core layer and the enhancement layer are arranged in order of frequency from small to large.
  • For the MLT coefficients of each sub-band of the enhancement layer there may be data of some sub-bands of the enhancement layer which are less important in the encoding or transmission process to adapt to the network condition, for example, in the bit stream multiplexing and packing module 260 In the use and packaging, the encoded data of some enhancement layer sub-bands that are less important may be lost.
  • the missing enhancement layer MLT coefficients may be compensated according to the envelope values of the subbands of the enhancement layer.
  • the compensation method is: The symbols of the MLT coefficients are randomly selected, and may be positive or negative.
  • the auditory perception model 450 determines that the sub-band signal with high sensitivity to the human ear has a large proportional constant value, and the signal with a small sensitivity to the human ear has a small proportional constant value.
  • the MLT inverse transform module 462 receives the MLT coefficient input by the MLT coefficient reconstruction module 461, and performs inverse MLT on the MLT coefficient to obtain a decompressed PCM signal.
  • the input PCM signal with a sampling frequency of 48 kHz has a frame length of 20 ms, a delay of 40 ms, and a code rate range of 32 to 64 kbits/s, wherein the core layer code rate is 32 kbits/s, and the layering step size is 0.8. Kbits/s.
  • step 501 the PCM signal is MLT and converted into an MLT coefficient.
  • each MLT is the latest 1920 samples x(n), where x(0) is the oldest sample, and 0 ⁇ n ⁇ 1920.
  • MLT outputs 960 MLT coefficients, ie mlt(m), where 0 ⁇ m ⁇ 960.
  • MLT is given by:
  • MLT can be decomposed into windows, overlap and addition, and then type IV discrete cosine transform (DCT, Discrete Cosine Transform). Window, Overlap, and Addition are done as follows:
  • each frame MLT coefficient is divided into a plurality of equally spaced sub-bands or a plurality of non-equal spaced sub-bands.
  • the MLT coefficients in the 0-20 kHz band are equally divided into 40 sub-bands, each of which has a bandwidth of 500 Hz and 20 MLT coefficients.
  • Step 503 According to the auditory perception model, divide the MLT coefficient into a core layer signal including a sensitive signal and an enhancement layer signal including a secondary sensitive signal.
  • the human ear is sensitive to signals in the range of 2k ⁇ 8kHZ. Therefore, the range of 0 ⁇ 8kHZ, that is, the sub-band 0 ⁇ 15 is divided into core layer signals, and 32kbits/s code rate is allocated to them. The 16 ⁇ 39 range is divided into enhancement layer signals, and the code rate is the remaining 32 kbits/s.
  • Step 504 Calculate, according to the core layer signal and the enhancement layer signal, an envelope value of each subband of the core layer signal and the enhancement layer signal, and encode each subband envelope value to obtain coded data of each subband envelope value. Then, steps 505 and 507 are performed.
  • the subband envelope value is defined as the root mean square of the MLT coefficient in this region (RMS, Root Mean
  • the sub-band envelope values are encoded by a variable length code (VLC) method or other coding method to obtain coded data of the envelope values of the sub-bands.
  • VLC variable length code
  • Step 505 Allocate bit numbers for each subband of the core layer signal according to an envelope value of each subband of the core layer signal.
  • the bit allocation algorithm of G.722.1 or G.929EV can be used to carry the sub-bands of the core layer. Number allocation bit.
  • Step 506 Quantize and encode each sub-band signal of the core layer signal according to the number of bits of each sub-band of the core layer signal to obtain encoded data of each sub-band of the core layer, and then perform step 510.
  • Step 507 Perform weighting calculation on the importance of each subband of the enhancement layer signal according to the auditory perception model and the envelope value of each subband of the enhancement layer signal.
  • the present invention weights the enhancement layer signal according to the auditory perception model. Calculation: The sub-band sensitive to the human ear, the importance weighted result is the product of rms ( r ) and a larger weight value; for the sub-band sensitive to the human ear, the importance weighted result is the rms of the sub-band (r) The product of a smaller weight value. That is to say, the importance of each sub-band signal of the enhancement layer signal is determined by the envelope value and the human ear sensitivity.
  • the subband importance weighting calculation can be expressed as:
  • ip ( r ) represents the magnitude of the importance of each sub-band signal of the enhancement layer signal.
  • Step 508 Allocate bit numbers for each subband signal according to the weighted result of the importance of each subband of the calculated enhancement layer signal.
  • the number of bits is allocated for each subband signal of the enhancement layer signal. For a sub-band signal of high importance, a larger number of bits are allocated, and for a sub-band signal having a smaller importance, a smaller number of bits are allocated.
  • Step 509 Quantize and encode each subband signal of the enhancement layer signal according to the number of bits of each subband signal of the enhancement layer, to obtain coded data of each subband of the enhancement layer.
  • Step 510 encoding data of an envelope value of each subband, and encoding data of each subband of the core layer
  • the coded data of each sub-band of the enhancement layer is multiplexed and packed, and then transmitted to the decoding end.
  • FIG. 3 See Figure 3 for a schematic diagram of the audio stream structure after multiplexing and packing.
  • the multiplexing and packing methods are the same as described in the bitstream multiplexing and packing module 260.
  • FIG. 6 is a flowchart of a layered decoding method according to an embodiment of the present invention. This embodiment is a process for decoding a code stream obtained by encoding in FIG. 5, and includes the following steps:
  • Step 601 Demultiplex the encoded data transmitted by the encoding end into core layer encoded data, subband envelope value encoded data, and enhancement layer encoded data.
  • the core layer coded data is a whole composed of a plurality of core layer sub-band coded data
  • the enhancement layer coded data is a whole composed of a plurality of enhancement layer sub-band coded data.
  • Step 602 Decode each sub-band envelope value encoded data to obtain an envelope value of each sub-band, and then perform step 603 and step 606.
  • the sub-band envelope values obtained by decoding the sub-band envelope value encoded data include an envelope value of each sub-band of the core layer and an envelope value of each sub-band of the enhancement layer.
  • Step 603 Allocating bit numbers for each subband of the core layer coded data according to each subband envelope value of the core layer coded data.
  • Step 604 Extract each sub-band encoded data of the core layer coded data according to the number of bits occupied by each subband of the core layer coded data.
  • the core layer coded data is a whole composed of sub-band coded data encoded by a plurality of core layers, and is extracted and decomposed into coded data of each sub-band of the core layer.
  • Step 605 After performing inverse quantization and decoding on the extracted sub-band encoded data of the core layer, obtaining MLT coefficients of each sub-band of the decompressed core layer, and then performing step 610.
  • Step 606 Perform weighting calculation on the importance of each subband of the enhancement layer encoded data according to the auditory perception model and the envelope value of each subband of the enhancement layer.
  • Step 607 Allocate bit numbers for the encoded data of each subband of the enhancement layer according to the importance of each subband of the enhancement layer encoded data.
  • Step 608 Extract the coded data of each sub-band of the enhancement layer coded data according to the number of bits occupied by each sub-band of the enhancement layer according to the order of importance of each sub-band data of the enhancement layer coded data.
  • Step 609 Perform inverse quantization and decoding on the encoded data of each subband of the extracted enhancement layer to obtain a decompressed enhancement layer MLT coefficient.
  • the enhancement layer encoded data is inverse quantized and decoded by the inverse of the quantization and encoding in the encoding process, and 20 MLT coefficients of each subband are obtained.
  • Step 610 rearranging the MLT coefficients of the core layers and the sub-bands of the enhancement layer in order of frequency.
  • the MLT coefficients of each sub-band of the core layer and the enhancement layer are arranged in order of frequency from small to large.
  • For the MLT coefficients of each subband of the enhancement layer there may be data of some subbands of the enhancement layer which are less important in the encoding or transmission process to adapt to the network condition, for example, in the encoding process, multiplexing and packing, Coded data for some enhancement layer sub-bands that are less likely to be lost.
  • the missing enhancement layer MLT coefficients can be reconstructed according to the envelope values of the sub-bands of the enhancement layer.
  • the reconstruction method is: The symbols of the MLT coefficients are randomly selected, which may be positive or negative, and the envelope value of the sub-band is multiplied by a ratio.
  • the constant is used as the amplitude of the MLT coefficient.
  • the proportional constant is determined according to the auditory perception model.
  • the sub-band signal with high sensitivity to the human ear has a large proportional constant value, and the signal with a small sensitivity to the human ear has a small proportional constant value.
  • Table 1 is a proportionality constant corresponding to each sub-band in the present embodiment. Proportional constant of sub-band subscript MLT coefficient reconstruction
  • Step 611 Perform inverse MLT on the MLT coefficients of each sub-band of the core layer and the enhancement layer to obtain a decompressed PCM signal.
  • Each inverse MLT operation processes 960 MLT coefficients, producing 960 time domain audio samples.
  • Inverse MLT can be decomposed into type IV DCT, window, overlap, and addition.
  • Type IV DCT is: u(n) for 0 ⁇ n ⁇ , J
  • y(n) is the representation of the PCM signal.
  • the coding scheme of the embodiment of the present invention converts the input signal into an MLT coefficient, and then divides it into a core layer signal and an enhancement layer signal according to the auditory perception model, and then according to the core.
  • the cardiac layer signal, the enhancement layer signal and the auditory perception model are obtained by multiplexing and packing the encoded data.
  • the auditory perception model the importance of each sub-band of the enhancement layer is weighted, and the obtained core layer MLT is obtained.
  • the coefficient and enhancement layer MLT coefficients are inverse MLT, and the decompressed code stream is output.
  • the embodiment of the present invention performs MLT on the input signal, and performs weighting calculation on the importance of each subband of the enhancement layer according to the auditory perception model, thereby improving the quality of the codec.
  • the problem that the high sampling rate input signal cannot be effectively processed in the prior art is solved.
  • the present invention does not adopt QMF and CELP coding, which reduces the codec complexity and enhances the codec effect.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Quality & Reliability (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

A hierarchical coding decoding method includes the following steps: after an input signal is converted to modulated lapped transform coefficient, the signal is divided into a core layer signal and an enhanced layer signal according to an auditory perception model, and coded data is obtained after multiplexed and packaged; when the coded data is decoded, after the importance of every sub-band of the enhanced layer is weighted according to the auditory perception model, the obtained core layer MLT coefficient and the obtained enhanced layer MLT coefficient are inverse MLT, and a decoded stream is output.

Description

一种可分层音频编解码方法及装置 技术领域  Layerable audio codec method and device
本发明涉及编解码技术, 具体涉及一种可分层音频编解码方法及装 置。 发明背景  The present invention relates to codec technology, and in particular to a layered audio codec method and apparatus. Background of the invention
随着多媒体技术的快速发展, 音频编解码被越来越广泛地应用于数 字音频广播、 因特网上的高质量音频传输、 数字电影等。  With the rapid development of multimedia technology, audio codecs are more and more widely used in digital audio broadcasting, high-quality audio transmission on the Internet, digital movies, and the like.
音频编解码***的一个重要特征是使音频编解码***能适应于不 同的应用环境。 音频可分层编码技术正是在此需求下发展起来的, 可分 层特征意味着音频信号以层的形式组织, 将信号分为低质量部分和高质 量部分, 信号的低质量部分即音频信号的核心层, 信号的高质量部分即 音频信号的增强层, 低质量部分能在没有任何高质量部分信息的情况下 被解码。 在传输信道不能保障全部带宽来传输完整信号时, 可分层特性 就显得特别有用。 例如, 当多个用户通过不同的通信链 妻入相同的音 频时,通过高速链路接入音频的用户可以适时播放 384kbit/s编码的环绕 声, 而仅有 56kbit/s调制解调器的用户则无法享受到此音频。 对音频信 号分级后, 当具有高带宽的用户享受高质量音频时, 用 56kbit/s码率连 接的用户则可以下载信号的核心层部分, 欣赏到一个较低质量的音频。  An important feature of the audio codec system is that the audio codec system can be adapted to different application environments. Audio layered coding technology is developed under this requirement. The layered feature means that the audio signal is organized in layers, dividing the signal into low-quality parts and high-quality parts, and the low-quality part of the signal is the audio signal. The core layer, the high quality part of the signal is the enhancement layer of the audio signal, and the low quality part can be decoded without any high quality part information. The layering feature is particularly useful when the transmission channel does not guarantee full bandwidth to transmit a complete signal. For example, when multiple users enter the same audio through different communication chains, users accessing audio through high-speed links can play 384kbit/s encoded surround sound at the right time, while users with only 56kbit/s modems cannot enjoy it. To this audio. After grading the audio signal, when users with high bandwidth enjoy high quality audio, users connected at a code rate of 56 kbit/s can download the core layer of the signal and enjoy a lower quality audio.
参见图 la, 为现有技术中可分层音频编码装置的结构示意图, 该装 置包括积分镜像滤波器组(QMF, Quadrature Mirror Filterbanks ) 101、 QMF102、 码本线性预测 (CELP, Code Excited Linear Prediction )编码 模块 103、 CELP解码模块 104、 加法器 105、 修正离散余弦变换( MDCT, Modified Discrete Cosine Transform )模块 106、 MDCT模块 107、 时域混 叠消除(TDAC, Time Domain Alias Cancellation )编码模块 108、 时域 带宽扩展(TDB WE, Time Domain Bandwidth Extension )模块 109、 比 特流复用及打包模块 110。 Referring to FIG. 1A, FIG. 1 is a schematic structural diagram of a layered audio coding device in the prior art, which includes a Quadrature Mirror Filterbanks (QMF) 101, a QMF 102, and a Code Excited Linear Prediction (CELP). Encoding module 103, CELP decoding module 104, adder 105, modified discrete cosine transform (MDCT, Modified Discrete Cosine Transform module 106, MDCT module 107, Time Domain Alias Cancellation (TDAC) encoding module 108, Time Domain Bandwidth Extension (TDB WE, Time Domain Bandwidth Extension) module 109, bit stream multiplexing and Packing module 110.
QMF101 , 对输入的脉码调制 (PCM, Pulse Code Modulation )信号 进行滤波, 输出为核心层信号。  The QMF101 filters the input pulse code modulation (PCM) signal and outputs the signal to the core layer.
QMF101的输入为 16,000Hz采样频率的 PCM输入信号。  The input to the QMF101 is a PCM input signal at a sampling frequency of 16,000 Hz.
QMF102, 对输入的 PCM信号进行滤波, 输出为增强层信号。  The QMF 102 filters the input PCM signal and outputs the enhancement layer signal.
PCM信号经 QMF101和 QMF2滤波后分为核心层信号和增强层信号。 CELP编码模块 103, 对 QMF1输入的核心层信号进行 CELP编码, 将 编码后的数据传送给 CELP解码模块 104和比特流复用及打包模块 110。  The PCM signal is filtered by QMF101 and QMF2 and divided into a core layer signal and an enhancement layer signal. The CELP encoding module 103 performs CELP encoding on the core layer signal input by the QMF1, and transmits the encoded data to the CELP decoding module 104 and the bit stream multiplexing and packing module 110.
CELP解码模块 104,将 CELP编码模块 103输入的编码数据进行 CELP 解码后, 传送给加法器 105。  The CELP decoding module 104 performs CELP decoding on the encoded data input by the CELP encoding module 103, and then transmits the encoded data to the adder 105.
加法器 105, 将 QMF101输入的核心层信号和 CELP解码模块 104输入 的信号相减, 将输出信号传送给 MDCT模块 106。  The adder 105 subtracts the core layer signal input from the QMF 101 and the signal input from the CELP decoding module 104, and transmits the output signal to the MDCT module 106.
MDCT模块 106, 将加法器 105输入的信号由时域变换为频域, 得到 MDCT系数, 传送给 TDAC编码模块 108。  The MDCT module 106 converts the signal input by the adder 105 from the time domain to the frequency domain to obtain MDCT coefficients, which are transmitted to the TDAC encoding module 108.
MDCT模块 107, 将 QMF102输入的增强层信号由时域变换为频域, 得到增强层的 MDCT系数, 传送给 TDAC编码模块 108。  The MDCT module 107 converts the enhancement layer signal input by the QMF 102 from the time domain to the frequency domain to obtain the MDCT coefficients of the enhancement layer, and transmits the coefficients to the TDAC encoding module 108.
TDAC编码模块 108 , 对 MDCT模块 106输入的 MDCT系数和 MDCT 模块 107输入的增强层 MDCT系数进行 TDAC编码, 将编码后的数据传送 给比特流复用模块 110。  The TDAC encoding module 108 performs TDAC encoding on the MDCT coefficients input by the MDCT module 106 and the enhancement layer MDCT coefficients input by the MDCT module 107, and transmits the encoded data to the bit stream multiplexing module 110.
TDAC编码时, 将 0~7000Hz的 MDCT系数分为 18个子带, 计算出这 18个子带的包络值, 按照包络值的大小为各子带分配编码比特位数, 根 据各子带的编码比特位数对各子带进行量化和编码。 TDBWE模块 109, 对 QMF102输入的增强层信号提取高频参数, 传 送给比特流复用及打包模块 110。 In the TDAC encoding, the MDCT coefficients from 0 to 7000 Hz are divided into 18 sub-bands, and the envelope values of the 18 sub-bands are calculated. According to the size of the envelope value, the number of coding bits is allocated to each sub-band according to the coding of each sub-band. The number of bits quantizes and encodes each subband. The TDBWE module 109 extracts high frequency parameters from the enhancement layer signals input by the QMF 102 and transmits them to the bit stream multiplexing and packing module 110.
比特流复用及打包模块 110,对 CELP编码模块 103输入的编码数据、 TDAC编码模块 108输入的编码数据和 TDBWE109输入的数据进行复用 和打包。  The bit stream multiplexing and packing module 110 multiplexes and packs the encoded data input by the CELP encoding module 103, the encoded data input by the TDAC encoding module 108, and the data input by the TDBWE 109.
打包时, 将编码数据根据各子带包络值从大到小的顺序依次排列。 参见图 lb , 为现有技术中与图 la相对应的可分层音频解码装置的 结构示意图, 该装置包括比特流解复用模块 120、 CELP解码模块 121、 TDAC解码模块 122、 TDBWE解码模块 123、 加法器 124、 逆 MDCT 模块 125、 逆 MDCT模块 126、 QMF127, QMF128, 加法器 129。  When packing, the encoded data is sequentially arranged in descending order of the envelope values of the respective sub-bands. Referring to FIG. 1b, FIG. 1 is a schematic structural diagram of a layered audio decoding device corresponding to FIG. 1a, which includes a bit stream demultiplexing module 120, a CELP decoding module 121, a TDAC decoding module 122, and a TDBWE decoding module 123. The adder 124, the inverse MDCT module 125, the inverse MDCT module 126, the QMF 127, the QMF 128, and the adder 129.
比特流解复用模块 120, 对接收到的编码数据进行解复用, 将解复 用得到的核心层编码数据传送给 CELP解码模块 121 , 将其他层数据传 送给 TDAC解码模块 122和 TDBWE解码模块 123。  The bitstream demultiplexing module 120 demultiplexes the received encoded data, transmits the demultiplexed core layer encoded data to the CELP decoding module 121, and transmits the other layer data to the TDAC decoding module 122 and the TDBWE decoding module. 123.
CELP解码模块 121 ,对接收到的核心层编码数据进行解码后,传送 给加法器 124。  The CELP decoding module 121 decodes the received core layer encoded data and transmits it to the adder 124.
TDAC解码模块 122, 对接收到的编码数据进行解码后, 传送给逆 MDCT模块 125和逆 MDCT模块 126。  The TDAC decoding module 122 decodes the received encoded data and transmits it to the inverse MDCT module 125 and the inverse MDCT module 126.
TDBWE解码模块 123, 对接收到的编码数据进行解码后, 传送给
Figure imgf000005_0001
The TDBWE decoding module 123 decodes the received encoded data and transmits it to
Figure imgf000005_0001
逆 MDCT模块 125,将接收到的频域信号转换为时域信号,传送给加 法器 124。  The inverse MDCT module 125 converts the received frequency domain signal into a time domain signal and transmits it to the adder 124.
逆 MDCT模块 126, 将接收到的频域信号转换为时域信号, 传送给
Figure imgf000005_0002
The inverse MDCT module 126 converts the received frequency domain signal into a time domain signal and transmits the signal to the time domain signal.
Figure imgf000005_0002
加法器 124, 将由 CELP解码模块 121输入的核心层解码数据和由逆 MDCT模块 125输入的数据进行相加运算, 将求和结果传送给 QMF127。 QMF127, 对接收到的信号进行升采样, 得到核心层信号。 The adder 124 adds the core layer decoded data input by the CELP decoding module 121 and the data input by the inverse MDCT module 125, and transmits the result of the summation to the QMF 127. QMF127, upsamples the received signal to obtain the core layer signal.
QMF128, 对接收到的信号进行升采样, 得到增强层信号。  QMF128, upsamples the received signal to obtain an enhancement layer signal.
加法器 129 , 将由 QMF127输入的核心层信号和由 QMF128输入的增 强层信号进行相加运算, 得到解压的 PCM码流。  The adder 129 adds the core layer signal input by the QMF 127 and the enhancement layer signal input by the QMF 128 to obtain a decompressed PCM code stream.
现有的可分层编解码方法有以下缺点:  The existing layered codec method has the following disadvantages:
1 )一般说来, 人类听觉***能感觉到 20Hz到 20,000Hz频率范围内 的声音, 频率的上限依赖于每个人听觉***的状况和声音的强度, 普通 人的听觉***对 2,000Hz到 8,000Hz频率范围内的声音比较敏感。 现有技 术处理的是 16,000Hz采样频率的输入信号, 根据各子带包络值大小分配 编码比特位数, 将包络值大的子带编码数据排在前面作为低层信息, 这 是可行的。 然而, 对于 32,000Hz, 44,100Hz或 48,000Hz采样频率的输入 信号, 这种处理方法将会存在 4艮大的缺陷。 例如, 某 16,000Hz附近的子 带具有较大的包络值, 但是可能还没有达到人耳可感知的阈值, 即人耳 不敏感, 如果为此子带分配较多的比特位数, 将会导致真正重要的子带 没有足够的比特位数来编码而影响编码质量。 这种方法也可能使人耳敏 感的重要子带因为包络值较小而被排在码流的后面, 在网络状况不好时 被优先丟弃, 这将影响用户听觉感受。 这就是说, 现有技术的可分层音 频编解码方法不能有效解决高采样频率信号输入的情况。  1) In general, the human auditory system can sense sounds in the frequency range of 20 Hz to 20,000 Hz. The upper limit of the frequency depends on the condition of each person's auditory system and the intensity of the sound. The average human auditory system has a frequency of 2,000 Hz to 8,000 Hz. The sound within the range is sensitive. The prior art processes an input signal of a sampling frequency of 16,000 Hz, assigns the number of coded bits according to the envelope value of each sub-band, and ranks the sub-band coded data with a large envelope value as the lower layer information, which is feasible. However, for an input signal at a sampling frequency of 32,000 Hz, 44, 100 Hz or 48,000 Hz, there are four major drawbacks to this approach. For example, a sub-band near 16,000 Hz has a large envelope value, but may not reach the threshold that the human ear can perceive, that is, the human ear is not sensitive. If more bits are allocated for this sub-band, Subbands that are really important do not have enough bits to encode and affect the quality of the encoding. This method may also make the important sub-band of human ear sensitivity be ranked behind the code stream because of the small envelope value, and is preferentially discarded when the network condition is not good, which will affect the user's auditory feeling. That is to say, the prior art layered audio codec method cannot effectively solve the problem of high sampling frequency signal input.
2 )现有技术中采用的 QMF增加了编解码算法的复杂度, 增长了编 解码算法的时延。 对核心层信号采用的 CELP编码是为适应语音信号特 点而设计的, 对同是低频的其他类型的信号并不合适, 这将影响编解码 效果。 发明内容  2) The QMF used in the prior art increases the complexity of the codec algorithm and increases the delay of the codec algorithm. The CELP code used for the core layer signal is designed to adapt to the characteristics of the speech signal. It is not suitable for other types of signals that are also low frequency, which will affect the codec effect. Summary of the invention
有鉴于此, 本发明实施例的一个目的在于提供一种可分层音频编码 装置, 该装置有效提高了编码质量。 In view of this, it is an object of embodiments of the present invention to provide a layered audio coding. The device, which effectively improves the encoding quality.
本发明实施例的另一目的在于提供一种可分层音频解码装置, 该装 置有效地提高了解码质量。  Another object of embodiments of the present invention is to provide a layered audio decoding apparatus that effectively improves decoding quality.
本发明实施例的又一目的在于提供一种可分层音频编解码方法, 该 方法有效提高了编解码质量。  It is still another object of embodiments of the present invention to provide a layered audio codec method, which effectively improves codec quality.
为了达到上述目的, 本发明的技术方案是这样实现的:  In order to achieve the above object, the technical solution of the present invention is achieved as follows:
一种可分层音频编码装置, 该装置包括: 基于听觉感知模型的分层 模块、 听觉感知模型、 子带包络计算及编码模块、 核心层编码模块、 增 强层编码模块和比特流复用及打包模块;  A layerable audio coding device, comprising: a layered module based on an auditory perception model, an auditory perception model, a subband envelope calculation and coding module, a core layer coding module, an enhancement layer coding module, and bit stream multiplexing and Packaging module
所述基于听觉感知模型的分层模块, 将输入信号经过调制重叠变换 The hierarchical module based on the auditory perception model performs modulation overlap transformation on the input signal
( MLT, Modulated Lapped Transform ), 变换为 MLT系数后, 根据听觉 感知模型, 划分为核心层信号和增强层信号; (MLT, Modulated Lapped Transform), after being transformed into MLT coefficients, according to the auditory perception model, divided into core layer signals and enhancement layer signals;
所述听觉感知模型,为基于听觉感知模型的分层模块提供分层依据, 为增强层编码模块的子带重要性加权提供依据;  The auditory perception model provides a layered basis for the layered module based on the auditory perception model, and provides a basis for weighting the sub-band importance of the enhancement layer coding module;
所述子带包络计算及编码模块, 根据核心层信号和增强层信号, 计 算出基于听觉感知模型的分层模块输入的核心层信号和增强层信号的 各子带的包络值后 , 将核心层信号和核心层信号各子带的包络值送给核 心层编码模块, 将增强层信号和增强层信号各子带包络值传送给增强层 编码模块; 对各子带包络值进行编码, 将编码数据传送给比特流复用及 打包模块;  The sub-band envelope calculation and coding module calculates the envelope value of each sub-band of the core layer signal and the enhancement layer signal based on the auditory perception model based on the core layer signal and the enhancement layer signal, The envelope values of the core layer signal and the core layer signal subbands are sent to the core layer coding module, and the enhancement layer signal and the enhancement layer signal subband envelope values are transmitted to the enhancement layer coding module; Encoding, transmitting the encoded data to the bitstream multiplexing and packing module;
所述核心层编码模块, 根据输入的核心层信号各子带的包络值, 对 输入的核心层信号进行编码后, 传送给比特流复用及打包模块;  The core layer coding module encodes the input core layer signal according to the envelope value of each subband of the input core layer signal, and then transmits the signal to the bit stream multiplexing and packing module;
所述增强层编码模块, 根据听觉感知模型和输入的增强层信号各子 带的包络值, 对输入的增强层信号进行编码后, 传送给比特流复用及打 包模块; 所述比特流复用及打包模块, 对核心层编码模块输入的核心层各子 带的编码数据、 增强层编码模块输入的增强层各子带的编码数据和子带 包络计算及编码模块输入的子带包络值编码数据进行复用和打包。 The enhancement layer coding module encodes the input enhancement layer signal according to the auditory perception model and the envelope value of each subband of the input enhancement layer signal, and then transmits the signal to the bit stream multiplexing and packing module; The bit stream multiplexing and packing module, the coded data of each sub-band of the core layer input by the core layer coding module, the coded data of each sub-band of the enhancement layer input by the enhancement layer coding module, and the calculation of the sub-band envelope and the input of the coding module The subband envelope value encoded data is multiplexed and packed.
一种可分层音频解码装置, 该装置包括: 比特流解复用模块、 子带 包络解码模块、核心层解码模块、增强层解码模块、听觉感知模型、 MLT 系数重建及逆变换模块;  A layerable audio decoding device, comprising: a bit stream demultiplexing module, a subband envelope decoding module, a core layer decoding module, an enhancement layer decoding module, an auditory perception model, an MLT coefficient reconstruction and an inverse transform module;
所述比特流解复用模块, 将接收到的编码数据分解为子带包络值编 码数据、核心层编码数据和增强层编码数据,传送给子带包络解码模块; 所述子带包络解码模块, 对子带包络值编码数据进行解码, 得到各 子带包络值后, 将核心层编码数据和核心层各子带的包络值传送给核心 层解码模块, 将增强层编码数据和增强层各子带的包络值传送给增强层 解码模块;  The bitstream demultiplexing module decomposes the received encoded data into subband envelope value encoded data, core layer encoded data, and enhancement layer encoded data, and transmits the data to the subband envelope decoding module; the subband envelope The decoding module decodes the sub-band envelope value encoded data, obtains the envelope value of each sub-band, and transmits the core layer encoded data and the envelope value of each sub-band of the core layer to the core layer decoding module, and the enhanced layer encoded data And transmitting an envelope value of each subband of the enhancement layer to the enhancement layer decoding module;
所述核心层解码模块, 根据输入的核心层各子带的包络值, 对输入 的核心层编码数据进行解码, 得到解压的核心层各子带的 MLT系数后, 传送给 MLT系数重建及逆变换模块;  The core layer decoding module decodes the input core layer encoded data according to the envelope value of each subband of the input core layer, obtains the MLT coefficients of the decompressed core layer subbands, and transmits the MLT coefficients to the MLT coefficient reconstruction and inverse Transformation module
所述增强层解码模块, 根据听觉感知模型和输入的增强层各子带的 包络值, 对输入的增强编码数据进行解码, 得到解压的增强层各子带的 MLT系数, 将增强层各子带的 MLT系数和增强层各子带的包络值传送 给 MLT系数重建及逆变换模块;  The enhancement layer decoding module decodes the input enhanced coding data according to the auditory perception model and the envelope value of each subband of the input enhancement layer, and obtains the MLT coefficients of each subband of the decompressed enhancement layer, and the enhancement layer each sub-band The MLT coefficient of the band and the envelope value of each subband of the enhancement layer are transmitted to the MLT coefficient reconstruction and inverse transform module;
所述听觉感知模型,为增强层解码模块的子带重要性加权提供依据; 所述 MLT系数重建及逆变换模块, 对核心层各子带的 MLT系数和 增强层各子带的 MLT系数进行逆变换, 得到解压的输出信号。  The auditory perception model provides a basis for subband importance weighting of the enhancement layer decoding module; the MLT coefficient reconstruction and inverse transform module inverses the MLT coefficients of each subband of the core layer and the MLT coefficients of each subband of the enhancement layer Transform to get the decompressed output signal.
一种可分层音频编解码方法, 该方法包括:  A layerable audio codec method, the method comprising:
将输入信号经 MLT后,根据听觉感知模型划分为核心层信号和增强 层信号,根据核心层信号和增强层信号,得到各子带包络值的编码数据; 根据核心层信号和核心层信号各子带的包络值得到核心层各子带的 编码数据, 根据增强层信号、 听觉感知模型和增强层信号各子带的包络 值, 得到增强层各子带的编码数据, 将得到的所述各子带包络值的编码 数据、 核心层各子带的编码数据和增强层各子带的编码数据一起复用打 包后, 传送给解码端。 After the input signal is passed through the MLT, it is divided into a core layer signal and an enhancement layer signal according to the auditory perception model, and the encoded data of the envelope values of each sub-band is obtained according to the core layer signal and the enhancement layer signal; The encoded data of each sub-band of the core layer is obtained according to the envelope values of the core layer signal and the sub-bands of the core layer signal, and the enhancement layer sub-bands are obtained according to the enhancement layer signal, the auditory perception model, and the envelope values of the sub-bands of the enhancement layer signal. The encoded data of the band is multiplexed and packed together with the encoded data of the envelope values of the sub-bands, the coded data of each sub-band of the core layer, and the coded data of each sub-band of the enhancement layer, and then transmitted to the decoding end.
从上述方案可以看出, 本发明实施例的可分层音频编解码方案对输 入信号进行了 MLT, 根据听觉感知模型得到复用打包数据后, 传送给解 码端, 这样, 提高了编解码的质量, 解决了现有技术中不能有效处理高 采样率输入信号的问题。 附图简要说明  As can be seen from the foregoing solution, the layered audio codec solution of the embodiment of the present invention performs MLT on the input signal, and after multiplexing and packing the data according to the auditory perception model, the data is transmitted to the decoding end, thereby improving the quality of the codec. The invention solves the problem that the high sampling rate input signal cannot be effectively processed in the prior art. BRIEF DESCRIPTION OF THE DRAWINGS
图 la为现有技术中可分层音频编码装置的结构示意图;  Figure la is a schematic structural diagram of a layerable audio coding device in the prior art;
图 lb为现有技术中与图 la相对应的可分层音频解码装置的结构示 意图;  Figure 1b is a schematic structural view of a layered audio decoding device corresponding to the figure la in the prior art;
图 2为本发明实施例可分层编码装置的结构示意图;  2 is a schematic structural diagram of a layered coding apparatus according to an embodiment of the present invention;
图 3为图 2中复用及打包后的音频码流结构示意图;  3 is a schematic diagram showing the structure of the audio code stream after multiplexing and packing in FIG. 2;
图 4为本发明实施例可分层解码装置的结构示意图;  4 is a schematic structural diagram of a layered decoding apparatus according to an embodiment of the present invention;
图 5为本发明实施例可分层编码方法的流程图;  FIG. 5 is a flowchart of a layered coding method according to an embodiment of the present invention; FIG.
图 6为本发明实施例可分层解码方法的流程图。 实施本发明的方式  FIG. 6 is a flowchart of a layered decoding method according to an embodiment of the present invention. Mode for carrying out the invention
为使本发明的目的、 技术方案和优点更加清楚明白, 下面结合实施 例和附图, 对本发明进一步详细说明。  In order to make the objects, the technical solutions and the advantages of the present invention more comprehensible, the present invention will be further described in detail below with reference to the embodiments and drawings.
参见图 2, 为本发明实施例可分层编码装置的结构示意图, 包括基 于听觉感知模型的分层模块 210、 听觉感知模型 220、 子带包络计算及 编码模块 230、 核心层编码模块 240、 增强层编码模块 250和比特流复 用及打包模块 260。 2 is a schematic structural diagram of a layered coding apparatus according to an embodiment of the present invention, including a layered module 210, an auditory perception model 220, and a subband envelope calculation based on an auditory perception model. Encoding module 230, core layer encoding module 240, enhancement layer encoding module 250, and bitstream multiplexing and packaging module 260.
基于听觉感知模型的分层模块 210,根据听觉感知模型 220,将 PCM 信号经过 MLT,变换为 MLT系数后,划分为核心层信号和增强层信号。 其包括 MLT模块 211、 子带划分模块 212和频带重要性分层模块 213。  The layering module 210 based on the auditory perception model divides the PCM signal into MLT coefficients according to the auditory perception model 220, and then divides into the core layer signal and the enhancement layer signal. It includes an MLT module 211, a subband partitioning module 212, and a band importance layering module 213.
MLT模块 211 , 对输入的 PCM信号进行 MLT, 变换为 MLT系数。 子带划分模块 212, 将每一帧 MLT系数划分为多个等间隔子带, 或 根据听觉感知模型 220将每一帧 MLT系数划分为多个非等间隔子带。  The MLT module 211 performs MLT on the input PCM signal and converts it into an MLT coefficient. The subband division module 212 divides each frame MLT coefficient into a plurality of equally spaced subbands, or divides each frame MLT coefficient into a plurality of non-equal interval subbands according to the auditory perception model 220.
划分为多个非等间隔子带的方法为:根据听觉感知模型 220,将 MLT 系数划分为多个非等间隔的子带, 子带的带宽与其频语位置有关, 对频 率低的 MLT系数, 划分较窄的子带, 对频率高的 MLT系数, 划分较宽 的子带。  The method of dividing into a plurality of non-equal interval sub-bands is: dividing the MLT coefficient into a plurality of non-equally spaced sub-bands according to the auditory perception model 220, the bandwidth of the sub-band is related to its frequency position, and for the MLT coefficient with low frequency, Divide narrower subbands and divide the wider subbands for high frequency MLT coefficients.
频带重要性分层模块 213,根据听觉感知模型 220,将划分为多个子 带的 MLT 系数分为包含敏感信号的核心层信号和包含次敏感信号的增 强层信号。  The band importance layering module 213 divides the MLT coefficients divided into a plurality of sub-bands into core layer signals including sensitive signals and enhancement layer signals including sub-sensitive signals according to the auditory perception model 220.
这里,根据听觉感知模型 220, 将人耳敏感的频带范围内的 MLT系 数划分为核心层信号, 将人耳次敏感的频带范围内的 MLT 系数划分为 增强层信号。 例如, 根据听觉感知模型, 人耳对 2, 000HZ-8, 000HZ 频率范围内的信号较敏感,就可以将 0HZ~8, 000HZ频率范围内的 MLT 系数划分为核心层信号,将 8, 000HZ以上频率范围内的 MLT系数划分 为增强层信号。 这里, 核心层信号和增强层信号分别包括多个子带。  Here, according to the auditory perception model 220, the MLT coefficients in the frequency range sensitive to the human ear are divided into core layer signals, and the MLT coefficients in the frequency range of the human ear sensitivity are divided into enhancement layer signals. For example, according to the auditory perception model, the human ear is sensitive to signals in the frequency range of 2, 000HZ-8, 000HZ, and the MLT coefficient in the frequency range of 0HZ~8, 000HZ can be divided into core layer signals, which will be above 8,000HZ. The MLT coefficients in the frequency range are divided into enhancement layer signals. Here, the core layer signal and the enhancement layer signal respectively include a plurality of sub-bands.
听觉感知模型 220,为子带划分模块 212的 MLT系数非等间隔划分 提供依据, 为频带重要性分层模块 213的子带分层提供依据, 为子带重 要性加权模块 251的子带重要性加权提供依据。  The auditory perception model 220 provides a basis for the non-equal interval division of the MLT coefficients of the sub-band partitioning module 212, and provides a basis for the sub-band layering of the band importance layering module 213, which is the sub-band importance of the sub-band importance weighting module 251. Weighted basis.
子带包络计算及编码模块 230, 根据核心层和增强层信号, 计算出 由频带重要性分层模块 213输入的核心层信号和增强层信号的各子带的 包络值后 , 将核心层信号和核心层信号各子带的包络值传送给核心层编 码模块 240, 将增强层信号和增强层信号各子带的包络值传送给增强层 编码模块 250; 对各子带包络值进行编码, 将编码数据传送给比特流复 用及打包模块 260。 Subband envelope calculation and coding module 230, which is calculated according to the core layer and the enhancement layer signal After the core layer signal input by the band importance layering module 213 and the envelope value of each subband of the enhancement layer signal, the envelope values of the core layer signals and the subbands of the core layer signal are transmitted to the core layer encoding module 240. The envelope values of the subbands of the enhancement layer signal and the enhancement layer signal are transmitted to the enhancement layer encoding module 250; the subband envelope values are encoded, and the encoded data is transmitted to the bitstream multiplexing and packing module 260.
核心层编码模块 240, 根据子带包络计算及编码模块 230输入的核 心层信号各子带的包络值, 对输入的核心层信号进行编码后, 传送给比 特流复用及打包模块 260。 其包括子带比特分配模块 241和量化及编码 模块 242。  The core layer encoding module 240 encodes the input core layer signal according to the envelope value of each subband of the core layer signal input by the subband envelope calculation and encoding module 230, and then transmits the signal to the bit stream multiplexing and packing module 260. It includes a subband bit allocation module 241 and a quantization and coding module 242.
子带比特分配模块 241 , 接收子带包络计算及编码模块 230输入的 核心层信号和核心层信号各子带的包络值, 根据核心层信号各子带包络 值, 为各子带分配比特位数, 将各子带信号的比特位数信息和核心层信 号传送给量化及编码模块 242。  The sub-band bit allocation module 241 receives the core layer signal input by the sub-band envelope calculation and encoding module 230 and the envelope value of each sub-band of the core layer signal, and allocates each sub-band according to the envelope value of each sub-band of the core layer signal. The number of bits, the bit number information of each subband signal and the core layer signal are transmitted to the quantization and coding module 242.
核心层信号包括多个子带, 即被划分层多个子带的 MLT系数。 量化及编码模块 242,根据核心层各子带的比特位数, 对输入的核心 层信号的各子带信号进行量化和编码, 将核心层各子带的编码数据传送 给比特流复用及打包模块 260。  The core layer signal includes a plurality of sub-bands, that is, MLT coefficients of a plurality of sub-bands that are divided into layers. The quantization and coding module 242 quantizes and encodes each sub-band signal of the input core layer signal according to the number of bits of each sub-band of the core layer, and transmits the encoded data of each sub-band of the core layer to the bit stream multiplexing and packaging. Module 260.
增强层编码模块 250, 根据由子带包络计算及编码模块 230输入的 增强层信号各子带的包络值和听觉感知模型 220, 对输入的增强层信号 进行编码后, 传送给比特流复用及打包模块 260。 其包括子带重要性加 权模块 251、 子带比特分配模块 252和量化及编码模块 253。  The enhancement layer coding module 250 encodes the input enhancement layer signal according to the envelope value of each subband of the enhancement layer signal input by the subband envelope calculation and coding module 230 and the auditory perception model 220, and then transmits the signal to the bit stream multiplexing. And packaging module 260. It includes a subband importance weighting module 251, a subband bit allocation module 252, and a quantization and encoding module 253.
子带重要性加权模块 251 , 接收由子带包络计算及编码模块 230输 入的增强层信号和增强层各子带的包络值, 根据输入的增强层各子带的 包络值和听觉感知模型 220, 对增强层信号的各子带的重要性进行加权 计算, 将计算得到的增强层各子带的重要性加权的结果和增强层信号传 送给子带比特分配模块 252。 The subband importance weighting module 251 receives the enhancement layer signal input by the subband envelope calculation and encoding module 230 and the envelope value of each subband of the enhancement layer, according to the envelope value and the auditory perception model of each subband of the input enhancement layer. 220. Perform weighting calculation on the importance of each subband of the enhancement layer signal, and calculate the weighted result of each subband of the enhancement layer and the enhancement layer signal transmission. The subband is assigned to the bit allocation module 252.
因为增强层信号的频率较高, 频带较宽, 信号的重要性不仅与包络 值有关, 还与人耳对信号的敏感度有关, 所以, 本发明实施例根据听觉 感知模型 220, 对增强层信号进行加权计算: 对人耳敏感的子带, 重要 性加权的结果为该子带的包络值与一个较大的权重值的乘积; 对于人耳 次敏感的子带, 重要性加权的结果为该子带的包络值与一个较小的权重 值的乘积。 也就是说, 现有技术中, 增强层各子带的重要性只是由包络 值决定, 而在本发明实施例里, 增强层各子带的重要性由包络值和人耳 敏感度共同决定。  Because the frequency of the enhancement layer signal is higher and the frequency band is wider, the importance of the signal is not only related to the envelope value, but also related to the sensitivity of the human ear to the signal. Therefore, the embodiment of the present invention is based on the auditory perception model 220 and the enhancement layer. The signal is weighted: the sub-band sensitive to the human ear, the weighted result is the product of the envelope value of the sub-band and a larger weight value; for the sub-band sensitive to the human ear, the importance weighted result The product of the envelope value of the subband and a smaller weight value. That is to say, in the prior art, the importance of each sub-band of the enhancement layer is only determined by the envelope value, and in the embodiment of the present invention, the importance of each sub-band of the enhancement layer is jointly performed by the envelope value and the human ear sensitivity. Decide.
子带比特分配模块 252, 接收由子带重要性加权模块 251输入的增 强层各子带的重要性加权的结果和增强层信号, 根据增强层信号各子带 的重要性加权的结果, 为增强层信号的各子带信号分配比特位数, 将各 子带信号的比特位数信息和增强层信号传送给量化及编码模块 253。  The sub-band bit allocation module 252 receives the importance weighting result of each sub-band of the enhancement layer input by the sub-band importance weighting module 251 and the enhancement layer signal, and is an enhancement layer according to the weighting of the importance of each sub-band of the enhancement layer signal. Each sub-band signal of the signal is assigned a bit number, and the bit number information and the enhancement layer signal of each sub-band signal are transmitted to the quantization and coding module 253.
根据增强层信号各子带的重要性加权的结果, 对于重要性大的子带 信号, 分配较多的比特位数, 对于重要性小的子带信号, 分配较少的比 特位数。  As a result of weighting the importance of each sub-band of the enhancement layer signal, more bit bits are allocated for the sub-band signals of higher importance, and fewer bit numbers are allocated for the sub-band signals of less importance.
量化及编码模块 253 , 接收子带比特分配模块 252输入的增强层各 子带的比特位数信息和增强层信号, 根据增强层各子带信号的比特位 数, 对增强层信号的各子带信号进行量化和编码, 将增强层各子带的编 码数据传送给比特流复用及打包模块 260。  The quantization and coding module 253 receives the bit number information of each subband of the enhancement layer and the enhancement layer signal input by the subband bit allocation module 252, and according to the bit number of each subband signal of the enhancement layer, each subband of the enhancement layer signal The signals are quantized and encoded, and the encoded data for each subband of the enhancement layer is transmitted to the bitstream multiplexing and packing module 260.
比特流复用及打包模块 260, 对量化及编码模块 242输入的核心层 各子带的编码数据、 量化及编码模块 253输入的增强层各子带的编码数 据和子带包络计算及编码模块 230输入的子带包络值编码数据进行复用 和打包。  The bit stream multiplexing and packing module 260, the coded data of each sub-band of the core layer input by the quantization and coding module 242, the coded data of each sub-band of the enhancement layer input by the quantization and coding module 253, and the sub-band envelope calculation and coding module 230 The input subband envelope value encoded data is multiplexed and packed.
这里, 子带包络计算及编码模块 230输入的子带包络值编码数据, 包括与核心层信号各子带对应的子带包络值、 与增强层信号各子带对应 的子带包络值。 Here, the subband envelope calculation and the subband envelope value encoded data input by the encoding module 230, The sub-band envelope value corresponding to each sub-band of the core layer signal and the sub-band envelope value corresponding to each sub-band of the enhancement layer signal are included.
参见图 3 , 为图 2中复用及打包后的音频码流结构示意图, 包括核 心部分和增强部分。 核心部分包括帧头、 各子带包络值的编码数据和核 心层编码数据, 核心层编码数据即图中的层 0编码数据, 由核心层各子 带的编码数据按照频率从低到高的顺序排列而成。 增强层部分由增强层 编码数据组成, 分为如图中所示的层 1编码数据至层 N编码数据。 将增 强层各子带的编码数据置入码流的方法为: 将增强层各子带的编码数据 按照重要性从大到小的顺序依次置入码流, 将增强层某一子带编码数据 置入码流之前, 先计算出所在帧的码流已用的比特位数与所述某一子带 比特位数之和, 再与所在帧的可用总比特位数相比较, 如果小于或等于 总比特位数, 则将所述某一子带编码数据置入码流, 并将已用比特位数 更新为之前已用比特数与所述某一子带编码数据比特位数的和, 继续置 入下一子带编码数据; 否则, 停止置入子带编码数据, 将剩余可用比特 位数用预先设置的值填充, 如, "1" 或 "0" , 也就是, 舍弃所述某一 子带编码数据以及比所述某一子带编码数据重要性小的所有子带编码 数据。  Referring to FIG. 3, it is a schematic diagram of the structure of the audio stream after multiplexing and packing in FIG. 2, including a core part and an enhancement part. The core part includes a frame header, coded data of each sub-band envelope value, and core layer coded data, and the core layer coded data is layer 0 coded data in the figure, and the coded data of each sub-band of the core layer is from low to high according to frequency. Arranged in order. The enhancement layer portion is composed of enhancement layer coded data and is divided into layer 1 coded data as shown in the figure to layer N coded data. The method for placing the coded data of each sub-band of the enhancement layer into the code stream is: placing the coded data of each sub-band of the enhancement layer into the code stream in order of importance from the largest to the smallest, and encoding the data of a certain sub-band of the enhancement layer. Before placing the code stream, first calculate the sum of the number of bits used by the code stream of the frame and the bit number of the certain sub-band, and compare it with the total number of bits available in the frame, if less than or equal to The total number of bits, the certain sub-band encoded data is placed into the code stream, and the used bit number is updated to the sum of the number of used bits and the bit number of the encoded data of the certain sub-band, and continues Insert the next sub-band encoded data; otherwise, stop placing the sub-band encoded data, and fill the remaining available bits with a preset value, such as "1" or "0", that is, discard the one Subband encoded data and all subband encoded data less important than the certain subband encoded data.
参见图 4, 为本发明实施例可分层解码装置的结构示意图, 包括比 特流解复用模块 410、 子带包络解码模块 420、 核心层解码模块 430、 增 强层解码模块 440、听觉感知模型 450、MLT系数重建及逆变换模块 460。  4 is a schematic structural diagram of a layered decoding apparatus according to an embodiment of the present invention, including a bit stream demultiplexing module 410, a subband envelope decoding module 420, a core layer decoding module 430, an enhancement layer decoding module 440, and an auditory perception model. 450, MLT coefficient reconstruction and inverse transform module 460.
比特流解复用模块 410, 将接收到的编码数据解复用为子带包络值 编码数据、 核心层编码数据和增强层编码数据, 传送给子带包络解码模 块 420。  The bit stream demultiplexing module 410 demultiplexes the received encoded data into subband envelope value encoded data, core layer encoded data, and enhancement layer encoded data, and transmits the encoded data to the subband envelope decoding module 420.
核心层编码数据为由多个核心层子带编码数据组成的一个整体, 增 强层编码数据为由多个增强层子带编码数据组成的一个整体。 子带包络解码模块 420, 接收比特流解复用模块 410输入的核心层 编码数据、 子带包络值编码数据和增强层编码数据, 对子带包络值编码 数据进行解码, 得到各子带的包络值后, 将核心层编码数据和核心层各 子带的包络值传送给核心层解码模块 430, 将增强层编码数据和增强层 各子带的包络值传送给增强层解码模块 440。 The core layer coded data is a whole composed of a plurality of core layer sub-band coded data, and the enhancement layer coded data is a whole composed of a plurality of enhancement layer sub-band coded data. The sub-band envelope decoding module 420 receives the core layer coded data, the sub-band envelope value coded data, and the enhancement layer coded data input by the bit stream demultiplexing module 410, and decodes the sub-band envelope value coded data to obtain each sub-block. After the envelope value, the core layer coded data and the envelope value of each subband of the core layer are transmitted to the core layer decoding module 430, and the envelope values of the enhancement layer coded data and the enhancement layer subbands are transmitted to the enhancement layer for decoding. Module 440.
对子带包络值编码数据进行解码后得到的各子带包络值, 包括核心 层各子带的包络值和增强层各子带的包络值。  The sub-band envelope values obtained by decoding the sub-band envelope value encoded data include an envelope value of each sub-band of the core layer and an envelope value of each sub-band of the enhancement layer.
核心层解码模块 430, 接收子带包络解码模块 420输入的核心层编 码数据和核心层各子带的包络值, 根据核心层各子带的包络值, 对核心 层编码数据进行解码, 得到解压的核心层各子带的 MLT 系数后, 传送 给 MLT系数重建及逆变换模块 460。 其包括子带比特分配模块 431、 子 带数据提取模块 432和逆量化及解码模块 433。  The core layer decoding module 430 receives the core layer coded data input by the subband envelope decoding module 420 and the envelope value of each subband of the core layer, and decodes the core layer coded data according to the envelope value of each subband of the core layer. The MLT coefficients of the demodulated core sub-bands are obtained and transmitted to the MLT coefficient reconstruction and inverse transform module 460. It includes a subband bit allocation module 431, a subband data extraction module 432, and an inverse quantization and decoding module 433.
子带比特分配模块 431 , 接收子带包络解码模块 420输入的核心层 编码数据和核心层各子带的包络值, 根据核心层各子带的包络值, 为各 子带分配比特位数, 将核心层各子带的比特位数信息和核心层编码数据 传送给子带数据提取模块 432。  The sub-band bit allocation module 431 receives the core layer coded data input by the sub-band envelope decoding module 420 and the envelope value of each sub-band of the core layer, and allocates bits for each sub-band according to the envelope value of each sub-band of the core layer. The number of bits and the core layer coded data of each subband of the core layer are transmitted to the subband data extraction module 432.
子带数据提取模块 432, 接收子带比特分配模块 431输入的核心层 各子带的比特位数信息和核心层编码数据, 根据核心层各子带所占的比 特位数, 提取核心层编码数据的各子带的编码数据, 将核心层各子带的 编码数据传送给逆量化及解码模块 433。  The subband data extraction module 432 receives the bit number information of each subband of the core layer and the core layer encoded data input by the subband bit allocation module 431, and extracts the core layer encoded data according to the number of bits occupied by each subband of the core layer. The encoded data of each subband transmits the encoded data of each subband of the core layer to the inverse quantization and decoding module 433.
从子带比特分配模块 431输入的核心层编码数据为包括多个核心层 子带编码数据的一个整体, 经子带数据提取模块 432后输出为核心层各 个子带的编码数据。  The core layer coded data input from the subband bit allocation module 431 is an entirety including a plurality of core layer subband coded data, and is output as the coded data of each subband of the core layer by the subband data extraction module 432.
逆量化及解码模块 433 , 接收子带数据提取模块 432输入的核心层 各子带的编码数据, 对核心层各子带的编码数据进行逆量化和解码后, 得到解压的核心层各子带的 MLT系数,传送给 MLT系数重建及逆变换 模块 460。 The inverse quantization and decoding module 433 receives the encoded data of each sub-band of the core layer input by the sub-band data extraction module 432, and performs inverse quantization and decoding on the encoded data of each sub-band of the core layer. The MLT coefficients of each subband of the decompressed core layer are obtained and passed to the MLT coefficient reconstruction and inverse transform module 460.
增强层解码模块 440, 接收子带包络解码模块 420输入的增强层编 码数据和增强层各子带的包络值, 根据增强层各子带的包络值和听觉感 知模型 450, 对增强层编码数据进行解码, 得到解压的增强层各子带的 MLT系数, 将增强层各子带的 MLT系数和增强层各子带的包络值传送 给 MLT系数重建及逆变换模块 460, 其包括子带重要性加权模块 441、 子带比特分配模块 442、 子带数据提取模块 443 和逆量化及解码模块 444。  The enhancement layer decoding module 440 receives the enhancement layer coded data input by the subband envelope decoding module 420 and the envelope value of each subband of the enhancement layer, and the enhancement layer according to the envelope value and the auditory perception model 450 of each subband of the enhancement layer. The encoded data is decoded to obtain MLT coefficients of each subband of the decomposed enhancement layer, and the MLT coefficients of each subband of the enhancement layer and the envelope values of the subbands of the enhancement layer are transmitted to the MLT coefficient reconstruction and inverse transform module 460, which includes the sub The importance weighting module 441, the subband bit allocation module 442, the subband data extraction module 443, and the inverse quantization and decoding module 444.
子带重要性加权模块 441 , 接收子带包络解码模块 420输入的增强 层编码数据和增强层各子带的包络值, 根据输入的增强层各子带的包络 值和听觉感知模型 450, 对增强层编码数据的各子带的重要性进行加权 计算, 将计算得到的增强编码数据的各子带的重要性加权结果、 增强层 编码数据和增强层各子带的包络值传送给子带比特分配模块 442。  The subband importance weighting module 441 receives the enhancement layer encoded data input by the subband envelope decoding module 420 and the envelope value of each subband of the enhancement layer, according to the envelope value of the subband of the input enhancement layer and the auditory perception model 450. And performing weighting calculation on the importance of each subband of the enhancement layer encoded data, and transmitting the weighted result of each subband of the calculated enhanced encoded data, the enhancement layer encoded data, and the envelope value of each subband of the enhancement layer to Subband bit allocation module 442.
因为增强层信号的频率较高, 频带较宽, 信号的重要性不仅与包络 值有关, 还与人耳对信号的敏感度有关, 所以, 本发明根据听觉感知模 型 450, 对增强层信号进行加权计算: 对人耳敏感的子带, 重要性加权 的结果为该子带的包络值与一个较大的权重值的乘积; 对于人耳次敏感 的子带, 重要性加权的结果为该子带的包络值与一个较小的权重值的乘 积。 得到的计算结果数值越大, 重要性越大, 计算结果数值越小, 重要 性越小。  Because the frequency of the enhancement layer signal is higher and the frequency band is wider, the importance of the signal is not only related to the envelope value, but also related to the sensitivity of the human ear to the signal. Therefore, the present invention performs the enhancement layer signal according to the auditory perception model 450. Weighted calculation: Sub-band sensitive to the human ear, the result of importance weighting is the product of the envelope value of the sub-band and a larger weight value; for sub-bands sensitive to human ears, the result of importance weighting is The product of the envelope value of the subband and a smaller weight value. The larger the value of the calculated result, the greater the importance, and the smaller the value of the calculation result, the smaller the importance.
也就是说, 现有技术中, 增强层子带的重要性只是由包络值决定, 而在本发明里, 增强层子带的重要性由包络值和人耳敏感度共同决定。  That is to say, in the prior art, the importance of the enhancement layer sub-band is only determined by the envelope value, and in the present invention, the importance of the enhancement layer sub-band is determined by the envelope value and the human ear sensitivity.
子带比特分配模块 442, 接收子带重要性加权模块 441输入的增强 层编码数据的各子带的重要性加权结果、 增强层编码数据和增强层各子 带的包络值, 根据增强层编码数据各子带的重要性加权的结果, 为增强 层各子带的编码数据分配比特位数, 将增强层编码数据的各子带的重要 性加权结果、 各子带的编码数据的比特位数信息、 增强层编码数据和增 强层各子带的包络值传送给子带数据提取模块 443。 The subband bit allocation module 442 receives the importance weighting result of each subband of the enhancement layer encoded data input by the subband importance weighting module 441, the enhancement layer coding data, and each layer of the enhancement layer. The envelope value of the band is weighted according to the importance weight of each sub-band of the enhancement layer coded data, and the bit number of bits is allocated to the coded data of each sub-band of the enhancement layer, and the importance weighting result of each sub-band of the enhancement layer coded data is The bit number information of the encoded data of each subband, the enhancement layer encoded data, and the envelope value of each subband of the enhancement layer are transmitted to the subband data extraction module 443.
子带数据提取模块 443, 接收子带比特分配模块 442输入的增强编 码数据的各子带的重要性加权结果、 增强层各子带的编码数据的比特位 数信息、 增强层编码数据和增强层各子带的包络值, 按照增强编码数据 各子带数据的重要性从大到小的顺序, 根据增强层各子带所占的比特位 数, 提取增强层编码数据的各子带的编码数据, 将增强层各子带的编码 数据和增强层各子带的包络值传送给逆量化及解码模块 444。  The subband data extraction module 443 receives the importance weighting result of each subband of the enhanced coded data input by the subband bit allocation module 442, the bit number information of the encoded data of each subband of the enhancement layer, the enhancement layer coded data, and the enhancement layer. The envelope value of each sub-band is extracted according to the importance of each sub-band data of the enhanced coded data, and the code of each sub-band of the enhancement layer coded data is extracted according to the number of bits occupied by each sub-band of the enhancement layer. Data, the encoded data of each subband of the enhancement layer and the envelope value of each subband of the enhancement layer are transmitted to an inverse quantization and decoding module 444.
从子带比特分配模块 442输入的增强层编码数据为包括多个增强层 子带编码数据的一个整体, 经子带数据提取模块 443后输出为增强层各 个子带的编码数据。  The enhancement layer coded data input from the subband bit allocation module 442 is an entirety including a plurality of enhancement layer subband coded data, and is outputted by the subband data extraction module 443 as coded data of each subband of the enhancement layer.
按照增强编码数据各子带数据的重要性从大到小的顺序, 根据相应 各子带所占的比特位数, 提取出增强层编码数据的各子带编码数据。 提 取数据时, 首先计算出已提取的所在帧的码流的比特位数和即将提取的 增强层编码数据的某一子带编码数据所占比特位数的和, 然后与所在帧 的码流的总比特位数相比较, 如果大于总比特位数, 则停止提取数据; 否则提取所述某一子带的编码, 将已提取比特位数更新为之前已提取比 特位数与所述某一子带编码所占比特位的和, 继续提取增强层编码数据 的下一子带编码数据。  The sub-band encoded data of the enhancement layer coded data is extracted according to the number of bits occupied by the respective sub-bands in order of increasing importance of the sub-band data of the enhanced coded data. When extracting data, first calculate the sum of the bit number of the code stream of the extracted frame and the bit number of a certain sub-band coded data of the enhancement layer coded data to be extracted, and then with the code stream of the frame in which the frame is located. The total number of bits is compared, if it is greater than the total number of bits, the data is stopped; otherwise, the encoding of the certain subband is extracted, and the number of extracted bits is updated to the previously extracted bit number and the certain sub The sum of the bits occupied by the code continues to extract the next sub-band coded data of the enhancement layer coded data.
逆量化及解码模块 444, 接收子带数据提取模块 443输入的增强层 各子带的编码数据和增强层各子带的包络值, 对增强层各子带的编码数 据进行逆量化和解码后, 得到解压的增强层各子带的 MLT 系数, 将增 强层各子带的 MLT系数和增强层各子带的包络值传送给 MLT系数重建 及逆变换模块 460。 The inverse quantization and decoding module 444 receives the encoded data of each subband of the enhancement layer and the envelope value of each subband of the enhancement layer input by the subband data extraction module 443, and performs inverse quantization and decoding on the encoded data of each subband of the enhancement layer. , obtain the MLT coefficient of each sub-band of the decompressed enhancement layer, and transmit the MLT coefficient of each sub-band of the enhancement layer and the envelope value of each sub-band of the enhancement layer to the MLT coefficient reconstruction And inverse transform module 460.
听觉感知模型 450, 为子带重要性加权模块 441的子带重要性加权 提供依据; 若编码或传输过程中为适应网络状况而丟掉了重要性较小的 增强层某些子带的数据时, 则为 MLT系数重建模块 461提供重建丟失 的增强层 MLT系数的依据。  The auditory perception model 450 provides a basis for the sub-band importance weighting of the sub-band importance weighting module 441; if the data of some sub-bands of the less important enhancement layer is lost in the encoding or transmission process to adapt to the network condition, The MLT coefficient reconstruction module 461 is then provided with a basis for reconstructing the lost enhancement layer MLT coefficients.
MLT系数重建及逆变换模块 460, 接收逆量化及解码模块 433输入 的核心层各子带的 MLT系数, 和逆量化及解码模块 444输入的增强层 各子带的 MLT系数、 增强层各子带的包络值, 对核心层各子带的 MLT 系数和增强层各子带的 MLT系数进行逆变换, 得到解压的 PCM信号, 其包括 MLT系数重建模块 461和 MLT逆变换模块 462。  The MLT coefficient reconstruction and inverse transform module 460 receives the MLT coefficients of the sub-bands of the core layer input by the inverse quantization and decoding module 433, and the MLT coefficients of the sub-bands of the enhancement layer input by the inverse quantization and decoding module 444, and the sub-bands of the enhancement layer. The envelope value, the MLT coefficient of each sub-band of the core layer and the MLT coefficient of each sub-band of the enhancement layer are inverse transformed to obtain a decompressed PCM signal, which includes an MLT coefficient reconstruction module 461 and an MLT inverse transform module 462.
MLT系数重建模块 461 , 接收逆量化及解码模块 433输入的核心层 各子带的 MLT系数, 和逆量化及解码模块 444输入的增强层各子带的 MLT系数、 增强层各子带的包络值, 根据增强层各子带的包络值, 按照 频带次序重新排列核心层和增强层各子带的 MLT系数后, 传送给 MLT 逆变换模块 462。  The MLT coefficient reconstruction module 461 receives the MLT coefficients of the sub-bands of the core layer input by the inverse quantization and decoding module 433, and the MLT coefficients of each sub-band of the enhancement layer input by the inverse quantization and decoding module 444, and the envelope of each sub-band of the enhancement layer. The value, according to the envelope value of each sub-band of the enhancement layer, rearranges the MLT coefficients of the core layer and the enhancement layer sub-bands according to the band order, and then transmits the MLT coefficients to the MLT inverse transform module 462.
重新排列后的 MLT系数为包括核心层 MLT系数和增强层 MLT系 数的一个整体。  The rearranged MLT coefficients are a whole including the core layer MLT coefficients and the enhancement layer MLT coefficients.
将核心层和增强层各子带的 MLT 系数按照频率从小到大的顺序依 次排列。 对于增强层各子带的 MLT 系数, 可能存在编码或传输过程中 为适应网络状况而丟掉的重要性较小的增强层某些子带的数据, 例如, 在比特流复用及打包模块 260复用和打包中, 可能会丟掉的重要性较小 的某些增强层子带的编码数据。 此时, 得到重新排列的 MLT 系数后, 可以根据增强层各子带的包络值补偿丟失的增强层 MLT 系数, 补偿方 法为: MLT系数的符号随机选取, 可以为正, 也可以为负, 将相应子带 的包络值乘以一比例常数, 作为 MLT 系数的幅度, 所述比例常数根据 听觉感知模型 450确定, 对于人耳敏感度大的子带信号, 其比例常数值 大, 对于人耳敏感度度小的信号, 其比例常数值小。 The MLT coefficients of each sub-band of the core layer and the enhancement layer are arranged in order of frequency from small to large. For the MLT coefficients of each sub-band of the enhancement layer, there may be data of some sub-bands of the enhancement layer which are less important in the encoding or transmission process to adapt to the network condition, for example, in the bit stream multiplexing and packing module 260 In the use and packaging, the encoded data of some enhancement layer sub-bands that are less important may be lost. At this time, after the rearranged MLT coefficients are obtained, the missing enhancement layer MLT coefficients may be compensated according to the envelope values of the subbands of the enhancement layer. The compensation method is: The symbols of the MLT coefficients are randomly selected, and may be positive or negative. Multiplying the envelope value of the corresponding sub-band by a proportional constant as the magnitude of the MLT coefficient, the proportional constant being The auditory perception model 450 determines that the sub-band signal with high sensitivity to the human ear has a large proportional constant value, and the signal with a small sensitivity to the human ear has a small proportional constant value.
MLT逆变换模块 462, 接收 MLT系数重建模块 461输入的的 MLT 系数, 对 MLT系数进行逆 MLT, 得到解压的 PCM信号。  The MLT inverse transform module 462 receives the MLT coefficient input by the MLT coefficient reconstruction module 461, and performs inverse MLT on the MLT coefficient to obtain a decompressed PCM signal.
参见图 5, 为本发明实施例可分层编码方法的流程图。 此实施例中, 输入采样频率为 48kHz的 PCM信号, 帧长为 20ms, 延时为 40ms, 码 率范围 32 ~ 64kbits/s , 其中核心层码率为 32kbits/s, 可分层步长为 0.8kbits/s。 包括以下步骤:  Referring to FIG. 5, it is a flowchart of a layered coding method according to an embodiment of the present invention. In this embodiment, the input PCM signal with a sampling frequency of 48 kHz has a frame length of 20 ms, a delay of 40 ms, and a code rate range of 32 to 64 kbits/s, wherein the core layer code rate is 32 kbits/s, and the layering step size is 0.8. Kbits/s. Includes the following steps:
步骤 501 , 将 PCM信号进行 MLT, 变换为 MLT系数。  In step 501, the PCM signal is MLT and converted into an MLT coefficient.
在 48kHz采样率下, 每帧 20ms的样值数目为 960, 因此每一次 MLT 的输入是最新的 1920个样值 x(n), 其中, x(0)是最旧的那个样值, 且, 0 ≤n< 1920。  At a sampling rate of 48 kHz, the number of samples per frame of 20 ms is 960, so the input of each MLT is the latest 1920 samples x(n), where x(0) is the oldest sample, and 0 ≤ n < 1920.
MLT输出 960个 MLT系数, 即 mlt(m), 其中, 0≤m<960。  MLT outputs 960 MLT coefficients, ie mlt(m), where 0 ≤ m < 960.
MLT由下式给出:
Figure imgf000018_0001
MLT is given by:
Figure imgf000018_0001
MLT可以分解为窗口、 重叠和加法运算, 然后进行 IV型离散余弦变 换(DCT, Discrete Cosine Transform ) 。 窗口、 重叠和加法运算按下式 完成:  MLT can be decomposed into windows, overlap and addition, and then type IV discrete cosine transform (DCT, Discrete Cosine Transform). Window, Overlap, and Addition are done as follows:
v(n) = w(479 - η)χ( Ί9 -n) + w(480 + w);c(480 + n) ,对于 0≤ w≤ 479 v(w + 480) = v 959_w)4960 + w)_w(w)41919_w) , 对 于  v(n) = w(479 - η)χ( Ί9 -n) + w(480 + w);c(480 + n) for 0≤ w≤ 479 v(w + 480) = v 959_w)4960 + w)_w(w)41919_w) , for
0 < « < 479 0 < « 479
其中:  among them:
( 71 \  ( 71 \
w(n) = sin (n + 0.5) w(n) = sin (n + 0.5)
920 J , 对于 0≤w<960 将 v W与 IV型 DCT合并, 形成的 MLT系数的表达式为:
Figure imgf000019_0001
,对于 0≤m < 960 步骤 502,将每一帧 MLT系数划分为多个等间隔子带或多个非等间 隔子带。
920 J , for 0≤w<960 Combining v W with type IV DCT, the resulting expression of the MLT coefficient is:
Figure imgf000019_0001
For 0 ≤ m < 960 step 502, each frame MLT coefficient is divided into a plurality of equally spaced sub-bands or a plurality of non-equal spaced sub-bands.
这里,将 0~20kHz频带范围内的 MLT系数等间隔划分为 40个子带, 每个子带的频带宽度为 500Hz, 包含 20个 MLT系数。  Here, the MLT coefficients in the 0-20 kHz band are equally divided into 40 sub-bands, each of which has a bandwidth of 500 Hz and 20 MLT coefficients.
步骤 503,根据听觉感知模型, 将 MLT系数分为包含敏感信号的核 心层信号和包含次敏感信号的增强层信号。  Step 503: According to the auditory perception model, divide the MLT coefficient into a core layer signal including a sensitive signal and an enhancement layer signal including a secondary sensitive signal.
根据听觉感知模型, 人耳对 2k~8kHZ 范围的信号较敏感, 因此将 0~8kHZ范围,即子带 0~15范围划分为核心层信号,并为其分配 32kbits/s 码率, 将子带 16~39范围划分为增强层信号, 码率为余下的 32kbits/s。  According to the auditory perception model, the human ear is sensitive to signals in the range of 2k~8kHZ. Therefore, the range of 0~8kHZ, that is, the sub-band 0~15 is divided into core layer signals, and 32kbits/s code rate is allocated to them. The 16~39 range is divided into enhancement layer signals, and the code rate is the remaining 32 kbits/s.
步骤 504, 根据核心层信号和增强层信号, 计算出核心层信号和增 强层信号的各子带的包络值, 对各子带包络值进行编码, 得到各子带包 络值的编码数据, 然后执行步骤 505和步骤 507。  Step 504: Calculate, according to the core layer signal and the enhancement layer signal, an envelope value of each subband of the core layer signal and the enhancement layer signal, and encode each subband envelope value to obtain coded data of each subband envelope value. Then, steps 505 and 507 are performed.
子带包络值被定义为该区域中 MLT系数的均方根( RMS, Root Mean The subband envelope value is defined as the root mean square of the MLT coefficient in this region (RMS, Root Mean
Square ) , 其计算式为: rms(r)Square ) , whose calculation formula is: rms(r)
Figure imgf000019_0002
Figure imgf000019_0002
计算出各子带得包络值之后,用可变字长编码( VLC, Variable Length Code )方法或其它编码方法对各子带包络值进行编码, 得到各子带包络 值的编码数据。  After calculating the envelope values of the sub-bands, the sub-band envelope values are encoded by a variable length code (VLC) method or other coding method to obtain coded data of the envelope values of the sub-bands.
步骤 505 , 根据核心层信号各子带的包络值, 为核心层信号各子带 分配比特位数。  Step 505: Allocate bit numbers for each subband of the core layer signal according to an envelope value of each subband of the core layer signal.
可以采用 G.722.1或 G.929EV的比特分配算法, 为核心层各子带信 号分配比特位。 The bit allocation algorithm of G.722.1 or G.929EV can be used to carry the sub-bands of the core layer. Number allocation bit.
步骤 506, 根据核心层信号各子带的比特位数, 对核心层信号的各 子带信号进行量化和编码, 得到核心层各子带的编码数据, 然后执行步 骤 510。  Step 506: Quantize and encode each sub-band signal of the core layer signal according to the number of bits of each sub-band of the core layer signal to obtain encoded data of each sub-band of the core layer, and then perform step 510.
步骤 507, 根据听觉感知模型和增强层信号各子带的包络值, 对增 强层信号各子带的重要性进行加权计算。  Step 507: Perform weighting calculation on the importance of each subband of the enhancement layer signal according to the auditory perception model and the envelope value of each subband of the enhancement layer signal.
因为增强层信号的频率较高, 频带较宽, 信号的重要性不仅与包络 值有关, 还与人耳对声音的敏感度有关, 所以, 本发明根据听觉感知模 型, 对增强层信号进行加权计算: 对人耳敏感的子带, 重要性加权的结 果为 rms ( r )与一个较大的权重值的乘积; 对于人耳次敏感的子带, 重 要性加权的结果为该子带的 rms ( r )与一个较小的权重值的乘积。 也就 是说, 增强层信号的各子带信号的重要性由包络值和人耳敏感度决定。  Because the frequency of the enhancement layer signal is higher and the frequency band is wider, the importance of the signal is not only related to the envelope value, but also related to the sensitivity of the human ear to the sound. Therefore, the present invention weights the enhancement layer signal according to the auditory perception model. Calculation: The sub-band sensitive to the human ear, the importance weighted result is the product of rms ( r ) and a larger weight value; for the sub-band sensitive to the human ear, the importance weighted result is the rms of the sub-band (r) The product of a smaller weight value. That is to say, the importance of each sub-band signal of the enhancement layer signal is determined by the envelope value and the human ear sensitivity.
子带重要性加权计算可以筒单地表示为:  The subband importance weighting calculation can be expressed as:
rms (16 + r) * 1.67  Rms (16 + r) * 1.67
0≤ r < 4  0≤ r < 4
ip(r) rms (16 + r) * 1.33  Ip(r) rms (16 + r) * 1.33
4≤ r < 12  4≤ r < 12
rms (16 + r)  Rms (16 + r)
12≤ r < 24  12≤ r < 24
ip ( r ) 的大小表示增强层信号的各子带信号的重要性的大小。 步骤 508 , 根据计算出的增强层信号各子带的重要性加权结果, 为 各子带信号分配比特位数。  The size of ip ( r ) represents the magnitude of the importance of each sub-band signal of the enhancement layer signal. Step 508: Allocate bit numbers for each subband signal according to the weighted result of the importance of each subband of the calculated enhancement layer signal.
根据步骤 507计算得到的加权重要性, 为增强层信号的各子带信号 分配比特位数。 对重要性大的子带信号, 分配较多的比特位数, 对重要 性小的子带信号, 分配较少的比特位数。  According to the weighted importance calculated in step 507, the number of bits is allocated for each subband signal of the enhancement layer signal. For a sub-band signal of high importance, a larger number of bits are allocated, and for a sub-band signal having a smaller importance, a smaller number of bits are allocated.
步骤 509, 根据增强层各子带信号的比特位数, 对增强层信号的各 子带信号进行量化和编码, 得到增强层各子带的编码数据。  Step 509: Quantize and encode each subband signal of the enhancement layer signal according to the number of bits of each subband signal of the enhancement layer, to obtain coded data of each subband of the enhancement layer.
步骤 510, 对各子带包络值的编码数据、 核心层各子带的编码数据 和增强层各子带的编码数据进行复用及打包后, 传送给解码端。 Step 510, encoding data of an envelope value of each subband, and encoding data of each subband of the core layer The coded data of each sub-band of the enhancement layer is multiplexed and packed, and then transmitted to the decoding end.
参见图 3 , 为复用及打包后的音频码流结构示意图。 复用及打包的 方法与比特流复用及打包模块 260处的描述相同。  See Figure 3 for a schematic diagram of the audio stream structure after multiplexing and packing. The multiplexing and packing methods are the same as described in the bitstream multiplexing and packing module 260.
参见图 6, 为本发明实施例可分层解码方法的流程图, 此实施例为 对图 5中编码后得到的码流进行解码的流程, 包括以下步骤:  6 is a flowchart of a layered decoding method according to an embodiment of the present invention. This embodiment is a process for decoding a code stream obtained by encoding in FIG. 5, and includes the following steps:
步骤 601 , 将编码端传送的编码数据解复用为核心层编码数据、 子 带包络值编码数据和增强层编码数据。  Step 601: Demultiplex the encoded data transmitted by the encoding end into core layer encoded data, subband envelope value encoded data, and enhancement layer encoded data.
核心层编码数据为由多个核心层子带编码数据组成的一个整体, 增 强层编码数据为由多个增强层子带编码数据组成的一个整体。  The core layer coded data is a whole composed of a plurality of core layer sub-band coded data, and the enhancement layer coded data is a whole composed of a plurality of enhancement layer sub-band coded data.
步骤 602, 对各子带包络值编码数据进行解码, 得到各子带的包络 值, 然后执行步骤 603和步骤 606。  Step 602: Decode each sub-band envelope value encoded data to obtain an envelope value of each sub-band, and then perform step 603 and step 606.
对子带包络值编码数据进行解码后得到的各子带包络值, 包括核心 层各子带的包络值和增强层各子带的包络值。  The sub-band envelope values obtained by decoding the sub-band envelope value encoded data include an envelope value of each sub-band of the core layer and an envelope value of each sub-band of the enhancement layer.
步骤 603 , 根据核心层编码数据的各子带包络值, 为核心层编码数 据的各子带分配比特位数。  Step 603: Allocating bit numbers for each subband of the core layer coded data according to each subband envelope value of the core layer coded data.
步骤 604, 根据核心层编码数据的各子带所占的比特位数, 提取核 心层编码数据的各子带编码数据。  Step 604: Extract each sub-band encoded data of the core layer coded data according to the number of bits occupied by each subband of the core layer coded data.
核心层编码数据为由多个核心层编码数据的子带编码数据组成的一 个整体, 提取后分解为核心层各子带的编码数据。  The core layer coded data is a whole composed of sub-band coded data encoded by a plurality of core layers, and is extracted and decomposed into coded data of each sub-band of the core layer.
步骤 605 , 对提取的核心层各子带编码数据进行逆量化和解码后, 得到解压的核心层各子带的 MLT系数, 然后执行步骤 610。  Step 605: After performing inverse quantization and decoding on the extracted sub-band encoded data of the core layer, obtaining MLT coefficients of each sub-band of the decompressed core layer, and then performing step 610.
步骤 606, 根据听觉感知模型和增强层各子带的包络值, 对增强层 编码数据的各子带的重要性进行加权计算。  Step 606: Perform weighting calculation on the importance of each subband of the enhancement layer encoded data according to the auditory perception model and the envelope value of each subband of the enhancement layer.
步骤 607 , 根据增强层编码数据各子带的重要性, 为增强层各子带 的编码数据分配比特位数。 步骤 608, 按照增强层编码数据各子带数据的重要性从大到小的顺 序, 根据增强层各子带所占的比特位数, 提取增强层编码数据的各子带 的编码数据。 Step 607: Allocate bit numbers for the encoded data of each subband of the enhancement layer according to the importance of each subband of the enhancement layer encoded data. Step 608: Extract the coded data of each sub-band of the enhancement layer coded data according to the number of bits occupied by each sub-band of the enhancement layer according to the order of importance of each sub-band data of the enhancement layer coded data.
本步骤所述的方法与子带数据提取模块 443处的描述相同, 这里不 再赘述。  The method described in this step is the same as that described in the subband data extraction module 443, and will not be described again here.
步骤 609,对所提取的增强层各子带的编码数据进行逆量化和解码, 得到解压的增强层 MLT系数。  Step 609: Perform inverse quantization and decoding on the encoded data of each subband of the extracted enhancement layer to obtain a decompressed enhancement layer MLT coefficient.
采用与编码流程中量化及编码相反的过程对增强层编码数据进行逆 量化和解码, 得到各子带的 20个 MLT系数。  The enhancement layer encoded data is inverse quantized and decoded by the inverse of the quantization and encoding in the encoding process, and 20 MLT coefficients of each subband are obtained.
步骤 610,按照频率次序重新排列核心层和增强层各子带的 MLT系 数。  Step 610, rearranging the MLT coefficients of the core layers and the sub-bands of the enhancement layer in order of frequency.
将核心层和增强层各子带的 MLT 系数按照频率从小到大的顺序依 次排列。 对于增强层各子带的 MLT 系数, 可能存在编码或传输过程中 为适应网络状况而丟掉的重要性较小的增强层某些子带的数据, 例如, 编码流程中, 复用和打包时, 可能会丟掉的重要性较小的某些增强层子 带的编码数据。 根据增强层各子带的包络值可重建丟失的增强层 MLT 系数, 重建方法为: MLT系数的符号随机选取, 可以为正, 也可以为 负, 将子带的包络值乘以一比例常数作为 MLT 系数的幅度, 所述比例 常数根据听觉感知模型确定, 对于人耳敏感度大的子带信号, 其比例常 数值大, 对于人耳敏感度度小的信号, 其比例常数值小。 表 1为本实施 例中与各子带对应的比例常数。 子带下标 MLT系数重建的比例常数 The MLT coefficients of each sub-band of the core layer and the enhancement layer are arranged in order of frequency from small to large. For the MLT coefficients of each subband of the enhancement layer, there may be data of some subbands of the enhancement layer which are less important in the encoding or transmission process to adapt to the network condition, for example, in the encoding process, multiplexing and packing, Coded data for some enhancement layer sub-bands that are less likely to be lost. The missing enhancement layer MLT coefficients can be reconstructed according to the envelope values of the sub-bands of the enhancement layer. The reconstruction method is: The symbols of the MLT coefficients are randomly selected, which may be positive or negative, and the envelope value of the sub-band is multiplied by a ratio. The constant is used as the amplitude of the MLT coefficient. The proportional constant is determined according to the auditory perception model. The sub-band signal with high sensitivity to the human ear has a large proportional constant value, and the signal with a small sensitivity to the human ear has a small proportional constant value. Table 1 is a proportionality constant corresponding to each sub-band in the present embodiment. Proportional constant of sub-band subscript MLT coefficient reconstruction
16-19 0.85  16-19 0.85
20-27 0.75  20-27 0.75
28-39 0.70  28-39 0.70
表 1: MLT系数重建的比例常数  Table 1: Proportional constants for MLT coefficient reconstruction
步骤 611 , 对核心层和增强层各子带的 MLT系数进行逆 MLT, 得 到解压的 PCM信号。  Step 611: Perform inverse MLT on the MLT coefficients of each sub-band of the core layer and the enhancement layer to obtain a decompressed PCM signal.
每一次逆 MLT运算处理 960个 MLT系数,产生 960个时域音频样 值。逆 MLT可以分解为 IV型 DCT、窗口、重叠和加法运算。 IV型 DCT 为: u(n) ,对于 0 < n <
Figure imgf000023_0001
, J
Each inverse MLT operation processes 960 MLT coefficients, producing 960 time domain audio samples. Inverse MLT can be decomposed into type IV DCT, window, overlap, and addition. Type IV DCT is: u(n) for 0 < n <
Figure imgf000023_0001
, J
960 960
窗口、 重叠和加法运算使用当前帧 DCT输出样值的一半和前一帧 DCT输出样值的一半:  Window, overlap, and addition use half of the current frame DCT output sample and half of the previous frame DCT output sample:
y(n) = w(n)u(479 - n) + w(959 - n)u_old(n) ,对于 0≤ η≤ 479 y(n + 480) = w(480 + n)u(n) - w(479 - n)u_old(479 - n) , 对于 0 < η < y(n) = w(n)u(479 - n) + w(959 - n)u_old(n) for 0≤ η≤ 479 y(n + 480) = w(480 + n)u(n) - w(479 - n)u_old(479 - n) , for 0 < η <
479 479
其中: w(n) = sin] ^^ (η + 0.5) ) , 对于 0≤ «≤ 959 u( )中未使用的一半存储为 u_old, 供下一帧使用:  Where: w(n) = sin] ^^ (η + 0.5) ) , for unused half of 0≤ «≤ 959 u( ) is stored as u_old for the next frame:
u_old(n) = u(n + 480) , 对于 0≤n≤479  U_old(n) = u(n + 480) , for 0≤n≤479
y(n)为 PCM信号的表示式。  y(n) is the representation of the PCM signal.
由上述实施例可见, 本发明实施例编码方案将输入信号变换为 MLT 系数后, 根据听觉感知模型划分为核心层信号和增强层信号, 再根据核 心层信号、增强层信号和听觉感知模型,得到复用和打包后的编码数据; 解码时,根据听觉感知模型,对增强层各子带的重要性进行加权计算后, 对得到的核心层 MLT系数和增强层 MLT系数进行逆 MLT , 输出解压码 流。 与现有的可分层编解码技术相比, 本发明实施例对输入信号进行了 MLT,根据听觉感知模型对增强层各子带的重要性进行加权计算, 这样, 提高了编解码的质量, 解决了现有技术中不能有效处理高采样率输入信 号的问题。 并且, 本发明不采用 QMF以及 CELP编码, 降低了编解码复 杂度, 增强了编解码效果。 It can be seen from the above embodiment that the coding scheme of the embodiment of the present invention converts the input signal into an MLT coefficient, and then divides it into a core layer signal and an enhancement layer signal according to the auditory perception model, and then according to the core. The cardiac layer signal, the enhancement layer signal and the auditory perception model are obtained by multiplexing and packing the encoded data. When decoding, according to the auditory perception model, the importance of each sub-band of the enhancement layer is weighted, and the obtained core layer MLT is obtained. The coefficient and enhancement layer MLT coefficients are inverse MLT, and the decompressed code stream is output. Compared with the existing layered codec technology, the embodiment of the present invention performs MLT on the input signal, and performs weighting calculation on the importance of each subband of the enhancement layer according to the auditory perception model, thereby improving the quality of the codec. The problem that the high sampling rate input signal cannot be effectively processed in the prior art is solved. Moreover, the present invention does not adopt QMF and CELP coding, which reduces the codec complexity and enhances the codec effect.
以上所述的具体实施例, 对本发明的目的、 技术方案和有益效果进 行了进一步详细说明, 所应理解的是, 以上所述仅为本发明的具体实施 例而已, 并不用于限定本发明的保护范围, 凡在本发明的精神和原则之 内, 所做的任何修改、 等同替换、 改进等, 均应包含在本发明的保护范 围之内。  The above described specific embodiments of the present invention are further described in detail, and are intended to be illustrative of the embodiments of the present invention. The scope of the protection, any modifications, equivalents, improvements, etc., made within the spirit and scope of the invention are intended to be included within the scope of the invention.

Claims

权利要求书 Claim
1、 一种可分层音频编码装置, 其特征在于, 该装置包括: 基于听觉 感知模型的分层模块、 听觉感知模型、 子带包络计算及编码模块、 核心 层编码模块、 增强层编码模块和比特流复用及打包模块;  A layerable audio encoding device, comprising: a layered module based on an auditory perception model, an auditory perception model, a subband envelope calculation and encoding module, a core layer encoding module, and an enhancement layer encoding module And bitstream multiplexing and packaging modules;
所述基于听觉感知模型的分层模块, 将输入信号经过调制重叠变换 The hierarchical module based on the auditory perception model performs modulation overlap transformation on the input signal
MLT, 变换为 MLT系数后, 根据听觉感知模型, 划分为核心层信号和 增强层信号; MLT, after being transformed into MLT coefficients, is divided into core layer signals and enhancement layer signals according to the auditory perception model;
所述听觉感知模型,为基于听觉感知模型的分层模块提供分层依据, 为增强层编码模块的子带重要性加权提供依据;  The auditory perception model provides a layered basis for the layered module based on the auditory perception model, and provides a basis for weighting the sub-band importance of the enhancement layer coding module;
所述子带包络计算及编码模块, 根据核心层信号和增强层信号, 计 算出基于听觉感知模型的分层模块输入的核心层信号和增强层信号的 各子带的包络值后 , 将核心层信号和核心层信号各子带的包络值送给核 心层编码模块, 将增强层信号和增强层信号各子带包络值传送给增强层 编码模块; 对各子带包络值进行编码, 将编码数据传送给比特流复用及 打包模块;  The sub-band envelope calculation and coding module calculates the envelope value of each sub-band of the core layer signal and the enhancement layer signal based on the auditory perception model based on the core layer signal and the enhancement layer signal, The envelope values of the core layer signal and the core layer signal subbands are sent to the core layer coding module, and the enhancement layer signal and the enhancement layer signal subband envelope values are transmitted to the enhancement layer coding module; Encoding, transmitting the encoded data to the bitstream multiplexing and packing module;
所述核心层编码模块, 根据输入的核心层信号各子带的包络值, 对 输入的核心层信号进行编码后, 传送给比特流复用及打包模块;  The core layer coding module encodes the input core layer signal according to the envelope value of each subband of the input core layer signal, and then transmits the signal to the bit stream multiplexing and packing module;
所述增强层编码模块, 根据听觉感知模型和输入的增强层信号各子 带的包络值, 对输入的增强层信号进行编码后, 传送给比特流复用及打 包模块;  The enhancement layer coding module encodes the input enhancement layer signal according to the auditory perception model and the envelope value of each subband of the input enhancement layer signal, and then transmits the signal to the bit stream multiplexing and packetizing module;
所述比特流复用及打包模块, 对核心层编码模块输入的核心层各子 带的编码数据、 增强层编码模块输入的增强层各子带的编码数据和子带 包络计算及编码模块输入的子带包络值编码数据进行复用和打包。  The bit stream multiplexing and packing module, the coded data of each sub-band of the core layer input by the core layer coding module, the coded data of each sub-band of the enhancement layer input by the enhancement layer coding module, and the calculation of the sub-band envelope and the input of the coding module The subband envelope value encoded data is multiplexed and packed.
2、如权利要求 1所述的装置, 其特征在于, 所述基于听觉感知模型 的分层模块包括: MLT模块、 子带划分模块和频带重要性分层模块; 所述 MLT模块, 对输入信号进行 MLT, 变换为 MLT系数; 所述子带划分模块, 将每一帧 MLT系数划分为多个等间隔子带; 所述频带重要性分层模块, 根据听觉感知模型, 将划分为多个子带 的 MLT系数分为核心层信号和增强层信号。 2. The apparatus according to claim 1, wherein the layering module based on the auditory perception model comprises: an MLT module, a subband partitioning module, and a band importance layering module; The MLT module performs MLT on the input signal and transforms into an MLT coefficient; the subband division module divides each frame MLT coefficient into a plurality of equally spaced sub-bands; the band importance layering module, according to auditory perception The model divides the MLT coefficients divided into multiple sub-bands into a core layer signal and an enhancement layer signal.
3、如权利要求 1所述的装置, 其特征在于, 所述听觉感知模型为子 带划分模块的 MLT系数非等间隔划分提供依据;  The apparatus according to claim 1, wherein the auditory perception model provides a basis for non-equal interval division of MLT coefficients of the sub-band division module;
所述基于听觉感知模型的分层模块包括: MLT模块、 子带划分模块 和频带重要性分层模块;  The hierarchical module based on the auditory perception model comprises: an MLT module, a subband partitioning module, and a band importance layering module;
所述 MLT模块, 对输入信号进行 MLT, 变换为 MLT系数; 所述子带划分模块, 根据所述听觉感知模型将每一帧 MLT 系数划 分为多个非等间隔子带;  The MLT module performs MLT on the input signal and transforms into an MLT coefficient; the subband division module divides each frame MLT coefficient into a plurality of non-equal interval subbands according to the auditory perception model;
所述频带重要性分层模块, 根据听觉感知模型, 将划分为多个子带 的 MLT系数分为核心层信号和增强层信号。  The band importance layering module divides the MLT coefficients divided into a plurality of sub-bands into a core layer signal and an enhancement layer signal according to an auditory perception model.
4、如权利要求 1所述的装置, 其特征在于, 所述核心层编码模块包 括子带比特分配模块和量化及编码模块;  The apparatus according to claim 1, wherein the core layer coding module comprises a subband bit allocation module and a quantization and coding module;
所述子带比特分配模块, 接收子带包络计算及编码模块输入的核心 层信号和核心层信号各子带的包络值, 根据核心层信号各子带包络值, 为各子带分配比特位数, 将各子带信号的比特位数信息和核心层信号传 送给量化及编码模块;  The subband bit allocation module receives the core layer signal input by the subband envelope calculation and coding module and the envelope value of each subband of the core layer signal, and allocates each subband according to the envelope value of each subband of the core layer signal. a bit number, transmitting bit bit information and core layer signals of each subband signal to a quantization and coding module;
所述量化及编码模块, 根据核心层各子带的比特位数, 对输入的核 心层信号的各子带信号进行量化和编码, 将核心层各子带的编码数据传 送给比特流复用及打包模块。  The quantization and coding module quantizes and encodes each sub-band signal of the input core layer signal according to the number of bits of each sub-band of the core layer, and transmits the encoded data of each sub-band of the core layer to the bit stream multiplexing and Packing module.
5、如权利要求 1至 4任一项所述的装置, 其特征在于, 所述增强层 编码模块包括子带重要性加权模块、 子带比特分配模块和量化及编码模 块; 所述子带重要性加权模块, 接收由子带包络计算及编码模块输入的 增强层信号和增强层各子带的包络值, 根据输入的增强层各子带的包络 值和听觉感知模型, 对增强层信号的各子带的重要性进行加权计算, 将 计算得到的增强层各子带的重要性加权的结果和增强层信号传送给子 带比特分配模块; The apparatus according to any one of claims 1 to 4, wherein the enhancement layer coding module comprises a subband importance weighting module, a subband bit allocation module, and a quantization and coding module; The subband importance weighting module receives the enhancement layer signal input by the subband envelope calculation and coding module and the envelope value of each subband of the enhancement layer, according to the envelope value and the auditory perception model of each subband of the input enhancement layer. And performing weighting calculation on the importance of each sub-band of the enhancement layer signal, and transmitting the weighted result of the importance of each sub-band of the enhancement layer and the enhancement layer signal to the sub-band bit allocation module;
所述子带比特分配模块, 根据增强层信号各子带的重要性加权的结 果, 为各子带信号分配比特位数, 将各子带信号的比特位数信息和增强 层信号传送给量化及编码模块;  The subband bit allocation module allocates bit numbers for each subband signal according to the weighting of the importance of each subband of the enhancement layer signal, and transmits the bit number information and the enhancement layer signal of each subband signal to the quantization and Coding module
所述量化及编码模块, 根据增强层各子带信号的比特位数, 对增强 层信号的各子带信号进行量化和编码, 将增强层各子带的编码数据传送 给比特流复用及打包模块。  The quantization and coding module quantizes and encodes each subband signal of the enhancement layer signal according to the number of bits of each subband signal of the enhancement layer, and transmits the coded data of each subband of the enhancement layer to the bitstream multiplexing and packing. Module.
6、 一种可分层音频解码装置, 其特征在于, 该装置包括: 比特流解 复用模块、 子带包络解码模块、 核心层解码模块、 增强层解码模块、 听 觉感知模型、 调制重叠变换 MLT系数重建及逆变换模块;  6. A layerable audio decoding device, the device comprising: a bit stream demultiplexing module, a subband envelope decoding module, a core layer decoding module, an enhancement layer decoding module, an auditory perception model, and a modulation overlap transform MLT coefficient reconstruction and inverse transform module;
所述比特流解复用模块, 将接收到的编码数据分解为子带包络值编 码数据、核心层编码数据和增强层编码数据,传送给子带包络解码模块; 所述子带包络解码模块, 对子带包络值编码数据进行解码, 得到各 子带包络值后, 将核心层编码数据和核心层各子带的包络值传送给核心 层解码模块, 将增强层编码数据和增强层各子带的包络值传送给增强层 解码模块;  The bitstream demultiplexing module decomposes the received encoded data into subband envelope value encoded data, core layer encoded data, and enhancement layer encoded data, and transmits the data to the subband envelope decoding module; the subband envelope The decoding module decodes the sub-band envelope value encoded data, obtains the envelope value of each sub-band, and transmits the core layer encoded data and the envelope value of each sub-band of the core layer to the core layer decoding module, and the enhanced layer encoded data And transmitting an envelope value of each subband of the enhancement layer to the enhancement layer decoding module;
所述核心层解码模块, 根据输入的核心层各子带的包络值, 对输入 的核心层编码数据进行解码, 得到解压的核心层各子带的 MLT系数后, 传送给 MLT系数重建及逆变换模块;  The core layer decoding module decodes the input core layer encoded data according to the envelope value of each subband of the input core layer, obtains the MLT coefficients of the decompressed core layer subbands, and transmits the MLT coefficients to the MLT coefficient reconstruction and inverse Transformation module
所述增强层解码模块, 根据听觉感知模型和输入的增强层各子带的 包络值, 对输入的增强编码数据进行解码, 得到解压的增强层各子带的 MLT系数, 将增强层各子带的 MLT系数和增强层各子带的包络值传送 给 MLT系数重建及逆变换模块; The enhancement layer decoding module decodes the input enhanced coded data according to the auditory perception model and the envelope value of each subband of the input enhancement layer, and obtains the decompressed enhancement layer subbands. The MLT coefficient, the MLT coefficient of each subband of the enhancement layer and the envelope value of each subband of the enhancement layer are transmitted to the MLT coefficient reconstruction and inverse transform module;
所述听觉感知模型,为增强层解码模块的子带重要性加权提供依据; 所述 MLT系数重建及逆变换模块,对核心层各子带的 MLT系数和 增强层各子带的 MLT系数进行逆变换, 得到解压的输出信号。  The auditory perception model provides a basis for subband importance weighting of the enhancement layer decoding module; the MLT coefficient reconstruction and inverse transform module inverses the MLT coefficients of each subband of the core layer and the MLT coefficients of each subband of the enhancement layer Transform to get the decompressed output signal.
7、如权利要求 6所述的装置, 其特征在于, 所述核心层解码模块包 括子带比特分配模块、 子带数据提取模块和逆量化及解码模块;  The apparatus according to claim 6, wherein the core layer decoding module comprises a subband bit allocation module, a subband data extraction module, and an inverse quantization and decoding module;
所述子带比特分配模块, 接收子带包络解码模块输入的核心层编码 数据和核心层各子的带包络值, 根据核心层各子带的包络值, 为各子带 分配比特位数, 将核心层各子带的比特位数信息和核心层编码数据传送 给子带数据提取模块;  The sub-band bit allocation module receives the core layer coded data input by the sub-band envelope decoding module and the envelope value of each sub-band of the core layer, and allocates bits for each sub-band according to the envelope value of each sub-band of the core layer. Number, transmitting bit bit information and core layer coded data of each subband of the core layer to the subband data extraction module;
所述子带数据提取模块, 根据核心层各子带所占的比特位数, 提取 核心层编码数据的各子带的编码数据, 将核心层各子带的编码数据传送 给逆量化及解码模块;  The subband data extracting module extracts encoded data of each subband of the core layer encoded data according to the number of bits occupied by each subband of the core layer, and transmits the encoded data of each subband of the core layer to the inverse quantization and decoding module. ;
所述逆量化及解码模块, 对核心层各子带的编码数据进行逆量化和 解码后, 得到解压的核心层各子带的 MLT系数, 传送给 MLT系数重建 及逆变换模块。  The inverse quantization and decoding module inversely quantizes and decodes the encoded data of each sub-band of the core layer, and obtains the MLT coefficients of each sub-band of the decompressed core layer, and transmits the MLT coefficients to the MLT coefficient reconstruction and inverse transform module.
8、如权利要求 6或 7所述的装置, 其特征在于, 所述增强层解码模 块包括子带重要性加权模块、 子带比特分配模块、 子带数据提取模块和 逆量化及解码模块;  The apparatus according to claim 6 or 7, wherein the enhancement layer decoding module comprises a subband importance weighting module, a subband bit allocation module, a subband data extraction module, and an inverse quantization and decoding module;
所述子带重要性加权模块, 接收子带包络解码模块输入的增强层编 码数据和增强层各子带的包络值, 根据增强层各子带的包络值和听觉感 知模型, 对增强层编码数据的各子带的重要性进行加权计算, 将计算得 到的增强编码数据的各子带的重要性加权结果、 增强层编码数据和增强 层各子带的包络值传送给子带比特分配模块; 所述子带比特分配模块, 根据增强层编码数据各子带的重要性加权 的结果, 为增强层编码数据的各子带的编码数据分配比特位数, 将增强 层编码数据的各子带的重要性加权结果、 各子带的编码数据的比特位数 信息、 增强层编码数据和增强层各子带的包络值传送给子带数据提取模 块 ^ The subband importance weighting module receives the enhancement layer coded data input by the subband envelope decoding module and the envelope value of each subband of the enhancement layer, and enhances according to an envelope value and an auditory perception model of each subband of the enhancement layer. The importance of each sub-band of the layer coded data is weighted, and the weighted result of each subband of the calculated enhanced coded data, the enhancement layer coded data, and the envelope value of each subband of the enhancement layer are transmitted to the subband bits. Distribution module The subband bit allocation module allocates bit numbers for the encoded data of each subband of the enhancement layer encoded data according to the importance weighting of each subband of the enhancement layer encoded data, and the subbands of the enhancement layer encoded data are The importance weighting result, the bit number information of the encoded data of each subband, the enhancement layer coded data, and the envelope value of each subband of the enhancement layer are transmitted to the subband data extraction module.
所述子带数据提取模块, 按照增强层编码数据各子带编码数据的重 要性从大到小的顺序, 根据相应各子带所占的比特位数, 提取增强层编 码数据的各子带的编码数据, 将增强层各子带的编码数据和增强层各子 带的包络值传送给逆量化及解码模块;  The subband data extracting module extracts the subbands of the enhanced layer encoded data according to the order of the number of bits occupied by the respective subbands according to the importance of the encoded data of each subband of the enhancement layer encoded data. Encoding data, transmitting the encoded data of each subband of the enhancement layer and the envelope value of each subband of the enhancement layer to the inverse quantization and decoding module;
所述逆量化及解码模块, 对输入的增强层各子带的编码数据进行逆 量化和解码后, 得到解压的增强层各子带的 MLT 系数, 将增强层各子 带的 MLT系数和输入的增强层各子带的包络值传送给 MLT系数重建及 逆变换模块。  The inverse quantization and decoding module performs inverse quantization and decoding on the encoded data of each subband of the input enhancement layer, and obtains the MLT coefficients of each subband of the decompressed enhancement layer, and the MLT coefficients of the subbands of the enhancement layer and the input The envelope values of each subband of the enhancement layer are passed to the MLT coefficient reconstruction and inverse transform module.
9、 如权利要求 8所述的装置, 其特征在于, 所述 MLT系数重建及 逆变换模块包括 MLT系数重建模块和 MLT逆变换模块;  9. The apparatus according to claim 8, wherein the MLT coefficient reconstruction and inverse transform module comprises an MLT coefficient reconstruction module and an MLT inverse transform module;
所述 MLT 系数重建模块, 根据输入的增强层各子带的包络值, 按 照频带次序重新排列核心层和增强层各子带的 MLT 系数后, 传送给 MLT逆变换模块;  The MLT coefficient reconstruction module, according to the envelope value of each subband of the input enhancement layer, rearranges the MLT coefficients of the core layer and the enhancement layer subbands according to the frequency band order, and then transmits the MLT coefficients to the MLT inverse transform module;
所述 MLT逆变换模块, 对核心层和增强层各子带的 MLT系数进行 逆 MLT变换, 得到解压的输出信号。  The MLT inverse transform module performs inverse MLT transform on the MLT coefficients of the core layer and each subband of the enhancement layer to obtain a decompressed output signal.
10、 如权利要求 9所述的装置, 其特征在于, 所述听觉感知模型, 为 MLT系数重建模块对丟掉的增强层 MLT系数的补偿提供依据; 所述 MLT 系数重建模块, 根据所述的听觉感知模型, 对丟掉的增 强层 MLT系数进行补偿。  10. The apparatus according to claim 9, wherein the auditory perception model provides a basis for compensation of the lost enhancement layer MLT coefficients by the MLT coefficient reconstruction module; the MLT coefficient reconstruction module, according to the hearing The perceptual model compensates for the missing enhancement layer MLT coefficients.
11、 一种可分层音频编解码方法, 其特征在于, 该方法包括: 将输入信号经调制重叠变换 MLT后, 根据听觉感知模型划分为核 心层信号和增强层信号, 根据核心层信号和增强层信号, 得到各子带包 络值的编码数据; 11. A layerable audio codec method, the method comprising: After the input signal is modulated and overlapped and transformed by the MLT, it is divided into a core layer signal and an enhancement layer signal according to the auditory perception model, and the encoded data of the envelope values of each sub-band is obtained according to the core layer signal and the enhancement layer signal;
根据核心层信号和核心层信号各子带的包络值得到核心层各子带的 编码数据, 根据增强层信号、 听觉感知模型和增强层信号各子带的包络 值, 得到增强层各子带的编码数据, 将得到的所述各子带包络值的编码 数据、 核心层各子带的编码数据和增强层各子带的编码数据一起复用打 包后, 传送给解码端。  The encoded data of each sub-band of the core layer is obtained according to the envelope values of the core layer signal and the sub-bands of the core layer signal, and the enhancement layer sub-bands are obtained according to the enhancement layer signal, the auditory perception model, and the envelope values of the sub-bands of the enhancement layer signal. The encoded data of the band is multiplexed and packed together with the encoded data of the envelope values of the sub-bands, the coded data of each sub-band of the core layer, and the coded data of each sub-band of the enhancement layer, and then transmitted to the decoding end.
12、 如权利要求 11 所述的方法, 其特征在于, 所述将输入信号经 MLT后之后进一步包括: 将所述 MLT后得到的每一帧 MLT系数划分 为多个等间隔子带, 或根据听觉感知模型将所述 MLT后得到的每一帧 MLT系数划分为多个非等间隔子带。  The method according to claim 11, wherein the inputting the signal after the MLT further comprises: dividing each frame MLT coefficient obtained after the MLT into a plurality of equally spaced sub-bands, or according to The auditory perception model divides each frame MLT coefficient obtained after the MLT into a plurality of non-equal interval sub-bands.
13、如权利要求 12所述的方法, 其特征在于, 所述得到各子带包络 值的编码数据的方法为:  The method according to claim 12, wherein the method for obtaining encoded data of each sub-band envelope value is:
计算出核心层信号和增强层信号的各子带的包络值, 对各子带包络 值进行编码, 得到各子带包络值的编码数据;  Calculating an envelope value of each subband of the core layer signal and the enhancement layer signal, and encoding each subband envelope value to obtain coded data of each subband envelope value;
所述得到核心层各子带的编码数据的方法为:  The method for obtaining encoded data of each sub-band of the core layer is:
根据核心层信号各子带的包络值, 为核心层信号各子带分配比特位 数;  Allocating bit numbers for each subband of the core layer signal according to an envelope value of each subband of the core layer signal;
根据核心层信号各子带的比特位数, 对核心层信号的各子带信号进 行量化和编码, 得到核心层各子带的编码数据;  The sub-band signals of the core layer signal are quantized and encoded according to the number of bits of each sub-band of the core layer signal, and the encoded data of each sub-band of the core layer is obtained;
所述得到增强层各子带的编码数据的方法为:  The method for obtaining encoded data of each sub-band of the enhancement layer is:
根据听觉感知模型和增强层信号各子带的包络值, 对增强层信号各 子带的重要性进行加权计算;  Weighting the importance of each subband of the enhancement layer signal according to the auditory perception model and the envelope value of each subband of the enhancement layer signal;
根据计算出的增强层信号各子带的重要性加权结果, 为各子带信号 分配比特位数; According to the calculated importance weighting result of each sub-band of the enhancement layer signal, for each sub-band signal Allocating bit digits;
根据增强层各子带信号的比特位数, 对增强层信号的各子带信号进 行量化和编码, 得到增强层各子带的编码数据。  The sub-band signals of the enhancement layer signal are quantized and encoded according to the number of bits of each sub-band signal of the enhancement layer, and the coded data of each sub-band of the enhancement layer is obtained.
14、 如权利要求 11至 13任一项所述的方法, 其特征在于, 所述复 用打包的方法为:  The method according to any one of claims 11 to 13, wherein the method of multiplexing and packaging is:
将各子带包络值的编码数据置于码流的帧头后面, 将核心层各子带 的编码数据置于各子带包络值的编码数据之后, 将增强层各子带的编码 数据置于核心层各子带的编码数据之后。  The encoded data of each sub-band envelope value is placed after the frame header of the code stream, and the encoded data of each sub-band of the core layer is placed after the encoded data of each sub-band envelope value, and the encoded data of each sub-band of the enhancement layer is used. Placed after the encoded data for each subband of the core layer.
15、如权利要求 14所述的方法, 其特征在于, 所述置入增强层编码 数据的方法为:  The method according to claim 14, wherein the method of inserting the enhancement layer encoded data is:
按照各子带的重要性从大到 d、的顺序将增强层各子带的编码数据依 次置入码流, 在将增强层某一子带编码数据置入码流之前, 先计算出所 在帧的码流已用的比特位数与所述某一子带的比特位数之和, 再与所在 帧的可用总比特位数相比较, 如果小于或等于总比特数, 则将当前子带 编码数据置入码流, 并将已用比特位数更新为之前已用比特数与所述某 一子带编码数据比特位数的和, 继续置入下一子带编码数据; 否则, 停 止置入子带编码数据。  The coded data of each sub-band of the enhancement layer is sequentially placed into the code stream according to the importance of each sub-band from the order of d to d, and the frame of the sub-band encoded data of the enhancement layer is first calculated before being placed in the code stream. The sum of the number of bits used by the code stream and the number of bits of the certain subband is compared with the total number of bits available for the frame in which it is located. If it is less than or equal to the total number of bits, the current subband is encoded. Data is placed in the code stream, and the used bit number is updated to the sum of the number of bits used before and the bit number of the encoded data of the certain sub-band, and the next sub-band encoded data is continued; otherwise, the insertion is stopped. Subband encoded data.
16、如权利要求 11所述的方法, 其特征在于, 所述传送给解码端之 后, 该方法进一步包括:  The method according to claim 11, wherein after the transmitting to the decoding end, the method further comprises:
对编码端传送的打包数据进行解复用, 根据听觉感知模型, 计算解 复用后的增强层编码数据的各子带的重要性, 得到核心层和增强层各子 带的 MLT系数;  The packetized data transmitted by the encoding end is demultiplexed, and the importance of each subband of the demultiplexed enhanced layer encoded data is calculated according to the auditory perception model, and the MLT coefficients of the sublayers of the core layer and the enhancement layer are obtained;
按照频带次序重新排列核心层和增强层各子带的 MLT 系数, 对 MLT系数进行逆 MLT , 输出解压码流。  The MLT coefficients of the core layer and the enhancement layer sub-bands are rearranged according to the band order, and the MLT coefficients are inversely MLT, and the decompressed code stream is output.
17、如权利要求 16所述的方法, 其特征在于, 所述对编码端传送的 打包数据进行解复用后进一步包括: The method according to claim 16, wherein the transmitting to the encoding end After the packaged data is demultiplexed, it further includes:
将解复用后得到的各子带包络值的编码数据进行解码, 得到各子带 的包给值;  Decoding the encoded data of each sub-band envelope value obtained after demultiplexing to obtain a packet giving value of each sub-band;
所述得到核心层各子带的 MLT系数的方法为:  The method for obtaining the MLT coefficients of each sub-band of the core layer is:
根据核心层编码数据的各子带的包络值, 为解复用得到的核心层编 码数据的各子带分配比特位数;  And assigning a bit number to each subband of the core layer coded data obtained by demultiplexing according to an envelope value of each subband of the core layer coded data;
根据核心层编码数据的各子带所占的比特位数, 提取核心层编码数 据的各子带编码数据;  Extracting each subband encoded data of the core layer encoded data according to the number of bits occupied by each subband of the core layer encoded data;
对提取的核心层各子带编码数据进行逆量化和解码后, 得到解压的 核心层各子带的 MLT系数;  After inversely quantizing and decoding the extracted sub-band encoded data of the core layer, the MLT coefficients of each sub-band of the decompressed core layer are obtained;
所述得到增强层各子带的 MLT系数的方法为:  The method for obtaining the MLT coefficient of each sub-band of the enhancement layer is:
根据听觉感知模型和增强层各子带的包络值, 对增强层编码数据的 各子带的重要性进行加权计算;  Weighting the importance of each subband of the enhancement layer encoded data according to the auditory perception model and the envelope value of each subband of the enhancement layer;
根据增强层编码数据各子带的重要性, 为增强层各子带的编码数据 分配比特位数;  Assigning bit numbers to the encoded data of each sub-band of the enhancement layer according to the importance of each sub-band of the enhancement layer encoded data;
按照增强编码数据各子带数据的重要性从大到小的顺序, 根据增强 层各子带所占的比特位数, 提取增强层编码数据的各子带的编码数据; 对所提取的增强层各子带的编码数据进行逆量化和解码, 得到解压 的增强层 MLT系数。  Extracting the encoded data of each sub-band of the enhancement layer encoded data according to the number of bits occupied by each sub-band of the enhancement layer according to the order of importance of each sub-band data of the enhanced coded data; The coded data of each subband is inverse quantized and decoded to obtain a decompressed enhancement layer MLT coefficient.
18、 如权利要求 13或 17所述的方法, 其特征在于, 所述对各子带 的重要性进行加权计算的方法为: 将增强层各子带的包络值乘以一个加 权值, 得到增强层各子带的重要性加权结果, 所述加权值根据听觉感知 模型确定。  The method according to claim 13 or 17, wherein the weighting calculation of the importance of each sub-band is: multiplying an envelope value of each sub-band of the enhancement layer by a weighting value to obtain An importance weighting result for each sub-band of the enhancement layer, the weighting value being determined according to an auditory perception model.
19、如权利要求 17所述的方法, 其特征在于, 所述提取增强层编码 数据的各子带的编码数据的方法为: 先计算出已提取的所在帧的码流的比特位数和即将提取的增强层编 码数据的某一子带编码数据所占比特位数的和, 再与所在帧的码流的总 比特位数相比较, 如果大于总比特位数, 则停止提取数据; 否则提取所 述某一子带的编码, 将已提取比特位数更新为之前已提取比特位数与所 述某一子带编码所占比特位的和, 继续提取增强层编码数据的下一子带 编码数据。 The method according to claim 17, wherein the method for extracting the encoded data of each subband of the enhancement layer encoded data is: First, calculate the sum of the bit number of the code stream of the extracted frame and the bit number of a certain sub-band coded data of the enhancement layer coded data to be extracted, and then the total number of bits of the code stream of the frame in which the frame is located. In comparison, if the total number of bits is greater than the total number of bits, the data is stopped; otherwise, the code of the certain sub-band is extracted, and the number of extracted bits is updated to the previously extracted bit number and the certain sub-band code. The sum of the bits continues to extract the next sub-band encoded data of the enhancement layer encoded data.
20、如权利要求 16所述的方法, 其特征在于, 当编码或传输过程中 丟失重要性较 d、的增强层子带数据时, 所述按照频带次序重新排列核心 层和增强层各子带的 MLT系数之后进一步包括补偿丟失的增强层 MLT 系数的方法:  The method according to claim 16, wherein when the enhancement layer subband data of importance d is lost during encoding or transmission, the core layer and the enhancement layer subbands are rearranged according to the frequency band order. The MLT coefficient is further followed by a method of compensating for the missing enhancement layer MLT coefficients:
MLT系数的符号随机选取, 将包络值乘以比例常数作为 MLT系数 的幅度, 所述比例常数根据听觉感知模型确定。  The sign of the MLT coefficient is randomly selected, and the envelope value is multiplied by a proportional constant as the magnitude of the MLT coefficient, which is determined according to the auditory perception model.
PCT/CN2007/071154 2006-12-20 2007-11-29 A hierarchical coding decoding method and device WO2008074251A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200610167891.5 2006-12-20
CNA2006101678915A CN101206860A (en) 2006-12-20 2006-12-20 Method and apparatus for encoding and decoding layered audio

Publications (1)

Publication Number Publication Date
WO2008074251A1 true WO2008074251A1 (en) 2008-06-26

Family

ID=39536002

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2007/071154 WO2008074251A1 (en) 2006-12-20 2007-11-29 A hierarchical coding decoding method and device

Country Status (2)

Country Link
CN (1) CN101206860A (en)
WO (1) WO2008074251A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8134484B2 (en) 2009-03-27 2012-03-13 Huawei Technologies, Co., Ltd. Encoding and decoding method and device
CN111402907A (en) * 2020-03-13 2020-07-10 大连理工大学 G.722.1-based multi-description speech coding method

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101771417B (en) * 2008-12-30 2012-04-18 华为技术有限公司 Methods, devices and systems for coding and decoding signals
JP5444863B2 (en) * 2009-06-11 2014-03-19 ソニー株式会社 Communication device
CN102081927B (en) * 2009-11-27 2012-07-18 中兴通讯股份有限公司 Layering audio coding and decoding method and system
CN102222505B (en) * 2010-04-13 2012-12-19 中兴通讯股份有限公司 Hierarchical audio coding and decoding methods and systems and transient signal hierarchical coding and decoding methods
CN102957651B (en) * 2011-08-17 2017-03-15 北京泰美世纪科技有限公司 A kind of digital audio broadcasting signal Frequency Synchronization and method of reseptance and its device
EP2665208A1 (en) 2012-05-14 2013-11-20 Thomson Licensing Method and apparatus for compressing and decompressing a Higher Order Ambisonics signal representation
CN104170007B (en) * 2012-06-19 2017-09-26 深圳广晟信源技术有限公司 To monophonic or the stereo method encoded
CN103650036B (en) * 2012-07-06 2016-05-11 深圳广晟信源技术有限公司 Method for coding multi-channel digital audio
CN103489450A (en) * 2013-04-07 2014-01-01 杭州微纳科技有限公司 Wireless audio compression and decompression method based on time domain aliasing elimination and equipment thereof
MX357353B (en) 2013-12-02 2018-07-05 Huawei Tech Co Ltd Encoding method and apparatus.
FR3024581A1 (en) * 2014-07-29 2016-02-05 Orange DETERMINING A CODING BUDGET OF A TRANSITION FRAME LPD / FD
CN105957533B (en) * 2016-04-22 2020-11-10 杭州微纳科技股份有限公司 Voice compression method, voice decompression method, audio encoder and audio decoder
CN110797004B (en) * 2018-08-01 2021-01-26 百度在线网络技术(北京)有限公司 Data transmission method and device
CN109036457B (en) * 2018-09-10 2021-10-08 广州酷狗计算机科技有限公司 Method and apparatus for restoring audio signal

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1312976A (en) * 1998-05-27 2001-09-12 微软公司 System and method of masking quantization noise of audio signals
CN1503572A (en) * 2002-11-21 2004-06-09 Progressive to lossless embedded audio coder (PLEAC) with multiple factorization reversible transform
WO2005036528A1 (en) * 2003-10-10 2005-04-21 Agency For Science, Technology And Research Method for encoding a digital signal into a scalable bitstream; method for decoding a scalable bitstream.
CN1623185A (en) * 2002-03-12 2005-06-01 诺基亚有限公司 Efficient improvement in scalable audio coding
CN1795495A (en) * 2003-04-30 2006-06-28 松下电器产业株式会社 Audio encoding device, audio decoding device, audio encodingmethod, and audio decoding method
WO2006098274A1 (en) * 2005-03-14 2006-09-21 Matsushita Electric Industrial Co., Ltd. Scalable decoder and scalable decoding method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1312976A (en) * 1998-05-27 2001-09-12 微软公司 System and method of masking quantization noise of audio signals
CN1623185A (en) * 2002-03-12 2005-06-01 诺基亚有限公司 Efficient improvement in scalable audio coding
CN1503572A (en) * 2002-11-21 2004-06-09 Progressive to lossless embedded audio coder (PLEAC) with multiple factorization reversible transform
CN1795495A (en) * 2003-04-30 2006-06-28 松下电器产业株式会社 Audio encoding device, audio decoding device, audio encodingmethod, and audio decoding method
WO2005036528A1 (en) * 2003-10-10 2005-04-21 Agency For Science, Technology And Research Method for encoding a digital signal into a scalable bitstream; method for decoding a scalable bitstream.
WO2006098274A1 (en) * 2005-03-14 2006-09-21 Matsushita Electric Industrial Co., Ltd. Scalable decoder and scalable decoding method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8134484B2 (en) 2009-03-27 2012-03-13 Huawei Technologies, Co., Ltd. Encoding and decoding method and device
US8436754B2 (en) 2009-03-27 2013-05-07 Huawei Technologies Co., Ltd. Encoding and decoding method and device
CN111402907A (en) * 2020-03-13 2020-07-10 大连理工大学 G.722.1-based multi-description speech coding method
CN111402907B (en) * 2020-03-13 2023-04-18 大连理工大学 G.722.1-based multi-description speech coding method

Also Published As

Publication number Publication date
CN101206860A (en) 2008-06-25

Similar Documents

Publication Publication Date Title
WO2008074251A1 (en) A hierarchical coding decoding method and device
EP2752849B1 (en) Encoder and encoding method
KR101135726B1 (en) Encoder, decoder, encoding method, decoding method, and recording medium
EP2201566B1 (en) Joint multi-channel audio encoding/decoding
US8032359B2 (en) Embedded silence and background noise compression
JP2022050609A (en) Audio-acoustic coding device, audio-acoustic decoding device, audio-acoustic coding method, and audio-acoustic decoding method
RU2185024C2 (en) Method and device for scaled coding and decoding of sound
US8428959B2 (en) Audio packet loss concealment by transform interpolation
US8386266B2 (en) Full-band scalable audio codec
EP0884850A2 (en) Scalable audio coding/decoding method and apparatus
JP5695074B2 (en) Speech coding apparatus and speech decoding apparatus
EP1806737A1 (en) Sound encoder and sound encoding method
WO2006041055A1 (en) Scalable encoder, scalable decoder, and scalable encoding method
MX2011001253A (en) Spectral smoothing device, encoding device, decoding device, communication terminal device, base station device, and spectral smoothing method.
JP2006513457A (en) Method for encoding and decoding speech at variable rates
WO2013143221A1 (en) Signal encoding and decoding method and device
WO2008098512A1 (en) A coding/decoding method, system and apparatus
KR100513729B1 (en) Speech compression and decompression apparatus having scalable bandwidth and method thereof
JP2002041099A (en) Method for expressing masked threshold level, reconstituting method and its system
WO2009096898A1 (en) Method and device of bitrate distribution/truncation for scalable audio coding

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07817344

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 07817344

Country of ref document: EP

Kind code of ref document: A1