CN101651457A - Audio coder and decoder, and coding and decoding methods - Google Patents

Audio coder and decoder, and coding and decoding methods Download PDF

Info

Publication number
CN101651457A
CN101651457A CN200810063488A CN200810063488A CN101651457A CN 101651457 A CN101651457 A CN 101651457A CN 200810063488 A CN200810063488 A CN 200810063488A CN 200810063488 A CN200810063488 A CN 200810063488A CN 101651457 A CN101651457 A CN 101651457A
Authority
CN
China
Prior art keywords
frequency
coefficient
esp
sampling point
decoding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN200810063488A
Other languages
Chinese (zh)
Inventor
李炎钊
邢文峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Silan Microelectronics Co Ltd
Original Assignee
Hangzhou Silan Microelectronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Silan Microelectronics Co Ltd filed Critical Hangzhou Silan Microelectronics Co Ltd
Priority to CN200810063488A priority Critical patent/CN101651457A/en
Publication of CN101651457A publication Critical patent/CN101651457A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention provides an ESP coder and a coding method, which are characterized in that: an audio sampling point is input into a time-frequency analysis module to perform MDCT transformation so thatthe audio sampling point is converted into a frequency coefficient; the frequency coefficient is compressed into a bit stream by a quantization/entropy coding module; and the bit stream is output intoa DRAM buffer. The invention also provides an ESP decoder and a decoding method, which are characterized in that: a bit steam with one bit block length is read from a DRAM buffer; then the bit streamwith one bit block strength is sent into an input buffer area of an entropy decoding/inverse quantization module; after performing entropy decoding and non-linear inverse quantization, a reconstructed frequency coefficient is obtained and is sent into an inverse time-frequency analysis module; and IMDCT transformation is performed to obtain a time domain signal after reconstruction, and the timedomain signal is output and played. The ESP coder and the ESP decoder, and the coding method and the decoding method can be applied to an electronic shock prevention system.

Description

Audio codec and decoding method
Technical field
The present invention relates to a kind of audio encoding and decoding technique of low computational complexity, relate in particular to the encoding and decoding technique that adopts MDCT (ModifiedDiscrete Cosine Transform) conversion and inverse transformation thereof, this audio encoding and decoding technique can be applied to the electronic shock system.
Background technology
Portable or vehicle-mounted servo player suffers the influence of mechanical shock easily, thereby cause voice data normally not play continuously, for overcoming this problem, all be provided with electronic shock (ESP in the general servo player, Electronic ShockProtection) system, its basic framework as shown in Figure 1, when player is operated in shockproof pattern, optical disc servo reads voice data with the rotating speed that is higher than normal mode, the audio frequency sampling point is compressed by the ESP encoder, data after will compressing are then sent into dynamic random access memory (DRAM, Dynamic Random Access Memory) buffering area stores, after being filled up, buffering area again the speed of optical disc servo is reduced to normal speed, simultaneously the ESP decoder with normal speed from the decoding of DRAM buffering area reading of data and send broadcast, when vibrations cause servoly during from the optical disk reading data failure, the ESP decoder still can be from DRAM buffering area reading of data and decoding, thereby realizes the electronic shock effect of certain hour.
The shockproof time of electronic shock system, there was the requirement of three aspects in the electronic shock system to audio code decode algorithm: 1, the basic free of losses of tonequality by the compression ratio decision of DRAM capacity and audio coding algorithm; 2, compression ratio is high more, and the shockproof time that the DRAM of same capability supports is longer, and DRAM required under the perhaps identical shockproof time is littler, and cost is lower, the general compression ratio that requires to reach more than 4: 1; 3, because the processing stage of peak value, electronic shock (ESP) system need walk abreast and finish the computing of twice coding and the computing of once decoding, if adopt software to realize, requires the computational complexity of its code decode algorithm lower.
Traditional electronic shock system generally adopts adaptive difference pulse code modulation (ADPCM) to carry out audio coding decoding, with the CD Player is example, audio frequency sampling point among the CD generally is 16bit, adopt the ADPCM coding sampling point can be compressed to 4bit, reach 4: 1 compression ratio, and do not lose tonequality substantially, bit rate is 4 (bits) * 44100*2 (sound channel)=352800bps, DRAM with 16Mbit is an example, can store 16000000/352800=45.3s, but adpcm encoder has two significant disadvantage: 1, the structure that the ADPCM code decode algorithm adopts pointwise to calculate, operand is big, can only use application-specific integrated circuit (ASIC) (ASIC) to realize, is not suitable for the software real-time implementation, adopt this scheme just need in system, add special-purpose ESP chip, cause cost to increase; 2, the compression efficiency of adpcm encoder is lower, and the compression ratio that is higher than 4: 1 can cause tonequality obviously to worsen.
Application number is in the patent of CN200310104551.4, proposed to adopt the code decode algorithm of MPEG1 audio code decode algorithm as ESP, can under the condition that tonequality does not have to lose substantially, improve compression ratio, because this method adopts psychoacoustic model and use the bank of filters of mixed type to carry out time frequency analysis and inverse time frequency analysis, computational complexity is quite big, need walk abreast the processing stage of its peak value and finish twice MPEG1 Layer3 cataloged procedure and a MPEG1 Layer3 decode procedure, embedded processors ARM 9 with main flow is an example, through still needing the above computing expense of 100MIPS after optimizing, the average magnitude that this statistics still obtains on soft simulation, if consideration program and data need be loaded into the overhead of high-speed cache (CACHE) from dynamic random access memory (DRAM), and the operand of entropy coding is associated with the data, at different songs, its peak value operand can exceed this number, and the peak value operand of MPEG1 Layer3 can reach more than the 140MIPS.With such computational complexity, be difficult on existing main flow flush bonding processor such as ARM9, finish in real time, if must realize that then need the operating frequency of processor is brought up to more than the 140MHz, this can significantly increase system power dissipation again at the main flow flush bonding processor.
Summary of the invention
The present invention is intended to solve the deficiencies in the prior art, take all factors into consideration the requirement of algorithm complex and compression ratio, a kind of ESP encoder, ESP decoder and electronic shock (ESP) system are provided, effective audio compressed data under the system hardware condition of cost can not increased, the present invention simultaneously also provides the low complexity audio code decode algorithm, this method is fit to software and realizes, can effectively prolong the shockproof time.
A kind of ESP encoder is characterized in that audio frequency sampling point input time frequency analysis module converts is a coefficient of frequency, and coefficient of frequency is compressed into bit stream through quantification/entropy coding module, and bit stream outputs to the DRAM buffer.
The time frequency analysis module of described ESP encoder adopts MDCT (Modified Discrete Cosine Transform) conversion, and it is defined as follows:
X ( k ) = Σ n = 0 N - 1 x ( n ) cos ( π ( 2 n + 1 + N / 2 ) ( 2 k + 1 ) 2 N ) , . . . . . . . . . . . . k = 0,1 , . . . , N 2 - 1
Wherein, x (n) expression audio frequency sampling point, n is the label of audio frequency sampling point, span is: n=0,1 ..., N-1;
X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is: k = 0,1 , . . . , N 2 - 1 ;
N represents that each frame comprises the quantity of audio frequency sampling point.
The MDCT conversion is the conversion of a full real number, and the MDCT conversion is unit with the frame, has the sampling point of half overlapping between consecutive frame, can eliminate the jumping phenomenon of frame boundaries, and the N value can be 128,256, chooses in 512,1024,2048.
Described quantification/entropy coding module adopts nonlinear quantization and Huffman to encode packed data according to the requirement of compression ratio, under the condition that guarantees tonequality, can reach 5: 1 compression ratio at least, and quantification/entropy coding step is as follows:
(1) coefficient of frequency of time frequency analysis module output multiply by scaling factor (Scaling Factor);
(2) coefficient of frequency after the calibration is carried out nonlinear quantization;
(3) coefficient of frequency after will quantizing is combined into coefficient of frequency in twos to (every frame altogether N/4 coefficient of frequency to);
(4) choose the Huffman code table, coefficient of frequency is encoded to carrying out Huffman, generate bit stream with this code table;
(5) judge whether the bit number that the Huffman coding produces exceeds the requirement of specified compression rate, if exceed, just dwindles scaling factor, jumps to step (1), withdraws from iteration until the requirement of satisfying compression ratio;
Wherein said quantitative formula can be y=int[x 3/4], the int[. in the formula] function representation round numbers part.
Wherein said code table can be chosen for a fixing code table.
Described ESP codec can also comprise package module, the package module of described ESP encoder will quantize/and the M frame bit stream of entropy coding module output is encapsulated in the bit block of a regular length successively, and then output to the DRAM buffer, if the length sum of M frame bit stream is less than the regular length of bit block, the remainder of bit block can be used zero padding.Consider memory space, encoding and decoding delay, the level and smooth requirement of bit rate, M can choose at 2~16.On the one hand since audio signal right and wrong are stably in essence, (10ms magnitude) can be regarded as stationary signal in short-term in a period of time, great changes have taken place through the bit number of quantization encoding output for different frame, by the bit stream behind the M frame audio signal quantization encoding is encapsulated in the bit block, can reduce the influence of peak value bit stream, on the other hand owing to the length of bit block is fixed, the ESP decoder is easy to determine the original position of each bit block, need not time-consumingly to carry out the bit block synchronous searching, reduced the computing expense of system.
A kind of ESP coding method is characterized in that may further comprise the steps:
(1) the audio frequency sampling point is converted to coefficient of frequency by time frequency analysis;
(2) coefficient of frequency is compressed into bit stream through quantification/entropy coding;
The time frequency analysis of wherein said ESP encoder adopts MDCT (Modified Discrete Cosine Transform) conversion, and it is defined as follows:
X ( k ) = Σ n = 0 N - 1 x ( n ) cos ( π ( 2 n + 1 + N / 2 ) ( 2 k + 1 ) 2 N ) , . . . . . . . . . . . . k = 0,1 , . . . , N 2 - 1
Wherein, x (n) expression audio frequency sampling point, n is the label of audio frequency sampling point, span is: n=0,1 ..., N-1;
X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is: k = 0,1 , . . . , N 2 - 1 ;
N represents that each frame comprises the quantity of audio frequency sampling point.
Wherein, described N value can be 128,256, choose in 512,1024,2048.
Wherein said quantification/entropy coding step is as follows:
(1) coefficient of frequency through time frequency analysis output multiply by scaling factor (Scaling Factor);
(2) coefficient of frequency after the calibration is carried out nonlinear quantization;
(3) coefficient of frequency after will quantizing is combined into coefficient of frequency in twos to (every frame altogether N/4 coefficient of frequency to);
(4) choose the Huffman code table, coefficient of frequency is encoded to carrying out Huffman, generate bit stream with code table;
(5) judge whether the bit number that the Huffman coding produces exceeds the requirement of specified compression rate, if exceed, just dwindles scaling factor, jumps to step (1), withdraws from iteration until the requirement of satisfying compression ratio.
Wherein, described quantitative formula can be y=int[x 3/4], the int[. in the formula] function representation round numbers part.
A kind of ESP coding method, its feature can also be encapsulated into the M frame bit stream of quantification/entropy coding output in the bit block of a regular length successively, if the length sum of M frame bit stream is less than the regular length of bit block, the remainder of bit block can be used zero padding.
Wherein, described M can choose at 2~16.
A kind of ESP decoder, it is characterized in that from the DRAM buffer, reading the bit stream of a bit block length, then the bit stream of a bit block length is sent into the input block of entropy decoding/inverse quantization module, the coefficient of frequency that carries out obtaining rebuilding after entropy decoding and the inverse quantization computing is sent into inverse time frequency analysis module, carry out IMDCT (Inverse Modified Discrete CosineTransform) conversion, time-domain signal after the reconstruction and output are play.
The process of described entropy decoding/inverse quantization module is as follows:
(1) use the code table identical that the bit stream of a bit block of input block is carried out the Huffman decoding with the ESP encoder, it is right that decoding obtains M*N/4 coefficient of frequency, wherein each coefficient of frequency is to comprising two coefficient of frequencies, be total to M*N/2 coefficient of frequency, and M anti-scaling factor ISF[m] (m=1,, M), anti-scaling factor of wherein every frame;
(2) M*N/2 the coefficient of frequency that decoding is obtained carries out the inverse quantization processing, obtains linear coefficient of frequency;
(3) the anti-scaling factor ISF[m that uses the decoding of (1) step to obtain] (m=1 ..., M), go on foot anti-calibration of linear frequency coefficient (the two multiplies each other) that obtains, the coefficient of frequency after anti-the calibration to (2);
(4) with the coefficient of frequency after M*N/2 the anti-calibration, be divided into the M frame successively, every frame N/2 coefficient of frequency exports inverse time frequency analysis module to.
The formula of wherein said inverse quantization can be y=int[x 3/4], int[. wherein] function representation round numbers part.
Inverse time frequency analysis module in the described ESP decoder is with the coefficient of frequency of entropy decoding/inverse quantization module output, be converted to PCM (Pulse Coding Modulation) signal, this module adopts IMDCT (Inverse Modified Discrete CosineTransform) conversion, this conversion is full real number conversion, multiple fast algorithm is efficiently arranged, do not repeat at this.After adopting fast algorithm, the computational complexity C of its average each sampling point and the relation of N value can be expressed as C ∝ log 2N, IMDCT is defined as follows:
Figure G2008100634887D00041
Wherein, X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is: k = 0,1 , . . . , N 2 - 1 ;
Figure G2008100634887D00052
Audio frequency sampling point after the expression IMDCT conversion, n is its label, span is: n=0,1 .., N-1;
N represents that each frame comprises the quantity of audio frequency sampling point.
A kind of ESP coding/decoding method is characterized in that comprising the following step:
(1) bit stream with a bit block length carries out entropy decoding/inverse quantization, the coefficient of frequency that obtains rebuilding;
(2) coefficient of frequency is carried out inverse time IMDCT (Inverse Modified Discrete Cosine Transform) analysis frequently, time-domain signal after obtaining rebuilding and output are play.
Wherein, described entropy decoding/dequantization step is as follows:
(1) use the code table identical that the bit stream of a bit block of input block is carried out the Huffman decoding with the ESP coding method, it is right that decoding obtains M*N/4 coefficient of frequency, wherein each coefficient of frequency is to comprising two coefficient of frequencies, be total to M*N/2 coefficient of frequency, and M anti-scaling factor ISF[m] (m=1,, M), anti-scaling factor of wherein every frame;
(2) M*N/2 the coefficient of frequency that decoding is obtained carries out the inverse quantization processing, obtains linear coefficient of frequency;
(3) the anti-scaling factor ISF[m that uses the decoding of (1) step to obtain] (m=1 ..., M), go on foot anti-calibration of linear frequency coefficient (the two multiplies each other) that obtains, the coefficient of frequency after anti-the calibration to (2);
(4) with the coefficient of frequency after M*N/2 the anti-calibration, be divided into the M frame successively, every frame N/2 coefficient of frequency exports inverse time frequency analysis module to.
Wherein, the formula of described inverse quantization can be y=int[x 3/4], iht[. wherein] function representation round numbers part.
Wherein, the inverse time frequency analysis in the described ESP coding/decoding method adopts IMDCT (Inverse Modified DiscreteCosine Transform) conversion, and IMDCT is defined as follows:
Figure G2008100634887D00053
Wherein, X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is: k = 0,1 , . . . , N 2 - 1 ;
Figure G2008100634887D00055
Audio frequency sampling point after the expression IMDCT conversion, n is its label, span is: n=0,1 ..., N-1;
N represents that each frame comprises the quantity of audio frequency sampling point.
A kind of ESP system, it is characterized in that comprising the ESP encoder, DRAM buffer, ESP decoder, when opening the ESP system, the audio frequency sampling point is encoded by the ESP encoder encodes successively, code stream behind the coding is stored in the DRAM buffer, and the code stream in the DRAM buffering area is characterized in that adopting ESP encoder and ESP decoder as mentioned above through the output of ESP decoder decode.
Beneficial effect of the present invention is: can make under the break-even substantially situation of CD Player tonequality, given up psychoacoustic model, with less operand, thereby realize that higher compression ratio reaches the long shockproof time, improves the service efficiency of dynamic random access memory, prolong the shockproof time, the present invention simultaneously adopts block-based computing structure, is fit to software and realizes that the computational complexity of algorithm is low, help reducing system's dominant frequency, thereby reduce system power dissipation and improve the stability of system.
Description of drawings
Fig. 1 ESP system diagram
Fig. 2 ESP coder structure Fig. 1
Fig. 3 ESP coder structure Fig. 2
Fig. 4 ESP decoder architecture Fig. 1
Fig. 5 ESP coded quantization/entropy coding flow chart
Fig. 6 ESP decoding entropy decoding/inverse quantization flow chart
Specific embodiment
A kind of as shown in Figure 2 ESP encoder is characterized in that audio frequency sampling point input time frequency analysis module (20) is converted to coefficient of frequency, and coefficient of frequency is compressed into bit stream through quantification/entropy coding module (21), and bit stream outputs to the DRAM buffer.
The time frequency analysis module (20) of described ESP encoder adopts MDCT (Modified Discrete Cosine Transform) conversion, and it is defined as follows:
X ( k ) = Σ n = 0 N - 1 x ( n ) cos ( π ( 2 n + 1 + N / 2 ) ( 2 k + 1 ) 2 N ) , . . . . . . . . . . . . k = 0,1 , . . . , N 2 - 1
Wherein, x (n) expression audio frequency sampling point, n is the label of audio frequency sampling point, span is: n=0,1 ..., N-1;
X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is: k = 0,1 , . . . , N 2 - 1 ;
N represents that each frame comprises the quantity of audio frequency sampling point.
The MDCT conversion is the conversion of a full real number, multiple fast algorithm is efficiently arranged, as document [1] Mu-Huo Cheng, etc.Fast IMDCT and MDCT Algorithms A Matrix Approach.[J] .IEEE Transactions on SignalProcessing, Vol.51, NO.1, Jan 2003., document [2] Bycond Gi Lee.A new algorithm to computethe discrete cosine transform[J] .IEEE Transactions on Acoustics.Speech, and SignalProcessing.1984,32 (6): the disclosed algorithm of many pieces of documents such as 1243-1245 all can be applied to the present invention, does not repeat at this.The MDCT conversion is a unit with the frame, has the sampling point of half overlapping between consecutive frame, can eliminate the jumping phenomenon of frame boundaries.Choosing of N needs to consider following factor: 1, N is chosen for 2 integral number power, helps the realization of fast algorithm; 2, the big more frequency spectrum resolving power of N is high more, helps improving compression ratio; 3, N is big more, and computational complexity is high more, and on average the relation of the computational complexity C of each sampling point and N value can be expressed as C ∝ log 2N; 4, the characteristic of audio signal is stably in short-term, and the N value is chosen excessive nonsensical, and based on above consideration, the N value can be 128,256, chooses in 512,1024,2048.
Described quantification/entropy coding module (21) adopts nonlinear quantization and Huffman to encode packed data according to the requirement of compression ratio, under the condition that guarantees tonequality, can reach 5: 1 compression ratio at least, quantification/entropy coding step following (as shown in Figure 5):
(1) coefficient of frequency of time frequency analysis module (20) output multiply by scaling factor (Scaling Factor);
(2) coefficient of frequency after the calibration is carried out nonlinear quantization;
(3) coefficient of frequency after will quantizing is combined into coefficient of frequency in twos to (every frame altogether N/4 coefficient of frequency to);
(4) choose the Huffman code table, coefficient of frequency is encoded to carrying out Huffman, generate bit stream with this code table;
(5) judge whether the bit number that the Huffman coding produces exceeds the requirement of specified compression rate, if exceed, just dwindles scaling factor, jumps to step (1), withdraws from iteration until the requirement of satisfying compression ratio;
Wherein said quantitative formula can be y=int[x 3/4], the int[. in the formula] function representation round numbers part.
Wherein said code table can be chosen for a fixing code table.
Described ESP codec can also comprise package module (22), as shown in Figure 3, the package module of described ESP encoder (22) will quantize/and the M frame bit stream of entropy coding module output is encapsulated in the bit block of a regular length successively, and then output to the DRAM buffer, if the length sum of M frame bit stream is less than the regular length of bit block, the remainder of bit block can be used zero padding.Consider memory space, encoding and decoding delay, the level and smooth requirement of bit rate, M can choose at 2~16.On the one hand since audio signal right and wrong are stably in essence, (10ms magnitude) can be regarded as stationary signal in short-term in a period of time, great changes have taken place through the bit number of quantization encoding output for different frame, by the bit stream behind the M frame audio signal quantization encoding is encapsulated in the bit block, can reduce the influence of peak value bit stream, on the other hand owing to the length of bit block is fixed, the ESP decoder is easy to determine the original position of each bit block, need not time-consumingly to carry out the bit block synchronous searching, reduced the computing expense of system.
A kind of ESP coding method is characterized in that may further comprise the steps:
(1) the audio frequency sampling point is converted to coefficient of frequency by time frequency analysis;
(2) coefficient of frequency is compressed into bit stream through quantification/entropy coding;
The time frequency analysis of wherein said ESP encoder adopts MDCT (Modified Discrete Cosine Transform) conversion, and it is defined as follows,
X ( k ) = Σ n = 0 N - 1 x ( n ) cos ( π ( 2 n + 1 + N / 2 ) ( 2 k + 1 ) 2 N ) , . . . . . . . . . . . . k = 0,1 , . . . , N 2 - 1
Wherein, x (n) expression audio frequency sampling point, n is the label of audio frequency sampling point, span is: n=0,1 ..., N-1;
X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is: k = 0,1 , . . . , N 2 - 1 ;
N represents that each frame comprises the quantity of audio frequency sampling point.
Wherein, described N value can be 128,256, choose in 512,1024,2048.
Wherein said quantification/entropy coding step following (as shown in Figure 5):
(1) coefficient of frequency through time frequency analysis output multiply by scaling factor (Scaling Factor);
(2) coefficient of frequency after the calibration is carried out nonlinear quantization;
(3) coefficient of frequency after will quantizing is combined into coefficient of frequency in twos to (every frame altogether N/4 coefficient of frequency to);
(4) choose the Huffman code table, coefficient of frequency is encoded to carrying out Huffman, generate bit stream with code table;
(5) judge whether the bit number that the Huffman coding produces exceeds the requirement of specified compression rate, if exceed, just dwindles scaling factor, jumps to step (1), withdraws from iteration until the requirement of satisfying compression ratio.
Wherein, described quantitative formula can be y=int[x 3/4], the int[. in the formula] function representation round numbers part.
A kind of ESP coding method, its feature can also be encapsulated into the M frame bit stream of quantification/entropy coding output in the bit block of a regular length successively, if the length sum of M frame bit stream is less than the regular length of bit block, the remainder of bit block can be used zero padding.
Wherein, described M can choose at 2~16.
A kind of ESP decoder, as shown in Figure 4, from the DRAM buffer, read the bit stream of a bit block length, then the bit stream of a bit block length is sent into the input block of entropy decoding/inverse quantization module (41), after carrying out entropy decoding and inverse quantization, the coefficient of frequency that obtains rebuilding is sent into inverse time frequency analysis module (42), carries out IMDCT (Inverse Modified DiscreteCosine Transform) conversion, and time-domain signal after obtaining rebuilding and output are play.
The process of described entropy decoding/inverse quantization module (41) following (as shown in Figure 6):
(1) use the code table identical that the bit stream of a bit block of input is carried out the Huffman decoding with the ESP encoder, it is right that decoding obtains M*N/4 coefficient of frequency, wherein each coefficient of frequency is to comprising two coefficient of frequencies, be total to M*N/2 coefficient of frequency, and M anti-scaling factor ISF[m] (m=1,, M), anti-scaling factor of wherein every frame;
(2) M*N/2 the coefficient of frequency that decoding is obtained carries out the inverse quantization processing, obtains linear coefficient of frequency;
(3) the anti-scaling factor ISF[m that uses the decoding of (1) step to obtain] (m=1 ..., M), go on foot anti-calibration of linear frequency coefficient (the two multiplies each other) that obtains, the coefficient of frequency after anti-the calibration to (2);
(4) with the coefficient of frequency after M*N/2 the anti-calibration, be divided into the M frame successively, every frame N/2 coefficient of frequency exports inverse time frequency analysis module (42) to.
The formula of wherein said inverse quantization can be y=int[x 3/4], int[. wherein] function representation round numbers part.
Inverse time frequency analysis module (42) in the described ESP decoder is with the coefficient of frequency of entropy decoding/inverse quantization module output, be converted to PCM (Pulse Coding Modulation) signal, this module adopts IMDCT (Inverse Modified DiscreteCosine Transform) conversion, this conversion is full real number conversion, multiple fast algorithm is efficiently arranged, wherein document [1], document many pieces of disclosed algorithms of document such as [2] all can be applied to the present invention as described above, do not repeat at this.After adopting fast algorithm, the computational complexity C of its average each sampling point and the relation of N value can be expressed as C ∝ log 2N, IMDCT is defined as follows,
Wherein, X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is: k = 0,1 , . . . , N 2 - 1 ;
Figure G2008100634887D00093
Audio frequency sampling point after the expression IMDCT conversion, n is its label, span is: n=0,1 ..., N-1;
N represents that each frame comprises the quantity of audio frequency sampling point.
A kind of ESP coding/decoding method is characterized in that comprising the following step:
(1) bit stream that reads a bit block length from the DRAM buffer carries out entropy decoding/inverse quantization, the coefficient of frequency that obtains rebuilding;
(2) coefficient of frequency is carried out inverse time IMDCT (Inverse Modified Discrete Cosine Transform) analysis frequently, time-domain signal after obtaining rebuilding and output are play.
Wherein, described entropy decoding/dequantization step following (as shown in Figure 6):
(1) use the code table identical that the bit stream of a bit block of input is carried out the Huffman decoding with the ESP coding method, it is right that decoding obtains M*N/4 coefficient of frequency, wherein each coefficient of frequency is to comprising two coefficient of frequencies, be total to M*N/2 coefficient of frequency, and M anti-scaling factor ISF[m] (m=1,, M), anti-scaling factor of wherein every frame;
(2) M*N/2 the coefficient of frequency that decoding is obtained carries out the inverse quantization processing, obtains linear coefficient of frequency;
(3) the anti-scaling factor ISF[m that uses the decoding of (1) step to obtain] (m=1 ..., M), go on foot anti-calibration of linear frequency coefficient (the two multiplies each other) that obtains, the coefficient of frequency after anti-the calibration to (2);
(4) with the coefficient of frequency after M*N/2 the anti-calibration, be divided into the M frame successively, every frame N/2 coefficient of frequency exports inverse time frequency analysis module to.
Wherein, the formula of described inverse quantization can be y=int[x 3/4], int[. wherein] function representation round numbers part.
Wherein, the inverse time frequency analysis in the described ESP coding/decoding method adopts IMDCT (Inverse Modified DiscreteCosine Transform) conversion, and IMDCT is defined as follows,
Figure G2008100634887D00101
Wherein, X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is: k = 0,1 , . . . , N 2 - 1 ;
Figure G2008100634887D00103
Audio frequency sampling point after the expression IMDCT conversion, n is its label, span is: n=0,1 ..., N-1;
N represents that each frame comprises the quantity of audio frequency sampling point.
A kind of ESP system, it is characterized in that comprising the ESP encoder, DRAM buffer, ESP decoder, when opening the ESP system, the audio frequency sampling point is encoded by the ESP encoder encodes successively, code stream behind the coding is stored in the DRAM buffer, and the code stream in the DRAM buffering area is characterized in that adopting ESP encoder and ESP decoder as mentioned above through the output of ESP decoder decode.
What should be understood that is: the foregoing description is just to explanation of the present invention; rather than limitation of the present invention; any innovation and creation modification that does not exceed in the connotation scope of the present invention waits the replacement or the modification of other unsubstantialities, all falls within the protection range of the present invention.

Claims (23)

1. ESP encoder, it is characterized in that audio frequency sampling point input time frequency analysis module converts is a coefficient of frequency, coefficient of frequency is compressed into bit stream through quantification/entropy coding module, bit stream outputs to the DRAM buffer, wherein, the time frequency analysis module of described ESP encoder adopts MDCT (Modified Discrete Cosine Transform) conversion, and it is defined as follows:
X ( k ) = Σ n = 0 N - 1 x ( n ) cos ( π ( 2 n + 1 + N / 2 ) ( 2 k + 1 ) 2 N ) , . . . . . . k = 0,1 , . . . , N 2 - 1
Wherein, x (n) expression audio frequency sampling point, n is the label of audio frequency sampling point, span is: n=0,1 ..., N-1;
X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is: k = 0,1 , . . . N 2 - 1 ;
N represents that each frame comprises the quantity of audio frequency sampling point.
2. a kind of ESP encoder as claimed in claim 1 is characterized in that the N value can be 128,256, chooses in 512,1024,2048.
3. a kind of ESP encoder as claimed in claim 1 is characterized in that described quantification/entropy coding step is as follows:
(1) coefficient of frequency with the output of time frequency analysis module multiply by scaling factor (Scaling Factor);
(2) coefficient of frequency after the calibration is carried out nonlinear quantization;
(3) coefficient of frequency after will quantizing is combined into coefficient of frequency in twos to (every frame altogether N/4 coefficient of frequency to);
(4) choose the Huffman code table, coefficient of frequency is encoded to carrying out Huffman, generate bit stream with code table;
(5) judge whether the bit number that the Huffman coding is exported exceeds the requirement of specified compression rate, if exceed, just dwindles scaling factor, jumps to step (1), withdraws from iteration until the requirement of satisfying compression ratio.
4. a kind of ESP encoder as claimed in claim 3 is characterized in that the quantitative formula that adopted can be y=int[x 3/4], the int[. in the formula] function representation round numbers part.
5. a kind of ESP encoder as claimed in claim 3 is characterized in that described code table can be chosen for a fixing code table.
6. a kind of ESP encoder as claimed in claim 3, it is characterized in that described ESP codec can also comprise package module, described package module, the M frame bit stream that quantification/entropy coding module is exported is encapsulated in the bit block successively, and then outputs to the DRAM buffer.
7. ESP coding method is characterized in that may further comprise the steps:
(1) the audio frequency sampling point is carried out time frequency analysis and be converted to coefficient of frequency;
(2) coefficient of frequency is compressed into bit stream through quantification/entropy coding;
Wherein said time frequency analysis adopts MDCT (Modified Discrete Cosine Transform) conversion, and it is defined as follows:
X ( k ) = Σ n = 0 N - 1 x ( n ) cos ( π ( 2 n + 1 + N / 2 ) ( 2 k + 1 ) 2 N ) , . . . . . . k = 0,1 , . . . , N 2 - 1
Wherein, x (n) expression audio frequency sampling point, n is the label of audio frequency sampling point, span is: n=0,1 ..., N-1;
X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is: k = 0,1 , . . . N 2 - 1 ;
N represents that each frame comprises the quantity of audio frequency sampling point.
8. ESP coding method as claimed in claim 7 is characterized in that described N value can be 128,256, chooses in 512,1024,2048.
9. ESP coding method as claimed in claim 7 is characterized in that described quantification/entropy coding step is as follows:
(1) will multiply by scaling factor (Scaling Factor) through the coefficient of frequency of time frequency analysis output;
(2) coefficient of frequency after the calibration is carried out nonlinear quantization;
(3) coefficient of frequency after will quantizing is combined into coefficient of frequency in twos to (every frame altogether N/4 coefficient of frequency to);
(4) choose the Huffman code table, coefficient of frequency is encoded to carrying out Huffman, generate bit stream with code table;
(5) judge whether the bit number that the Huffman coding is exported exceeds the requirement of specified compression rate, if exceed, just dwindles scaling factor, jumps to step (1), withdraws from iteration until the requirement of satisfying compression ratio.
10. ESP coding method as claimed in claim 9 is characterized in that described quantitative formula can be y=int[x 3/4], the int[. in the formula] function representation round numbers part.
11. ESP coding method as claimed in claim 7 is characterized in that and the M frame bit stream of quantification/entropy coding module output can also be encapsulated in the bit block successively, and then outputs to the DRAM buffer.
12. ESP coding method as claimed in claim 11 is characterized in that described M can choose at 2~16.
13. ESP decoder, it is characterized in that from the DRAM buffer, reading the bit stream of a bit block length, then the bit stream of a bit block length is sent into the input block of entropy decoding/inverse quantization module, the coefficient of frequency that obtains rebuilding is sent into inverse time frequency analysis module, carry out IMDCT (Inverse Modified Discrete Cosine Transform) conversion, time-domain signal after obtaining rebuilding and output are play.
14. ESP decoder as claimed in claim 13 is characterized in that entropy decoding/dequantization step is as follows:
(1) use the code table identical that the bit stream of a bit block of input is carried out the Huffman decoding with the ESP encoder, it is right that decoding obtains M*N/4 coefficient of frequency, wherein each coefficient of frequency is to comprising two coefficient of frequencies, be total to M*N/2 coefficient of frequency, and M anti-scaling factor ISF[m] (m=1,, M), anti-scaling factor of wherein every frame;
(2) M*N/2 the coefficient of frequency that decoding is obtained carries out the inverse quantization processing, obtains linear coefficient of frequency;
(3) the anti-scaling factor ISF[m that uses first step decoding to obtain] (m=1 ..., M), go on foot anti-calibration of linear frequency coefficient (the two multiplies each other) that obtains, the coefficient of frequency after anti-the calibration to second;
(4) with the coefficient of frequency after M*N/2 the anti-calibration, be divided into the M frame, every frame N/2 coefficient of frequency exports inverse time frequency analysis module to.
15. ESP decoder as claimed in claim 14 is characterized in that the formula of wherein said inverse quantization can be y=int[x 3/4], int[. wherein] function representation round numbers part.
16. ESP decoder as claimed in claim 13 is characterized in that the inverse time frequency analysis module in the described ESP decoder adopts IMDCT (Inverse Modified Discrete Cosine Transform) conversion, IMDCT is defined as follows:
Figure A2008100634880004C1
Wherein, X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is: k = 0,1 , . . . N 2 - 1 ;
Figure A2008100634880004C3
Audio frequency sampling point after the expression IMDCT conversion, n is its label, span is: n=0,1 ..., N-1;
N represents that each frame comprises the quantity of audio frequency sampling point.
17. an ESP coding/decoding method is characterized in that comprising the following step:
(1) bit stream of a bit block length is sent into entropy decoding/inverse quantization, the coefficient of frequency that obtains rebuilding;
(2) carry out the inverse time frequency analysis, time-domain signal after obtaining rebuilding and output are play.
18. ESP coding/decoding method as claimed in claim 17 is characterized in that described entropy decoding/dequantization step is as follows:
(1) use the code table identical that the bit stream of a bit block of input is carried out the Huffman decoding with the ESP coding method, it is right that decoding obtains M*N/4 coefficient of frequency, wherein each coefficient of frequency is to comprising two coefficient of frequencies, be total to M*N/2 coefficient of frequency, and M anti-scaling factor ISF[m] (m=1,, M), anti-scaling factor of wherein every frame;
(2) M*N/2 the coefficient of frequency that decoding is obtained carries out the inverse quantization processing, obtains linear coefficient of frequency;
(3) the anti-scaling factor ISF[m that uses first step decoding to obtain] (m=1 ..., M), go on foot anti-calibration of linear frequency coefficient (the two multiplies each other) that obtains, the coefficient of frequency after anti-the calibration to second;
(4) with the coefficient of frequency after M water N/2 the anti-calibration, be divided into the M frame, every frame N/2 coefficient of frequency exports inverse time frequency analysis module to.
19. ESP coding/decoding method as claimed in claim 18 is characterized in that the formula of described inverse quantization can be y=int[x 3/4], int[. wherein] function representation round numbers part.
20. ESP coding/decoding method as claimed in claim 17 is characterized in that described inverse time frequency analysis adopts the IMDCT conversion, it is defined as follows:
Figure A2008100634880005C1
Wherein, X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is: k = 0,1 , . . . N 2 - 1 ;
Figure A2008100634880005C3
Audio frequency sampling point after the expression IMDCT conversion, n is its label, span is: n=0,1 ..., N-1;
N represents that each frame comprises the quantity of audio frequency sampling point.
21. ESP system, it is characterized in that comprising the ESP encoder, DRAM buffer, ESP decoder, when opening the ESP system, the audio frequency sampling point is encoded by the ESP encoder successively, and the code stream behind the coding is stored in the DRAM buffer, and the code stream in the DRAM buffering area is exported through the ESP decoder decode, it is characterized in that encoder adopts MDCT conversion, nonlinear quantization and entropy coding, decoder adopts entropy decoding, non-linear inverse quantization and IMDCT conversion.
22. ESP as claimed in claim 21 system is characterized in that the MDCT transform definition is as follows:
X ( k ) = Σ n = 0 N - 1 x ( n ) cos ( π ( 2 n + 1 + N / 2 ) ( 2 k + 1 ) 2 N ) , . . . . . . k = 0,1 , . . . , N 2 - 1
Wherein, x (n) expression audio frequency sampling point, n is the label of audio frequency sampling point, span is: n=0,1 ..., N-1;
X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is: k = 0,1 , . . . N 2 - 1 ;
N represents that each frame comprises the quantity of audio frequency sampling point.
23. ESP as claimed in claim 21 system is characterized in that the IMDCT transform definition is as follows:
Figure A2008100634880005C6
Wherein, X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is: k = 0,1 , . . . N 2 - 1 ;
Figure A2008100634880005C8
Audio frequency sampling point after the expression IMDCT conversion, n is its label, span is: n=0,1 ..., N-1;
N represents that each frame comprises the quantity of audio frequency sampling point.
CN200810063488A 2008-08-14 2008-08-14 Audio coder and decoder, and coding and decoding methods Pending CN101651457A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200810063488A CN101651457A (en) 2008-08-14 2008-08-14 Audio coder and decoder, and coding and decoding methods

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200810063488A CN101651457A (en) 2008-08-14 2008-08-14 Audio coder and decoder, and coding and decoding methods

Publications (1)

Publication Number Publication Date
CN101651457A true CN101651457A (en) 2010-02-17

Family

ID=41673598

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200810063488A Pending CN101651457A (en) 2008-08-14 2008-08-14 Audio coder and decoder, and coding and decoding methods

Country Status (1)

Country Link
CN (1) CN101651457A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102419978A (en) * 2011-08-23 2012-04-18 展讯通信(上海)有限公司 Audio decoder and frequency spectrum reconstructing method and device for audio decoding
CN107731237A (en) * 2012-09-24 2018-02-23 三星电子株式会社 Time domain frame error concealing device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102419978A (en) * 2011-08-23 2012-04-18 展讯通信(上海)有限公司 Audio decoder and frequency spectrum reconstructing method and device for audio decoding
CN102419978B (en) * 2011-08-23 2013-03-27 展讯通信(上海)有限公司 Audio decoder and frequency spectrum reconstructing method and device for audio decoding
CN107731237A (en) * 2012-09-24 2018-02-23 三星电子株式会社 Time domain frame error concealing device
CN107731237B (en) * 2012-09-24 2021-07-20 三星电子株式会社 Time domain frame error concealment apparatus

Similar Documents

Publication Publication Date Title
CN103258541B (en) Adaptive time/frequency-based audio encoding and decoding apparatuses and methods
CN102150202B (en) Method and apparatus audio/speech signal encoded and decode
CN101836251B (en) Scalable speech and audio encoding using combinatorial encoding of MDCT spectrum
KR100571824B1 (en) Method for encoding/decoding of embedding the ancillary data in MPEG-4 BSAC audio bitstream and apparatus using thereof
CN101527138B (en) Coding method and decoding method for ultra wide band expansion, coder and decoder as well as system for ultra wide band expansion
CN103380455B (en) Efficient encoding/decoding of audio signals
CN102306494B (en) Method and apparatus for encoding/decoding audio signal
KR20100089772A (en) Method of coding/decoding audio signal and apparatus for enabling the method
CN1262990C (en) Audio coding method and apparatus using harmonic extraction
CN101651457A (en) Audio coder and decoder, and coding and decoding methods
CN1874163B (en) Method for compression and expansion of digital audio data
KR20230091045A (en) An audio processing method using complex data and devices for performing the same
KR100911994B1 (en) Method and apparatus for encoding/decoding signal having strong non-stationary properties using hilbert-huang transform
JP3348759B2 (en) Transform coding method and transform decoding method
JPH09230898A (en) Acoustic signal transformation and encoding and decoding method
Servetti et al. Fast implementation of the MPEG-4 AAC main and low complexity decoder
Malvar Lossless and near-lossless audio compression using integer-reversible modulated lapped transforms
CN103035249B (en) Audio arithmetic coding method based on time-frequency plane context
CN103489450A (en) Wireless audio compression and decompression method based on time domain aliasing elimination and equipment thereof
Zhang et al. AVS-M audio: algorithm and implementation
Lee et al. A VLSI implementation of MPEG-2 AAC decoder system
KR20080092823A (en) Apparatus and method for encoding and decoding signal
KR20240066586A (en) Method and apparatus for encoding and decoding audio signal using complex polar quantizer
CN1764073B (en) Re-quantization method in audio decode
KR20080034819A (en) Apparatus and method for encoding and decoding signal

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20100217