CN101651457A

CN101651457A - Audio coder and decoder, and coding and decoding methods

Info

Publication number: CN101651457A
Application number: CN200810063488A
Authority: CN
Inventors: 李炎钊; 邢文峰
Original assignee: Hangzhou Silan Microelectronics Co Ltd
Current assignee: Hangzhou Silan Microelectronics Co Ltd
Priority date: 2008-08-14
Filing date: 2008-08-14
Publication date: 2010-02-17

Abstract

The invention provides an ESP coder and a coding method, which are characterized in that: an audio sampling point is input into a time-frequency analysis module to perform MDCT transformation so thatthe audio sampling point is converted into a frequency coefficient; the frequency coefficient is compressed into a bit stream by a quantization/entropy coding module; and the bit stream is output intoa DRAM buffer. The invention also provides an ESP decoder and a decoding method, which are characterized in that: a bit steam with one bit block length is read from a DRAM buffer; then the bit streamwith one bit block strength is sent into an input buffer area of an entropy decoding/inverse quantization module; after performing entropy decoding and non-linear inverse quantization, a reconstructed frequency coefficient is obtained and is sent into an inverse time-frequency analysis module; and IMDCT transformation is performed to obtain a time domain signal after reconstruction, and the timedomain signal is output and played. The ESP coder and the ESP decoder, and the coding method and the decoding method can be applied to an electronic shock prevention system.

Description

Audio codec and decoding method

Technical field

The present invention relates to a kind of audio encoding and decoding technique of low computational complexity, relate in particular to the encoding and decoding technique that adopts MDCT (ModifiedDiscrete Cosine Transform) conversion and inverse transformation thereof, this audio encoding and decoding technique can be applied to the electronic shock system.

Background technology

Portable or vehicle-mounted servo player suffers the influence of mechanical shock easily, thereby cause voice data normally not play continuously, for overcoming this problem, all be provided with electronic shock (ESP in the general servo player, Electronic ShockProtection) system, its basic framework as shown in Figure 1, when player is operated in shockproof pattern, optical disc servo reads voice data with the rotating speed that is higher than normal mode, the audio frequency sampling point is compressed by the ESP encoder, data after will compressing are then sent into dynamic random access memory (DRAM, Dynamic Random Access Memory) buffering area stores, after being filled up, buffering area again the speed of optical disc servo is reduced to normal speed, simultaneously the ESP decoder with normal speed from the decoding of DRAM buffering area reading of data and send broadcast, when vibrations cause servoly during from the optical disk reading data failure, the ESP decoder still can be from DRAM buffering area reading of data and decoding, thereby realizes the electronic shock effect of certain hour.

The shockproof time of electronic shock system, there was the requirement of three aspects in the electronic shock system to audio code decode algorithm: 1, the basic free of losses of tonequality by the compression ratio decision of DRAM capacity and audio coding algorithm; 2, compression ratio is high more, and the shockproof time that the DRAM of same capability supports is longer, and DRAM required under the perhaps identical shockproof time is littler, and cost is lower, the general compression ratio that requires to reach more than 4: 1; 3, because the processing stage of peak value, electronic shock (ESP) system need walk abreast and finish the computing of twice coding and the computing of once decoding, if adopt software to realize, requires the computational complexity of its code decode algorithm lower.

Traditional electronic shock system generally adopts adaptive difference pulse code modulation (ADPCM) to carry out audio coding decoding, with the CD Player is example, audio frequency sampling point among the CD generally is 16bit, adopt the ADPCM coding sampling point can be compressed to 4bit, reach 4: 1 compression ratio, and do not lose tonequality substantially, bit rate is 4 (bits) * 44100*2 (sound channel)=352800bps, DRAM with 16Mbit is an example, can store 16000000/352800=45.3s, but adpcm encoder has two significant disadvantage: 1, the structure that the ADPCM code decode algorithm adopts pointwise to calculate, operand is big, can only use application-specific integrated circuit (ASIC) (ASIC) to realize, is not suitable for the software real-time implementation, adopt this scheme just need in system, add special-purpose ESP chip, cause cost to increase; 2, the compression efficiency of adpcm encoder is lower, and the compression ratio that is higher than 4: 1 can cause tonequality obviously to worsen.

Application number is in the patent of CN200310104551.4, proposed to adopt the code decode algorithm of MPEG1 audio code decode algorithm as ESP, can under the condition that tonequality does not have to lose substantially, improve compression ratio, because this method adopts psychoacoustic model and use the bank of filters of mixed type to carry out time frequency analysis and inverse time frequency analysis, computational complexity is quite big, need walk abreast the processing stage of its peak value and finish twice MPEG1 Layer3 cataloged procedure and a MPEG1 Layer3 decode procedure, embedded processors ARM 9 with main flow is an example, through still needing the above computing expense of 100MIPS after optimizing, the average magnitude that this statistics still obtains on soft simulation, if consideration program and data need be loaded into the overhead of high-speed cache (CACHE) from dynamic random access memory (DRAM), and the operand of entropy coding is associated with the data, at different songs, its peak value operand can exceed this number, and the peak value operand of MPEG1 Layer3 can reach more than the 140MIPS.With such computational complexity, be difficult on existing main flow flush bonding processor such as ARM9, finish in real time, if must realize that then need the operating frequency of processor is brought up to more than the 140MHz, this can significantly increase system power dissipation again at the main flow flush bonding processor.

Summary of the invention

The present invention is intended to solve the deficiencies in the prior art, take all factors into consideration the requirement of algorithm complex and compression ratio, a kind of ESP encoder, ESP decoder and electronic shock (ESP) system are provided, effective audio compressed data under the system hardware condition of cost can not increased, the present invention simultaneously also provides the low complexity audio code decode algorithm, this method is fit to software and realizes, can effectively prolong the shockproof time.

A kind of ESP encoder is characterized in that audio frequency sampling point input time frequency analysis module converts is a coefficient of frequency, and coefficient of frequency is compressed into bit stream through quantification/entropy coding module, and bit stream outputs to the DRAM buffer.

The time frequency analysis module of described ESP encoder adopts MDCT (Modified Discrete Cosine Transform) conversion, and it is defined as follows:

X (k) = Σ_{n = 0}^{N - 1} x (n) \cos (\frac{π (2 n + 1 + N / 2) (2 k + 1)}{2 N}), . . . . . . . . . . . . k = 0,1, . . ., \frac{N}{2} - 1

Wherein, x (n) expression audio frequency sampling point, n is the label of audio frequency sampling point, span is: n=0,1 ..., N-1;

X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is:

k = 0,1, . . ., \frac{N}{2} - 1;

N represents that each frame comprises the quantity of audio frequency sampling point.

The MDCT conversion is the conversion of a full real number, and the MDCT conversion is unit with the frame, has the sampling point of half overlapping between consecutive frame, can eliminate the jumping phenomenon of frame boundaries, and the N value can be 128,256, chooses in 512,1024,2048.

Described quantification/entropy coding module adopts nonlinear quantization and Huffman to encode packed data according to the requirement of compression ratio, under the condition that guarantees tonequality, can reach 5: 1 compression ratio at least, and quantification/entropy coding step is as follows:

(1) coefficient of frequency of time frequency analysis module output multiply by scaling factor (Scaling Factor);

(2) coefficient of frequency after the calibration is carried out nonlinear quantization;

(3) coefficient of frequency after will quantizing is combined into coefficient of frequency in twos to (every frame altogether N/4 coefficient of frequency to);

(4) choose the Huffman code table, coefficient of frequency is encoded to carrying out Huffman, generate bit stream with this code table;

(5) judge whether the bit number that the Huffman coding produces exceeds the requirement of specified compression rate, if exceed, just dwindles scaling factor, jumps to step (1), withdraws from iteration until the requirement of satisfying compression ratio;

Wherein said quantitative formula can be y=int[x ^3/4], the int[. in the formula] function representation round numbers part.

Wherein said code table can be chosen for a fixing code table.

Described ESP codec can also comprise package module, the package module of described ESP encoder will quantize/and the M frame bit stream of entropy coding module output is encapsulated in the bit block of a regular length successively, and then output to the DRAM buffer, if the length sum of M frame bit stream is less than the regular length of bit block, the remainder of bit block can be used zero padding.Consider memory space, encoding and decoding delay, the level and smooth requirement of bit rate, M can choose at 2～16.On the one hand since audio signal right and wrong are stably in essence, (10ms magnitude) can be regarded as stationary signal in short-term in a period of time, great changes have taken place through the bit number of quantization encoding output for different frame, by the bit stream behind the M frame audio signal quantization encoding is encapsulated in the bit block, can reduce the influence of peak value bit stream, on the other hand owing to the length of bit block is fixed, the ESP decoder is easy to determine the original position of each bit block, need not time-consumingly to carry out the bit block synchronous searching, reduced the computing expense of system.

A kind of ESP coding method is characterized in that may further comprise the steps:

(1) the audio frequency sampling point is converted to coefficient of frequency by time frequency analysis;

(2) coefficient of frequency is compressed into bit stream through quantification/entropy coding;

The time frequency analysis of wherein said ESP encoder adopts MDCT (Modified Discrete Cosine Transform) conversion, and it is defined as follows:

X (k) = Σ_{n = 0}^{N - 1} x (n) \cos (\frac{π (2 n + 1 + N / 2) (2 k + 1)}{2 N}), . . . . . . . . . . . . k = 0,1, . . ., \frac{N}{2} - 1

k = 0,1, . . ., \frac{N}{2} - 1;

Wherein, described N value can be 128,256, choose in 512,1024,2048.

Wherein said quantification/entropy coding step is as follows:

(1) coefficient of frequency through time frequency analysis output multiply by scaling factor (Scaling Factor);

(4) choose the Huffman code table, coefficient of frequency is encoded to carrying out Huffman, generate bit stream with code table;

(5) judge whether the bit number that the Huffman coding produces exceeds the requirement of specified compression rate, if exceed, just dwindles scaling factor, jumps to step (1), withdraws from iteration until the requirement of satisfying compression ratio.

Wherein, described quantitative formula can be y=int[x ^3/4], the int[. in the formula] function representation round numbers part.

A kind of ESP coding method, its feature can also be encapsulated into the M frame bit stream of quantification/entropy coding output in the bit block of a regular length successively, if the length sum of M frame bit stream is less than the regular length of bit block, the remainder of bit block can be used zero padding.

Wherein, described M can choose at 2～16.

A kind of ESP decoder, it is characterized in that from the DRAM buffer, reading the bit stream of a bit block length, then the bit stream of a bit block length is sent into the input block of entropy decoding/inverse quantization module, the coefficient of frequency that carries out obtaining rebuilding after entropy decoding and the inverse quantization computing is sent into inverse time frequency analysis module, carry out IMDCT (Inverse Modified Discrete CosineTransform) conversion, time-domain signal after the reconstruction and output are play.

The process of described entropy decoding/inverse quantization module is as follows:

(1) use the code table identical that the bit stream of a bit block of input block is carried out the Huffman decoding with the ESP encoder, it is right that decoding obtains M*N/4 coefficient of frequency, wherein each coefficient of frequency is to comprising two coefficient of frequencies, be total to M*N/2 coefficient of frequency, and M anti-scaling factor ISF[m] (m=1,, M), anti-scaling factor of wherein every frame;

(2) M*N/2 the coefficient of frequency that decoding is obtained carries out the inverse quantization processing, obtains linear coefficient of frequency;

(3) the anti-scaling factor ISF[m that uses the decoding of (1) step to obtain] (m=1 ..., M), go on foot anti-calibration of linear frequency coefficient (the two multiplies each other) that obtains, the coefficient of frequency after anti-the calibration to (2);

(4) with the coefficient of frequency after M*N/2 the anti-calibration, be divided into the M frame successively, every frame N/2 coefficient of frequency exports inverse time frequency analysis module to.

The formula of wherein said inverse quantization can be y=int[x ^3/4], int[. wherein] function representation round numbers part.

Inverse time frequency analysis module in the described ESP decoder is with the coefficient of frequency of entropy decoding/inverse quantization module output, be converted to PCM (Pulse Coding Modulation) signal, this module adopts IMDCT (Inverse Modified Discrete CosineTransform) conversion, this conversion is full real number conversion, multiple fast algorithm is efficiently arranged, do not repeat at this.After adopting fast algorithm, the computational complexity C of its average each sampling point and the relation of N value can be expressed as C ∝ log ₂N, IMDCT is defined as follows:

Wherein, X (k) represents coefficient of frequency, and k is the label of coefficient of frequency, and span is:

k = 0,1, . . ., \frac{N}{2} - 1;

Audio frequency sampling point after the expression IMDCT conversion, n is its label, span is: n=0,1 .., N-1;

A kind of ESP coding/decoding method is characterized in that comprising the following step:

(1) bit stream with a bit block length carries out entropy decoding/inverse quantization, the coefficient of frequency that obtains rebuilding;

(2) coefficient of frequency is carried out inverse time IMDCT (Inverse Modified Discrete Cosine Transform) analysis frequently, time-domain signal after obtaining rebuilding and output are play.

Wherein, described entropy decoding/dequantization step is as follows:

(1) use the code table identical that the bit stream of a bit block of input block is carried out the Huffman decoding with the ESP coding method, it is right that decoding obtains M*N/4 coefficient of frequency, wherein each coefficient of frequency is to comprising two coefficient of frequencies, be total to M*N/2 coefficient of frequency, and M anti-scaling factor ISF[m] (m=1,, M), anti-scaling factor of wherein every frame;

Wherein, the formula of described inverse quantization can be y=int[x ^3/4], iht[. wherein] function representation round numbers part.

Wherein, the inverse time frequency analysis in the described ESP coding/decoding method adopts IMDCT (Inverse Modified DiscreteCosine Transform) conversion, and IMDCT is defined as follows:

k = 0,1, . . ., \frac{N}{2} - 1;

Audio frequency sampling point after the expression IMDCT conversion, n is its label, span is: n=0,1 ..., N-1;

A kind of ESP system, it is characterized in that comprising the ESP encoder, DRAM buffer, ESP decoder, when opening the ESP system, the audio frequency sampling point is encoded by the ESP encoder encodes successively, code stream behind the coding is stored in the DRAM buffer, and the code stream in the DRAM buffering area is characterized in that adopting ESP encoder and ESP decoder as mentioned above through the output of ESP decoder decode.

Beneficial effect of the present invention is: can make under the break-even substantially situation of CD Player tonequality, given up psychoacoustic model, with less operand, thereby realize that higher compression ratio reaches the long shockproof time, improves the service efficiency of dynamic random access memory, prolong the shockproof time, the present invention simultaneously adopts block-based computing structure, is fit to software and realizes that the computational complexity of algorithm is low, help reducing system's dominant frequency, thereby reduce system power dissipation and improve the stability of system.

Description of drawings

Fig. 1 ESP system diagram

Fig. 2 ESP coder structure Fig. 1

Fig. 3 ESP coder structure Fig. 2

Fig. 4 ESP decoder architecture Fig. 1

Fig. 5 ESP coded quantization/entropy coding flow chart

Fig. 6 ESP decoding entropy decoding/inverse quantization flow chart

Specific embodiment

A kind of as shown in Figure 2 ESP encoder is characterized in that audio frequency sampling point input time frequency analysis module (20) is converted to coefficient of frequency, and coefficient of frequency is compressed into bit stream through quantification/entropy coding module (21), and bit stream outputs to the DRAM buffer.

The time frequency analysis module (20) of described ESP encoder adopts MDCT (Modified Discrete Cosine Transform) conversion, and it is defined as follows:

X (k) = Σ_{n = 0}^{N - 1} x (n) \cos (\frac{π (2 n + 1 + N / 2) (2 k + 1)}{2 N}), . . . . . . . . . . . . k = 0,1, . . ., \frac{N}{2} - 1

k = 0,1, . . ., \frac{N}{2} - 1;

The MDCT conversion is the conversion of a full real number, multiple fast algorithm is efficiently arranged, as document [1] Mu-Huo Cheng, etc.Fast IMDCT and MDCT Algorithms A Matrix Approach.[J] .IEEE Transactions on SignalProcessing, Vol.51, NO.1, Jan 2003., document [2] Bycond Gi Lee.A new algorithm to computethe discrete cosine transform[J] .IEEE Transactions on Acoustics.Speech, and SignalProcessing.1984,32 (6): the disclosed algorithm of many pieces of documents such as 1243-1245 all can be applied to the present invention, does not repeat at this.The MDCT conversion is a unit with the frame, has the sampling point of half overlapping between consecutive frame, can eliminate the jumping phenomenon of frame boundaries.Choosing of N needs to consider following factor: 1, N is chosen for 2 integral number power, helps the realization of fast algorithm; 2, the big more frequency spectrum resolving power of N is high more, helps improving compression ratio; 3, N is big more, and computational complexity is high more, and on average the relation of the computational complexity C of each sampling point and N value can be expressed as C ∝ log ₂N; 4, the characteristic of audio signal is stably in short-term, and the N value is chosen excessive nonsensical, and based on above consideration, the N value can be 128,256, chooses in 512,1024,2048.

Described quantification/entropy coding module (21) adopts nonlinear quantization and Huffman to encode packed data according to the requirement of compression ratio, under the condition that guarantees tonequality, can reach 5: 1 compression ratio at least, quantification/entropy coding step following (as shown in Figure 5):

(1) coefficient of frequency of time frequency analysis module (20) output multiply by scaling factor (Scaling Factor);

Wherein said code table can be chosen for a fixing code table.

Described ESP codec can also comprise package module (22), as shown in Figure 3, the package module of described ESP encoder (22) will quantize/and the M frame bit stream of entropy coding module output is encapsulated in the bit block of a regular length successively, and then output to the DRAM buffer, if the length sum of M frame bit stream is less than the regular length of bit block, the remainder of bit block can be used zero padding.Consider memory space, encoding and decoding delay, the level and smooth requirement of bit rate, M can choose at 2～16.On the one hand since audio signal right and wrong are stably in essence, (10ms magnitude) can be regarded as stationary signal in short-term in a period of time, great changes have taken place through the bit number of quantization encoding output for different frame, by the bit stream behind the M frame audio signal quantization encoding is encapsulated in the bit block, can reduce the influence of peak value bit stream, on the other hand owing to the length of bit block is fixed, the ESP decoder is easy to determine the original position of each bit block, need not time-consumingly to carry out the bit block synchronous searching, reduced the computing expense of system.

The time frequency analysis of wherein said ESP encoder adopts MDCT (Modified Discrete Cosine Transform) conversion, and it is defined as follows,

X (k) = Σ_{n = 0}^{N - 1} x (n) \cos (\frac{π (2 n + 1 + N / 2) (2 k + 1)}{2 N}), . . . . . . . . . . . . k = 0,1, . . ., \frac{N}{2} - 1

k = 0,1, . . ., \frac{N}{2} - 1;

Wherein, described N value can be 128,256, choose in 512,1024,2048.

Wherein said quantification/entropy coding step following (as shown in Figure 5):

Wherein, described M can choose at 2～16.

A kind of ESP decoder, as shown in Figure 4, from the DRAM buffer, read the bit stream of a bit block length, then the bit stream of a bit block length is sent into the input block of entropy decoding/inverse quantization module (41), after carrying out entropy decoding and inverse quantization, the coefficient of frequency that obtains rebuilding is sent into inverse time frequency analysis module (42), carries out IMDCT (Inverse Modified DiscreteCosine Transform) conversion, and time-domain signal after obtaining rebuilding and output are play.

The process of described entropy decoding/inverse quantization module (41) following (as shown in Figure 6):

(1) use the code table identical that the bit stream of a bit block of input is carried out the Huffman decoding with the ESP encoder, it is right that decoding obtains M*N/4 coefficient of frequency, wherein each coefficient of frequency is to comprising two coefficient of frequencies, be total to M*N/2 coefficient of frequency, and M anti-scaling factor ISF[m] (m=1,, M), anti-scaling factor of wherein every frame;

(4) with the coefficient of frequency after M*N/2 the anti-calibration, be divided into the M frame successively, every frame N/2 coefficient of frequency exports inverse time frequency analysis module (42) to.

Inverse time frequency analysis module (42) in the described ESP decoder is with the coefficient of frequency of entropy decoding/inverse quantization module output, be converted to PCM (Pulse Coding Modulation) signal, this module adopts IMDCT (Inverse Modified DiscreteCosine Transform) conversion, this conversion is full real number conversion, multiple fast algorithm is efficiently arranged, wherein document [1], document many pieces of disclosed algorithms of document such as [2] all can be applied to the present invention as described above, do not repeat at this.After adopting fast algorithm, the computational complexity C of its average each sampling point and the relation of N value can be expressed as C ∝ log ₂N, IMDCT is defined as follows,

k = 0,1, . . ., \frac{N}{2} - 1;

(1) bit stream that reads a bit block length from the DRAM buffer carries out entropy decoding/inverse quantization, the coefficient of frequency that obtains rebuilding;

Wherein, described entropy decoding/dequantization step following (as shown in Figure 6):

(1) use the code table identical that the bit stream of a bit block of input is carried out the Huffman decoding with the ESP coding method, it is right that decoding obtains M*N/4 coefficient of frequency, wherein each coefficient of frequency is to comprising two coefficient of frequencies, be total to M*N/2 coefficient of frequency, and M anti-scaling factor ISF[m] (m=1,, M), anti-scaling factor of wherein every frame;

Wherein, the formula of described inverse quantization can be y=int[x ^3/4], int[. wherein] function representation round numbers part.

Wherein, the inverse time frequency analysis in the described ESP coding/decoding method adopts IMDCT (Inverse Modified DiscreteCosine Transform) conversion, and IMDCT is defined as follows,

k = 0,1, . . ., \frac{N}{2} - 1;

What should be understood that is: the foregoing description is just to explanation of the present invention; rather than limitation of the present invention; any innovation and creation modification that does not exceed in the connotation scope of the present invention waits the replacement or the modification of other unsubstantialities, all falls within the protection range of the present invention.

Claims

1. ESP encoder, it is characterized in that audio frequency sampling point input time frequency analysis module converts is a coefficient of frequency, coefficient of frequency is compressed into bit stream through quantification/entropy coding module, bit stream outputs to the DRAM buffer, wherein, the time frequency analysis module of described ESP encoder adopts MDCT (Modified Discrete Cosine Transform) conversion, and it is defined as follows:

X (k) = Σ_{n = 0}^{N - 1} x (n) \cos (\frac{π (2 n + 1 + N / 2) (2 k + 1)}{2 N}), . . . . . . k = 0,1, . . ., \frac{N}{2} - 1

k = 0,1, . . . \frac{N}{2} - 1;

2. a kind of ESP encoder as claimed in claim 1 is characterized in that the N value can be 128,256, chooses in 512,1024,2048.

3. a kind of ESP encoder as claimed in claim 1 is characterized in that described quantification/entropy coding step is as follows:

(1) coefficient of frequency with the output of time frequency analysis module multiply by scaling factor (Scaling Factor);

(5) judge whether the bit number that the Huffman coding is exported exceeds the requirement of specified compression rate, if exceed, just dwindles scaling factor, jumps to step (1), withdraws from iteration until the requirement of satisfying compression ratio.

4. a kind of ESP encoder as claimed in claim 3 is characterized in that the quantitative formula that adopted can be y=int[x ^3/4], the int[. in the formula] function representation round numbers part.

5. a kind of ESP encoder as claimed in claim 3 is characterized in that described code table can be chosen for a fixing code table.

6. a kind of ESP encoder as claimed in claim 3, it is characterized in that described ESP codec can also comprise package module, described package module, the M frame bit stream that quantification/entropy coding module is exported is encapsulated in the bit block successively, and then outputs to the DRAM buffer.

7. ESP coding method is characterized in that may further comprise the steps:

(1) the audio frequency sampling point is carried out time frequency analysis and be converted to coefficient of frequency;

Wherein said time frequency analysis adopts MDCT (Modified Discrete Cosine Transform) conversion, and it is defined as follows:

X (k) = Σ_{n = 0}^{N - 1} x (n) \cos (\frac{π (2 n + 1 + N / 2) (2 k + 1)}{2 N}), . . . . . . k = 0,1, . . ., \frac{N}{2} - 1

k = 0,1, . . . \frac{N}{2} - 1;

8. ESP coding method as claimed in claim 7 is characterized in that described N value can be 128,256, chooses in 512,1024,2048.

9. ESP coding method as claimed in claim 7 is characterized in that described quantification/entropy coding step is as follows:

(1) will multiply by scaling factor (Scaling Factor) through the coefficient of frequency of time frequency analysis output;

10. ESP coding method as claimed in claim 9 is characterized in that described quantitative formula can be y=int[x ^3/4], the int[. in the formula] function representation round numbers part.

11. ESP coding method as claimed in claim 7 is characterized in that and the M frame bit stream of quantification/entropy coding module output can also be encapsulated in the bit block successively, and then outputs to the DRAM buffer.

12. ESP coding method as claimed in claim 11 is characterized in that described M can choose at 2～16.

13. ESP decoder, it is characterized in that from the DRAM buffer, reading the bit stream of a bit block length, then the bit stream of a bit block length is sent into the input block of entropy decoding/inverse quantization module, the coefficient of frequency that obtains rebuilding is sent into inverse time frequency analysis module, carry out IMDCT (Inverse Modified Discrete Cosine Transform) conversion, time-domain signal after obtaining rebuilding and output are play.

14. ESP decoder as claimed in claim 13 is characterized in that entropy decoding/dequantization step is as follows:

(3) the anti-scaling factor ISF[m that uses first step decoding to obtain] (m=1 ..., M), go on foot anti-calibration of linear frequency coefficient (the two multiplies each other) that obtains, the coefficient of frequency after anti-the calibration to second;

(4) with the coefficient of frequency after M*N/2 the anti-calibration, be divided into the M frame, every frame N/2 coefficient of frequency exports inverse time frequency analysis module to.

15. ESP decoder as claimed in claim 14 is characterized in that the formula of wherein said inverse quantization can be y=int[x ^3/4], int[. wherein] function representation round numbers part.

16. ESP decoder as claimed in claim 13 is characterized in that the inverse time frequency analysis module in the described ESP decoder adopts IMDCT (Inverse Modified Discrete Cosine Transform) conversion, IMDCT is defined as follows:

k = 0,1, . . . \frac{N}{2} - 1;

17. an ESP coding/decoding method is characterized in that comprising the following step:

(1) bit stream of a bit block length is sent into entropy decoding/inverse quantization, the coefficient of frequency that obtains rebuilding;

(2) carry out the inverse time frequency analysis, time-domain signal after obtaining rebuilding and output are play.

18. ESP coding/decoding method as claimed in claim 17 is characterized in that described entropy decoding/dequantization step is as follows:

(4) with the coefficient of frequency after M water N/2 the anti-calibration, be divided into the M frame, every frame N/2 coefficient of frequency exports inverse time frequency analysis module to.

19. ESP coding/decoding method as claimed in claim 18 is characterized in that the formula of described inverse quantization can be y=int[x ^3/4], int[. wherein] function representation round numbers part.

20. ESP coding/decoding method as claimed in claim 17 is characterized in that described inverse time frequency analysis adopts the IMDCT conversion, it is defined as follows:

k = 0,1, . . . \frac{N}{2} - 1;

21. ESP system, it is characterized in that comprising the ESP encoder, DRAM buffer, ESP decoder, when opening the ESP system, the audio frequency sampling point is encoded by the ESP encoder successively, and the code stream behind the coding is stored in the DRAM buffer, and the code stream in the DRAM buffering area is exported through the ESP decoder decode, it is characterized in that encoder adopts MDCT conversion, nonlinear quantization and entropy coding, decoder adopts entropy decoding, non-linear inverse quantization and IMDCT conversion.

22. ESP as claimed in claim 21 system is characterized in that the MDCT transform definition is as follows:

X (k) = Σ_{n = 0}^{N - 1} x (n) \cos (\frac{π (2 n + 1 + N / 2) (2 k + 1)}{2 N}), . . . . . . k = 0,1, . . ., \frac{N}{2} - 1

k = 0,1, . . . \frac{N}{2} - 1;

23. ESP as claimed in claim 21 system is characterized in that the IMDCT transform definition is as follows:

k = 0,1, . . . \frac{N}{2} - 1;