WO2006096099A1 - Low-complexity code excited linear prediction encoding - Google Patents
Low-complexity code excited linear prediction encoding Download PDFInfo
- Publication number
- WO2006096099A1 WO2006096099A1 PCT/SE2005/000349 SE2005000349W WO2006096099A1 WO 2006096099 A1 WO2006096099 A1 WO 2006096099A1 SE 2005000349 W SE2005000349 W SE 2005000349W WO 2006096099 A1 WO2006096099 A1 WO 2006096099A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- signal
- excitation
- pulse locations
- candidate
- signals
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/09—Long term prediction, i.e. removing periodical redundancies, e.g. by using adaptive codebook or pitch predictor
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/10—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/10—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a multipulse excitation
- G10L19/107—Sparse pulse excitation, e.g. by using algebraic codebook
Definitions
- the present invention relates in general to audio coding, and in particular to code excited linear prediction coding.
- ICP inter-channel prediction
- ICP image stabilization
- AMR-NB Adaptive Multi-Rate Narrow Band and Adaptive Multi-Rate
- an excitation signal at an input of a short-term LP synthesis filter is constructed by adding two excitation vectors from adaptive and fixed (innovative) codebooks, respectively.
- the speech is synthesized by feeding the two properly chosen vectors from these codebooks through the short-term synthesis filter.
- the optimum excitation sequence in a codebook is chosen using an analysis-by- synthesis search procedure in which the error between the original and synthesized speech is minimized according to a perceptually weighted distortion measure.
- a first type of codebook is the so-called stochastic codebooks.
- stochastic codebooks Such a codebook often involves substantial physical storage. Given the index in a codebook, the excitation vector is obtained by conventional table lookup. The size of the codebook is therefore limited by the bit-rate and the complexity.
- a second type of codebook is an algebraic codebook.
- algebraic codebooks are not random and require virtually no storage.
- An algebraic codebook is a set of indexed code vectors whose amplitudes and positions of the pulses constituting the k ⁇ code vector are derived directly from the corresponding index k. This requires virtually no memory requirements. Therefore, the size of algebraic codebooks is not limited by memory requirements. Additionally, the algebraic codebooks are well suited for efficient search procedures.
- the amount of bits allocated to the fixed codebook procedures ranges from 36% up to 76%.
- a general object of the present invention is thus to provide improved methods and devices for speech coding.
- a subsidiary object of the present invention is to provide CELP methods and devices having reduced requirement in terms of bit rates and encoder complexity.
- excitation signals of a first signal encoded by CELP are used to derive a limited set of candidate excitation signals for a second signal.
- the second signal is correlated with the first signal.
- the limited set of candidate excitation signals is derived by a rule, which was selected from a predetermined set of rules based on the encoded first signal and/ or the second signal.
- pulse locations of the excitation signals of the first encoded signal are used for determining the set of candidate excitation signals. More preferably, the pulse locations of the set of candidate excitation signals are positioned in the vicinity of the pulse locations of the excitation signals of the first encoded signal.
- the first and second signals may be multi-channel signals of a common speech or audio signal. However, the first and second signals may also be identical, whereby the coding of the second signal can be utilized for re-encoding at a lower bit rate.
- One advantage with the present invention is that the coding complexity is reduced. Furthermore, in the case of multi-channel signals, the required bit rate for transmitting coded signals is reduced. Also, the present invention may be efficiently applied to re-encoding the same signal at a lower rate.
- Another advantage of the invention is the compatibility with mono signals and the possibility to be implemented as an extension to existing speech codecs with very few modifications.
- FIG. IA is a schematic illustration of a code excited linear prediction model
- FIG. IB is a schematic illustration of a process of deriving an excitation signal
- FIG. 1C is a schematic illustration of an embodiment of an excitation signal for use in a code excited linear prediction model
- FIG. 2 is a block scheme of an embodiment of an encoder and decoder according to the code excited linear prediction model;
- FIG. 3A is a diagram illustrating one embodiment of a principle of selecting candidate excitation signals according to the present invention
- FIG. 3B is a diagram illustrating another embodiment of a principle of selecting candidate excitation signals according to the present invention.
- FIG. 4 illustrates a possibility to reduce required data entities according to an embodiment of the present invention
- FIG. 5A is a block scheme of an embodiment of encoders and decoders for two signals according to the present invention.
- FIG. 5B is a block scheme of another embodiment of encoders and decoders for two signals according to the present invention
- FIG. 6 is a block scheme of an embodiment of encoders and decoders for re-encoding of a signal according to the present invention
- FIG. 7 is a block scheme of an embodiment of encoders and decoders for parallel encoding of a signal for different bit rates according to the present invention
- FIG. 8 is a diagram illustrating the perceptual quality achieved by embodiments of the present invention.
- FIG. 9 is a flow diagram of the main steps of an embodiment of an encoding method according to the present invention.
- FIG. 10 is a flow diagram of the main steps of another embodiment of an encoding method according to the present invention.
- FIG. 11 is a flow diagram of the main steps of an embodiment of a decoding method according to the present invention.
- a general CELP speech synthesis model is depicted in Fig. IA.
- a fixed codebook 10 comprises a number of candidate excitation signals 30, characterized by a respective index k. In the case of an algebraic codebook, the index k alone characterizes the corresponding candidate excitation signal 30 completely.
- Each candidate excitation signal 30 comprises a number of pulses 32 having a certain position and amplitude.
- An index k determines a candidate excitation signal 30 that is amplified in an amplifier 11 giving rise to an output excitation signal Ck(n) 12.
- the excitation signal Ck(n) and the adaptive signal v(n) are summed in an adder 17, giving a composite excitation signal u(n).
- the composite excitation signal u(n) influences the adaptive codebook for subsequent signals, as indicated by the dashed line 13.
- the composite excitation signal u(n) is used as input signal to a transform 1 /A(z) in a linear prediction synthesis section 20, resulting in a "predicted" signal s(n) 21 , which, typically after post-processing 22, is provided as the output from the CELP synthesis procedure.
- the CELP speech synthesis model is used for analysis-by- synthesis coding of the speech signal of interest.
- a target signal s(n) i.e. the signal that is going to be resembled is provided.
- the remaining difference is the target for the fixed codebook excitation signal, whereby a codebook index k corresponding to an entry Ck should minimize the difference according to typically an objective function, e.g. a mean square measure.
- the algebraic codebook is searched by minimizing the mean square error between the weighted input speech and the weighted synthesis speech.
- the fixed codebook search aims to find the algebraic codebook entry c k corresponding to index k, such that
- the matrix H is a filtering matrix whose elements are derived from the impulse response of a weighting filter.
- y 2 is a vector of components which are dependent on the signal to be encoded.
- This fixed codebook procedure can be illustrated as in Fig. IB, where an index k selects an entry Ck from the fixed codebook 10 as excitation signal 12.
- the index k typically serves as an input to a table look-up, while in an algebraic fixed codebook, the excitation signal 12 are derived directly from the index k.
- the multi-pulse excitation can be written as:
- Fig. 1C illustrates an example of a candidate excitation signal 30 of the fixed codebook 10.
- the candidate excitation signal 30 is characterized by a number of pulses 32, in this example 8 pulses.
- the pulses 32 are characterized by their position P(l)-P(8) and their amplitude, which in a typical algebraic fixed codebook is either +1 or - 1.
- the CELP model is typically implemented as illustrated in Fig. 2.
- the different parts corresponding to the different functions of the CELP synthesis model of Fig. IA are given the same reference numbers, since the parts mainly are characterized by their function and typically not in the same degree by their actual implementation. For instance, error weighting filters, usually present in an actual implementation of a linear prediction analysis by synthesis are not represented.
- a signal to be encoded s(n) 33 is provided to an encoder unit 40.
- the encoder unit comprises a CELP synthesis block 25 according to the above discussed principles. (Post-processing is omitted in order to facilitate the reading of the figure.)
- the output from the CELP synthesis block 25 is compared with the signal s(n) in a comparator block 31.
- a difference 37 which may be weighted by a weighting filter, is provided to an codebook optimization block 35, which is arranged according to any prior-art principles to find an optimum or at least reasonably good excitation signal Ck(n) 12.
- the codebook optimization block 35 provides the fixed codebook 10 with the corresponding index k.
- the index k and the delay ⁇ of the adaptive codebook 12 are encoded in an index encoder 38 to provide an output signal 45 representing the index k and the delay ⁇ .
- the representation of the index k and the delay ⁇ is provided to a decoder unit 50.
- the decoder unit comprises a CELP synthesis block 25 according to the above discussed principles. (Post-processing is also here omitted in order to facilitate the reading of the figure.)
- the representation of index k and delay ⁇ are decoded in an index decoder 53, and index k and delay ⁇ are provided as input parameters to the fixed codebook and the adaptive code, respectively, resulting in a synthesized signal s(n) 21, which is supposed to resemble the original signal s(n).
- the representation of the index k and the delay ⁇ can be stored for a shorter or longer time anywhere between the encoder and decoder, enabling e.g. audio recordings storing requiring relatively small storing capability.
- the present invention is related to speech and in general audio coding.
- it deals with cases where a main signal s M (n) has been encoded according to the CELP technique and the desire is to encode another signal s s (n) .
- This invention is thus directly applicable to stereo and in general multichannel coding for speech in teleconferencing applications.
- the application of this invention can also include audio coding as part of an open-loop or closed-loop content dependent encoding.
- the main signal s M (n) is often chosen as the sum signal and s s (n) as the difference signal of the left and right channels.
- the presumption of the present invention is that the main signal s M (n) is available in a CELP encoded representation.
- One basic idea of the present invention is to limit the search in the fixed codebook during the encoding of the other signal s s (n) to a subset of candidate excitation signals. This subset is selected dependent on the CELP encoding of the main signal.
- the pulses of the candidate excitation signals of the subset are restricted to a set of pulse positions that are dependent on the pulse positions of the main signal. This is equivalent to defining constrained candidate pulse locations.
- the set of available pulse positions can typically be set to the pulse positions of the main signal plus neighboring pulse positions.
- a main channel and a side channel can be constructed by
- the main channel is the first encoded channel and that the pulses locations for the fixed codebook excitation for that encoding are available.
- g P v(n) is the adaptive codebook excitation and s c (n) is the target signal for adaptive codebook search.
- the number of potential pulse positions of the candidate excitation signals are defined relative to the main signal pulse positions. Since they are only a fraction of all possible positions, the amount of bits required for encoding the side signal with an excitation signal within this limited set of candidate excitation signals is therefore largely reduced, compared with the case where all pulse positions may occur.
- the selection of the pulses candidate positions relatively to the main pulse position is fundamental in determining the complexity as well as the required bit-rate.
- pulse positions for the side signal are set equal to the pulse positions of the main signal. Then there is no encoding of the pulse positions needed and only encoding of the pulse amplitudes is needed. In the case of algebraic code books with pulses having + 1/-1 amplitudes, then only the signs (N bits) need to be encoded.
- the pulse positions of candidate excitation signals for the side signal are selected based on the main signal pulse positions and possible additional parameters.
- the additional parameters may consist of time delay between the two channels and/or difference of adaptive codebook index.
- J(i,k) denote some delay index.
- each mono pulse position generate a set of pulse positions used for constructing the candidate excitation signals for the side signal pulse search procedure.
- P M denotes the pulse positions of the excitation signal for the main signal
- P s * denotes possible pulse positions of the candidate excitation signals for the side signal analysis.
- the delay index may be made dependent on the effective delay between the two channels and /or the adaptive codebook index.
- A: max 3
- J(i,k) j(k) e ⁇ - 1,0,+ 1).
- the rules how to select the pulse positions can be constructed in many various manners.
- the actual rule to use may be adapted to the actual implementation.
- the important characteristics are, however, that the pulse positions candidates are selected dependent on the pulse positions resulting from the main signal analysis following a certain rule.
- This rule may be unique and fixed or may be selected from a set of predetermined rules dependent on e.g. the degree of correlation between the two channels and/ or the delay between the two channels..
- the set of pulse candidates of the side signal is constructed.
- the set of the side signal pulse candidates is in general very small compared to the entire frame length. This allows reformulating the objective maximization problem based on a decimated frame.
- the pulses are searched by using, for example, the depth-first algorithm described in [5] or by using an exhaustive search if the number of candidate pulses is really small. However, even with a small number of candidates it is recommended to use a fast search procedure.
- a backward filtered signal is in general pre-computed using
- P * (i) are the candidate pulses positions and p is their number. It should be noted that p is always less than, and typically much less than, the frame length L .
- ⁇ 2 is symmetric and is positive definite.
- Fig. 4 The summary of these decimation operations is illustrated in Fig. 4.
- a reduction of an algebraic codebook 10 of ordinary size to a reduced size codebook 10' is illustrated.
- a reduction of a weighting filter covariance matrix 60 of ordinary size to a reduced weighting filter covariance matrix 60' is illustrated.
- a reduction of a backward filtered target 62 of ordinary size to a reduced size backward filtered target 62' is illustrated.
- Maximizing the objective function on the decimated signals has several advantages.
- One of them is the reduction of memory requirements, for instance the matrix ⁇ 2 needs lower memory.
- Another advantage is the fact that because the main signal pulse locations are in all cases transmitted to the receiver, the indices of the decimated signals are always available to the decoder. This in turn allows the encoding of the other signal (side) pulse positions relatively to the main signal pulse positions, which consumes much less bits.
- Another advantage is the reduction in computational complexity since the maximization is performed on decimated signals.
- FIG. 5A an embodiment of a system of encoders 4OA, 4OB and decoders
- a main signal 33A s m (n) is provided to a first encoder 4OA.
- the first encoder 4OA operates according to any prior art CELP encoding model, producing an index k m for the fixed codebook and a delay measure ⁇ m for the adaptive codebook. The details of this encoding are not of any importance for the present invention and is omitted in order to facilitate the understanding of Fig. 5A.
- the parameters k m and ⁇ m are encoded in a first index encoder 38A, giving representations k* m and ⁇ * m of the parameters that are sent to a first decoder
- the representations k* m and ⁇ * m are decoded into parameters k m and ⁇ m in a first index decoder 53A. From these parameters, the original signal is reproduced according to any CELP decoding model according to prior art. The details of this decoding are not of any importance for the present invention and is omitted in order to facilitate the understanding of Fig. 5A.
- a reproduced first output signal 2 IA s m (n) is provided.
- a side signal 33B s s (n) is provided as an input signal to a second encoder
- the second encoder 4OB is to most parts similar as the encoder of Fig. 2.
- the signals are now given an index "s" to distinguish them from any signals used for encoding the main signal.
- the second encoder 4OB comprises a CELP synthesis block 25.
- the index k m or a representation thereof is provided from the first encoder 4OA to an input 45 of the fixed codebook 10 of the second encoder 4OB.
- the index k m is used by a candidate deriving means 47 to extract a reduced fixed codebook 10' according to the above presented principles.
- the synthesis of the CELP synthesis block 25' of the second encoder 4OB is thus based on indices k' s representing excitation signals c' t , (n) from the reduced fixed codebook 10'.
- An index k' s is thus found to represent a best choice of the CELP synthesis.
- the parameters k' s and ⁇ s are encoded in a second index encoder 38B, giving representations k'* s and ⁇ * s of the parameters that are sent to a second decoder 5OB.
- the representations k'* s and ⁇ * s are decoded into parameters k' s and ⁇ s in a second index decoder 53B.
- the index parameter k m is available from the first decoder 5OA and is provided to the input 55 of the fixed codebook 10 of the second decoder 5OB, in order to enabling an extraction by a candidate deriving means 57 of a reduced fixed codebook 10' equal to what was used in the second encoder 4OB.
- the original side signal is reproduced according to ordinary CELP decoding models 25". The details of this decoding are performed essentially in analogy with Fig. 2, but using the reduced fixed codebook 10' instead.
- a reproduced side output signal 2 IB s s (n) is thus provided.
- Selection of the rule to construct the set of candidate pulses can advantageously be made adaptive and dependent on additional inter-channel characteristics, such as delay parameters, degree of correlation, etc.
- the encoder has preferably to transmit to the decoder which rule has been selected for deriving the set of candidate pulses for encoding the other signal.
- the rule selection could for instance be performed by a closed- loop procedure, where a number of rules are tested and the one giving the best result finally is selected.
- Fig. 5B illustrates an embodiment, using the rule selection approach.
- the mono signal s m (n) and preferably also the side signal s s (n) are here additionally provided to a rule selecting unit 39.
- the parameter k m representing the mono signal can be used.
- the rule selection unit 39 the signals are analysed, e.g. with respect to delay parameters or degree of correlation.
- a rule e.g. represented by an index r is selected from a set of predefined rules.
- the index of the selected rule is provided to the candidate deriving means 47 for determining how the candidate sets should be derived.
- the rule index r is also provided to the second index encoder 38B giving a representation r* of the index, which subsequently is sent to the second decoder 5OB.
- the second index decoder 53B decodes the rule index r, which then is used to govern the operation of the candidate deriving means 57.
- the specific rule used as well as the resulting number of candidate side signal pulses are the main parameters governing the bit rate and the complexity of the algorithm.
- FIG. 6 illustrates an embodiment, where different parts of a transmission path allows for different bit rates. It is thus applicable as part of a rate transcoding solution.
- a signal s(n) is provided as an input signal 33A to a first encoder 4OA, which produces representations k* and ⁇ * of parameters that are transmitted according to a first bit rate. At a certain place, the available bit rate is reduced, and a re-encoding for lower bit-rates has to be performed.
- a first decoder 5OA uses the representations k* and ⁇ * of parameters for producing a reproduced signal 2 IA s(n) .
- This reproduced signal 2 IA s(n) is provided to a second encoder 4OB as an input signal 33B. Also the index k from the first decoder 50A is provided to the second encoder 4OB. The index k is in analogy with Fig. 6 used for extracting a reduced fixed codebook 10'.
- the second encoder 4OB encodes the signal s(n) for a lower bit rate, giving an index k' representing the selected excitation signal c' -, (n) .
- this index &' is of little use in a distant decoder, since the decoder does not have the information necessary to construct a corresponding reduced fixed codebook.
- the index k' thus has to be associated with an index k , referring to the original codebook 10.
- a first encoding is made with a bit rate n and the second encoding is made with a bit rate m, where n>m.
- Fig. 7 illustrates a system, where a signal s(n) is provided to both a first encoder 4OA and a second encoder 4OB.
- the second encoder provides a reduced fixed codebook 10' based on an index k a representing the first encoding.
- the second encoding is here denoted by the index "b".
- the second encoder 4OB thus becomes independent of the first decoder 5OB.
- Most other parts are in analogy with Fig. 6, however, with adapted indexing.
- the present invention offers a substantial reduction in complexity thus allowing the implementation of these applications with low cost hardware.
- An embodiment of the above-described algorithm has been implemented in association with an AMR-WB speech codec.
- the same adaptive codebook index is used as is used for encoding the mono excitation.
- the LTP gain as well as the innovation vector gain was not quantized.
- the algorithm for the algebraic codebook was based on the mono pulse positions. As described in e.g. [6], the codebook may be structured in tracks.
- the number of tracks is equal to 4.
- the candidate pulse positions are as follows
- 1 io, U is 0, 4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60
- 2 ii is, ⁇ 9 1 , 5, 9, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53, 57, 61
- the implemented algorithm retains all the mono pulses as the pulse positions of the side signal, i.e. the pulse positions are not encoded. Only the signs of the pulses are encoded.
- each pulse will consume only 1 bit for encoding the sign, which leads to a total bit rate equal to the number of mono pulses.
- there are 12 pulses per sub-frame and this leads to a total bit rate equal to 12 bits x 4 x 50 2.4 kbps for encoding the innovation vector. This is the same number of bits required for the very lowest AMR-WB mode (2 pulses for the 6.6kbps mode), but in this case we have higher pulses density.
- Fig. 8 shows the results obtained with PEAQ [4] for evaluating the perceptual quality.
- PEAQ has been chosen since to the best knowledge, it is the only tool that provides objective quality measures for stereo signals. From the results, it is clearly seen that the stereo 100 does in fact provide a quality lift with respect to the mono signal 102.
- the used sound items were quite various, sound 1 , S l , is an extract from a movie with background noise, sound 2, S2, is a 1 min radio recording, sound 3, S3, a cart racing sport event, and sound 4, S4, is a real two microphone recoding.
- Fig. 9 illustrates an embodiment of an encoding method according to the present invention.
- the procedure starts in step 200.
- a representation of a CELP excitation signal for a first audio signal is provided. Note that it is not absolutely necessary to provide the entire first audio signal, just the representation of the CELP excitation signal.
- a second audio signal is provided, which is correlated with the first audio signal.
- a set of candidate excitation signals is derived in step 214 depending on the first CELP excitation signal.
- the pulse positions of the candidate excitation signals are related to the pulse positions of the CELP excitation signal of the first audio signal.
- step 216 a CELP encoding is performed on the second audio signal, using the reduced set of candidate excitation signals derived in step 214.
- the representation, i.e. typically an index, of the CELP excitation signal for the second audio signal is encoded, using references to the reduced candidate set. The procedure ends in step 299.
- Fig. 10 illustrates another embodiment of an encoding method according to the present invention.
- the procedure starts in step 200.
- step 21 1 an audio signal is provided.
- step 213 a representation of a first CELP excitation signal for the same audio signal is provided.
- a set of candidate excitation signals is derived in step 215 depending on the first CELP excitation signal.
- the pulse positions of the candidate excitation signals are related to the pulse positions of the CELP excitation signal of the first audio signal.
- a CELP re-encoding is performed on the audio signal, using the reduced set of candidate excitation signals derived in step 215.
- the representation, i.e. typically an index, of the second CELP excitation signal for the audio signal is encoded, using references to the non- reduced candidate set, i.e. the set used for the first CELP encoding.
- the procedure ends in step 299.
- Fig. 11 illustrates an embodiment of a decoding method according to the present invention.
- the procedure starts in step 200.
- a representation of a first CELP excitation signal for a first audio signal is provided.
- a representation of a second CELP excitation signal for a second audio signal is provided.
- a second excitation signal is derived from the second excitation signal and with knowledge of the first excitation signal.
- a reduced set of candidate excitation signals is derived depending on the first CELP excitation signal, from which a second excitation signal is selected by use of an index for the second CELP excitation signal.
- the second audio signal is reconstructed using the second excitation signal.
- the procedure ends in step 299.
- the invention allows a dramatic reduction of complexity (both memory and arithmetic operations) as well as bit-rate when encoding multiple audio channels by using algebraic codebooks and CELP.
Abstract
Description
Claims
Priority Applications (8)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2008500663A JP5174651B2 (en) | 2005-03-09 | 2005-03-09 | Low complexity code-excited linear predictive coding |
AT05722196T ATE513290T1 (en) | 2005-03-09 | 2005-03-09 | LESS COMPLEX CODE EXCITED LINEAR PREDICTION CODING |
CN2005800489816A CN101138022B (en) | 2005-03-09 | 2005-03-09 | Low-complexity code excited linear prediction encoding and decoding method and device |
PCT/SE2005/000349 WO2006096099A1 (en) | 2005-03-09 | 2005-03-09 | Low-complexity code excited linear prediction encoding |
BRPI0520115A BRPI0520115B1 (en) | 2005-03-09 | 2005-03-09 | methods for encoding and decoding audio signals and encoder and decoder for audio signals |
KR1020077023047A KR101235425B1 (en) | 2005-03-09 | 2005-03-09 | Low-complexity code excited linear prediction encoding |
EP05722196A EP1859441B1 (en) | 2005-03-09 | 2005-03-09 | Low-complexity code excited linear prediction encoding |
TW094144472A TW200639801A (en) | 2005-03-09 | 2005-12-15 | Low-complexity code excited linear prediction encoding |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/SE2005/000349 WO2006096099A1 (en) | 2005-03-09 | 2005-03-09 | Low-complexity code excited linear prediction encoding |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2006096099A1 true WO2006096099A1 (en) | 2006-09-14 |
Family
ID=36953623
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/SE2005/000349 WO2006096099A1 (en) | 2005-03-09 | 2005-03-09 | Low-complexity code excited linear prediction encoding |
Country Status (8)
Country | Link |
---|---|
EP (1) | EP1859441B1 (en) |
JP (1) | JP5174651B2 (en) |
KR (1) | KR101235425B1 (en) |
CN (1) | CN101138022B (en) |
AT (1) | ATE513290T1 (en) |
BR (1) | BRPI0520115B1 (en) |
TW (1) | TW200639801A (en) |
WO (1) | WO2006096099A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013148913A (en) * | 2007-04-29 | 2013-08-01 | Huawei Technologies Co Ltd | Encoding method, decoding method, encoder, and decoder |
US8959018B2 (en) | 2010-06-24 | 2015-02-17 | Huawei Technologies Co.,Ltd | Pulse encoding and decoding method and pulse codec |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6192334B1 (en) * | 1997-04-04 | 2001-02-20 | Nec Corporation | Audio encoding apparatus and audio decoding apparatus for encoding in multiple stages a multi-pulse signal |
EP1132893A2 (en) * | 2000-02-15 | 2001-09-12 | Lucent Technologies Inc. | Constraining pulse positions in CELP vocoding |
US20040024595A1 (en) * | 1997-01-27 | 2004-02-05 | Toshiyuki Nomura | Speech coder/decoder |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3139602B2 (en) * | 1995-03-24 | 2001-03-05 | 日本電信電話株式会社 | Acoustic signal encoding method and decoding method |
JPH1097295A (en) * | 1996-09-24 | 1998-04-14 | Nippon Telegr & Teleph Corp <Ntt> | Coding method and decoding method of acoustic signal |
JP3622365B2 (en) * | 1996-09-26 | 2005-02-23 | ヤマハ株式会社 | Voice encoding transmission system |
JP3329216B2 (en) * | 1997-01-27 | 2002-09-30 | 日本電気株式会社 | Audio encoding device and audio decoding device |
JP3134817B2 (en) * | 1997-07-11 | 2001-02-13 | 日本電気株式会社 | Audio encoding / decoding device |
US6161086A (en) * | 1997-07-29 | 2000-12-12 | Texas Instruments Incorporated | Low-complexity speech coding with backward and inverse filtered target matching and a tree structured mutitap adaptive codebook search |
SE521225C2 (en) * | 1998-09-16 | 2003-10-14 | Ericsson Telefon Ab L M | Method and apparatus for CELP encoding / decoding |
JP3343082B2 (en) * | 1998-10-27 | 2002-11-11 | 松下電器産業株式会社 | CELP speech encoder |
JP2004302259A (en) * | 2003-03-31 | 2004-10-28 | Matsushita Electric Ind Co Ltd | Hierarchical encoding method and hierarchical decoding method for sound signal |
-
2005
- 2005-03-09 AT AT05722196T patent/ATE513290T1/en not_active IP Right Cessation
- 2005-03-09 CN CN2005800489816A patent/CN101138022B/en not_active Expired - Fee Related
- 2005-03-09 EP EP05722196A patent/EP1859441B1/en not_active Not-in-force
- 2005-03-09 JP JP2008500663A patent/JP5174651B2/en not_active Expired - Fee Related
- 2005-03-09 KR KR1020077023047A patent/KR101235425B1/en active IP Right Grant
- 2005-03-09 BR BRPI0520115A patent/BRPI0520115B1/en not_active IP Right Cessation
- 2005-03-09 WO PCT/SE2005/000349 patent/WO2006096099A1/en active Application Filing
- 2005-12-15 TW TW094144472A patent/TW200639801A/en unknown
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040024595A1 (en) * | 1997-01-27 | 2004-02-05 | Toshiyuki Nomura | Speech coder/decoder |
US6192334B1 (en) * | 1997-04-04 | 2001-02-20 | Nec Corporation | Audio encoding apparatus and audio decoding apparatus for encoding in multiple stages a multi-pulse signal |
EP1132893A2 (en) * | 2000-02-15 | 2001-09-12 | Lucent Technologies Inc. | Constraining pulse positions in CELP vocoding |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9912350B2 (en) | 2007-04-29 | 2018-03-06 | Huawei Technologies Co., Ltd. | Coding method, decoding method, coder, and decoder |
US10666287B2 (en) | 2007-04-29 | 2020-05-26 | Huawei Technologies Co., Ltd. | Coding method, decoding method, coder, and decoder |
US8988256B2 (en) | 2007-04-29 | 2015-03-24 | Huawei Technologies Co., Ltd. | Coding method, decoding method, coder, and decoder |
JP2013148913A (en) * | 2007-04-29 | 2013-08-01 | Huawei Technologies Co Ltd | Encoding method, decoding method, encoder, and decoder |
US9225354B2 (en) | 2007-04-29 | 2015-12-29 | Huawei Technologies Co., Ltd. | Coding method, decoding method, coder, and decoder |
US9444491B2 (en) | 2007-04-29 | 2016-09-13 | Huawei Technologies Co., Ltd. | Coding method, decoding method, coder, and decoder |
US10425102B2 (en) | 2007-04-29 | 2019-09-24 | Huawei Technologies Co., Ltd. | Coding method, decoding method, coder, and decoder |
US10153780B2 (en) | 2007-04-29 | 2018-12-11 | Huawei Technologies Co.,Ltd. | Coding method, decoding method, coder, and decoder |
US9020814B2 (en) | 2010-06-24 | 2015-04-28 | Huawei Technologies Co., Ltd. | Pulse encoding and decoding method and pulse codec |
US9858938B2 (en) | 2010-06-24 | 2018-01-02 | Huawei Technologies Co., Ltd. | Pulse encoding and decoding method and pulse codec |
US9508348B2 (en) | 2010-06-24 | 2016-11-29 | Huawei Technologies Co., Ltd. | Pulse encoding and decoding method and pulse codec |
US10446164B2 (en) | 2010-06-24 | 2019-10-15 | Huawei Technologies Co., Ltd. | Pulse encoding and decoding method and pulse codec |
US8959018B2 (en) | 2010-06-24 | 2015-02-17 | Huawei Technologies Co.,Ltd | Pulse encoding and decoding method and pulse codec |
Also Published As
Publication number | Publication date |
---|---|
CN101138022A (en) | 2008-03-05 |
EP1859441A1 (en) | 2007-11-28 |
CN101138022B (en) | 2011-08-10 |
KR101235425B1 (en) | 2013-02-20 |
JP2008533522A (en) | 2008-08-21 |
TW200639801A (en) | 2006-11-16 |
ATE513290T1 (en) | 2011-07-15 |
BRPI0520115A2 (en) | 2009-09-15 |
BRPI0520115B1 (en) | 2018-07-17 |
EP1859441B1 (en) | 2011-06-15 |
KR20070116869A (en) | 2007-12-11 |
JP5174651B2 (en) | 2013-04-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8000967B2 (en) | Low-complexity code excited linear prediction encoding | |
US7778827B2 (en) | Method and device for gain quantization in variable bit rate wideband speech coding | |
US8856012B2 (en) | Apparatus and method of encoding and decoding signals | |
US5778335A (en) | Method and apparatus for efficient multiband celp wideband speech and music coding and decoding | |
US9928843B2 (en) | Method and apparatus for encoding/decoding speech signal using coding mode | |
Atal et al. | Speech and audio coding for wireless and network applications | |
JP2006525533A5 (en) | ||
CN101218628A (en) | Apparatus and method of encoding and decoding an audio signal | |
KR20020077389A (en) | Indexing pulse positions and signs in algebraic codebooks for coding of wideband signals | |
US20050258983A1 (en) | Method and apparatus for voice trans-rating in multi-rate voice coders for telecommunications | |
WO2015157843A1 (en) | Methods, encoder and decoder for linear predictive encoding and decoding of sound signals upon transition between frames having different sampling rates | |
US7634402B2 (en) | Apparatus for coding of variable bitrate wideband speech and audio signals, and a method thereof | |
JP3396480B2 (en) | Error protection for multimode speech coders | |
JP2002268686A (en) | Voice coder and voice decoder | |
EP1859441B1 (en) | Low-complexity code excited linear prediction encoding | |
AU2018338424B2 (en) | Method and device for efficiently distributing a bit-budget in a CELP codec | |
US20070276655A1 (en) | Method and apparatus to search fixed codebook and method and apparatus to encode/decode a speech signal using the method and apparatus to search fixed codebook | |
KR100389898B1 (en) | Method for quantizing linear spectrum pair coefficient in coding voice | |
Rutherford | Improving the performance of Federal Standard 1016 (CELP) | |
Zhou et al. | A unified framework for ACELP codebook search based on low-complexity multi-rate lattice vector quantization |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 200580048981.6 Country of ref document: CN |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2005722196 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2008500663 Country of ref document: JP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
NENP | Non-entry into the national phase |
Ref country code: RU |
|
WWE | Wipo information: entry into national phase |
Ref document number: 1020077023047 Country of ref document: KR |
|
WWP | Wipo information: published in national office |
Ref document number: 2005722196 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: PI0520115 Country of ref document: BR Kind code of ref document: A2 |