EP1933306A1 - Method and apparatus for transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format - Google Patents

Method and apparatus for transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format Download PDF

Info

Publication number
EP1933306A1
EP1933306A1 EP06025955A EP06025955A EP1933306A1 EP 1933306 A1 EP1933306 A1 EP 1933306A1 EP 06025955 A EP06025955 A EP 06025955A EP 06025955 A EP06025955 A EP 06025955A EP 1933306 A1 EP1933306 A1 EP 1933306A1
Authority
EP
European Patent Office
Prior art keywords
speech signal
parameter
pitch
pitch parameter
pcm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP06025955A
Other languages
German (de)
French (fr)
Inventor
Christophe Beaugeant
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nokia Solutions and Networks GmbH and Co KG
Original Assignee
Nokia Siemens Networks GmbH and Co KG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Siemens Networks GmbH and Co KG filed Critical Nokia Siemens Networks GmbH and Co KG
Priority to EP06025955A priority Critical patent/EP1933306A1/en
Publication of EP1933306A1 publication Critical patent/EP1933306A1/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • G10L19/12Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/173Transcoding, i.e. converting between two coded representations avoiding cascaded coding-decoding

Definitions

  • the present invention relates to a method and apparatus for transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format.
  • CELP code excited linear prediction
  • CELP coding Many different formats of CELP coding are in use today. In order to successfully decode a CELP-coded speech signal, the decoder must employ the same CELP coding model, in the following referred as "format", as the encoder that produced the signal. When communications systems employing different CELP formats must share speech data, it is often desirable to convert the speech signal from one CELP coding format to another.
  • tandem transcoding A known method to provide interconnectivity consists in decoding one standard compressed bitstream and to re-encode it into the other standard bitstream.
  • This known method is called the tandem transcoding.
  • tandem transcoding system includes an input CELP format decoder and an output CELP format encoder.
  • the input format CELP decoder receives a speech signal that has been encoded using one CELP format.
  • the decoder of the tandem coding system decodes the input coded speech signal to produce a pcm speech signal.
  • the output CELP format encoder of the tandem coding system receives the decoded pcm speech signal and encodes it using the output CELP format to produce a compressed output signal in the output CELP format.
  • the primary disadvantage of this approach is the perceptual degradation experienced by the speech signal in passing through multiple encoders and decoders. Further, the known tandem transcoding scheme suffers from the problems of complexity and delay.
  • the basic idea of this internal smart transcoding solution of the applicant is to use the redundancy on the standard to avoid computing parameters that were already computed. For example, it is possible to use parameters already coded at the encoder of the sending apparatus at the encoder of the transcoding system or apparatus to drive the re-encoding.
  • One of these parameters mapped between the speech codecs is the pitch parameter.
  • the pitch mapping is provided by copying the pitch or pitch parameter from the bitstream of a first codec to the encoder of a second codec.
  • the pitch estimation is done in two steps in standardized CELP coding.
  • An open-loop search gives a first estimation of the pitch To.
  • a closed loop pitch T OP is obtained as a refinement of the pitch parameter To by a search in an interval [T O -T LOW ; T O -T HIGH ].
  • a further enhanced internal solution is to provide a mapping skipping either the open-loop search or both the closed loop-search and the open-loop search dependent on predefined parameters.
  • the pitch parameter of the first codec T OP (A) is taken as the output of the open-loop search so that the closed loop search at the encoder of the second codec is done in an interval around T op (A).
  • the pitch T op (A) is directly taken as the output of the closed loop search and is quantified at the encoder of the second codec.
  • More advanced approaches try to estimate more accurately the pitch or pitch parameter at the encoder of the second codec given the pitch computed by the first codec.
  • Such approaches are for example known from " An Efficient transcoding algorithm for G.713.1 and EVCR speech coders", Kyung Tae Kim and al. IEEE 54th or from “A novel scheme from EVRC to G.729AB, Pankaj K. R., 37th Asilomar Conf. On Signals, Systems and Computers, 2003 .
  • Said advanced approaches could be called “pitch smoothing” method.
  • the open-loop pitch computation at the encoder of the second codec is driven by the pitch parameter T OP (A) of the first codec.
  • T OP (A) the pitch parameter of the first codec.
  • an open-loop search at the encoder of the second codec is also driven by the pitch or pitch parameter T op (A), by limiting the closed loop-search in a restricted interval (T O -T' LOW ; T O -T' HIGH ) with T' LOW ⁇ T LOW and T' HIGH ⁇ T HIGH ). All the previous mentioned solutions work either at the encoder of the second codec on the output of the open-loop search or on the output of the closed loop search.
  • An object of the present invention is to provide an optimal compromise for a transcoding scheme between the quality of the transmission of the speech signal and the complexity of the generating of the pitch parameter.
  • a method of transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format comprises the following steps:
  • transcoding apparatus for transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format
  • the transcoding apparatus comprises:
  • Voiced and unvoiced characteristics are defined by the action of the vocal cords.
  • the vocal cords vibrate for voiced sounds, but do not vibrate for unvoiced sounds. For example, all the vowels in English are voiced sounds. Some of the consonants such as "b”, "d” are partially voiced.
  • the beginning of the phoneme [b] or [d] is plosive, the end is voiced, while "p", "f" for instance are completely unvoiced.
  • the estimation of the pitch parameter at the encoder of the second codec is all the more accurate the open-loop search and the closed-loop search is done.
  • Definitions for the closed-loop search and the open-loop search can be found in the specification of the speech codec by ITU-T or 3GPPP. It can be found for instance in "Mandatory Speech Codec speech processing functions; Adaptive Multi-Rate (AMR) speech codec", release 6, 3GPPP TS 26.090, v6.0.0 or in "Coding of Speech at 8 kbit/s using conjugate-structure algebraic-excited linear prediction", ITU-T recommendation G.729 03/1996.
  • AMR Adaptive Multi-Rate
  • the pitch parameter T OP (A) is taken as the output of the open loop at the encoder of the second codec and it is possible to skip the closed loop search depended on the energy level of the decoded speech signal and the voiced level of the speech signal.
  • the influence of the pitch is less so that an accurate estimation of the pitch parameter for the second CELP format is not needed according to the present invention.
  • the pitch parameter of the first codec can be used as the output of the closed loop of the encoder of the second codec.
  • an optimal compromise between the quality of the transmission of the speech signal and the complexity of the generating of the pitch parameter of the encoder of the second codec is provided.
  • the method comprises the steps of:
  • the transcoding apparatus comprises:
  • the energy of the signal is an important factor determining if the quality needs to be optimal or not at the encoder of the second codec.
  • Artefacts are principally more acceptable on signals of low energy than on high energy signals. Indeed, low signals are less audible and can be principally more degraded than energetic signals. Accordingly, according to the present invention an accurate estimation of the pitch parameter at the encoder of the second codec is principally only applied for high energy signals.
  • An advantage of the present invention is to provide an adaptive compromise between the closed loop search and the open loop search depending on the first pitch parameter of the first compressed speech signal encoded using the first CELP format and depending on its energy level.
  • the method further comprises the step of encoding the decoded pcm speech signal using the second CELP format to a second coded speech signal including at least a second pitch parameter.
  • the closed loop search process is performed in a restricted interval [T op (A)-T' LOW ; T op (A) +T' HIGH ] around the first pitch parameter (T op (A)), wherein T' LOW the signals a preselected lower pitch threshold value and T' HIGH a preselected upper pitch threshold value.
  • the lower and the upper pitch first threshold values are preselected the greater the detected voiced level is.
  • the first parameter and/or the second parameter are provided as a predetermined threshold value, respectively.
  • the first CELP format is provided by a first codec and the second CELP format is provided by a second codec which is different to the first codec.
  • Suitable examples for the first codes and the second codec are the following:
  • the first codec and the second codec are selected from the group of AMR, AMR-WB/G.722.2, G.729, ANNEXES OF G.729, G.723.1, EVRC AND VMR-WB.
  • the first coded speech signal includes at least the first pitch parameter T op (A) and an additional parameter set comprising a linear prediction code (LPC) parameter and/or at least one fixed gain parameter and/or at least one adaptive gain parameter and/or one adaptive code-book parameter.
  • A first pitch parameter
  • LPC linear prediction code
  • the voiced level of the pcm speech signal is detected by means of using a variability of the first pitch parameter at a predetermined frame or for the predetermined time and/or by means of using at least one parameter of the additional parameter set.
  • the energy level of the pcm speech signal is detected by means of using the fixed gain parameter of the first coded speech signal and/or by means of computing the energy level of the decoded pcm speech signal.
  • Figure 1 shows a schematic flow diagram of a first embodiment of the method for transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format.
  • CELP code excited linear prediction
  • inventive method for transcoding the speech signal is explained by means of the schematic flow diagram of figure 1 referring to the block diagram of figure 4 .
  • the method of transcoding the speech signal of the present invention comprises the method steps S1-S8:
  • a first coded speech signal DS1 is received, in particular the frame of a speech signal.
  • Said coded speech signal DS1 is encoded using the first CELP format and includes at least a first pitch parameter T op (A).
  • the first coded speech signal DS1 includes at least the first pitch parameter T OP (A) and an additional parameter set.
  • the additional parameter set comprises a linear prediction code (LPC) parameter and/or at least one fixed gain parameter and/or at least one adaptive gain parameter and/or at least one adaptive code-bug parameter.
  • LPC linear prediction code
  • the received first coded speech signal DS1 is decoded to a decoded pcm speech signal AS2.
  • a voiced level VL of the pcm speech signal AS2 is detected within a predetermined time window T.
  • the voiced level VL of the pcm speech signal AS2 is detected by means of using a variability of the first pitch parameter T OP (A) at the predetermined frame or for the predetermined time window T.
  • the voiced level VL can be detected by means of using at least one parameter of said additional parameter set.
  • the first parameter P1 can be provided as a predetermined threshold value.
  • an energy level EL of the pcm speech signal AS2 is detected within the predetermined time window T.
  • the energy level EL of the pcm speech signal AS2 is detected by means of using the fixed gain parameter of the first coded speech signal DS1 and/or by means of computing the energy level of the decoded pcm speech signal AS2.
  • the energy level EL of the pcm speech signal AS2 is high or low dependent on at least a second parameter P2.
  • the second parameter P2 is provided as a predetermined threshold value.
  • a closed loop search process is performed which receives at least the first pitch parameter T OP (A) and estimates a second pitch parameter T OP (B) for the second CELP format dependent on at least the first pitch parameter T op (A).
  • the closed loop search process is performed in a restricted interval [T OP (A)-T' LOW ; T OP (A)+T' HIGH ] around the first pitch parameter T OP (A), wherein T' LOW designates a pre-selected lower pitch threshold value and T' HIGH designates an upper pitch threshold value.
  • the lower and the upper pitch threshold values T' HIGH , T' LOW are pre-selected the greater the detected voiced level VL is.
  • the first pitch parameter T OP (A) is copied as second pitch parameter T OP (B) for the second CELP format.
  • a preferable value for energy level EL on a frame is energy level EL ⁇ 20 dB.
  • Figure 2 is a schematic flow diagram of a second embodiment of the method of the present invention.
  • the second embodiment of figure 2 comprises the method steps S1-S8 as shown in figure 1 and as explained above. Further, the second embodiment of the method of the present invention of figure 2 comprises the additional method step S9.
  • the decoded pcm speech signal AS2 is encoded using the second CELP format to a second coded speech signal DS2 including at least the second pitch parameter T OP (B).
  • Figure 3 is a diagram showing the pitch parameter T OP over the time t.
  • Figure 3 shows that the amplitude of the pitch parameter T OP is low in the voiced periods VP.
  • the pitch parameter T OP is high in the unvoiced periods UP.
  • FIG. 4 is a schematic block diagram of an embodiment of the transcoding apparatus 1 of the present invention.
  • the transcoding apparatus 1 of figure 4 is adapted to execute the method of figure 1 , respectively, of figure 2 .
  • the transcoding apparatus 1 comprises a receiving means 2, a decoding means 3, a first detecting means 4, a first determining means 5, a second detecting means 6, a second determining means 7 and a pitch parameter providing means 8 comprising at least a closed loop search means 8a and a copying means 8b (see figure 6 ).
  • the receiving means 2 is adapted to receive the first coded speech signal DS1 encoded using the first CELP format and including at least the first pitch parameter T OP (A).
  • the decoding means 3 is adapted to decode the received first coded speech signal DS1 to provide an pcm speech signal AS2.
  • the first detecting means 4 is adapted to detect the voiced level VL of the pcm speech signal AS2 within the predetermined time window T.
  • the first determining means 5 is adapted to determine, if the pcm speech signal AS2 is a voiced speech signal or an unvoiced speech signal dependent on at least the first parameter P1.
  • the second detecting means 6 is adapted to detect an energy level EL of the pcm speech signal AS2 within the predetermined time window T.
  • the second determining means 7 is adapted to determine, if the energy level EL of the pcm speech signal AS2 is high or low dependent on at least the second parameter P2.
  • the closed loop search means 8a is adapted to perform a closed loop search.
  • the closed loop search means 8a receives at least the first pitch parameter T OP (A) and estimates a second pitch parameter T OP (B) for the second CELP format dependent on at least the first pitch parameter T OP (A), if the pcm speech signal AS2 is voiced and its energy level EL is high (see figure 6 ).
  • the copying means 8b is adapted to copy the first pitch parameter T OP (A) as the second pitch parameter T OP (B) for the second CELP format, if the pcm speech signal AS2 is unvoiced or its energy level EL is low (see figure 6 ).
  • the closed loop search means 8a and the copying means 8b are shown in detail in figure 6 .
  • the pitch parameter providing means 8 comprises further a decision means 8c.
  • the decision means 8c receives the signals EL' and VL'.
  • EL' designates the detection result of the second determining means 7. For example, if the energy level EL is greater than the second parameter P2 which is a threshold value, the decision signal EL' is high. On the other hand, if the energy level EL is smaller or equal to the second parameter P2, the decision signal EL' is low. Further, the signal VL' is the decision signal of the first detecting means 4. If, for example, the voiced level VL is greater than the first parameter P1 which is a threshold value, the decision signal VL' is high. On the other hand, the decision signal VL' is low.
  • the closed loop search means 8a performs the closed loop search in a restricted interval [T OP (A) -T' LOW ; T o p (A) +T 'HIGH] around the first pitch parameter T OP (A).
  • FIG. 5 shows a schematic block diagram of the transcoding apparatus 1 coupled between two terminal units 11 and 12.
  • a first terminal unit 11 comprises an encoding means 12 and a sending means 13.
  • the encoding means 12 receives a first pcm speech signal AS1 and encodes said first pcm speech signal AS1 using the first CELP format to a first coded speech signal DS1.
  • the sending means 13 receives the first coded speech signal DS1 encoded with the first CELP format and sends it to the transcoding apparatus 1.
  • the second terminal unit 14 comprises a receiving means 15 and a decoding means 16.
  • the receiving means 15 receives the second coded speech signal DS2 encoded with the second CELP format.
  • the receiving means 15 transfers the received second coded speech signal DS2 to the decoding means 16 for decoding.
  • the decoding means 16 works with the second CELP format.
  • the present invention is not limited to the use of one transcoding apparatus between the terminal units, but there could be also provided a lot of different transcoding apparatuses, wherein neighbouring transcoding apparatuses which are coupled to each other work on the same CELP format.

Abstract

The method of the present invention comprises the following steps: receiving a first coded speech signal using the first CELP format and including at least a first pitch parameter; decoding the received first coded speech signal to a decoded pcm speech signal; detecting a voiced level of the pcm speech signal within a predetermined time window; determining, if the pcm speech signal is a voiced speech signal or an unvoiced speech signal dependent on at least a first parameter; if the pcm speech signal is voiced, performing a closed loop search process which receives at least the first pitch parameter and estimates a second pitch parameter for the second CELP format dependent on at least the first pitch parameter; and if the pcm speech signal is unvoiced, copying the first pitch parameter as the second pitch parameter for the second CELP format.

Description

  • The present invention relates to a method and apparatus for transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format.
  • In the past, several types of networks have been developed, like mobile GSM, UMTS, CTMA, IP which all provide an alternative way to the known circuit switch network. The interconnection of all these networks relates to an interoperability problem regarding transmission of speech. Indeed, non-compatible speech standards were adopted in the different networks, although most of the codecs at medium rate (5-16,5 kbit/s) are based on the same model, namely the code excited linear prediction (CELP).
  • Many different formats of CELP coding are in use today. In order to successfully decode a CELP-coded speech signal, the decoder must employ the same CELP coding model, in the following referred as "format", as the encoder that produced the signal. When communications systems employing different CELP formats must share speech data, it is often desirable to convert the speech signal from one CELP coding format to another.
  • A known method to provide interconnectivity consists in decoding one standard compressed bitstream and to re-encode it into the other standard bitstream. This known method is called the tandem transcoding. In a tandem coding system an input CELP format is converted to an output CELP format. Therefore, tandem transcoding system includes an input CELP format decoder and an output CELP format encoder. The input format CELP decoder receives a speech signal that has been encoded using one CELP format. The decoder of the tandem coding system decodes the input coded speech signal to produce a pcm speech signal. The output CELP format encoder of the tandem coding system receives the decoded pcm speech signal and encodes it using the output CELP format to produce a compressed output signal in the output CELP format. The primary disadvantage of this approach is the perceptual degradation experienced by the speech signal in passing through multiple encoders and decoders. Further, the known tandem transcoding scheme suffers from the problems of complexity and delay.
  • Furthermore, smart transcoding solutions are internal known by the applicant, using the fact that the different standards are based on the CELP principle. Their aim is to reduce the complexity of the transcoding as many functions at the decoder of the transcoding system could be skipped. Further, they aim to decrease the delay and enhancing the quality or at least getting the same quality as the known transcoding scheme.
  • The basic idea of this internal smart transcoding solution of the applicant is to use the redundancy on the standard to avoid computing parameters that were already computed. For example, it is possible to use parameters already coded at the encoder of the sending apparatus at the encoder of the transcoding system or apparatus to drive the re-encoding. One of these parameters mapped between the speech codecs is the pitch parameter.
  • According to an internal solution of the applicant, the pitch mapping is provided by copying the pitch or pitch parameter from the bitstream of a first codec to the encoder of a second codec.
  • The pitch estimation is done in two steps in standardized CELP coding. An open-loop search gives a first estimation of the pitch To. In a second step by means of a closed loop search, a closed loop pitch TOP is obtained as a refinement of the pitch parameter To by a search in an interval [TO-TLOW; TO-THIGH].
  • A further enhanced internal solution is to provide a mapping skipping either the open-loop search or both the closed loop-search and the open-loop search dependent on predefined parameters. For example, the pitch parameter of the first codec TOP (A) is taken as the output of the open-loop search so that the closed loop search at the encoder of the second codec is done in an interval around Top (A). In the case where the closed loop search is also skipped, the pitch Top (A) is directly taken as the output of the closed loop search and is quantified at the encoder of the second codec.
  • More advanced approaches try to estimate more accurately the pitch or pitch parameter at the encoder of the second codec given the pitch computed by the first codec. Such approaches are for example known from "An Efficient transcoding algorithm for G.713.1 and EVCR speech coders", Kyung Tae Kim and al. IEEE 54th or from "A novel scheme from EVRC to G.729AB, Pankaj K. R., 37th Asilomar Conf. On Signals, Systems and Computers, 2003. Said advanced approaches could be called "pitch smoothing" method. Therein, the open-loop pitch computation at the encoder of the second codec is driven by the pitch parameter TOP (A) of the first codec. There is not a direct mapping, but a research of the open-loop pitch taking into account the difference between the pitch Top (A) and the one computed at the encoder of the second codec at the previous frame of the bitstream.
  • Further, in "Improvement issues on transcoding algorithms, for the flexible usage to the various pairs of speech codec, Jin-Kyu Choi and al, ICASSP 2004" an open-loop search at the encoder of the second codec is also driven by the pitch or pitch parameter Top (A), by limiting the closed loop-search in a restricted interval (TO-T'LOW; TO-T'HIGH) with T'LOW < TLOW and T'HIGH < THIGH). All the previous mentioned solutions work either at the encoder of the second codec on the output of the open-loop search or on the output of the closed loop search.
  • An object of the present invention is to provide an optimal compromise for a transcoding scheme between the quality of the transmission of the speech signal and the complexity of the generating of the pitch parameter.
  • The above mentioned object is solved by means of a method with the features of claim 1 and/or by means of a transcoding apparatus of claim 12.
  • According to the present invention, a method of transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format is provided, wherein the method comprises the following steps:
    1. a) receiving a first encoded speech signal using the first CELP format and including at least a first pitch parameter;
    2. b) decoding the received first compressed speech signal to a decoded pcm speech signal;
    3. c) detecting a voiced level of the pcm speech signal within a predetermined time window;
    4. d) determining, if the pcm speech signal is a voiced speech signal or an unvoiced speech signal dependent on at least a first parameter;
    5. e) if the pcm speech signal is voiced, performing a closed loop search process which receives at least the first pitch parameter and estimates a second pitch parameter for the second CELP format dependent on at least the first pitch parameter; and
    6. f) if the pcm speech signal is unvoiced, copying the first pitch parameter as the second pitch parameter for the second CELP format.
  • Further, a transcoding apparatus for transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format is provided, wherein the transcoding apparatus comprises:
    1. a) a receiving means which receives the first speech signal encoded using the first CELP format and including at least the first pitch parameter;
    2. b) a decoding means which decodes the received first compressed speech signal to the decoded pcm speech signal;
    3. c) a first detecting means which detects the voiced level of the pcm speech signal within the predetermined time window;
    4. d) a first determining means which determines if the pcm speech signal is the voiced speech signal or the unvoiced speech signal dependent on at least the first parameter;
    5. e) a closed loop search means which performs a closed loop search, wherein the closed loop search means receives at least the first pitch parameter and estimates a second pitch parameter for the second CELP format dependent on at least the first pitch parameter, if the pcm speech signal is voiced; and
    6. f) a copying means which copies the first pitch parameter as the second pitch parameter for the second CELP format, if the pcm speech signal is unvoiced.
  • An experiment of the applicant has shown that an accurate pitch value is necessary at encoding during voiced periods to assure a good quality of the coded signal. During unvoiced periods the CELP encoders are less sensitive to a wrong estimation of the pitch.
  • Voiced and unvoiced characteristics are defined by the action of the vocal cords. The vocal cords vibrate for voiced sounds, but do not vibrate for unvoiced sounds. For example, all the vowels in English are voiced sounds. Some of the consonants such as "b", "d" are partially voiced. The beginning of the phoneme [b] or [d] is plosive, the end is voiced, while "p", "f" for instance are completely unvoiced.
  • The estimation of the pitch parameter at the encoder of the second codec is all the more accurate the open-loop search and the closed-loop search is done.
  • Definitions for the closed-loop search and the open-loop search can be found in the specification of the speech codec by ITU-T or 3GPPP. It can be found for instance in "Mandatory Speech Codec speech processing functions; Adaptive Multi-Rate (AMR) speech codec", release 6, 3GPPP TS 26.090, v6.0.0 or in "Coding of Speech at 8 kbit/s using conjugate-structure algebraic-excited linear prediction", ITU-T recommendation G.729 03/1996.
  • The more the process of the open-loop search and the closed-loop search is applied, the more accurate and on the other hand the more complex the estimation of the pitch parameter is. According to the present invention, in voiced periods, the closed loop search is kept, wherein during unvoiced periods, the whole closed loop search can be principally skipped. Taking into account this inventive basic principal, in a transcoding scheme, the pitch parameter TOP (A) is taken as the output of the open loop at the encoder of the second codec and it is possible to skip the closed loop search depended on the energy level of the decoded speech signal and the voiced level of the speech signal. In the case of an unvoiced signal, the influence of the pitch is less so that an accurate estimation of the pitch parameter for the second CELP format is not needed according to the present invention. In a transcoding scheme of the present invention, in such a case the pitch parameter of the first codec can be used as the output of the closed loop of the encoder of the second codec.
  • Advantageously, according to the present invention, an optimal compromise between the quality of the transmission of the speech signal and the complexity of the generating of the pitch parameter of the encoder of the second codec is provided.
  • Advantages, developments and improvements of the present invention are found in the subclaims.
  • According to an embodiment of the present invention, the method comprises the steps of:
    1. a) receiving the first coded speech signal encoded using the first CELP format and including at least the first pitch parameter;
    2. b) decoding the received first coded speech signal to the decoded pcm speech signal;
    3. c) detecting the voiced level of the pcm speech signal within the predetermined time window;
    4. d) determining, if the pcm speech signal is a voiced speech signal or an unvoiced speech signal dependent on at least the first parameter;
    5. e) detecting an energy level of the pcm speech signal within the predetermined time window;
    6. f) determining, if the energy level of the pcm speech signal is high or low dependent on at least a second parameter;
    7. g) if the pcm speech signal is voiced and its energy level is high, performing the closed loop search process which receives at least the first pitch parameter and estimates the second pitch parameter for the second CELP format dependent on at least the first pitch parameter;
    8. h) if the pcm speech signal is unvoiced or its energy level is low, copying the first pitch parameter as the second pitch parameter for the second CELP format.
  • According to an embodiment of the present invention, the transcoding apparatus comprises:
    1. a) a receiving means which receives the first coded speech signal encoded using the first CELP format and including at least the first pitch parameter;
    2. b) a decoding means which decodes the received first coded speech signal to the decoded pcm speech signal;
    3. c) a first detecting means which detects the voiced level of the pcm speech signal within the predetermined time window;
    4. d) a first determining means which determines if the pcm speech signal is the voiced speech signal or the unvoiced speech signal dependent on at least the first parameter;
    5. e) a second detecting means which detects an energy level of the pcm speech signal within the predetermined time window;
    6. f) a second determining means which determines if the energy level of the pcm speech signal is high or low dependent on at least the second parameter;
    7. g) a closed loop search means which performs a closed loop search, wherein the closed loop search means receives at least the first pitch parameter and estimates a second pitch parameter for the second CELP format dependent on at least the first pitch parameter, if the pcm speech signal is voiced and its energy level is high; and
    8. h) a copying means which copies the first pitch parameter as the second pitch parameter for the second CELP format, if the pcm speech signal is unvoiced or its energy level is low.
  • According to the present invention, the energy of the signal is an important factor determining if the quality needs to be optimal or not at the encoder of the second codec. Artefacts are principally more acceptable on signals of low energy than on high energy signals. Indeed, low signals are less audible and can be principally more degraded than energetic signals. Accordingly, according to the present invention an accurate estimation of the pitch parameter at the encoder of the second codec is principally only applied for high energy signals.
  • An advantage of the present invention is to provide an adaptive compromise between the closed loop search and the open loop search depending on the first pitch parameter of the first compressed speech signal encoded using the first CELP format and depending on its energy level.
  • According to a further embodiment of the present invention, the method further comprises the step of encoding the decoded pcm speech signal using the second CELP format to a second coded speech signal including at least a second pitch parameter.
  • According to a further embodiment, the closed loop search process is performed in a restricted interval [Top(A)-T'LOW; Top (A) +T'HIGH] around the first pitch parameter (Top (A)), wherein T'LOW the signals a preselected lower pitch threshold value and T'HIGH a preselected upper pitch threshold value.
  • According to a further embodiment, the lower and the upper pitch first threshold values are preselected the greater the detected voiced level is.
  • According a further embodiment, the first parameter and/or the second parameter are provided as a predetermined threshold value, respectively.
  • According to a further embodiment, the first CELP format is provided by a first codec and the second CELP format is provided by a second codec which is different to the first codec. Suitable examples for the first codes and the second codec are the following:
    1. 1. AMR: Mandatory Speech Codec speech processing functions; Adaptive Multi-Rate (AMR) speech codec (Release 6), 3GPP TS 26.090, v6.0.0:
    2. 2. AMR-WB/G.722.2: Wideband coding of speech at around 16 kbit/s using Adaptive Multi-Rate Wideband (AMR-WB), ITU T Recommendation G.722.2, 07/2003.
    3. 3. G.729: Coding of Speech at 8 kbit/s using conjugate-structure algebraic-excited linear prediction, ITU-T Recommendation G.729 03/1996.
    4. 4. Annexes of G.729: Annex A, B, D, E, J.
    5. 5. G.723.1: Dual rate speech coder for multimedia communications transmitting at 5.3 and 6.3 kbit/s, ITU-T recommendation G.723.1, 03/1996.
    6. 6. EVRC: Enhanced Variable Rate Codec (EVRC), 3GPP2 C.S0014-0, Version 1.0, 2003.
    7. 7. VMR-WB: Source-Controlled Variable-Rate Multimode 2 Wideband Speech Codec (VMR-WB) 3GPP2 C.S.0052-0, Version 1.0, 2004.
  • According to a further embodiment, the first codec and the second codec are selected from the group of AMR, AMR-WB/G.722.2, G.729, ANNEXES OF G.729, G.723.1, EVRC AND VMR-WB.
  • According to a further embodiment, the first coded speech signal includes at least the first pitch parameter Top (A) and an additional parameter set comprising a linear prediction code (LPC) parameter and/or at least one fixed gain parameter and/or at least one adaptive gain parameter and/or one adaptive code-book parameter.
  • According to a further embodiment, the voiced level of the pcm speech signal is detected by means of using a variability of the first pitch parameter at a predetermined frame or for the predetermined time and/or by means of using at least one parameter of the additional parameter set.
  • According to a further embodiment, the energy level of the pcm speech signal is detected by means of using the fixed gain parameter of the first coded speech signal and/or by means of computing the energy level of the decoded pcm speech signal.
  • Exemplary embodiments of the invention are illustrated in the drawings and explained in more detail in the description below.
  • In the figures:
  • Figure 1:
    shows a schematic flow diagram of a first embodiment of the method of the present invention;
    figure 2:
    shows a schematic flow diagram of a second embodiment of the method of the present invention;
    figure 3:
    shows a diagram illustrating the pitch parameter over the time;
    figure 4:
    shows a schematic block diagram of an embodiment of the transcoding apparatus of the present invention;
    figure 5:
    shows a schematic block diagram of the transcoding apparatus of figure 4 coupled between two terminal units; and
    figure 6:
    shows a detailed schematic block diagram of the pitch parameter providing device of figure 4.
  • In the figures, identical reference symbol designate identical or functionally identical elements.
  • Figure 1 shows a schematic flow diagram of a first embodiment of the method for transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format. In the following, the inventive method for transcoding the speech signal is explained by means of the schematic flow diagram of figure 1 referring to the block diagram of figure 4. The method of transcoding the speech signal of the present invention comprises the method steps S1-S8:
  • Method step S1:
  • A first coded speech signal DS1 is received, in particular the frame of a speech signal. Said coded speech signal DS1 is encoded using the first CELP format and includes at least a first pitch parameter Top (A). Preferably, the first coded speech signal DS1 includes at least the first pitch parameter TOP (A) and an additional parameter set. The additional parameter set comprises a linear prediction code (LPC) parameter and/or at least one fixed gain parameter and/or at least one adaptive gain parameter and/or at least one adaptive code-bug parameter.
  • Method step S2:
  • The received first coded speech signal DS1 is decoded to a decoded pcm speech signal AS2.
  • Method step S3:
  • A voiced level VL of the pcm speech signal AS2 is detected within a predetermined time window T. Particularly, the voiced level VL of the pcm speech signal AS2 is detected by means of using a variability of the first pitch parameter TOP (A) at the predetermined frame or for the predetermined time window T. Alternatively, the voiced level VL can be detected by means of using at least one parameter of said additional parameter set.
  • Method step S4:
  • It is determined, if the pcm speech signal AS2 is a voiced speech signal or an unvoiced speech signal dependent on at least a first parameter P1. The first parameter P1 can be provided as a predetermined threshold value.
  • Method step S5:
  • Preferably, an energy level EL of the pcm speech signal AS2 is detected within the predetermined time window T. Preferably, the energy level EL of the pcm speech signal AS2 is detected by means of using the fixed gain parameter of the first coded speech signal DS1 and/or by means of computing the energy level of the decoded pcm speech signal AS2.
  • Method step S6:
  • Preferably, it is determined, if the energy level EL of the pcm speech signal AS2 is high or low dependent on at least a second parameter P2. Particularly, the second parameter P2 is provided as a predetermined threshold value.
  • Method step S7:
  • If the pcm speech signal AS2 is voiced and ppreferably its energy level EL is high, a closed loop search process is performed which receives at least the first pitch parameter TOP(A) and estimates a second pitch parameter TOP(B) for the second CELP format dependent on at least the first pitch parameter Top (A).
  • Preferably, the closed loop search process is performed in a restricted interval [TOP(A)-T'LOW; TOP(A)+T'HIGH] around the first pitch parameter TOP (A), wherein T'LOW designates a pre-selected lower pitch threshold value and T'HIGH designates an upper pitch threshold value.
  • Particularly, the lower and the upper pitch threshold values T'HIGH, T'LOW are pre-selected the greater the detected voiced level VL is. Typically, the variations of T'HIGH, T'LOW are between 1 and 3 samples for codecs like AMR, G729, EVCR. Indeed, for such codecs, the maximal range is the complete closed-loop search made in the interval [TOP (A) -TLOW; TOP (A) +THIGH] where TLOW =3, THIGH=3
  • Method step S8:
  • If the pcm speech signal AS2 is unvoiced or preferably its energy level EL is low, the first pitch parameter TOP (A) is copied as second pitch parameter TOP (B) for the second CELP format. Typically, a preferable value for energy level EL on a frame is energy level EL < 20 dB.
  • Figure 2 is a schematic flow diagram of a second embodiment of the method of the present invention. The second embodiment of figure 2 comprises the method steps S1-S8 as shown in figure 1 and as explained above. Further, the second embodiment of the method of the present invention of figure 2 comprises the additional method step S9.
  • Method step S9:
  • The decoded pcm speech signal AS2 is encoded using the second CELP format to a second coded speech signal DS2 including at least the second pitch parameter TOP (B).
  • Figure 3 is a diagram showing the pitch parameter TOP over the time t. Figure 3 shows that the amplitude of the pitch parameter TOP is low in the voiced periods VP. On the other hand, the pitch parameter TOP is high in the unvoiced periods UP.
  • Figure 4 is a schematic block diagram of an embodiment of the transcoding apparatus 1 of the present invention.
  • The transcoding apparatus 1 of figure 4 is adapted to execute the method of figure 1, respectively, of figure 2.
  • Therefore, the transcoding apparatus 1 comprises a receiving means 2, a decoding means 3, a first detecting means 4, a first determining means 5, a second detecting means 6, a second determining means 7 and a pitch parameter providing means 8 comprising at least a closed loop search means 8a and a copying means 8b (see figure 6).
  • The receiving means 2 is adapted to receive the first coded speech signal DS1 encoded using the first CELP format and including at least the first pitch parameter TOP(A).
  • The decoding means 3 is adapted to decode the received first coded speech signal DS1 to provide an pcm speech signal AS2.
  • The first detecting means 4 is adapted to detect the voiced level VL of the pcm speech signal AS2 within the predetermined time window T.
  • The first determining means 5 is adapted to determine, if the pcm speech signal AS2 is a voiced speech signal or an unvoiced speech signal dependent on at least the first parameter P1.
  • The second detecting means 6 is adapted to detect an energy level EL of the pcm speech signal AS2 within the predetermined time window T.
  • The second determining means 7 is adapted to determine, if the energy level EL of the pcm speech signal AS2 is high or low dependent on at least the second parameter P2.
  • The closed loop search means 8a is adapted to perform a closed loop search. The closed loop search means 8a receives at least the first pitch parameter TOP (A) and estimates a second pitch parameter TOP (B) for the second CELP format dependent on at least the first pitch parameter TOP (A), if the pcm speech signal AS2 is voiced and its energy level EL is high (see figure 6).
  • The copying means 8b is adapted to copy the first pitch parameter TOP (A) as the second pitch parameter TOP (B) for the second CELP format, if the pcm speech signal AS2 is unvoiced or its energy level EL is low (see figure 6).
  • The closed loop search means 8a and the copying means 8b are shown in detail in figure 6. The pitch parameter providing means 8 comprises further a decision means 8c. The decision means 8c receives the signals EL' and VL'. EL' designates the detection result of the second determining means 7. For example, if the energy level EL is greater than the second parameter P2 which is a threshold value, the decision signal EL' is high. On the other hand, if the energy level EL is smaller or equal to the second parameter P2, the decision signal EL' is low. Further, the signal VL' is the decision signal of the first detecting means 4. If, for example, the voiced level VL is greater than the first parameter P1 which is a threshold value, the decision signal VL' is high. On the other hand, the decision signal VL' is low.
  • Particularly, the closed loop search means 8a performs the closed loop search in a restricted interval [TOP (A) -T'LOW; Top (A) +T'HIGH] around the first pitch parameter TOP (A).
  • Figure 5 shows a schematic block diagram of the transcoding apparatus 1 coupled between two terminal units 11 and 12. A first terminal unit 11 comprises an encoding means 12 and a sending means 13. The encoding means 12 receives a first pcm speech signal AS1 and encodes said first pcm speech signal AS1 using the first CELP format to a first coded speech signal DS1. The sending means 13 receives the first coded speech signal DS1 encoded with the first CELP format and sends it to the transcoding apparatus 1.
  • The second terminal unit 14 comprises a receiving means 15 and a decoding means 16. The receiving means 15 receives the second coded speech signal DS2 encoded with the second CELP format. The receiving means 15 transfers the received second coded speech signal DS2 to the decoding means 16 for decoding. The decoding means 16 works with the second CELP format.
  • Although the present invention has been explained on the basis of particular exemplary embodiments, it is not restricted thereto, but rather can be modified in any desired manner without deporting from the basic principle of the invention.
  • In particular, the present invention is not limited to the use of one transcoding apparatus between the terminal units, but there could be also provided a lot of different transcoding apparatuses, wherein neighbouring transcoding apparatuses which are coupled to each other work on the same CELP format.

Claims (13)

  1. Method of transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format, comprising:
    a) receiving a first coded speech signal (DS1) encoded using the first CELP format and including at least a first pitch parameter (Top (A));
    b) decoding the received first coded speech signal (DS1) to a decoded pcm speech signal (AS2);
    c) detecting a voiced level (VL) of the pcm speech signal (AS2) within a predetermined time window (T);
    d) determining, if the pcm speech signal (AS2) is a voiced speech signal or an unvoiced speech signal dependent on at least a first parameter (P1);
    e) if the pcm speech signal (AS2) is voiced, performing a closed loop search process which receives at least the first pitch parameter (Top(A)) and estimates a second pitch parameter (Top(B)) for the second CELP format dependent on at least the first pitch parameter (Top(A));
    f) if the pcm speech signal (AS2) is unvoiced, copying the first pitch parameter (Top(A)) as the second pitch parameter (Top(B)) for the second CELP format.
  2. The method of claim 1, comprising:
    a) receiving the first coded speech signal (DS1) encoded using the first CELP format and including at least the first pitch parameter (Top (A));
    b) decoding the received first coded speech signal (DS1) to the decoded pcm speech signal (AS2);
    c) detecting the voiced level (VL) of the pcm speech signal (AS2) within the predetermined time window (T);
    d) determining, if the pcm speech signal (AS2) is a voiced speech signal or an unvoiced speech signal dependent on at least the first parameter (P1) ;
    e) detecting an energy level (EL) of the pcm speech signal (AS2) within the predetermined time window (T);
    f) determining, if the energy level (EL) of the pcm speech signal (AS2) is high or low dependent on at least a second parameter (P2);
    g) if the pcm speech signal (AS2) is voiced and its energy level (EL) is high, performing the closed loop search process which receives at least the first pitch parameter (Top(A)) and estimates the second pitch parameter (Top(B)) for the second CELP format dependent on at least the first pitch parameter (Top (A));
    h) if the pcm speech signal (AS2) is unvoiced or its energy level (EL) is low, copying the first pitch parameter (Top (A)) as the second pitch parameter (Top (B)) for the second CELP format.
  3. The method of claim 1 or 2,
    comprising further the step of
    encoding the decoded pcm speech signal (AS2) using the second CELP format to a second coded speech signal (DS2) including at least the second pitch parameter (Top (B)).
  4. The method of claims 1, 2 or 3,
    wherein the closed loop search process is performed in a restricted interval [Top (A) -T'LOW; Top (A) +T'HIGH] around the first pitch parameter (Top(A)), wherein T'LOW designates a preselected lower pitch threshold value and T'HIGH designates a upper pitch threshold value.
  5. The method of claim 4,
    wherein the lower and the upper pitch threshold values
    (T'HIGH, T'LOW) are preselected the greater the detected voiced level (VL) is.
  6. The method of claim 1 or one of claims 2 to 5,
    wherein the first parameter (P1) and/or second parameter (P2) are provided as a predetermined threshold value, respectively.
  7. The method of claim 1 or one of claims 3 to 6,
    wherein the first CELP format is provided by a first codec (3, 12) and the second CELP format is provided by a second codec (9, 16) which is different to the first codec (3, 12).
  8. The method of claim 7,
    wherein the first codec (3, 12) and the second codec (9, 16) are selected from the group of AMR, AMR-WB/G.722.2, G.729, Annexes of G.729, G.723.1, EVRC and VMR-WB.
  9. The method of claim 1 or one of claims 2 to 8,
    wherein the first coded speech signal (DS1) includes at least the first pitch parameter (Top (A)) and an additional parameter set comprising a linear prediction code (LPC) parameter and/or at least one fixed gain parameter and/or at least one adaptive gain parameter and/or at least one adaptive code-book parameter.
  10. The method of claim 9,
    wherein the voiced level (VL) of the pcm speech signal (AS2) is detected by means of using a variability of the first pitch parameter (Top (A)) at a predetermined frame or for the predetermined time window (T) and/or by means of using at least one parameter of the additional parameter set.
  11. The method of claim 9 or 10,
    wherein the energy level (EL) of the pcm speech signal (AS2) is detected by means of using the fixed gain parameter of the first coded speech signal (DS1) and/or by means of computing the energy level of the decoded pcm speech signal (AS2).
  12. Transcoding apparatus (1) for executing the method of claim 1, comprising:
    a) a receiving means (2) which receives the first coded speech signal (DS1) encoded using the first CELP format and including at least the first pitch parameter;
    b) a decoding means (3) which decodes the received first coded speech signal (DS1) to the decoded pcm speech signal (AS2);
    c) a first detecting means (4) which detects the voiced level (VL) of the pcm speech signal (AS2) within the predetermined time window (T);
    d) a first determining means (5) which determines if the pcm speech signal (AS2) is the voiced speech signal or the unvoiced speech signal dependent on at least the first parameter (P1);
    e) a closed loop search means (8a) which performs a closed loop search, wherein the closed loop search means (8a) receives at least the first pitch parameter (Top(A)) and estimates the second pitch parameter (Top(B)) for the second CELP format dependent on at least the first pitch parameter (Top (A)), if the pcm speech signal (AS2) is voiced; and
    f) a copying means (8b) which copies the first pitch parameter (Top (A)) as the second pitch parameter for the second CELP format, if the pcm speech signal (AS2) is unvoiced.
  13. Transcoding apparatus (1) for executing the method of claim 2 or one of claims 3 to 11, comprising:
    a) a receiving means (2) which receives the first coded speech signal (DS1) encoded using the first CELP format and including at least the first pitch parameter;
    b) a decoding means (3) which decodes the received first coded speech signal (DS1) to the decoded pcm speech signal (AS2) ;
    c) a first detecting means (4) which detects the voiced level (VL) of the pcm speech signal (AS2) within the predetermined time window (T);
    d) a first determining means (5) which determines if the pcm speech signal (AS2) is the voiced speech signal or the unvoiced speech signal dependent on at least the first parameter (P1) ;
    e) a second detecting means (6) which detects an energy level (EL) of the pcm speech signal (AS2) within the predetermined time window (T);
    f) a second determining means (7) which determines if the energy level (EL) of the pcm speech signal (AS2) is high or low dependent on at least the second parameter (P2);
    g) a closed loop search means (8a) which performs a closed loop search, wherein the closed loop search means (8a) receives at least the first pitch parameter (Top(A)) and estimates a second pitch parameter (ToP(B)) for the second CELP format dependent on at least the first pitch parameter (Top (A)), if the pcm speech signal (AS2) is voiced and its energy level (EL) is high; and
    h) a copying means (8b) which copies the first pitch parameter (Top(A)) as the second pitch parameter for the second CELP format, if the pcm speech signal (AS2) is unvoiced or its energy level (EL) is low.
EP06025955A 2006-12-14 2006-12-14 Method and apparatus for transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format Withdrawn EP1933306A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP06025955A EP1933306A1 (en) 2006-12-14 2006-12-14 Method and apparatus for transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP06025955A EP1933306A1 (en) 2006-12-14 2006-12-14 Method and apparatus for transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format

Publications (1)

Publication Number Publication Date
EP1933306A1 true EP1933306A1 (en) 2008-06-18

Family

ID=37909828

Family Applications (1)

Application Number Title Priority Date Filing Date
EP06025955A Withdrawn EP1933306A1 (en) 2006-12-14 2006-12-14 Method and apparatus for transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format

Country Status (1)

Country Link
EP (1) EP1933306A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003058407A2 (en) * 2002-01-08 2003-07-17 Dilithium Networks Pty Limited A transcoding scheme between celp-based speech codes

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003058407A2 (en) * 2002-01-08 2003-07-17 Dilithium Networks Pty Limited A transcoding scheme between celp-based speech codes

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
DALWON JANG ET AL: "A Novel Rate Selection Algorithm for Transcoding CELP-type Codec and SMV", EUROSPEECH 2003, September 2003 (2003-09-01), Geneva, CH, pages 2865, XP007007018 *
GHENANIA M ET AL: "TRANSCODAGE INTELLIGENT A FAIBLE COMPLEXITE EXTRE LES CODEURS UIT-T G.729 ET 3GPP NB-AMR (12.2 KBIT/S)", CORESA. COMPRESSION ET REPRESENTATION DES SIGNAUX AUDIOVISUELS, 25 May 2004 (2004-05-25), pages 85 - 88, XP001199662 *
JIN-KYU CHOI ET AL: "Improvement issues on transcoding algorithms : for the flexible usage to the various pairs of speech codec", IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2004. PROCEEDINGS. (ICASSP '04)., 17 May 2004 (2004-05-17) - 21 May 2004 (2004-05-21), MONTREAL, QUEBEC, CANADA, pages 269 - 272, XP010717617, ISBN: 0-7803-8484-9 *
KYUNG TAE KIM ET AL: "An efficient transcoding algorithm for G.723.1 and EVRC speech coders", VTC FALL 2001. IEEE 54TH. VEHICULAR TECHNOLOGY CONFERENCE. PROCEEDINGS. ATLANTIC CITY, NJ, OCT. 7 - 11, 2001, IEEE VEHICULAR TECHNOLGY CONFERENCE, NEW YORK, NY : IEEE, US, vol. VOL. 1 OF 4. CONF. 54, 7 October 2001 (2001-10-07), pages 1561 - 1564, XP010562224, ISBN: 0-7803-7005-8 *
PANKAJ K R ED - MATTHEWS M B (ED) INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS: "A novel transcoding scheme from EVRC to G.729AB", CONFERENCE RECORD OF THE 37TH. ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, & COMPUTERS. PACIFIC GROOVE, CA, NOV. 9 - 12, 2003, ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, NEW YORK, NY : IEEE, US, vol. VOL. 1 OF 2. CONF. 37, 9 November 2003 (2003-11-09), pages 533 - 536, XP010702678, ISBN: 0-7803-8104-1 *

Similar Documents

Publication Publication Date Title
EP1747556B1 (en) Supporting a switch between audio coder modes
EP1340223B1 (en) Method and apparatus for robust speech classification
US8825477B2 (en) Systems, methods, and apparatus for frame erasure recovery
EP2099028B1 (en) Smoothing discontinuities between speech frames
EP2301022B1 (en) Multi-reference lpc filter quantization device and method
US6470313B1 (en) Speech coding
JP4907826B2 (en) Closed-loop multimode mixed-domain linear predictive speech coder
US6754630B2 (en) Synthesis of speech from pitch prototype waveforms by time-synchronous waveform interpolation
EP3352169B1 (en) Unvoiced decision for speech processing
KR102173422B1 (en) Audio coding device, audio coding method, audio coding program, audio decoding device, audio decoding method, and audio decoding program
EP1181687B1 (en) Multipulse interpolative coding of transition speech frames
US8380495B2 (en) Transcoding method, transcoding device and communication apparatus used between discontinuous transmission
EP1933306A1 (en) Method and apparatus for transcoding a speech signal from a first code excited linear prediction (CELP) format to a second code excited linear prediction (CELP) format
JP2011090311A (en) Linear prediction voice coder in mixed domain of multimode of closed loop
kS kkSkkS et al. km mmm SmmSZkukkS kkkk kkkLLk k kkkkkkS

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU LV MC NL PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA HR MK RS

AKX Designation fees paid
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20081219