EP3447766B1 - Encoding method, encoding apparatus, corresponding program and recording medium - Google Patents

Encoding method, encoding apparatus, corresponding program and recording medium Download PDF

Info

Publication number: EP3447766B1
Authority: EP; European Patent Office
Prior art keywords: lsp; quantized; parameter sequence; adjusted; encoding
Prior art date: 2014-04-24
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.): Active

Application number

EP18200102.4A

Other languages

German (de)

English (en)

French (fr)

Other versions

EP3447766A1 (en

Inventor

Takehiro Moriya

Yutaka Kamamoto

Noboru Harada

Hirokazu Kameoka

Ryosuke Sugiura

Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)

Nippon Telegraph and Telephone Corp

University of Tokyo NUC

Original Assignee

Nippon Telegraph and Telephone Corp

University of Tokyo NUC

Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)

2014-04-24

Filing date

2015-02-16

Publication date

2020-04-08

Family has litigation

First worldwide family litigation filed litigation Critical https://patents.darts-ip.com/?family=54332153&utm_source=***_patent&utm_medium=platform_link&utm_campaign=public_patent_search&patent=EP3447766(B1) "Global patent litigation dataset” by Darts-ip is licensed under a Creative Commons Attribution 4.0 International License.

2015-02-16 Application filed by Nippon Telegraph and Telephone Corp, University of Tokyo NUC filed Critical Nippon Telegraph and Telephone Corp

2015-02-16 Priority to EP19216781.5A priority Critical patent/EP3648103B1/en

2015-02-16 Priority to PL18200102T priority patent/PL3447766T3/pl

2015-02-16 Priority to PL19216781T priority patent/PL3648103T3/pl

2019-02-27 Publication of EP3447766A1 publication Critical patent/EP3447766A1/en

2020-04-08 Application granted granted Critical

2020-04-08 Publication of EP3447766B1 publication Critical patent/EP3447766B1/en

Status Active legal-status Critical Current

2035-02-16 Anticipated expiration legal-status Critical

Links

238000000034 method Methods 0.000 title claims description 89
230000003595 spectral effect Effects 0.000 claims description 94
230000005236 sound signal Effects 0.000 claims description 80
230000009466 transformation Effects 0.000 claims description 56
238000004364 calculation method Methods 0.000 claims description 10
238000001228 spectrum Methods 0.000 claims description 8
238000006243 chemical reaction Methods 0.000 claims description 7
238000013139 quantization Methods 0.000 claims description 2
238000012545 processing Methods 0.000 description 59
230000000875 corresponding effect Effects 0.000 description 31
239000011159 matrix material Substances 0.000 description 30
238000010586 diagram Methods 0.000 description 23
230000015572 biosynthetic process Effects 0.000 description 17
238000003786 synthesis reaction Methods 0.000 description 17
230000008569 process Effects 0.000 description 15
238000007796 conventional method Methods 0.000 description 13
230000000694 effects Effects 0.000 description 13
230000004048 modification Effects 0.000 description 13
238000012986 modification Methods 0.000 description 13
230000002123 temporal effect Effects 0.000 description 13
230000006870 function Effects 0.000 description 8
108010076504 Protein Sorting Signals Proteins 0.000 description 7
230000001174 ascending effect Effects 0.000 description 7
230000003044 adaptive effect Effects 0.000 description 6
238000009499 grossing Methods 0.000 description 6
230000002596 correlated effect Effects 0.000 description 3
238000003860 storage Methods 0.000 description 3
239000013598 vector Substances 0.000 description 3
230000004044 response Effects 0.000 description 2
238000012546 transfer Methods 0.000 description 2
101100350187 Caenorhabditis elegans odd-2 gene Proteins 0.000 description 1
230000001419 dependent effect Effects 0.000 description 1
230000005284 excitation Effects 0.000 description 1
239000000284 extract Substances 0.000 description 1
238000013213 extrapolation Methods 0.000 description 1
230000014509 gene expression Effects 0.000 description 1
230000014759 maintenance of location Effects 0.000 description 1
230000003287 optical effect Effects 0.000 description 1
239000004065 semiconductor Substances 0.000 description 1
230000007704 transition Effects 0.000 description 1
230000001755 vocal effect Effects 0.000 description 1

Images

Classifications

- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/06—Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
- G10L19/07—Line spectrum pair [LSP] vocoders
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/08—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
- G10L19/12—Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/06—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being correlation coefficients
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/12—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being prediction coefficients

Definitions

the present invention relates to encoding techniques, and more particularly to techniques for converting frequency domain parameters equivalent to linear prediction coefficients.
Non-Patent Literatures 1 and 2 input sound signals in each frame are coded by either a frequency domain encoding method or a time domain encoding method. Whether to use the frequency domain or time domain encoding method is determined in accordance with the characteristics of the input sound signals in each frame.
linear prediction coefficients obtained by linear prediction analysis of input sound signal are converted to a sequence of LSP parameters, which is then coded to obtained LSP codes, and also a quantized LSP parameter sequence corresponding to the LSP codes is generated.
encoding is carried out by using linear prediction coefficients determined from a quantized LSP parameter sequence for the current frame and a quantized LSP parameter sequence for the preceding frame as the filter coefficients for a synthesis filter serving as a time-domain filter, applying the synthesis filter to a signal generated by synthesis of the waveforms contained in an adaptive codebook and the waveforms contained in a fixed codebook so as to determine a synthesized signal, and determining indices for the respective codebooks such that the distortion between the synthesized signal determined and the input sound signal is minimized.
a quantized LSP parameter sequence is converted to linear prediction coefficients to determine a quantized linear prediction coefficient sequence; the quantized linear prediction coefficient sequence is smoothed to determine an adjusted quantized linear prediction coefficient sequence; a signal from which the effect of the spectral envelope has been removed is determined by normalizing each value in a frequency domain signal series which is determined by converting the input sound signal to the frequency domain using each value in a power spectral envelope series, which is a series in the frequency domain corresponding to the adjusted quantized linear prediction coefficients; and the determined signal is coded by variable length encoding taking into account spectral envelope information.
linear prediction coefficients determined through linear prediction analysis of the input sound signal are employed in common in the frequency domain and time domain encoding methods.
Linear prediction coefficients are converted into a sequence of frequency domain parameters equivalent to the linear prediction coefficients, such as LSP (Line Spectrum Pair) parameters or ISP (Immittance Spectrum Pairs) parameters.
LSP codes or ISP codes generated by encoding the LSP parameter sequence (or ISP parameter sequence) are transmitted to a decoding apparatus.
LSP frequencies LSP frequencies
ISF ISP frequencies
an LSP parameter sequence consisting of p LSP parameters will be represented as ⁇ [1], ⁇ [2], ..., ⁇ [p].
"p" represents the order of prediction which is an integer equal to or greater than 1.
the symbol in brackets ([]) represents index.
⁇ [i] indicates the ith LSP parameter in an LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p].
a symbol written in the upper right of ⁇ in brackets indicates frame number.
an LSP parameter sequence generated for the sound signals in the fth frame is represented as ⁇ [f] [1], ⁇ [f] [2], ..., ⁇ [f] [p].
⁇ k [i] means the kth power of ⁇ [i].
a speech sound digital signal (hereinafter referred to as input sound signal) in the time domain per frame, which defines a predetermined time segment, is input to a conventional encoding apparatus 9.
the encoding apparatus 9 performs processing in the processing units described below on the input sound signal on a per-frame basis.
a per-frame input sound signal is input to a linear prediction analysis unit 105, a feature amount extracting unit 120, a frequency domain encoding unit 150, and a time domain encoding unit 170.
the linear prediction analysis unit 105 performs linear prediction analysis on the per-frame input sound signal to determine a linear prediction coefficient sequence a[1], a[2], ..., a[p], and outputs it.
a[i] is a linear prediction coefficient of the ith order.
the linear prediction coefficient sequence a[1], a[2], ..., a[p] output by the linear prediction analysis unit 105 is input to an LSP generating unit 110.
the LSP generating unit 110 determines and outputs a series of LSP parameters, ⁇ [1], ⁇ [2], ..., ⁇ [p], corresponding to the linear prediction coefficient sequence a[1], a[2], ..., a[p] output from the linear prediction analysis unit 105.
the series of LSP parameters, ⁇ [1], ⁇ [2], ..., ⁇ [p] will be referred to as an LSP parameter sequence.
the LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] is a series of parameters that are defined as the root of the sum polynomial defined by Formula (2) and the difference polynomial defined by Formula (3).
F 1 z A z + z ⁇ p + 1 A z ⁇ 1
F 2 z A z ⁇ z ⁇ p + 1 A z ⁇ 1
the LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] is a series in which values are arranged in ascending order. That is, it satisfies 0 ⁇ ⁇ 1 ⁇ ⁇ 2 ⁇ ... ⁇ ⁇ p ⁇ ⁇ .
the LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] output by the LSP generating unit 110 is input to an LSP encoding unit 115.
the LSP encoding unit 115 encodes the LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] output by the LSP generating unit 110, determines LSP code C1 and a quantized LSP parameter series ⁇ [1], ⁇ [2], ..., ⁇ [p] corresponding to the LSP code C1, and outputs them.
the quantized LSP parameter series ⁇ [1], ⁇ [2], ..., ⁇ [p] will be referred to as a quantized LSP parameter sequence.
the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] output by the LSP encoding unit 115 is input to a quantized linear prediction coefficient generating unit 900, a delay input unit 165, and a time domain encoding unit 170.
the LSP code C1 output by the LSP encoding unit 115 is input to an output unit 175.
the feature amount extracting unit 120 extracts the magnitude of the temporal variation in the input sound signal as the feature amount.
the feature amount extracting unit 120 implements control so that the quantized linear prediction coefficient generating unit 900 will perform the subsequent processing.
the feature amount extracting unit 120 inputs information indicating the frequency domain encoding method to the output unit 175 as identification code Cg.
the predetermined threshold i.e., when the temporal variation in the input sound signal is large
the feature amount extracting unit 120 implements control so that the time domain encoding unit 170 will perform the subsequent processing.
the feature amount extracting unit 120 inputs information indicating the time domain encoding method to the output unit 175 as identification code Cg.
Processes in the quantized linear prediction coefficient generating unit 900, a quantized linear prediction coefficient adjusting unit 905, an approximate smoothed power spectral envelope series calculating unit 910, and the frequency domain encoding unit 150 are executed when the feature amount extracted by the feature amount extracting unit 120 is smaller than the predetermined threshold (i.e., when the temporal variation in the input sound signal is small) (step S121).
the quantized linear prediction coefficient generating unit 900 determines a series of linear prediction coefficients, ⁇ a[1], ⁇ a[2], ..., ⁇ a[p], from the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] output by the LSP encoding unit 115, and outputs it.
the linear prediction coefficient series ⁇ a[1], ⁇ a[2], ..., ⁇ a[p] will be referred to as a quantized linear prediction coefficient sequence.
the quantized linear prediction coefficient sequence ⁇ a[1], ⁇ a[2], ..., ⁇ a[p] output by the quantized linear prediction coefficient generating unit 900 is input to the quantized linear prediction coefficient adjusting unit 905.
the adjustment factor ⁇ R is a predetermined positive integer equal to or smaller than 1.
the series ⁇ a[1] ⁇ ( ⁇ R), ⁇ a[2] ⁇ ( ⁇ R) 2 , ..., ⁇ a[p] ⁇ ( ⁇ R) p will be referred to as an adjusted quantized linear prediction coefficient sequence.
the adjusted quantized linear prediction coefficient sequence ⁇ a[1] ⁇ ( ⁇ R), ⁇ a[2] ⁇ ( ⁇ R) 2 , ..., ⁇ a[p] ⁇ ( ⁇ R) p output by the quantized linear prediction coefficient adjusting unit 905 is input to the approximate smoothed power spectral envelope series calculating unit 910.
step S910 using each coefficient ⁇ a[i] ⁇ ( ⁇ R) i in the adjusted quantized linear prediction coefficient sequence ⁇ a[1] ⁇ ( ⁇ R), ⁇ a[2] ⁇ ( ⁇ R) 2 , ..., ⁇ a[p] ⁇ ( ⁇ R) p output by the quantized linear prediction coefficient adjusting unit 905, the approximate smoothed power spectral envelope series calculating unit 910 generates an approximate smoothed power spectral envelope series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N] by Formula (4) and outputs it.
exp( ⁇ ) is an exponential function whose base is Napier's constant
j is the imaginary unit
⁇ 2 prediction residual energy.
the approximate smoothed power spectral envelope series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N] is a frequency-domain series corresponding to the adjusted quantized linear prediction coefficient sequence ⁇ a[1] ⁇ ( ⁇ R), ⁇ a[2] ⁇ ( ⁇ R) 2 , ..., ⁇ a[p] ⁇ ( ⁇ R) p .
the approximate smoothed power spectral envelope series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N] output by the approximate smoothed power spectral envelope series calculating unit 910 is input to the frequency domain encoding unit 150.
input sound signal x[t] at time t is represented by Formula (5) with its own values in the past back to time p, i.e., x[t-1], ..., x[t-p], a prediction residual e[t], and linear prediction coefficients a[1], a[2], ..., a[p].
processing for adjusting a linear prediction coefficient by multiplying linear prediction coefficient a[i] by the ith power of the adjustment factor ⁇ R is equivalent to processing that flats the waves of the amplitude of the power spectral envelope in the frequency domain (processing for smoothing the power spectral envelope).
the series W ⁇ R [1], W ⁇ R [2], ..., W ⁇ R [N] defined by Formula (7) is called a smoothed power spectral envelope series.
the series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N] defined by Formula (4) is equivalent to a series of approximations of the individual values in the smoothed power spectral envelope series W ⁇ R [1], W ⁇ R [2], ..., W ⁇ R [N] defined by Formula (7). Accordingly, the series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N] defined by Formula (4) is called an approximate smoothed power spectral envelope series.
sqrt(y) represents the square root of y.
the frequency domain encoding unit 150 then encodes the normalized frequency domain signal sequence X N [1], X N [2], ..., X N [N] by variable length encoding to generate frequency domain signal codes.
the frequency domain signal codes output by the frequency domain encoding unit 150 are input to the output unit 175.
the delay input unit 165 and the time domain encoding unit 170 are executed when the feature amount extracted by the feature amount extracting unit 120 is equal to or greater than the predetermined threshold (i.e., when the temporal variation in the input sound signal is large) (step S121).
the delay input unit 165 holds the input quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p], and outputs it to the time domain encoding unit 170 with a delay equivalent to the duration of one frame. For example, if the current frame is the fth frame, the quantized LSP parameter sequence for the f-1th frame, ⁇ [f-1] [1], ⁇ [f-1] [2], ..., ⁇ [f-1] [p], is output to the time domain encoding unit 170.
the time domain encoding unit 170 carries out encoding by determining a synthesized signal by applying the synthesis filter to a signal generated by synthesis of the waveforms contained in the adaptive codebook and the waveforms contained in the fixed codebook, and determining the indices for the respective codebooks so that the distortion between the synthesized signal determined and the input sound signal is minimized.
the codebook indices are determined so as to minimize the value given by applying an auditory weighting filter to a signal representing the difference of the synthesized signal from the input sound signal.
the auditory weighting filter is a filter for determining distortion when selecting the adaptive codebook and/or the fixed codebook.
the filter coefficients of the synthesis filter and the auditory weighting filter are generated by use of the quantized LSP parameter sequence for the fth frame, ⁇ [1], ⁇ [2], ..., ⁇ [p], and the quantized LSP parameter sequence for the f-1th frame, ⁇ [f-1] [1], ⁇ [f-1] [2], ..., ⁇ 0 [f-1] [p].
a frame is first divided into two subframes, and the filter coefficients for the synthesis filter and the auditory weighting filter are determined as follows.
each coefficient ⁇ a[i] in a quantized linear prediction coefficient sequence ⁇ a[1], ⁇ a[2], ..., ⁇ a[p], which is a coefficient sequence obtained by converting the quantized LSP parameter sequence for the fth frame, ⁇ [1], ⁇ [2], ..., ⁇ [p], into linear prediction coefficients, is employed for the filter coefficient of the synthesis filter.
a series of values a ⁇ 1 ⁇ ⁇ R , a ⁇ 2 ⁇ ⁇ R 2 , ... , a ⁇ p ⁇ ⁇ R p , is employed which is determined by multiplying each coefficient ⁇ a[i] in the quantized linear prediction coefficient sequence ⁇ a[1], ⁇ a[2], ..., ⁇ a[p] by the ith power of adjustment factor ⁇ R.
each coefficient ⁇ a[i] in an interpolated quantized linear prediction coefficient sequence ⁇ a[1], ⁇ a[2], ..., ⁇ a[p], which is a coefficient sequence obtained by converting an interpolated quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] into linear prediction coefficients, is employed for the filter coefficient of the synthesis filter.
the interpolated quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] is a series of intermediate values between each value ⁇ [i] in the quantized LSP parameter sequence for the fth frame, ⁇ [1], ⁇ [2], ..., ⁇ [p], and each value ⁇ [f-1] [i] in the quantized LSP parameter sequence for the f-1th frame, ⁇ [f-1] [1], ⁇ [f-1] [2], ..., ⁇ [f-1] [p], namely a series of values obtained by interpolating between the values ⁇ [i] and ⁇ [f-1] [i].
⁇ a 1 ⁇ ⁇ R , ⁇ a 2 ⁇ ⁇ R 2 , ... , ⁇ a p ⁇ ⁇ R p is employed which is determined by multiplying each coefficient ⁇ a[i] in the interpolated quantized linear prediction coefficient sequence ⁇ a[1], ⁇ a[2], ..., ⁇ a[p] by the ith power of the adjustment factor ⁇ R.
the encoding apparatus 9 transmits, by way of the output unit 175, the LSP code C1 output by the LSP encoding unit 115, the identification code Cg output by the feature amount extracting unit 120, and either the frequency domain signal codes output by the frequency domain encoding unit 150 or the time domain signal codes output by the time domain encoding unit 170, to the decoding apparatus.
Patent Literature 1 discloses a speech processing apparatus that is able to enhance formants more naturally.
a speech analyzing unit analyzes an input speech signal to find LPCs and converts the LPCs to LSPs
a speech decoding unit calculates a distance between adjacent orders of the LSPs by an LSP analytical processing unit and calculates LSP adjusting amounts of larger values for LSPs of adjacent orders closer in distance by an LSP adjusting amount calculating unit
an LSP adjusting unit adjusts the LSPs based on the LSP adjusting amounts such that the LSPs of adjacent orders closer in distance become closer
an LSP-LPC converting unit converts the adjusted LSPs to LPCs
an LPC combining unit uses the LPCs and sound source parameters to obtain formant-enhanced speech.
Patent Literature 2 discloses a speech synthesis apparatus in which spectrum emphasis characteristics can be set taking into account the frequency response and psychoacoustic hearing sense and in which the degree of freedom in setting the response is larger.
An excitation signal ex(n) is synthesized by a synthesis filter to give a synthesized speech signal which is sent to a spectrum emphasis filter.
the spectrum emphasis filter spectrum-emphasizes the synthesized speech signal and outputs the resulting spectrum-emphasized signal.
the vocal tract parameters from an input terminal are converted by a parameter conversion circuit into LSP frequencies which are interpolated by an LSP interpolation circuit with equal-interval line spectral pair frequencies to produce interpolated LSP frequencies.
the transfer function of the spectrum emphasis filter is determined on the basis of the interpolated LSP frequencies.
the adjustment factor yR serves to achieve encoding with small distortion that takes the sense of hearing into account to an increased degree by flattening the waves of the amplitude of a power spectral envelope more for a higher frequency when eliminating the influence of the power spectral envelope from the input sound signal.
the adjusted quantized linear prediction coefficient sequence ⁇ a[1] ⁇ ( ⁇ R), ⁇ a[2] ⁇ ( ⁇ R) 2 , ..., ⁇ a[p] ⁇ ( ⁇ R) p is a series that approximates the adjusted linear prediction coefficient sequence a ⁇ R [1], a ⁇ R [2], ..., a ⁇ R [p] with high accuracy.
the LSP encoding unit of a conventional encoding apparatus performs encoding processing so that the distortion between the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] and the LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] is minimized.
the distortion between the adjusted quantized linear prediction coefficient sequence ⁇ a[1] ⁇ ( ⁇ R), ⁇ a [2] ⁇ ( ⁇ R) 2 , ..., ⁇ a[p] ⁇ ( ⁇ R) p generated from the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] and the adjusted linear prediction coefficient sequence a ⁇ R [1], a ⁇ R [2], ..., a ⁇ R [p] is not minimized, leading to large encoding distortion in the frequency domain encoding unit.
An object of the present invention is to provide encoding techniques that selectively use frequency domain encoding and time domain encoding in accordance with the characteristics of the input sound signal and that are capable of reducing the encoding distortion in frequency domain encoding compared to conventional techniques, and also generating LSP parameters that correspond to quantized LSP parameters for the preceding frame and are to be used in time domain encoding, from linear prediction coefficients resulting from frequency domain encoding or coefficients equivalent to linear prediction coefficients, typified by LSP parameters.
Another object of the present invention is to generate coefficients equivalent to linear prediction coefficients having varying degrees of smoothing effect from coefficients equivalent to linear prediction coefficients used, for example, in the above-described encoding technique.
the present invention provides an encoding method and an encoding apparatus, as well as a corresponding program and a computer-readable recording medium, having the features of the respective independent claim. Preferred embodiments are described in the dependent claims.
a frequency domain parameter sequence generating method includes, where p is an integer equal to or greater than 1, ⁇ 1 is a positive constant equal to or smaller than 1, a[1], a[2], ..., a[p] are a linear prediction coefficient sequence which is obtained by linear prediction analysis of audio signals in a predetermined time segment, and ⁇ [1], ⁇ [2], ..., ⁇ [p] are a frequency domain parameter sequence derived from the linear prediction coefficient sequence a[1], a[2], ..., a[p], a parameter sequence conversion step of determining a converted frequency domain parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] using the frequency domain parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] as input.
An encoding method includes, where ⁇ is an adjustment factor which is a positive constant equal to or smaller than 1, a linear prediction coefficient adjustment step of generating an adjusted linear prediction coefficient sequence a ⁇ [1], a ⁇ [2], ..., a ⁇ [p] by adjusting the linear prediction coefficient sequence a[1], a[2], ..., a[p] using the adjustment factor y; an adjusted LSP generation step of generating an adjusted LSP parameter sequence ⁇ ⁇ [1], ⁇ ⁇ [2], ..., ⁇ ⁇ [p] using the adjusted linear prediction coefficient sequence a ⁇ [1], a ⁇ [2], ..., a ⁇ [p]; an adjusted LSP encoding step of encoding the adjusted LSP parameter sequence ⁇ ⁇ [1], ⁇ ⁇ [2], ..., ⁇ ⁇ [p] to generate adjusted LSP codes and an adjusted quantized LSP parameter sequence ⁇ ⁇ [1], ⁇ ⁇ [2], ...,
An encoding method includes, where ⁇ is an adjustment factor which is a positive constant equal to or smaller than 1, a linear prediction coefficient adjustment step of generating an adjusted linear prediction coefficient sequence a ⁇ [1], a ⁇ [2], ..., a ⁇ [p] by adjusting the linear prediction coefficient sequence a[1], a[2], ..., a[p] using the adjustment factor y; an adjusted LSP generation step of generating an adjusted LSP parameter sequence ⁇ ⁇ [1], ⁇ ⁇ [2], ..., ⁇ ⁇ [p] using the adjusted linear prediction coefficient sequence a ⁇ [1], a ⁇ [2], ..., a ⁇ [p]; an adjusted LSP encoding step of encoding the adjusted LSP parameter sequence ⁇ ⁇ [1], ⁇ ⁇ [2], ..., ⁇ ⁇ [p] to generate adjusted LSP codes and an adjusted quantized LSP parameter sequence ⁇ ⁇ [1], ⁇ ⁇ [2], ...
the encoding techniques of the present invention it is possible to reduce the encoding distortion in frequency domain encoding compared to conventional techniques, and also obtain LSP parameters that correspond to quantized LSP parameters for the preceding frame and are to be used in time domain encoding from linear prediction coefficients resulting from frequency domain encoding or coefficients equivalent to linear prediction coefficients, typified by LSP parameters. It is also possible to generate coefficients equivalent to linear prediction coefficients having varying degrees of smoothing effect from coefficients equivalent to linear prediction coefficients used in, for example, the above-described encoding technique.
An encoding apparatus obtains, in a frame for which time domain encoding is performed, LSP codes by encoding LSP parameters that have been converted from linear prediction coefficients.
the encoding apparatus obtains adjusted LSP codes by encoding adjusted LSP parameters that have been converted from adjusted linear prediction coefficients.
linear prediction coefficients generated by inverse adjustment of linear prediction coefficients that correspond to LSP parameters corresponding to adjusted LSP codes are converted to LSPs, which are then used as LSP parameters in the time domain encoding for the following frame.
a decoding apparatus obtains, in a frame for which time domain decoding is performed, linear prediction coefficients that have been converted from LSP parameters resulting from decoding of LSP codes and uses them for time domain decoding.
the decoding apparatus uses adjusted LSP parameters generated by decoding adjusted LSP codes for the frequency domain decoding.
time domain decoding is to be performed in a frame following a frame for which frequency domain decoding was performed, linear prediction coefficients generated by inverse adjustment of linear prediction coefficients that correspond to LSP parameters corresponding to the adjusted LSP codes are converted to LSPs, which are then used as LSP parameters in the time domain decoding for the following frame.
encoding and decoding apparatuses As illustrated in Fig. 3 , input sound signals input to an encoding apparatus 1 are coded into a code sequence, which is then sent from the encoding apparatus 1 to the decoding apparatus 2, in which the code sequence is decoded into decoded sound signals and output.
the encoding apparatus 1 includes, as with the conventional encoding apparatus 9, an input unit 100, a linear prediction analysis unit 105, an LSP generating unit 110, an LSP encoding unit 115, a feature amount extracting unit 120, a frequency domain encoding unit 150, a delay input unit 165, a time domain encoding unit 170, and an output unit 175, for example.
the encoding apparatus 1 further includes a linear prediction coefficient adjusting unit 125, an adjusted LSP generating unit 130, an adjusted LSP encoding unit 135, a quantized linear prediction coefficient generating unit 140, a first quantized smoothed power spectral envelope series calculating unit 145, a quantized linear prediction coefficient inverse adjustment unit 155, and an inverse-adjusted LSP generating unit 160, for example.
the encoding apparatus 1 is a specialized device build by incorporating special programs into a known or dedicated computer having a central processing unit (CPU), main memory (random access memory or RAM), and the like, for example.
the encoding apparatus 1 performs various kinds of processing under the control of the central processing unit, for example.
Data input to the encoding apparatus 1 or data resulting from various kinds of processing are stored in the main memory, for example, and data stored in the main memory are retrieved for use in other processing as necessary.
At least some of the processing components of the encoding apparatus 1 may be implemented by hardware such as an integrated circuit.
the encoding apparatus 1 in the first embodiment differs from the conventional encoding apparatus 9 in that, when the feature amount extracted by the feature amount extracting unit 120 is smaller than a predetermined threshold (i.e., when the temporal variation in the input sound signal is small), the encoding apparatus 1 encodes an adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p], which is a series generated by converting an adjusted linear prediction coefficient sequence a ⁇ R [1], a ⁇ R [2], ..., a ⁇ R [p] into LSP parameters, and outputs adjusted LSP code Cy, instead of encoding an LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] which is a series generated by converting linear prediction coefficient sequence a[1], a[2], ..., a[p] into LSP parameters and outputting LSP code C1.
a predetermined threshold i.e., when the temporal variation in the input sound signal
the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] is not generated and thus cannot be input to the delay input unit 165.
the quantized linear prediction coefficient inverse adjustment unit 155 and the inverse-adjusted LSP generating unit 160 are processing components added for addressing this: when the feature amount extracted by the feature amount extracting unit 120 in the preceding frame was smaller than the predetermined threshold (i.e., when temporal variation in the input sound signal was small), they generate a series of approximations of the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] for the preceding frame to be used in the time domain encoding unit 170, from the adjusted quantized linear prediction coefficient sequence ⁇ a ⁇ R [1], ⁇ a ⁇ R [2], ..., ⁇ a ⁇ R [p].
an inverse-adjusted LSP parameter sequence ⁇ '[1], ⁇ '[2], ..., ⁇ '[p] is the series of approximations of the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p].
the series a ⁇ R [1], a ⁇ R [2], ..., a ⁇ R [p] determined will be called an adjusted linear prediction coefficient sequence.
the adjusted linear prediction coefficient sequence a ⁇ R [1], a ⁇ R [2], ..., a ⁇ R [p] output by the linear prediction coefficient adjusting unit 125 is input to the adjusted LSP generating unit 130.
the adjusted LSP generating unit 130 determines and outputs an adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p], which is a series of LSP parameters corresponding to the adjusted linear prediction coefficient sequence a ⁇ R [1], a ⁇ R [2], ..., a ⁇ R [p] output by the linear prediction coefficient adjusting unit 125.
the adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] is a series in which values are arranged in ascending order. That is, it satisfies 0 ⁇ ⁇ ⁇ R 1 ⁇ ⁇ ⁇ R 2 ⁇ ... ⁇ ⁇ ⁇ R p ⁇ ⁇ .
the adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] output by the adjusted LSP generating unit 130 is input to the adjusted LSP encoding unit 135.
the adjusted LSP encoding unit 135 encodes the adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] output by the adjusted LSP generating unit 130, and generates adjusted LSP code C ⁇ and a series of quantized adjusted LSP parameters, ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p], corresponding to the adjusted LSP code C ⁇ , and outputs them.
the series ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] will be called an adjusted quantized LSP parameter sequence.
the adjusted quantized LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] output by the adjusted LSP encoding unit 135 is input to the quantized linear prediction coefficient generating unit 140.
the adjusted LSP code C ⁇ output by the adjusted LSP encoding unit 135 is input to the output unit 175.
the quantized linear prediction coefficient generating unit 140 generates and outputs a series of linear prediction coefficients, ⁇ a ⁇ R [1], ⁇ a ⁇ R [2], ..., ⁇ a ⁇ R [p], from the adjusted quantized LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] output by the adjusted LSP encoding unit 135.
the series ⁇ a ⁇ R [1], ⁇ a ⁇ R [2], ..., ⁇ a ⁇ R [p] will be called an adjusted quantized linear prediction coefficient sequence.
the adjusted quantized linear prediction coefficient sequence ⁇ a ⁇ [1], ⁇ a ⁇ [2], ..., ⁇ a ⁇ [p] output by the quantized linear prediction coefficient generating unit 140 is input to the first quantized smoothed power spectral envelope series calculating unit 145 and the quantized linear prediction coefficient inverse adjustment unit 155.
the first quantized smoothed power spectral envelope series calculating unit 145 generates and outputs a quantized smoothed power spectral envelope series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N] according to Formula (8) using each coefficient ⁇ a ⁇ R [i] in the adjusted quantized linear prediction coefficient sequence ⁇ a ⁇ R [1], ⁇ a ⁇ R [2], ..., ⁇ a ⁇ R [p] output by the quantized linear prediction coefficient generating unit 140.
the quantized smoothed power spectral envelope series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N] output by the first quantized smoothed power spectral envelope series calculating unit 145 is input to the frequency domain encoding unit 150.
Processing in the frequency domain encoding unit 150 is the same as that performed by the frequency domain encoding unit 150 of the conventional encoding apparatus 9 except that it uses the quantized smoothed power spectral envelope series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N] in place of the approximate smoothed power spectral envelope series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N].
the quantized linear prediction coefficient inverse adjustment unit 155 determines a series ⁇ a ⁇ [1]/( ⁇ R), ⁇ a ⁇ [2]/( ⁇ R) 2 , ..., ⁇ a ⁇ [p]/( ⁇ R) p of value a ⁇ [i]/( ⁇ R) i determined by dividing each value ⁇ a ⁇ R [i] in the adjusted quantized linear prediction coefficient sequence ⁇ a ⁇ R [1], ⁇ a ⁇ R [2], ..., ⁇ a ⁇ R [p] output by the quantized linear prediction coefficient generating unit 140 by the ith power of the adjustment factor ⁇ R, and outputs it.
the series ⁇ a ⁇ [1]/( ⁇ R), ⁇ a ⁇ [2]/( ⁇ R) 2 , ..., ⁇ a ⁇ [p]/( ⁇ R) p will be called an inverse-adjusted linear prediction coefficient sequence.
the adjustment factor ⁇ R is set to the same value as the adjustment factor ⁇ R used in the linear prediction coefficient adjusting unit 125.
the inverse-adjusted linear prediction coefficient sequence ⁇ a ⁇ [1]/( ⁇ R), ⁇ a ⁇ [2]/( ⁇ R) 2 , ..., ⁇ a ⁇ [p]/( ⁇ R) p output by the quantized linear prediction coefficient inverse adjustment unit 155 is input to the inverse-adjusted LSP generating unit 160.
the inverse-adjusted LSP generating unit 160 determines and outputs a series of LSP parameters, ⁇ '[1], ⁇ '[2], ..., ⁇ '[p], from the inverse-adjusted linear prediction coefficient sequence ⁇ a ⁇ [1]/( ⁇ R), ⁇ a ⁇ [2]/( ⁇ R) 2 , ..., ⁇ a ⁇ [p]/( ⁇ R) p output by the quantized linear prediction coefficient inverse adjustment unit 155.
the LSP parameter series ⁇ '[1], ⁇ '[2], ..., ⁇ '[p] will be called an inverse-adjusted LSP parameter sequence.
the inverse-adjusted LSP parameter sequence ⁇ '[1], ⁇ '[2], ..., ⁇ '[p] is a series in which values are arranged in ascending order. That is, it is a series that satisfies 0 ⁇ ⁇ ⁇ ′ 1 ⁇ ⁇ ⁇ ′ 2 ⁇ ... ⁇ ⁇ ⁇ ′ p ⁇ ⁇ .
the inverse-adjusted LSP parameters ⁇ '[1], ⁇ '[2], ..., ⁇ '[p] output by the inverse-adjusted LSP generating unit 160 are input to the delay input unit 165 as a quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p]. That is, the inverse-adjusted LSP parameters ⁇ '[1], ⁇ '[2], ..., ⁇ '[p] are used in place of the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p].
the encoding apparatus 1 sends, by way of the output unit 175, the LSP code C1 output by the LSP encoding unit 115, the identification code Cg output by the feature amount extracting unit 120, the adjusted LSP code C ⁇ output by the adjusted LSP encoding unit 135, and either the frequency domain signal codes output by the frequency domain encoding unit 150 or the time domain signal codes output by the time domain encoding unit 170, to the decoding apparatus 2.
the decoding apparatus 2 includes an input unit 200, an identification code decoding unit 205, an LSP code decoding unit 210, an adjusted LSP code decoding unit 215, a decoded linear prediction coefficient generating unit 220, a first decoded smoothed power spectral envelope series calculating unit 225, a frequency domain decoding unit 230, a decoded linear prediction coefficient inverse adjustment unit 235, a decoded inverse-adjusted LSP generating unit 240, a delay input unit 245, a time domain decoding unit 250, and an output unit 255, for example.
the decoding apparatus 2 is a specialized device build by incorporating special programs into a known or dedicated computer having a central processing unit (CPU), main memory (random access memory or RAM), and the like, for example.
the decoding apparatus 2 performs various kinds of processing under the control of the central processing unit, for example.
Data input to the decoding apparatus 2 or data resulting from various kinds of processing are stored in the main memory, for example, and data stored in the main memory are retrieved for use in other processing as necessary.
At least some of the processing components of the decoding apparatus 2 may be implemented by hardware such as an integrated circuit.
a code sequence generated in the encoding apparatus 1 is input to the decoding apparatus 2.
the code sequence contains the LSP code C1, identification code Cg, adjusted LSP code C ⁇ , and either frequency domain signal codes or time domain signal codes.
the identification code decoding unit 205 implements control so that the adjusted LSP code decoding unit 215 will execute the subsequent processing if the identification code Cg contained in the input code sequence corresponds to information indicating the frequency domain encoding method, and so that the LSP code decoding unit 210 will execute the subsequent processing if the identification code Cg corresponds to information indicating the time domain encoding method.
the adjusted LSP code decoding unit 215, the decoded linear prediction coefficient generating unit 220, the first decoded smoothed power spectral envelope series calculating unit 225, the frequency domain decoding unit 230, the decoded linear prediction coefficient inverse adjustment unit 235, and the decoded inverse-adjusted LSP generating unit 240 are executed when the identification code Cg contained in the input code sequence corresponds to information indicating the frequency domain encoding method (step S206).
the adjusted LSP code decoding unit 215 obtains a decoded adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] by decoding the adjusted LSP code C ⁇ contained in the input code sequence, and outputs it. That is, it obtains and outputs a decoded adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] which is a sequence of LSP parameters corresponding to the adjusted LSP code C ⁇ .
the decoded adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] obtained here is identical to the adjusted quantized LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] generated by the encoding apparatus 1 if the adjusted LSP code C ⁇ output by the encoding apparatus 1 is accurately input to the decoding apparatus 2 without being affected by code errors or the like.
the decoded adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] output by the adjusted LSP code decoding unit 215 is input to the decoded linear prediction coefficient generating unit 220.
the decoded linear prediction coefficient generating unit 220 generates and outputs a series of linear prediction coefficients, ⁇ a ⁇ R [1], ⁇ a ⁇ R [2], ..., ⁇ a ⁇ R [p], from the decoded adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] output by the adjusted LSP code decoding unit 215.
the series ⁇ a ⁇ R [1], ⁇ a ⁇ R [2], ..., ⁇ a ⁇ R [p] will be called a decoded adjusted linear prediction coefficient sequence.
the decoded linear prediction coefficient sequence ⁇ a ⁇ R [1], ⁇ a ⁇ R [2], ..., ⁇ a ⁇ R [p] output by the decoded linear prediction coefficient generating unit 220 is input to the first decoded smoothed power spectral envelope series calculating unit 225 and the decoded linear prediction coefficient inverse adjustment unit 235.
the first decoded smoothed power spectral envelope series calculating unit 225 generates and outputs a decoded smoothed power spectral envelope series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N] according to Formula (8) using each coefficient ⁇ a ⁇ R [i] in the decoded adjusted linear prediction coefficient sequence ⁇ a ⁇ R [1], ⁇ a ⁇ R [2], ..., ⁇ a ⁇ R [p] output by the decoded linear prediction coefficient generating unit 220.
the decoded smoothed power spectral envelope series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N] output by the first decoded smoothed power spectral envelope series calculating unit 225 is input to the frequency domain decoding unit 230.
the frequency domain decoding unit 230 decodes the frequency domain signal codes contained in the input code sequence to determine a decoded normalized frequency domain signal sequence X N [1], X N [2], ..., X N [N].
the decoded linear prediction coefficient inverse adjustment unit 235 determines and outputs a series, ⁇ a ⁇ R [1]/( ⁇ R), ⁇ a ⁇ R [2]/( ⁇ R) 2 , ..., ⁇ a ⁇ R [p]/( ⁇ R) p , of value ⁇ a ⁇ [i]/( ⁇ R) i by dividing each value ⁇ a ⁇ R [i] in the decoded adjusted linear prediction coefficient sequence ⁇ a ⁇ R [1], ⁇ a ⁇ R [2], ..., ⁇ a ⁇ R [p] output by the decoded linear prediction coefficient generating unit 220 by the ith power of the adjustment factor ⁇ R.
the series ⁇ a ⁇ R [1]/( ⁇ R), ⁇ a ⁇ R [2]/( ⁇ R) 2 , ..., ⁇ a ⁇ R [p]/( ⁇ R) p will be called a decoded inverse-adjusted linear prediction coefficient sequence.
the adjustment factor ⁇ R is set to the same value as the adjustment factor ⁇ R used in the linear prediction coefficient adjusting unit 125 of the encoding apparatus 1.
the decoded inverse-adjusted linear prediction coefficient sequence ⁇ a ⁇ R [1]/( ⁇ R), ⁇ a ⁇ R [2]/( ⁇ R) 2 , ..., ⁇ a ⁇ R [p]/( ⁇ R) p output by the decoded linear prediction coefficient inverse adjustment unit 235 is input to the decoded inverse-adjusted LSP generating unit 240.
the decoded inverse-adjusted LSP generating unit 240 determines an LSP parameter series ⁇ '[1], ⁇ '[2], ..., ⁇ '[p] from the decoded inverse-adjusted linear prediction coefficient sequence ⁇ a ⁇ R [1]/( ⁇ R), ⁇ a ⁇ R [2]/( ⁇ R) 2 , ..., ⁇ a ⁇ R [p]/( ⁇ R) p , and outputs it.
the LSP parameter series ⁇ '[1], ⁇ '[2], ..., ⁇ '[p] will be called a decoded inverse-adjusted LSP parameter sequence.
the decoded inverse-adjusted LSP parameters ⁇ '[1], ⁇ '[2], ..., ⁇ '[p] output by the decoded inverse-adjusted LSP generating unit 240 are input to the delay input unit 245 as a decoded LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p].
the LSP code decoding unit 210, the delay input unit 245, and the time domain decoding unit 250 are executed when the identification code Cg contained in the input code sequence corresponds to information indicating the time domain encoding method (step S206).
the LSP code decoding unit 210 decodes the LSP code C1 contained in the input code sequence to obtain a decoded LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p], and outputs it. That is, it obtains and outputs a decoded LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p], which is a sequence of LSP parameters corresponding to the LSP code C1.
the decoded LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] output by the LSP code decoding unit 210 is input to the delay input unit 245 and the time domain decoding unit 250.
the delay input unit 245 holds the input decoded LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] and outputs it to the time domain decoding unit 250 with a delay equivalent to the duration of one frame. For instance, if the current frame is the fth frame, the decoded LSP parameter sequence for the f-1th frame, ⁇ [f-1] [1], ⁇ [f-1] [2], ..., ⁇ [f-1] [p], is output to the time domain decoding unit 250.
the decoded inverse-adjusted LSP parameter sequence ⁇ '[1], ⁇ '[2], ..., ⁇ '[p] output by the decoded inverse-adjusted LSP generating unit 240 is input to the delay input unit 245 as the decoded LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p].
the time domain decoding unit 250 identifies the waveforms contained in the adaptive codebook and waveforms in the fixed codebook from the time domain signal codes contained in the input code sequence.
the synthesis filter By applying the synthesis filter to a signal generated by synthesis of the waveforms in the adaptive codebook and the waveforms in the fixed codebook that have been identified, a synthesized signal from which the effect of the spectral envelope has been removed is determined, and the synthesized signal determined is output as a decoded sound signal.
the filter coefficients for the synthesis filter are generated using the decoded LSP parameter sequence for the fth frame, ⁇ [1], ⁇ [2], ..., ⁇ [p], and the decoded LSP parameter sequence for the f-1th frame, ⁇ [f-1] [1], ⁇ [f-1] [2], ..., ⁇ [f-1] [p].
a frame is first divided into two subframes, and the filter coefficients for the synthesis filter are determined as follows.
a series of values a ⁇ 1 ⁇ ⁇ R , a ⁇ 2 ⁇ ⁇ R 2 , ... , a ⁇ p ⁇ ⁇ R p is used as filter coefficients for the synthesis filter. This is obtained by multiplying each coefficient ⁇ a[i] of the decoded linear prediction coefficients ⁇ a[1], ⁇ a[2], ..., ⁇ a[p], which is a coefficient sequence generated by converting the decoded LSP parameter sequence for the fth frame, ⁇ [1], ⁇ [2], ..., ⁇ [p], into linear prediction coefficients, by the ith power of the adjustment factor ⁇ R.
a series of values ⁇ a 1 ⁇ ⁇ R , ⁇ a 2 ⁇ ⁇ R 2 , ... , ⁇ a p ⁇ ⁇ R p which is obtained by multiplying each coefficient ⁇ a[i] of decoded interpolated linear prediction coefficients ⁇ a[1], ⁇ a[2], ..., ⁇ a[p] by the ith power of the adjustment factor ⁇ R, is used as filter coefficients for the synthesis filter.
the adjusted LSP encoding unit 135 of the encoding apparatus 1 determines such an adjusted quantized LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] that minimizes the quantizing distortion between the adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] and the adjusted quantized LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p].
the quantized smoothed power spectral envelope series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N], which is a power spectral envelope series obtained by expanding the adjusted quantized LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] into the frequency domain, can approximate the smoothed power spectral envelope series W ⁇ R [1], W ⁇ R [2], ..., W ⁇ R [N] with high accuracy.
the code amount of the LSP code C1 is the same as that of the adjusted LSP code C ⁇ , the first embodiment yields smaller encoding distortion in frequency domain encoding than the conventional technique.
the adjusted LSP code C ⁇ achieves a further smaller code amount compared to the conventional method than the LSP code C1 does.
the code amount can be reduced compared to the conventional method, whereas with the same code amount as the conventional method, encoding distortion can be reduced compared to the conventional method.
the encoding apparatus 1 and decoding apparatus 2 of the first embodiment are expensive in terms of calculation in the inverse-adjusted LSP generating unit 160 and the decoded inverse-adjusted LSP generating unit 240 in particular.
an encoding apparatus 3 in a second embodiment directly generates an approximate quantized LSP parameter sequence ⁇ [1] app , ⁇ [2] app , ..., ⁇ [p] app , which is a series of approximations of the values in the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p], from the adjusted quantized LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] without the intermediation of linear prediction coefficients.
a decoding apparatus 4 in the second embodiment directly generates a decoded approximate LSP parameter sequence ⁇ [1] app , ⁇ [2] app , ..., ⁇ [p] app , which is a series of approximations of the values in the decoded LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p], from the decoded adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] without the intermediation of linear prediction coefficients.
Fig. 8 shows the functional configuration of the encoding apparatus 3 in the second embodiment.
the encoding apparatus 3 differs from the encoding apparatus 1 of the first embodiment in that it does not include the quantized linear prediction coefficient inverse adjustment unit 155 and the inverse-adjusted LSP generating unit 160 but includes an LSP linear transformation unit 300 instead.
the LSP linear transformation unit 300 applies approximate linear transformation to an adjusted quantized LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] to generate an approximate quantized LSP parameter sequence ⁇ [1] app , ⁇ [2] app , ..., ⁇ [p] app .
LSP linear transformation unit 300 applies approximate transformation to a series of quantized LSP parameters, the nature of an unquantized LSP parameter sequence will be discussed first because the nature of a quantized LSP parameter series is basically the same as the nature of an unquantized LSP parameter sequence.
An LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] is a parameter sequence in the frequency domain that is correlated with the power spectral envelope of the input sound signal.
Each value in the LSP parameter sequence is correlated with the frequency position of the extreme of the power spectral envelope of the input sound signal.
the extreme of the power spectral envelope is present at a frequency position between ⁇ [i] and ⁇ [i+1]; and with a steeper slope of a tangent around the extreme, the interval between ⁇ [i] and ⁇ [i+1] (i.e., the value of ⁇ [i+1] - ⁇ [i]) becomes smaller.
the adjusted LSP parameters satisfy the property: 0 ⁇ ⁇ ⁇ 1 ⁇ ⁇ ⁇ 2 ... ⁇ ⁇ ⁇ p ⁇ ⁇ .
the horizontal axis represents the value of adjustment factor ⁇ and the vertical axis represents the adjusted LSP parameter value.
each ⁇ ⁇ [i] is derived by determining an adjusted linear prediction coefficient sequence a ⁇ [1], a ⁇ [2], ..., a ⁇ [p] for each value of ⁇ through processing similar to the linear prediction coefficient adjusting unit 125 by use of a linear prediction coefficient sequence a[1], a[2], ..., a[p] which has been obtained by linear prediction analysis on a certain speech sound signal, and then converting the adjusted linear prediction coefficient sequence a ⁇ [1], a ⁇ [2], ..., a ⁇ [p] into LSP parameters through similar processing to the adjusted LSP generating unit 130.
each LSP parameter ⁇ ⁇ [i] when seen locally, is in a linear relationship with increase or decrease of ⁇ .
the magnitude of the slope of a straight line connecting a point ( ⁇ 1, ⁇ ⁇ 1 [i]) and a point ( ⁇ 2, ⁇ ⁇ 2 [i]) on the two-dimensional plane is correlated with the relative interval between the LSP parameters that precede and follow ⁇ ⁇ 1 [i] in the LSP parameter sequence, ⁇ ⁇ 1 [1], ⁇ ⁇ 1 [2], ..., ⁇ ⁇ 1 [p] (i.e., ⁇ ⁇ 1 [i-1] and ⁇ ⁇ 1 [i+1]), and ⁇ ⁇ 1 [i].
Formulas (9) and (10) indicate that when ⁇ ⁇ 1 [i] is closer to ⁇ ⁇ 1 [i+1] with respect to the midpoint between ⁇ ⁇ 1 [i+1] and ⁇ ⁇ 1 [i-1], ⁇ ⁇ 2 [i] will assume a value that is further closer to ⁇ ⁇ 2 [i+1] (see Fig. 10 ).
Formulas (11) and (12) indicate that when ⁇ ⁇ 1 [i] is closer to ⁇ ⁇ 1 [i-1] with respect to the midpoint between ⁇ ⁇ 1 [i+1] and ⁇ ⁇ 1 [i-1], ⁇ ⁇ 2 [i] will assume a value that is further closer to ⁇ ⁇ 2 [i-1].
This means that on a two-dimensional plane with the horizontal axis being the ⁇ value and the vertical axis being the LSP parameter value, the slope of straight line connecting the point ( ⁇ 1, ⁇ ⁇ 1 [i]) and the point ( ⁇ 2, ⁇ ⁇ 2 [i]) is smaller than the slope of a straight line connecting the point (0, ⁇ ⁇ 0 [i]) and the point ( ⁇ 1, ⁇ ⁇ 1 [i]).
Formulas (9) to (12) describe the relationships on the assumption of ⁇ 1 ⁇ 2, the model of Formula (13) has no limitation on the relation of magnitude between ⁇ 1 and ⁇ 2; they may be either ⁇ 1 ⁇ 2 or ⁇ 1> ⁇ 2.
the matrix K is a band matrix that has non-zero values only in the diagonal components and elements adjacent to them and is a matrix representing the correlations described above that hold between LSP parameters corresponding to the diagonal components and the neighboring LSP parameters. Note that although Formula (14) illustrates a band matrix with a band width of three, the band width is not limited to three.
⁇ 1> ⁇ 2 it means straight line interpolation, while when ⁇ 1 ⁇ 2, it means straight line extrapolation.
Formula (17) means adjusting the value of - ⁇ ⁇ 2 [i] by weighting the differences between the ith LSP parameter ⁇ ⁇ 1 [i] in the LSP parameter sequence, ⁇ ⁇ 1 [1], ⁇ ⁇ 1 [2], ..., ⁇ ⁇ 1 [p], and its preceding and following LSP parameter values (i.e., ⁇ ⁇ 1 [i]- ⁇ ⁇ 1 [i-1] and ⁇ ⁇ 1 [i+1]- ⁇ ⁇ 1 [i]) to obtain ⁇ ⁇ 2 [i]. That is to say, correlations such as shown in Formulas (9) through (12) above are reflected in the elements in the band portion (non-zero elements) of the matrix K in Formula (13a).
the values ⁇ ⁇ 2 [1], ⁇ ⁇ 2 [2], ..., ⁇ ⁇ 2 [p] given by Formula (13a) are approximate values (estimated values) of LSP parameter values ⁇ ⁇ 2 [1], ⁇ ⁇ 2 [2], ..., ⁇ ⁇ 2 [p] when the linear prediction coefficient sequence a[1] ⁇ ( ⁇ 2), ..., a[p] ⁇ ( ⁇ 2) p is converted to LSP parameters.
the matrix K in Formula (14) tends to have positive values in the diagonal components and negative values in elements in the vicinity of them, as indicated by Formulas (16) and (17).
the matrix K is a preset matrix, which is pre-learned using learning data, for example. How to learn the matrix K will be discussed later.
the LSP linear transformation unit 300 included in the encoding apparatus 3 of the second embodiment generates an approximate quantized LSP parameter sequence ⁇ [1] app , ⁇ [2] app , ..., ⁇ [p] app from the adjusted quantized LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] based on Formula (13b).
the adjustment factor ⁇ R used in generation of the adjusted quantized LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] is the same as the adjustment factor ⁇ R used in the linear prediction coefficient adjusting unit 125.
the adjusted quantized LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] output by the adjusted LSP encoding unit 135 is also input to the LSP linear transformation unit 300 in addition to the quantized linear prediction coefficient generating unit 140.
the approximate quantized LSP parameter sequence ⁇ [1] app , ⁇ [2] app , ..., ⁇ [p] app output by the LSP linear transformation unit 300 is input to the delay input unit 165 as the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p].
the approximate quantized LSP parameter sequence ⁇ [1] app , ⁇ [2] app , ..., ⁇ [p] app for the preceding frame is used in place of the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] for the preceding frame.
Fig. 13 shows the functional configuration of the decoding apparatus 4 in the second embodiment.
the decoding apparatus 4 differs from the decoding apparatus 2 in the first embodiment in that it does not include the decoded linear prediction coefficient inverse adjustment unit 235 and the decoded inverseadjusted LSP generating unit 240 but includes a decoded LSP linear transformation unit 400 instead.
Processing in the adjusted LSP code decoding unit 215 is the same as the first embodiment. However, the decoded adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] output by the adjusted LSP code decoding unit 215 is also input to the decoded LSP linear transformation unit 400 in addition to the decoded linear prediction coefficient generating unit 220.
the decoded approximate LSP parameter sequence ⁇ [1] app , ⁇ [2] app , ..., ⁇ [p] app output by the decoded LSP linear transformation unit 400 is input to the delay input unit 245 as a decoded LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p].
the approximate quantized LSP parameter sequence ⁇ [1] app , ⁇ [2] app , ..., ⁇ [p] app for the preceding frame is used in place of the decoded LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] for the preceding frame.
the transformation matrix K used in the LSP linear transformation unit 300 and the decoded LSP linear transformation unit 400 is determined in advance through the following process and prestored in storages (not shown) of the encoding apparatus 3 and the decoding apparatus 4.
Step 1 For prepared sample data for speech sound signals corresponding to M frames, each sample data is subjected to linear prediction analysis to obtain linear prediction coefficients.
a linear prediction coefficient sequence produced by linear prediction analysis of the mth (1 ⁇ m ⁇ M) sample data is represented as a (m) [1], a (m) [2], ..., a (m) [p], and referred to as a linear prediction coefficient sequence a (m) [1], a (m) [2], ..., a (m) [p] corresponding to the mth sample data.
Step 4 For each m, an adjusted LSP parameter sequence ⁇ ⁇ L (m) [1], ..., ⁇ ⁇ L (m) [p] is determined from the adjusted linear prediction coefficient sequence a ⁇ L (m) [1], ..., a ⁇ L (m) [p].
the adjusted LSP parameter sequence ⁇ ⁇ L (m) [1], ..., ⁇ ⁇ L (M) [p] is coded in a similar manner to the adjusted LSP encoding unit 135, thereby generating a quantized LSP parameter sequence ⁇ ⁇ L (m) [1], ..., ⁇ ⁇ L (m) [p].
⁇ ⁇ m ⁇ 2 ⁇ ⁇ ⁇ L m 1 , ... , ⁇ ⁇ ⁇ L m p T .
Steps 1 to 4 M pairs of quantized LSP parameter sequences ( ⁇ (m) ⁇ 1 , ⁇ (M) ⁇ 2 ) are obtained.
m 1, ..., M ⁇ . Note that all of the values of adjustment factor yL used in generation of the learning data set Q are common fixed values.
the matrix K used in the LSP linear transformation unit 300 does not have to be one that has been learned using the same value as the adjustment factor yR used in the encoding apparatus 3.
the products of the values x 1 , x 2 , ..., x 15 , y 1 , y 2 , ..., y 14 , z 2 , z 3 , ..., z 15 in Formula (14) and ⁇ 2- ⁇ 1 are xx 1 , xx 2 , ..., xx 15 , yy 1 , yy 2 , ..., yy 14 , zz 2 , zz 3 , ..., zz 15 below:
the encoding apparatus 3 provides similar effects to the encoding apparatus 1 in the first embodiment because, as with the first embodiment, it has a configuration in which the quantized linear prediction coefficient generating unit 900, the quantized linear prediction coefficient adjusting unit 905, and the approximate smoothed power spectral envelope series calculating unit 910 of the conventional encoding apparatus 9 are replaced with the linear prediction coefficient adjusting unit 125, adjusted LSP generating unit 130, adjusted LSP encoding unit 135, quantized linear prediction coefficient generating unit 140, and the first quantized smoothed power spectral envelope series calculating unit 145. That is, when the encoding distortion is equal to that in a conventional method, the code amount can be reduced compared to the conventional method, whereas when the code amount is the same as in the conventional method, encoding distortion can be reduced compared to the conventional method.
the calculation cost of the encoding apparatus 3 in the second embodiment is low because K is a band matrix in calculation of Formula (18).
K is a band matrix in calculation of Formula (18).
the encoding apparatus 3 in the second embodiment decides whether to code in the time domain or in the frequency domain based on the magnitude of temporal variation in the input sound signal for each frame.
the temporal variation in the input sound signal was large and frequency domain encoding was selected, it is possible that actually a sound signal reproduced by encoding in the time domain leads to smaller distortion relative to the input sound signal than a signal reproduced by encoding in the frequency domain.
the temporal variation in the input sound signal was small and encoding in the time domain was selected, it is possible that actually a sound signal reproduced by encoding in the frequency domain leads to smaller distortion relative to the input sound signal than a sound signal reproduced by encoding in the time domain.
the encoding apparatus 3 in the second embodiment cannot always select one of the time domain and frequency domain encoding methods that provides smaller distortion relative to the input sound signal.
an encoding apparatus 8 in a modification of the second embodiment performs both time domain and frequency domain encoding on each frame and selects either of them that yields smaller distortion relative to the input sound signal.
Fig. 15 shows the functional configuration of the encoding apparatus 8 in a modification of the second embodiment.
the encoding apparatus 8 differs from the encoding apparatus 3 in the second embodiment in that it does not include the feature amount extracting unit 120 and includes a code selection and output unit 375 in place of the output unit 175.
the LSP generating unit 110, LSP encoding unit 115, linear prediction coefficient adjusting unit 125, adjusted LSP generating unit 130, adjusted LSP encoding unit 135, quantized linear prediction coefficient generating unit 140, first quantized smoothed power spectral envelope series calculating unit 145, delay input unit 165, and LSP linear transformation unit 300 are also executed in addition to the input unit 100 and the linear prediction analysis unit 105 for all frames regardless of whether the temporal variation in the input sound signal is large or small.
the operations of these components are the same as the second embodiment.
the approximate quantized LSP parameter sequence ⁇ [1] app , ⁇ [2] app , ..., ⁇ [p] app generated by the LSP linear transformation unit 300 is input to the delay input unit 165.
the delay input unit 165 holds the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] input from the LSP encoding unit 115 and the approximate quantized LSP parameter sequence ⁇ [1] app , ⁇ [2] app , ..., ⁇ [p] app input from the LSP linear transformation unit 300 at least for the duration of one frame.
the delay input unit 165 outputs the approximate quantized LSP parameter sequence ⁇ [1] app , ⁇ [2] app , ..., ⁇ [p] app for the preceding frame input from the LSP linear transformation unit 300 to the time domain encoding unit 170 as the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] for the preceding frame.
the delay input unit 165 outputs the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] for the preceding frame input from the LSP encoding unit 115 to the time domain encoding unit 170 (step S165).
the frequency domain encoding unit 150 generates and outputs frequency domain signal codes, and also determines and outputs the distortion or an estimated value of the distortion of the sound signal corresponding to the frequency domain signal codes relative to the input sound signal.
the distortion or an estimation thereof may be determined either in the time domain or in the frequency domain. This means that the frequency domain encoding unit 150 may determine the distortion or an estimated value of the distortion of a frequency-domain sound signal series corresponding to frequency domain signal codes relative to the frequency-domain sound signal series that is obtained by converting the input sound signal into the frequency domain.
the time domain encoding unit 170 as with the time domain encoding unit 170 in the second embodiment, generates and outputs time domain signal codes, and also determines the distortion or an estimated value of the distortion of the sound signal corresponding to the time domain signal codes relative to the input sound signal.
Input to the code selection and output unit 375 are the frequency domain signal codes generated by the frequency domain encoding unit 150, the distortion or an estimated value of distortion determined by the frequency domain encoding unit 150, the time domain signal codes generated by the time domain encoding unit 170, and the distortion or an estimated value of distortion determined by the time domain encoding unit 170.
the code selection and output unit 375 When the distortion or estimated value of distortion input from the frequency domain encoding unit 150 is smaller than the distortion or an estimated value of distortion input from the time domain encoding unit 170, the code selection and output unit 375 outputs the frequency domain signal codes and identification code Cg which is information indicating the frequency domain encoding method. When the distortion or estimated value of distortion input from the frequency domain encoding unit 150 is greater than the distortion or an estimated value of distortion input from the time domain encoding unit 170, the code selection and output unit 375 outputs the time domain signal codes and identification code Cg which is information indicating the time domain encoding method.
the code selection and output unit 375 outputs either the time domain signal codes or the frequency domain signal codes according to predetermined rules, as well as identification code Cg which is information indicating the encoding method corresponding to the codes being output.
the code selection and output unit 375 outputs either one that leads to a smaller distortion of the sound signal reproduced from the codes relative to the input sound signal, and also outputs information indicative of the encoding method that yields smaller distortion as identification code Cg (step S375).
the code selection and output unit 375 may also be configured to select either one of the sound signals reproduced from the respective codes that has smaller distortion relative to the input sound signal.
the frequency domain encoding unit 150 and the time domain encoding unit 170 reproduce sound signals from the codes and output them instead of distortion or an estimated value of distortion.
the code selection and output unit 375 outputs either the sound signal reproduced by the frequency domain encoding unit 150 or the sound signal reproduced by the time domain encoding unit 170 respectively from frequency domain signal codes and time domain signal codes that has smaller distortion relative to the input sound signal, and also outputs information indicating the encoding method that yields smaller distortion as identification code Cg.
the code selection and output unit 375 may be configured to select either one that has a smaller code amount.
the frequency domain encoding unit 150 outputs frequency domain signal codes as in the second embodiment.
the time domain encoding unit 170 outputs time domain signal codes as in the second embodiment.
the code selection and output unit 375 outputs either the frequency domain signal codes or the time domain signal codes that have a smaller code amount, and also outputs information indicating the encoding method that yields a smaller code amount as identification code Cg.
a code sequence output by the encoding apparatus 8 in the modification of the second embodiment can be decoded by the decoding apparatus 4 of the second embodiment as with a code sequence output by the encoding apparatus 3 of the second embodiment.
the encoding apparatus 8 in the modification of the second embodiment provides similar effects to the encoding apparatus 3 of the second embodiment and further has the effect of reducing the code amount to be output compared to the encoding apparatus 3 of the second embodiment.
the encoding apparatus 1 of the first embodiment and the encoding apparatus 3 of the second embodiment once convert the adjusted quantized LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] into linear prediction coefficients and then calculate the quantized smoothed power spectral envelope series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N].
An encoding apparatus 5 in the third embodiment directly calculates the quantized smoothed power spectral envelope series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N] from the adjusted quantized LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] without converting the adjusted quantized LSP parameter sequence to linear prediction coefficients.
a decoding apparatus 6 in the third embodiment directly calculates the decoded smoothed power spectral envelope series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N] from the decoded adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] without converting the decoded adjusted LSP parameter sequence to linear prediction coefficients.
Fig. 17 shows the functional configuration of the encoding apparatus 5 according to the third embodiment.
the encoding apparatus 5 differs from the encoding apparatus 3 in the second embodiment in that it does not include the quantized linear prediction coefficient generating unit 140 and the first quantized smoothed power spectral envelope series calculating unit 145 but includes a second quantized smoothed power spectral envelope series calculating unit 146 instead.
the second quantized smoothed power spectral envelope series calculating unit 146 uses the adjusted quantized LSP parameters ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] output by the adjusted LSP encoding unit 135 to determine a quantized smoothed power spectral envelope series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N] according to Formula (19) and outputs it.
Fig. 19 shows the functional configuration of the decoding apparatus 6 in the third embodiment.
the decoding apparatus 6 differs from the decoding apparatus 4 in the second embodiment in that it does not include the decoded linear prediction coefficient generating unit 220 and the first decoded smoothed power spectral envelope series calculating unit 225 but includes a second decoded smoothed power spectral envelope series calculating unit 226 instead.
the second decoded smoothed power spectral envelope series calculating unit 226 uses the decoded adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] to determine a decoded smoothed power spectral envelope series ⁇ W ⁇ R [1], ⁇ W ⁇ R [2], ..., ⁇ W ⁇ R [N] according to the Formula (19) above and outputs it.
the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] is a series that satisfies 0 ⁇ ⁇ ⁇ 1 ⁇ ... ⁇ ⁇ ⁇ p ⁇ ⁇ . That is, it is a series in which parameters are arranged in ascending order.
the approximate quantized LSP parameter sequence ⁇ [1] app , ⁇ [2] app , ..., ⁇ [p] app generated by the LSP linear transformation unit 300 is produced through approximate transformation, so it could not be in ascending order.
the fourth embodiment adds processing for rearranging the approximate quantized LSP parameter sequence ⁇ [1] app , ⁇ [2] app , ..., ⁇ [p] app output by the LSP linear transformation unit 300 into ascending order.
Fig. 21 shows the functional configuration of an encoding apparatus 7 in the fourth embodiment.
the encoding apparatus 7 differs from the encoding apparatus 5 in the second embodiment in that it further includes an approximate LSP series modifying unit 700.
the approximate LSP series modifying unit 700 outputs a series in which the values ⁇ [i] app in the approximate quantized LSP parameter sequence ⁇ [1] app , ⁇ [2] app , ..., ⁇ [p] app output by the LSP linear transformation unit 300 have been rearranged in ascending order as a modified approximate quantized LSP parameter sequence ⁇ '[1] app , ⁇ '[2] app , ..., ⁇ '[p] app .
the modified first approximate quantized LSP parameter sequence ⁇ '[1] app , ⁇ '[2] app , ..., ⁇ '[p] app output by the approximate LSP series modifying unit 700 is input to the delay input unit 165 as the quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p].
each value ⁇ [i] app may be adjusted as ⁇ '[i] app such that
is equal to or greater than a predetermined threshold for each value of i 1, ..., p-1.
an ISP parameter sequence may be employed instead of an LSP parameter sequence.
An ISP parameter sequence ISP[1], ..., ISP[p] is equivalent to a series consisting of an LSP parameter sequence of the p-1th order and PARCOR coefficient k p of the pth order (the highest order). That is to say,
input to the LSP linear transformation unit 300 is an adjusted quantized ISP parameter sequence ⁇ ISP ⁇ R [1], ⁇ ISP ⁇ R [2], ..., ⁇ ISP ⁇ R [p].
the LSP linear transformation unit 300 determines an approximate quantized ISP parameter sequence ⁇ ISP[1] app , ..., ⁇ ISP[p] app through the following process and outputs it.
Step 2 ⁇ ISP[p] app defined by the formula below is determined.
I ⁇ SP p app I ⁇ SP ⁇ R p ⁇ 1 / ⁇ R p .
the LSP linear transformation unit 300 included in the encoding apparatuses 3, 5, 7, 8 and the decoded LSP linear transformation unit 400 included in the decoding apparatuses 4, 6 may also be implemented as a separate frequency domain parameter sequence generating apparatus.
the following description illustrates a case where the LSP linear transformation unit 300 included in the encoding apparatuses 3, 5, 7, 8 and the decoded LSP linear transformation unit 400 included in the decoding apparatuses 4, 6 are implemented as a separate frequency domain parameter sequence generating apparatus.
a frequency domain parameter sequence generating apparatus 10 includes a parameter sequence converting unit 20 for example, as shown in Fig. 23 , and receives frequency domain parameters ⁇ [1], ⁇ [2], ..., ⁇ [p] as input and outputs converted frequency domain parameters ⁇ [1], ⁇ [2], ..., ⁇ [p].
the frequency domain parameters ⁇ [1], ⁇ [2], ..., ⁇ [p] to be input are a frequency domain parameter sequence derived from linear prediction coefficients, a[1], a[2], ..., a[p], which are obtained by linear prediction analysis of sound signals in a predetermined time segment.
the frequency domain parameters ⁇ [1], ⁇ [2], ..., ⁇ [p] may be an LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p] used in conventional encoding methods, or a quantized LSP parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p], for example.
they may be the adjusted LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] or the adjusted quantized LSP parameter sequence ⁇ ⁇ R [1], ⁇ ⁇ R [2], ..., ⁇ ⁇ R [p] used in the aforementioned embodiments, for example.
they may be frequency domain parameters equivalent to LSP parameters, such as the ISP parameter sequence described in the modification above, for example.
a frequency domain parameter sequence derived from linear prediction coefficients a[1], a[2], ..., a[p] are a series in the frequency domain derived from a linear prediction coefficient sequence and represented by the same number of elements as the order of prediction, typified by an LSP parameter sequence, an ISP parameter sequence, an LSF parameter sequence, or an ISF parameter sequence each derived from the linear prediction coefficient sequence a[1], a[2], ..., a[p], or a frequency domain parameter sequence in which all of the frequency domain parameters ⁇ [1], ⁇ [2], ..., ⁇ [p-1] are present from 0 to ⁇ and, when all of the linear prediction coefficients contained in the linear prediction coefficient sequence are 0, the frequency domain parameters ⁇ [1], ⁇ [2], ..., ⁇ [p-1] are present from 0 to ⁇ at equal intervals.
the parameter sequence converting unit 20 similarly to the LSP linear transformation unit 300 and the decoded LSP linear transformation unit 400, applies approximate linear transformation to the frequency domain parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p-1] making use of the nature of LSP parameters to generate a converted frequency domain parameter sequence ⁇ [1], ⁇ [2], ..., ⁇ [p].
the parameter sequence converting unit 20 determines the converted frequency domain parameters ⁇ [1], ⁇ [2], ..., ⁇ [p] according to Formula (20) below and outputs it.
⁇ ⁇ 1 ⁇ ⁇ 2 ⁇ ⁇ ⁇ p K ⁇ 1 ⁇ ⁇ p + 1 ⁇ 2 ⁇ 2 ⁇ ⁇ p + 1 ⁇ ⁇ p ⁇ p ⁇ p + 1 ⁇ 2 ⁇ ⁇ 1 + ⁇ 1 ⁇ 2 ⁇ ⁇ p
frequency domain parameters ⁇ [1], ⁇ [2], ..., ⁇ [p] are a frequency-domain parameter sequence or the quantized values thereof equivalent to a 1 ⁇ ⁇ 1 , a 2 ⁇ ⁇ 1 2 , ... , a p ⁇ ⁇ 1 p , which is a coefficient sequence that has been adjusted by multiplying each coefficient a[i] of the linear prediction coefficients a[1], a[2], ..., a[p] by the ith power of the factor ⁇ 1.
the converted frequency domain parameters ⁇ [1], ⁇ [2], ..., ⁇ [p] are a series that approximates a frequency-domain parameter sequence equivalent to a 1 ⁇ ⁇ 2 , a 2 ⁇ ⁇ 2 2 , ... , a p ⁇ ⁇ 2 p , which is a coefficient sequence that has been adjusted by multiplying each coefficient a[i] of the linear prediction coefficients a[1], a[2], ..., a[p] by the ith power of factor ⁇ 2.
the frequency domain parameter sequence generating apparatus in the fifth embodiment is able to determine converted frequency domain parameters from frequency domain parameters with a smaller amount of calculation than when converted frequency domain parameters are determined from frequency domain parameters by way of linear prediction coefficients as in the encoding apparatus 1 and the decoding apparatus 2.
a program describing the processing details can be recorded in a computer-readable recording medium.
the computer-readable recording medium may be any kind of media, such as a magnetic recording device, optical disk, magneto-optical recording medium, and semiconductor memory, for example.
Such a program may be distributed by selling, granting, or lending a portable recording medium, such as a DVD or CD-ROM for example, having the program recorded thereon.
the program may be stored in a storage device at a server computer and transferred to other computers from the server computer over a network so as to distribute the program.
the computer When a computer is to execute such a program, the computer first stores the program recorded on a portable recording medium or the program transferred from the server computer once in its own storage device, for example. Then, when it carries out processing, the computer reads the program stored in its recording medium and performs processing in accordance with the program that has been read. As an alternative form of execution of the program, the computer may directly read the program from a portable recording medium and perform processing in accordance with the program, or the computer may perform processing sequentially in accordance with a program it has received every time a program is transferred from the server computer to the computer.
ASP application service provider
Programs in the embodiments described herein are intended to contain information that is used in processing by an electronic computer and subordinate to programs (such as data that is not a direct instruction on a computer but has properties governing the processing of the computer).

Landscapes

Engineering & Computer Science (AREA)
Physics & Mathematics (AREA)
Computational Linguistics (AREA)
Signal Processing (AREA)
Health & Medical Sciences (AREA)
Audiology, Speech & Language Pathology (AREA)
Human Computer Interaction (AREA)
Acoustics & Sound (AREA)
Multimedia (AREA)
Spectroscopy & Molecular Physics (AREA)
Compression, Expansion, Code Conversion, And Decoders (AREA)

EP18200102.4A 2014-04-24 2015-02-16 Encoding method, encoding apparatus, corresponding program and recording medium Active EP3447766B1 (en)

Priority Applications (3)

Application Number	Priority Date	Filing Date	Title
EP19216781.5A EP3648103B1 (en)	2014-04-24	2015-02-16	Decoding method, decoding apparatus, corresponding program and recording medium
PL18200102T PL3447766T3 (pl)	2014-04-24	2015-02-16	Sposób kodowania, urządzenie kodujące, odpowiedni program i nośnik zapisu
PL19216781T PL3648103T3 (pl)	2014-04-24	2015-02-16	Sposób dekodowania, urządzenie dekodujące, odpowiedni program i nośnik zapisu

Applications Claiming Priority (3)

Application Number	Priority Date	Filing Date	Title
JP2014089895		2014-04-24
EP15783646.1A EP3136387B1 (en)	2014-04-24	2015-02-16	Frequency domain parameter sequence generating method, encoding method, decoding method, frequency domain parameter sequence generating apparatus, encoding apparatus, decoding apparatus, program, and recording medium
PCT/JP2015/054135 WO2015162979A1 (ja)	2014-04-24	2015-02-16	周波数領域パラメータ列生成方法、符号化方法、復号方法、周波数領域パラメータ列生成装置、符号化装置、復号装置、プログラム及び記録媒体

Related Parent Applications (2)

Application Number	Title	Priority Date	Filing Date
EP15783646.1A Division-Into EP3136387B1 (en)	2014-04-24	2015-02-16	Frequency domain parameter sequence generating method, encoding method, decoding method, frequency domain parameter sequence generating apparatus, encoding apparatus, decoding apparatus, program, and recording medium
EP15783646.1A Division EP3136387B1 (en)	2014-04-24	2015-02-16	Frequency domain parameter sequence generating method, encoding method, decoding method, frequency domain parameter sequence generating apparatus, encoding apparatus, decoding apparatus, program, and recording medium

Related Child Applications (2)

Application Number	Title	Priority Date	Filing Date
EP19216781.5A Division EP3648103B1 (en)	2014-04-24	2015-02-16	Decoding method, decoding apparatus, corresponding program and recording medium
EP19216781.5A Division-Into EP3648103B1 (en)	2014-04-24	2015-02-16	Decoding method, decoding apparatus, corresponding program and recording medium

Publications (2)

Publication Number	Publication Date
EP3447766A1 EP3447766A1 (en)	2019-02-27
EP3447766B1 true EP3447766B1 (en)	2020-04-08

Family

ID=54332153

Family Applications (3)

Application Number	Title	Priority Date	Filing Date
EP19216781.5A Active EP3648103B1 (en)	2014-04-24	2015-02-16	Decoding method, decoding apparatus, corresponding program and recording medium
EP15783646.1A Active EP3136387B1 (en)	2014-04-24	2015-02-16	Frequency domain parameter sequence generating method, encoding method, decoding method, frequency domain parameter sequence generating apparatus, encoding apparatus, decoding apparatus, program, and recording medium
EP18200102.4A Active EP3447766B1 (en)	2014-04-24	2015-02-16	Encoding method, encoding apparatus, corresponding program and recording medium

Family Applications Before (2)

Application Number	Title	Priority Date	Filing Date
EP19216781.5A Active EP3648103B1 (en)	2014-04-24	2015-02-16	Decoding method, decoding apparatus, corresponding program and recording medium
EP15783646.1A Active EP3136387B1 (en)	2014-04-24	2015-02-16	Frequency domain parameter sequence generating method, encoding method, decoding method, frequency domain parameter sequence generating apparatus, encoding apparatus, decoding apparatus, program, and recording medium

Country Status (9)

Country	Link
US (3)	US10332533B2 (es)
EP (3)	EP3648103B1 (es)
JP (4)	JP6270992B2 (es)
KR (3)	KR101872905B1 (es)
CN (3)	CN110503963B (es)
ES (3)	ES2795198T3 (es)
PL (3)	PL3447766T3 (es)
TR (1)	TR201900472T4 (es)
WO (1)	WO2015162979A1 (es)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
TR201900472T4 (tr) *	2014-04-24	2019-02-21	Nippon Telegraph & Telephone	Frekans alanı parametre dizisi oluşturma metodu, kodlama metodu, kod çözme metodu, frekans alanı parametre dizisi oluşturma aparatı, kodlama aparatı, kod çözme aparatı, programı ve kayıt ortamı.
JP6517924B2 (ja) *	2015-04-13	2019-05-22	日本電信電話株式会社	線形予測符号化装置、方法、プログラム及び記録媒体
JP7395901B2 (ja) *	2019-09-19	2023-12-12	ヤマハ株式会社	コンテンツ制御装置、コンテンツ制御方法およびプログラム
CN116151130B (zh) *	2023-04-19	2023-08-15	国网浙江新兴科技有限公司	风电场最大频率阻尼系数计算方法、装置、设备及介质

Family Cites Families (47)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
JPS58181096A (ja) *	1982-04-19	1983-10-22	株式会社日立製作所	音声分析合成方式
US5003604A (en) *	1988-03-14	1991-03-26	Fujitsu Limited	Voice coding apparatus
JP2659605B2 (ja) *	1990-04-23	1997-09-30	三菱電機株式会社	音声復号化装置及び音声符号化・復号化装置
US5504833A (en) *	1991-08-22	1996-04-02	George; E. Bryan	Speech approximation using successive sinusoidal overlap-add models and pitch-scale modifications
US5327518A (en) *	1991-08-22	1994-07-05	Georgia Tech Research Corporation	Audio analysis/synthesis system
JP2993396B2 (ja) *	1995-05-12	1999-12-20	三菱電機株式会社	音声加工フィルタ及び音声合成装置
JP2778567B2 (ja) *	1995-12-23	1998-07-23	日本電気株式会社	信号符号化装置及び方法
JPH09230896A (ja)	1996-02-28	1997-09-05	Sony Corp	音声合成装置
FI964975A (fi) *	1996-12-12	1998-06-13	Nokia Mobile Phones Ltd	Menetelmä ja laite puheen koodaamiseksi
US7272556B1 (en) *	1998-09-23	2007-09-18	Lucent Technologies Inc.	Scalable and embedded codec for speech and audio signals
JP2000242298A (ja) *	1999-02-24	2000-09-08	Mitsubishi Electric Corp	Ｌｓｐ補正装置，音声符号化装置及び音声復号化装置
JP2000250597A (ja) *	1999-02-24	2000-09-14	Mitsubishi Electric Corp	Ｌｓｐ補正装置，音声符号化装置及び音声復号化装置
DE60128677T2 (de) *	2000-04-24	2008-03-06	Qualcomm, Inc., San Diego	Verfahren und vorrichtung zur prädiktiven quantisierung von stimmhaften sprachsignalen
KR100910282B1 (ko) *	2000-11-30	2009-08-03	파나소닉 주식회사	Ｌｐｃ 파라미터의 벡터 양자화 장치, ｌｐｃ 파라미터복호화 장치, 기록 매체, 음성 부호화 장치, 음성 복호화장치, 음성 신호 송신 장치, 및 음성 신호 수신 장치
US7003454B2 (en) *	2001-05-16	2006-02-21	Nokia Corporation	Method and system for line spectral frequency vector quantization in speech codec
JP3859462B2 (ja) *	2001-05-18	2006-12-20	株式会社東芝	予測パラメータ分析装置および予測パラメータ分析方法
JP4413480B2 (ja) *	2002-08-29	2010-02-10	富士通株式会社	音声処理装置及び移動通信端末装置
KR20070009644A (ko) *	2004-04-27	2007-01-18	마츠시타 덴끼 산교 가부시키가이샤	스케일러블 부호화 장치, 스케일러블 복호화 장치 및 그방법
CN101656075B (zh) *	2004-05-14	2012-08-29	松下电器产业株式会社	音频解码装置、音频解码方法以及通信终端和基站装置
US7742912B2 (en) *	2004-06-21	2010-06-22	Koninklijke Philips Electronics N.V.	Method and apparatus to encode and decode multi-channel audio signals
US8239190B2 (en) *	2006-08-22	2012-08-07	Qualcomm Incorporated	Time-warping frames of wideband vocoder
KR101565919B1 (ko) *	2006-11-17	2015-11-05	삼성전자주식회사	고주파수 신호 부호화 및 복호화 방법 및 장치
US8688437B2 (en) *	2006-12-26	2014-04-01	Huawei Technologies Co., Ltd.	Packet loss concealment for speech coding
JP5006774B2 (ja) *	2007-12-04	2012-08-22	日本電信電話株式会社	符号化方法、復号化方法、これらの方法を用いた装置、プログラム、記録媒体
EP2077551B1 (en) *	2008-01-04	2011-03-02	Dolby Sweden AB	Audio encoder and decoder
EP2234273B8 (en) *	2008-01-24	2013-08-07	Nippon Telegraph and Telephone Corporation	Coding method, decoding method, apparatuses thereof, programs thereof, and recording medium
US8909521B2 (en) *	2009-06-03	2014-12-09	Nippon Telegraph And Telephone Corporation	Coding method, coding apparatus, coding program, and recording medium therefor
JP5223786B2 (ja) *	2009-06-10	2013-06-26	富士通株式会社	音声帯域拡張装置、音声帯域拡張方法及び音声帯域拡張用コンピュータプログラムならびに電話機
EP2551848A4 (en) *	2010-03-23	2016-07-27	Lg Electronics Inc	METHOD AND APPARATUS FOR PROCESSING AUDIO SIGNAL
CA2793140C (en) *	2010-04-09	2016-05-31	Dolby International Ab	Mdct-based complex prediction stereo coding
EP2596494B1 (en) *	2010-07-20	2020-08-05	Fraunhofer-Gesellschaft zur Förderung der Angewandten Forschung e.V.	Audio decoder, audio decoding method and computer program
KR101747917B1 (ko) *	2010-10-18	2017-06-15	삼성전자주식회사	선형 예측 계수를 양자화하기 위한 저복잡도를 가지는 가중치 함수 결정 장치 및 방법
JP5694751B2 (ja) *	2010-12-13	2015-04-01	日本電信電話株式会社	符号化方法、復号方法、符号化装置、復号装置、プログラム、記録媒体
CN103329199B (zh) *	2011-01-25	2015-04-08	日本电信电话株式会社	编码方法、编码装置、周期性特征量决定方法、周期性特征量决定装置、程序、记录介质
RU2559709C2 (ru) *	2011-02-16	2015-08-10	Ниппон Телеграф Энд Телефон Корпорейшн	Способ кодирования, способ декодирования, кодер, декодер, программа и носитель записи
US10515643B2 (en) *	2011-04-05	2019-12-24	Nippon Telegraph And Telephone Corporation	Encoding method, decoding method, encoder, decoder, program, and recording medium
TWI672691B (zh) *	2011-04-21	2019-09-21	南韓商三星電子股份有限公司	解碼方法
US9916538B2 (en) *	2012-09-15	2018-03-13	Z Advanced Computing, Inc.	Method and system for feature detection
US9524725B2 (en) *	2012-10-01	2016-12-20	Nippon Telegraph And Telephone Corporation	Encoding method, encoder, program and recording medium
WO2014144579A1 (en) *	2013-03-15	2014-09-18	Apple Inc.	System and method for updating an adaptive speech recognition model
TR201900472T4 (tr) *	2014-04-24	2019-02-21	Nippon Telegraph & Telephone	Frekans alanı parametre dizisi oluşturma metodu, kodlama metodu, kod çözme metodu, frekans alanı parametre dizisi oluşturma aparatı, kodlama aparatı, kod çözme aparatı, programı ve kayıt ortamı.
US20160292445A1 (en) *	2015-03-31	2016-10-06	Secude Ag	Context-based data classification
US20170154188A1 (en) *	2015-03-31	2017-06-01	Philipp MEIER	Context-sensitive copy and paste block
US10542961B2 (en) *	2015-06-15	2020-01-28	The Research Foundation For The State University Of New York	System and method for infrasonic cardiac monitoring
US10839302B2 (en) *	2015-11-24	2020-11-17	The Research Foundation For The State University Of New York	Approximate value iteration with complex returns by bounding
US11205103B2 (en) *	2016-12-09	2021-12-21	The Research Foundation for the State University	Semisupervised autoencoder for sentiment analysis
US11568236B2 (en) *	2018-01-25	2023-01-31	The Research Foundation For The State University Of New York	Framework and methods of diverse exploration for fast and safe policy improvement

2015
- 2015-02-16 TR TR2019/00472T patent/TR201900472T4/tr unknown
- 2015-02-16 KR KR1020167029133A patent/KR101872905B1/ko active IP Right Grant
- 2015-02-16 US US15/302,094 patent/US10332533B2/en active Active
- 2015-02-16 EP EP19216781.5A patent/EP3648103B1/en active Active
- 2015-02-16 PL PL18200102T patent/PL3447766T3/pl unknown
- 2015-02-16 ES ES18200102T patent/ES2795198T3/es active Active
- 2015-02-16 KR KR1020187017973A patent/KR101972007B1/ko active IP Right Grant
- 2015-02-16 KR KR1020187017982A patent/KR101972087B1/ko active IP Right Grant
- 2015-02-16 WO PCT/JP2015/054135 patent/WO2015162979A1/ja active Application Filing
- 2015-02-16 EP EP15783646.1A patent/EP3136387B1/en active Active
- 2015-02-16 PL PL19216781T patent/PL3648103T3/pl unknown
- 2015-02-16 CN CN201910757241.3A patent/CN110503963B/zh active Active
- 2015-02-16 CN CN201910757348.8A patent/CN110503964B/zh active Active
- 2015-02-16 CN CN201580020682.5A patent/CN106233383B/zh active Active
- 2015-02-16 ES ES19216781T patent/ES2901749T3/es active Active
- 2015-02-16 JP JP2016514752A patent/JP6270992B2/ja active Active
- 2015-02-16 ES ES15783646T patent/ES2713410T3/es active Active
- 2015-02-16 PL PL15783646T patent/PL3136387T3/pl unknown
- 2015-02-16 EP EP18200102.4A patent/EP3447766B1/en active Active
2017
- 2017-12-25 JP JP2017247615A patent/JP6486450B2/ja active Active
- 2017-12-25 JP JP2017247616A patent/JP6484325B2/ja active Active
2019
- 2019-02-19 JP JP2019027368A patent/JP6650540B2/ja active Active
- 2019-04-30 US US16/398,429 patent/US10504533B2/en active Active
- 2019-10-15 US US16/601,740 patent/US10643631B2/en active Active

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Digital cellular telecommunications system (Phase 2+); Universal Mobile Telecommunications System (UMTS); LTE; Audio codec processing functions; Extended Adaptive Multi-Rate - Wideband (AMR-WB+) codec; Transcoding functions (3GPP TS 26.290 version 11.0.0 Release 11)", TECHNICAL SPECIFICATION, EUROPEAN TELECOMMUNICATIONS STANDARDS INSTITUTE (ETSI), 650, ROUTE DES LUCIOLES ; F-06921 SOPHIA-ANTIPOLIS ; FRANCE, vol. 3GPP SA 4, no. V11.0.0, 1 October 2012 (2012-10-01), XP014075402 *
"Digital cellular telecommunications system (Phase 2+); Universal Mobile Telecommunications System (UMTS); LTE; Speech codec speech processing functions; Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; Transcoding functions (3GPP TS 26.190 version 11.0.0 Release 11)", TECHNICAL SPECIFICATION, EUROPEAN TELECOMMUNICATIONS STANDARDS INSTITUTE (ETSI), 650, ROUTE DES LUCIOLES ; F-06921 SOPHIA-ANTIPOLIS ; FRANCE, vol. 3GPP SA 4, no. V11.0.0, 1 October 2012 (2012-10-01), XP014075377 *

Also Published As

Publication number	Publication date
JP2019091075A (ja)	2019-06-13
CN110503963B (zh)	2022-10-04
EP3648103A1 (en)	2020-05-06
PL3447766T3 (pl)	2020-08-24
PL3136387T3 (pl)	2019-05-31
JP2018077501A (ja)	2018-05-17
KR20180074810A (ko)	2018-07-03
US20170249947A1 (en)	2017-08-31
ES2901749T3 (es)	2022-03-23
EP3136387A4 (en)	2017-09-13
CN110503963A (zh)	2019-11-26
US20200043506A1 (en)	2020-02-06
WO2015162979A1 (ja)	2015-10-29
US10332533B2 (en)	2019-06-25
JP2018067010A (ja)	2018-04-26
EP3136387A1 (en)	2017-03-01
CN110503964A (zh)	2019-11-26
CN106233383B (zh)	2019-11-01
ES2795198T3 (es)	2020-11-23
EP3136387B1 (en)	2018-12-12
US20190259403A1 (en)	2019-08-22
US10504533B2 (en)	2019-12-10
KR20180074811A (ko)	2018-07-03
CN110503964B (zh)	2022-10-04
JPWO2015162979A1 (ja)	2017-04-13
ES2713410T3 (es)	2019-05-21
JP6486450B2 (ja)	2019-03-20
KR101872905B1 (ko)	2018-08-03
CN106233383A (zh)	2016-12-14
TR201900472T4 (tr)	2019-02-21
PL3648103T3 (pl)	2022-02-07
US10643631B2 (en)	2020-05-05
EP3648103B1 (en)	2021-10-20
JP6484325B2 (ja)	2019-03-13
JP6270992B2 (ja)	2018-01-31
EP3447766A1 (en)	2019-02-27
KR101972087B1 (ko)	2019-04-24
KR20160135328A (ko)	2016-11-25
JP6650540B2 (ja)	2020-02-19
KR101972007B1 (ko)	2019-04-24

Legal Events

Date	Code	Title	Description
2019-01-25	PUAI	Public reference made under article 153(3) epc to a published international application that has entered the european phase	Free format text: ORIGINAL CODE: 0009012
2019-01-25	STAA	Information on the status of an ep patent application or granted ep patent	Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE
2019-02-27	17P	Request for examination filed	Effective date: 20181012
2019-02-27	AC	Divisional application: reference to earlier application	Ref document number: 3136387 Country of ref document: EP Kind code of ref document: P
2019-02-27	AK	Designated contracting states	Kind code of ref document: A1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR
2019-06-12	STAA	Information on the status of an ep patent application or granted ep patent	Free format text: STATUS: EXAMINATION IS IN PROGRESS
2019-07-17	17Q	First examination report despatched	Effective date: 20190617
2019-10-06	GRAP	Despatch of communication of intention to grant a patent	Free format text: ORIGINAL CODE: EPIDOSNIGR1
2019-10-06	STAA	Information on the status of an ep patent application or granted ep patent	Free format text: STATUS: GRANT OF PATENT IS INTENDED
2019-10-30	INTG	Intention to grant announced	Effective date: 20191007
2020-02-10	GRAS	Grant fee paid	Free format text: ORIGINAL CODE: EPIDOSNIGR3
2020-03-06	GRAA	(expected) grant	Free format text: ORIGINAL CODE: 0009210
2020-03-06	STAA	Information on the status of an ep patent application or granted ep patent	Free format text: STATUS: THE PATENT HAS BEEN GRANTED
2020-04-08	AC	Divisional application: reference to earlier application	Ref document number: 3136387 Country of ref document: EP Kind code of ref document: P
2020-04-08	AK	Designated contracting states	Kind code of ref document: B1 Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR
2020-04-15	REG	Reference to a national code	Ref country code: AT Ref legal event code: REF Ref document number: 1255446 Country of ref document: AT Kind code of ref document: T Effective date: 20200415 Ref country code: CH Ref legal event code: EP
2020-04-30	REG	Reference to a national code	Ref country code: DE Ref legal event code: R096 Ref document number: 602015050579 Country of ref document: DE
2020-05-13	REG	Reference to a national code	Ref country code: IE Ref legal event code: FG4D
2020-06-24	REG	Reference to a national code	Ref country code: NL Ref legal event code: FP
2020-09-25	REG	Reference to a national code	Ref country code: LT Ref legal event code: MG4D
2020-10-30	PG25	Lapsed in a contracting state [announced via postgrant information from national office to epo]	Ref country code: LT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408 Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200817 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200709 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408 Ref country code: NO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200708 Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200808 Ref country code: SE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408
2020-11-15	REG	Reference to a national code	Ref country code: AT Ref legal event code: MK05 Ref document number: 1255446 Country of ref document: AT Kind code of ref document: T Effective date: 20200408
2020-11-23	REG	Reference to a national code	Ref country code: ES Ref legal event code: FG2A Ref document number: 2795198 Country of ref document: ES Kind code of ref document: T3 Effective date: 20201123
2020-11-30	PG25	Lapsed in a contracting state [announced via postgrant information from national office to epo]	Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200708 Ref country code: RS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408 Ref country code: LV Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408 Ref country code: HR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408
2020-12-31	PG25	Lapsed in a contracting state [announced via postgrant information from national office to epo]	Ref country code: AL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408
2021-01-12	REG	Reference to a national code	Ref country code: DE Ref legal event code: R097 Ref document number: 602015050579 Country of ref document: DE
2021-01-29	PG25	Lapsed in a contracting state [announced via postgrant information from national office to epo]	Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408 Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408 Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408 Ref country code: SM Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408
2021-02-12	PLBE	No opposition filed within time limit	Free format text: ORIGINAL CODE: 0009261
2021-02-12	STAA	Information on the status of an ep patent application or granted ep patent	Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT
2021-02-26	PG25	Lapsed in a contracting state [announced via postgrant information from national office to epo]	Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408
2021-03-17	26N	No opposition filed	Effective date: 20210112
2021-05-31	PG25	Lapsed in a contracting state [announced via postgrant information from national office to epo]	Ref country code: SI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408
2021-09-30	PG25	Lapsed in a contracting state [announced via postgrant information from national office to epo]	Ref country code: MC Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408
2021-10-27	REG	Reference to a national code	Ref country code: BE Ref legal event code: MM Effective date: 20210228
2021-10-29	PG25	Lapsed in a contracting state [announced via postgrant information from national office to epo]	Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20210216 Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20210228 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20210228
2022-01-31	PG25	Lapsed in a contracting state [announced via postgrant information from national office to epo]	Ref country code: IE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20210216
2022-07-29	PG25	Lapsed in a contracting state [announced via postgrant information from national office to epo]	Ref country code: BE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20210228
2023-06-30	PG25	Lapsed in a contracting state [announced via postgrant information from national office to epo]	Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408
2023-07-31	PG25	Lapsed in a contracting state [announced via postgrant information from national office to epo]	Ref country code: HU Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO Effective date: 20150216
2023-07-31	PGFP	Annual fee paid to national office [announced via postgrant information from national office to epo]	Ref country code: ES Payment date: 20230427 Year of fee payment: 9
2024-04-16	PGFP	Annual fee paid to national office [announced via postgrant information from national office to epo]	Ref country code: NL Payment date: 20240219 Year of fee payment: 10
2024-04-30	PG25	Lapsed in a contracting state [announced via postgrant information from national office to epo]	Ref country code: MK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20200408
2024-04-30	PGFP	Annual fee paid to national office [announced via postgrant information from national office to epo]	Ref country code: DE Payment date: 20240219 Year of fee payment: 10 Ref country code: GB Payment date: 20240219 Year of fee payment: 10
2024-05-31	PGFP	Annual fee paid to national office [announced via postgrant information from national office to epo]	Ref country code: TR Payment date: 20240210 Year of fee payment: 10 Ref country code: PL Payment date: 20240208 Year of fee payment: 10 Ref country code: IT Payment date: 20240228 Year of fee payment: 10 Ref country code: FR Payment date: 20240222 Year of fee payment: 10

Publication	Publication Date	Title
US10643631B2 (en)	2020-05-05	Decoding method, apparatus and recording medium
EP1995723B1 (en)	2010-06-16	Neuroevolution training system
CN107248411A (zh)	2017-10-13	丢帧补偿处理方法和装置
EP3226243B1 (en)	2022-01-05	Encoding apparatus, decoding apparatus, and method and program for the same
EP2869299B1 (en)	2021-07-21	Decoding method, decoding apparatus, program, and recording medium therefor
JPH0782360B2 (ja)	1995-09-06	音声分析合成方法
JP2000235400A (ja)	2000-08-29	音響信号符号化装置、復号化装置、これらの方法、及びプログラム記録媒体
JP3024467B2 (ja)	2000-03-21	音声符号化装置
JP2016509694A (ja)	2016-03-31	音声信号を合成するための装置及び方法、デコーダ、エンコーダ、システム及びコンピュータプログラム
JP3144244B2 (ja)	2001-03-12	音声符号化装置
JPH0455899A (ja)	1992-02-24	音声信号符号化方式
JPH0844397A (ja)	1996-02-16	音声符号化装置
JP2002244700A (ja)	2002-08-30	音声符号化装置、音声符号化方法および記憶素子