CN109256143A - Speech parameter quantization method, device, computer equipment and storage medium - Google Patents
Speech parameter quantization method, device, computer equipment and storage medium Download PDFInfo
- Publication number
- CN109256143A CN109256143A CN201811109230.6A CN201811109230A CN109256143A CN 109256143 A CN109256143 A CN 109256143A CN 201811109230 A CN201811109230 A CN 201811109230A CN 109256143 A CN109256143 A CN 109256143A
- Authority
- CN
- China
- Prior art keywords
- frame
- parameter
- quantization
- voice signal
- lsf
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000013139 quantization Methods 0.000 title claims abstract description 187
- 238000000034 method Methods 0.000 title claims abstract description 148
- 230000005540 biological transmission Effects 0.000 claims abstract description 71
- 238000004458 analytical method Methods 0.000 claims abstract description 16
- 239000013598 vector Substances 0.000 claims description 66
- 125000004122 cyclic group Chemical group 0.000 claims description 19
- 238000011002 quantification Methods 0.000 claims description 18
- 238000004590 computer program Methods 0.000 claims description 17
- 230000009466 transformation Effects 0.000 claims description 17
- 230000008859 change Effects 0.000 claims description 13
- 238000001228 spectrum Methods 0.000 claims description 9
- 238000006243 chemical reaction Methods 0.000 claims description 8
- 230000015572 biosynthetic process Effects 0.000 description 6
- 238000003786 synthesis reaction Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 5
- 230000005236 sound signal Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 230000003044 adaptive effect Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 230000008054 signal transmission Effects 0.000 description 3
- 230000003595 spectral effect Effects 0.000 description 3
- 241000208340 Araliaceae Species 0.000 description 2
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 2
- 235000003140 Panax quinquefolius Nutrition 0.000 description 2
- 235000008434 ginseng Nutrition 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 208000032366 Oversensing Diseases 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000001174 ascending effect Effects 0.000 description 1
- 238000005311 autocorrelation function Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000149 penetrating effect Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 108090000623 proteins and genes Proteins 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000000153 supplemental effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/04—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
- G10L19/16—Vocoder architecture
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/032—Quantisation or dequantisation of spectral components
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
This application involves a kind of speech parameter quantization method, device, computer equipment and storage mediums.The described method includes: obtaining the speech parameter of voice signal using preset parameters analysis method;According to the transmission rate of the voice signal and the speech parameter, determine that quantization method corresponding with the speech parameter quantifies the speech parameter.The different demands of user can be met suitable for different scenes for the different transmission rates quantization method different with speech parameter flexible setting using this method.
Description
Technical field
This application involves technical field of audio, more particularly to a kind of speech parameter quantization method, device, computer
Equipment and storage medium.
Background technique
Vocoder is the coder that a kind of pair of speech is analyzed and synthesized, also referred to as speech analysis synthesis system or speech
Band compression system is compressed voice communication band and the powerful for carrying out secret communication.
Vocoder can be divided into encoder and decoder, and encoder converts bit stream for audio signal and is used for channel biography
Defeated, decoder is recovered from bit stream carries out speech synthesis for the parameter of speech synthesis, exports audio data.Currently, compiling
Code the common speech parameter quantization method of device include internet low rate encoding (Internet Low Bitrate Codec,
ILBC), enhanced variable rate codec (Enhanced Variable Rate Codec, EVRC), adaptive multi-rate
The methods of voice coding (Adaptive Multi Rate, AMR).
But above-mentioned speech parameter quantization method is inflexible, does not adapt to various scenes.
Summary of the invention
Based on this, it is necessary in view of the above technical problems, provide a kind of speech parameter for capableing of flexible adaptation several scenes
Quantization method, device, computer equipment and storage medium.
A kind of speech parameter quantization method, which comprises
The speech parameter of voice signal is obtained using preset parameters analysis method;
According to the transmission rate of the voice signal and the speech parameter, quantization corresponding with the speech parameter is determined
Method quantifies the speech parameter.
The transmission rate according to the voice signal and the speech parameter in one of the embodiments, determine
Quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the transmission rate of the voice signal is 2400bps, and the speech parameter is pitch period, then according to
The pure and impure type of the present frame of the cyclic attributes or transmission voice signal of voice signal, to the pitch period amount of progress
Change.
The cyclic attributes or the transmission voice signal according to the voice signal in one of the embodiments,
The pure and impure type of present frame, quantifies the pitch period, comprising:
If the present frame is unvoiced frame, logarithm is carried out in the pitch period of the present frame to the voice signal and is turned
It changes, and uniform quantization is carried out to transformation result using default order;
If the present frame be unvoiced frames, alternatively, the cyclic attributes of the voice signal be it is aperiodic, then to the voice
The pitch period of signal carries out bit quantization.
If the present frame is unvoiced frames in one of the embodiments, alternatively, the cyclic attributes of the voice signal are
It is aperiodic, then bit quantization is carried out to the pitch period of the voice signal, comprising:
If the present frame is unvoiced frames, the corresponding bit of the pitch period of the voice signal is quantified as the
One value;
If the cyclic attributes of the voice signal be it is aperiodic, by the corresponding bit of the pitch period of the voice signal
Position is quantified as second value.
The transmission rate according to the voice signal and the speech parameter in one of the embodiments, determine
Quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the transmission rate of the voice signal is 2400bps, and the speech parameter is line spectrum pair LSF parameter, then adopts
The LSF parameter is quantified with preset three-level vector code book.
The transmission rate according to the voice signal and the speech parameter in one of the embodiments, determine
Quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the transmission rate of the voice signal is 1200bps, according to the clear of the time frame for transmitting the voice signal
Turbid type determines the super frame mode of the voice signal;
According to the speech parameter and the super frame mode, determine quantization method corresponding with the speech parameter to described
Speech parameter is quantified.
It is described according to the speech parameter and the super frame mode, the determining and voice in one of the embodiments,
The corresponding quantization method of parameter quantifies the speech parameter, comprising:
If the speech parameter is LSF parameter, according to the super frame mode, using preset codebook quantification method to institute
LSF parameter is stated to be quantified.
It is described according to the super frame mode in one of the embodiments, using preset codebook quantification method to described
LSF parameter is quantified, comprising:
The super frame mode includes three adjacent time frames, and two are included at least in three adjacent time frames
Unvoiced frame then quantifies the LSF parameter of third frame in current super frame using preset three-level quantization code book, and according to previous
In superframe in the LSF parameter quantized value and the current super frame of third frame third frame LSF parameter quantized value, to first frame and
The LSF parameter of two frames is quantified, and the third frame is to be located at last time frame in the super frame mode in timing.
The LSF parameter quantized value according to third frame in previous superframe and described current in one of the embodiments,
The LSF parameter quantized value of third frame, quantifies the LSF parameter of first frame and the second frame in superframe, comprising:
According to the LSF parameter amount of third frame in the LSF parameter quantized value and the current super frame of third frame in previous superframe
Change value determines the LSF parameter quantized value of the corresponding first frame of each predictive coefficient in predictive coefficient code book and the LSF of the second frame
Parameter quantized value;
According to the LSF parameter quantized value of the LSF parameter of the first frame, the LSF parameter of second frame, the first frame
With the LSF parameter quantized value of second frame, target prediction coefficient is determined;
Residual error vector is determined according to the target prediction coefficient, and the residual error is sweared using preset second level vector code book
Quantified.
In one of the embodiments, according to the LSF parameter of the first frame, the LSF parameter of second frame, described
The LSF parameter quantized value of the LSF parameter quantized value of one frame and second frame, determines target prediction coefficient, comprising:
According to the LSF parameter quantized value of the LSF parameter of the first frame, the LSF parameter of second frame, the first frame
With the LSF parameter quantized value of second frame, determine that the corresponding prediction of each predictive coefficient in the predictive coefficient code book misses
Difference;
Determine that the corresponding predictive coefficient of the smallest prediction error is the target prediction coefficient.
It is described according to the super frame mode in one of the embodiments, using preset codebook quantification method to described
LSF parameter is quantified, comprising:
The super frame mode includes three adjacent time frames, and includes a voiced sound in three adjacent time frames
Frame then quantifies the LSF parameter of unvoiced frame in the super frame mode using preset three-level vector code book, and use is preset
Level-one vector code book quantifies the LSF parameter of unvoiced frames in the super frame mode.
It is described according to the super frame mode in one of the embodiments, using preset codebook quantification method to described
LSF parameter is quantified, comprising:
The super frame mode includes three adjacent time frames, and three adjacent time frames are unvoiced frames, then
Quantified using LSF parameter of the preset level-one vector code book to unvoiced frames in the super frame mode.
It is described according to the speech parameter and the super frame mode, the determining and voice in one of the embodiments,
The corresponding quantization method of parameter quantifies the speech parameter, comprising:
If the speech parameter is pitch period and pure and impure type, according to the super frame mode, using preset bit
Quantization method quantifies the pitch period and pure and impure type.
It is described according to the super frame mode in one of the embodiments, using preset 5 bit quantization method to described
Pitch period and pure and impure type are quantified, comprising:
If the super frame mode includes three adjacent time frames, and three adjacent time frames are unvoiced frames,
The pitch period and the corresponding bit of pure and impure type are then quantified as second value.
It is described according to the super frame mode in one of the embodiments, using preset 5 bit quantization method to described
Pitch period and pure and impure type are quantified, comprising:
If the super frame mode includes three adjacent time frames, and three adjacent time frames include a voiceless sound
The corresponding bit of the pure and impure type is then quantified as second value by frame, by the fundamental tone week of unvoiced frame in the super frame mode
Phase carries out Logarithm conversion, and determines Target quantization value according to transformation result and the pure and impure type.
It is described according to transformation result and the pure and impure type determines Target quantization value in one of the embodiments, comprising:
Uniform quantization is carried out to the transformation result, obtains uniform quantization coefficient;
According to the corresponding relationship between the uniform quantization coefficient, the pure and impure type and preset code book serial number, determine
The Target quantization value.
It is described according to the super frame mode in one of the embodiments, using preset 5 bit quantization method to described
Pitch period and pure and impure type are quantified, comprising:
If the super frame mode includes three adjacent time frames, and includes two turbid in three adjacent time frames
The corresponding bit of pure and impure type of the unvoiced frame is then quantified as second value by sound frame, by the pure and impure type of the unvoiced frames
Corresponding bit is quantified as the first value;And using preset vector code book to the pitch period of three adjacent time frames
Quantified.
It is described according to the super frame mode in one of the embodiments, using preset 5 bit quantization method to described
Pitch period and pure and impure type are quantified, comprising:
If the super frame mode includes three adjacent time frames, and three adjacent time frames include a voiced sound
Frame then obtains the code book serial number of N-bit according to preset vector code book;According to the preceding M-bit of the code book serial number of the N-bit
Value quantifies the pure and impure type, and the low N-M bit of the N-bit code book serial number is determined as to the amount of the pitch period
Change value, wherein N and M is positive integer, and N is greater than M.
A kind of speech parameter quantization device, comprising:
Module is obtained, for obtaining the speech parameter of voice signal using preset parameters analysis method;
Determining module, for the transmission rate and the speech parameter, determination and the voice according to the voice signal
The corresponding quantization method of parameter quantifies the speech parameter.
A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing
Device performs the steps of when executing the computer program
The speech parameter of voice signal is obtained using preset parameters analysis method;
According to the transmission rate of the voice signal and the speech parameter, quantization corresponding with the speech parameter is determined
Method quantifies the speech parameter.
A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor
It is performed the steps of when row
The speech parameter of voice signal is obtained using preset parameters analysis method;
According to the transmission rate of the voice signal and the speech parameter, quantization corresponding with the speech parameter is determined
Method quantifies the speech parameter.
Above-mentioned speech parameter quantization method, device, computer equipment and storage medium, using preset parameters analysis method
The speech parameter for obtaining voice signal, according to the transmission rate and speech parameter of voice signal, determination is corresponding with speech parameter
Quantization method quantifies speech parameter, can be directed to the quantization different with speech parameter flexible setting of different transmission rates
Method meets the different demands of user suitable for different scenes.
Detailed description of the invention
Fig. 1 is a kind of flow chart for speech parameter quantization method that one embodiment of the application provides;
Fig. 2 is a kind of flow chart for speech parameter quantization method that another embodiment of the application provides;
Fig. 3 is a kind of block diagram for speech parameter quantization device that one embodiment of the application provides;
Fig. 4 is a kind of block diagram for speech parameter quantization device that another embodiment of the application provides;
Fig. 5 is a kind of block diagram for speech parameter quantization device that another embodiment of the application provides;
Fig. 6 is the internal structure chart of computer equipment in one embodiment.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not
For limiting the application.
Vocoder can generally be divided into encoder and decoder, and encoder can be by 8000hz, 16bit uniform quantization
Audio signal is converted into the bit stream of 2400bps or 1200bps for transmission, and decoder is recovered from bit stream and is used for
The parameter synthesis audio signal of speech synthesis, and export the audio data of 8000hz, 16bit.In the application with encoder be hold
Row main body, emphasis describe the quantization scheme in encoder, and decoding process is the inverse process of coding.
Fig. 1 is a kind of flow chart for speech parameter quantization method that one embodiment of the application provides, the execution master of this method
Body is encoder, as shown in Figure 1, this method comprises:
Step 101, the speech parameter that voice signal is obtained using preset parameters analysis method.
Wherein, speech parameter may include pitch period, the pure and impure type of subband, linear predictive coding (Linear
Prediction Coding, LPC) parameter, line spectrum pair (Line Spectrum Frequency, LSF) parameter, gain parameter,
Fourier modulus parameter etc..
In the present embodiment, when carrying out LPC parameter extraction based on 10 rank linear filters, specifically, can be using synthesis
Filter transfer functionTo extract LPC parameter, wherein aiFor LPC (linear prediction) coefficient.By LPC
After analysis, LPC coefficient can be converted to LSF parameter.It can be using normalization amplitude difference ENERGY METHOD or auto-correlation function
The pitch period that voice signal is obtained with method, can also be smoothed pitch period, obtain the pitch period of integer.
For the pure and impure type of subband, voice signal can be divided into 5 subbands, voiced sound degree is individually calculated on each subband, is gone forward side by side
The pure and impure judgement of row, obtains pure and impure type.For the voice signal of each frame input, two can be calculated according to different the window's positions
A yield value G1 and G2, as gain parameter.Preceding ten of gene frequency in the corresponding surplus spectrum of voice signal can also be extracted
The range value of harmonic wave is as Fourier modulus parameter etc..The application of acquisition methods in to(for) speech parameter are not limited.
Step 102, transmission rate and speech parameter according to voice signal determine quantization method corresponding with speech parameter
Speech parameter is quantified.
Wherein, the transmission rate of voice signal is usually 2400bps or 1200bps, or other rates, this Shen
It please be not limited thereto.
In the present embodiment, can according to actual needs, different transmission rates, different phonetic parameter for voice signal
Different quantization methods are set.For example, the LSF parameter for being 2400bps for the transmission rate of voice signal, can use 3 grades of codes
This is quantified;The LSF parameter that transmission rate for voice signal is 1200bps, can also be to the different of voice signal
Different quantization methods is arranged in super frame mode;The pitch period that transmission rate for voice signal is 2400bps, can be by base
Because period and aperiodic mark carry out even quantization;The pitch period that transmission rate for voice signal is 1200bps, can be with
Pitch period and pure and impure type are subjected to joint quantization etc..
Speech parameter quantization method provided by the embodiments of the present application obtains voice signal using preset parameters analysis method
Speech parameter determine quantization method corresponding with speech parameter to language according to the transmission rate and speech parameter of voice signal
Sound parameter is quantified, and can be directed to the different transmission rates quantization method different with speech parameter flexible setting, is suitable for
In different scenes, meet the different demands of user.
Introduce below transmission rate be 2400bps when, the quantization method of pitch period.
In one embodiment, according to the transmission rate of voice signal and speech parameter, determination is corresponding with speech parameter
Quantization method quantifies speech parameter, comprising: if the transmission rate of voice signal is 2400bps, and speech parameter is base
The sound period, then according to the pure and impure type of the present frame of the cyclic attributes of voice signal or transmission of speech signals, to pitch period into
Row quantization.
It in the present embodiment, can be according to the cyclic attributes of voice signal if the transmission rate of voice signal is 2400bps
Or the pure and impure type of the present frame of transmission of speech signals, the pitch period of voice signal and aperiodicity mark are subjected to joint amount
Change, for example, using 8 bits to pitch period carry out quantization encoding, the quantized value simultaneously indicate frame pure and impure type and non-week
Phase mark.
Optionally, step is " right according to the pure and impure type of the present frame of the cyclic attributes of voice signal or transmission of speech signals
Pitch period is quantified ", comprising: if present frame is unvoiced frame, voice signal is carried out pair in the pitch period of present frame
Number conversion, and uniform quantization is carried out to transformation result using default order;If present frame is unvoiced frames, alternatively, voice signal
Cyclic attributes be it is aperiodic, then bit quantization is carried out to the pitch period of voice signal.
In the present embodiment, judge the pure and impure type of present frame, it, can be to present frame if the present frame is unvoiced frame
Pitch period carries out Logarithm conversion, and carries out uniform quantization to transformation result using default order.For example, if present frame is turbid
The pitch period of unvoiced frame is converted into the logarithm with 10 or 2 bottom of for first, and carries out 254 rank uniform quantizations by sound frame, if fundamental tone
Periodic regime is lg (20)~lg (144), then quantifying the result is that serial number between 2~255.If present frame is unvoiced frames, or
Person, the cyclic attributes of voice signal be it is aperiodic, then bit is carried out using pitch period of the preset quantized value to voice signal
Quantization.
Optionally, if present frame be unvoiced frames, alternatively, the cyclic attributes of voice signal be it is aperiodic, then to voice signal
Pitch period carry out bit quantization, comprising: if present frame be unvoiced frames, by the corresponding bit of the pitch period of voice signal
Position is quantified as the first value;If the cyclic attributes of voice signal be it is aperiodic, by the corresponding ratio of the pitch period of voice signal
Special position is quantified as second value.
In the present embodiment, the first value can be 1, and second value can be 0, alternatively, being also possible to the first value is 0, second
Value is 1, is mainly 1 with the first value in the embodiment of the present application, second value is 0 to come for example, but not being limited once.For example,
If encoding using 8 bit quantizations, if present frame is unvoiced frames, 8 bit quantization codings are disposed as 1;If present frame is
Aperiodic mark is 1, and 8 bit quantization codings are disposed as 0.
In the embodiment of the present application, when the transmission rate of voice signal is 2400bps, when quantifying for pitch period, by language
The pitch period and aperiodicity mark of sound signal carry out joint quantization, and the quantized value of pitch period can not only indicate the clear of frame
Turbid type also can indicate that aperiodic mark, so that the range that pitch period indicates is wider, moreover, the operand of the above method
It is lower.
Introduce below transmission rate be 2400bps when, the quantization method of LSF parameter.
In one embodiment, according to the transmission rate of voice signal and speech parameter, determination is corresponding with speech parameter
Quantization method quantifies speech parameter, comprising: if the transmission rate of voice signal is 2400bps, and speech parameter is line
Spectrum then quantifies LSF parameter using preset three-level vector code book to LSF parameter.
In the present embodiment, it is based on multi-stage vector quantization (Multi-Stage Vector Quantizer, MSVQ) scheme,
3 grades of (7+6+6) vector code books can be used to quantify LSF parameter, contain 128,64,64 LSF codebook vectors respectively, quantify
Result afterwards is codebook vectors index, occupies 19bits altogether.Specific quantization method is as follows: will be converted first by LPC parameter
To LSF parameter arranged according to ascending order, and guarantee to be at least 50Hz per the distance of adjacent LSF parameter, the LSF after sort joins
Number vector f;Then it is scanned in each vector code book according to weighted euclidean distance criterion shown in following formula:
Wherein, ' f be code book in vector, be the accumulated value of 3 grades of code books.So that dlsp(corresponding to f, ' f) the smallest ' f
Codebook vectors are search as a result, that is, making dlsp(codebook vectors corresponding to f, ' f) the smallest ' f are LSF parameter
Quantized result.W (i) is weighting coefficient, can be calculated according to following formula:
Wherein, p (f (i)) is the power spectral density that predictive filter is exported in i-th of line spectral frequency parameters respective frequencies.
In the embodiment of the present application, if the transmission rate of voice signal is 2400bps, and speech parameter is line spectrum pair LSF ginseng
Number then quantifies LSF parameter using preset three-level vector code book, and operand is smaller.
Introduce below transmission rate be 2400bps when, gain quantization method.
In the present embodiment, each frame needs to quantify two gain Gs 1 and G2.G2 carries out uniform quantization, model using 5bits
It encloses from 10dB to 77dB;The quantization of G1 needs to combine G2 and PG2 (G2 of PG2 previous frame), is used using following adaptive algorithms
3bits carries out uniform quantization: if | G2-PG2 | < 5, and | G1-0.5 (G2+PG2) | < 3.0, then G1 quantization encoding is 3 bits
0;Otherwise, G1 carries out 7 layers of uniform quantization of 3 bits in (g_min, g_max) range, quantization the result is that between 1~7
Serial number.Wherein g_min and g_max are calculated according to the following formula, and are limited in 10 to 77 ranges: g-min=min (G2, PG2)-
6.0, g-max=max (G2, PG2)+6.0.
Introduce below transmission rate be 2400bps when, the quantization method of pure and impure type.
In the present embodiment, pure and impure (U/V) judgement that 5 subbands can be used in the pure and impure type of subband is quantified, often
A subband indicates voiceless sound or voiced sound with 1 bit, for example, if the bit is expressed as voiced sound for 1, if the bit is 0 expression
For voiceless sound.5 subbands need 5 bits to indicate altogether, but since the pure and impure sound of peak low band subband is the quantization by pitch period
Value judgement, therefore only need to quantify the pure and impure sound of 4 higher frequency band subbands, 4 bits are actually needed.The rule specifically quantified
It is as follows:
If the voiced sound degree judgement of peak low band subband is voiceless sound, present frame is unvoiced frames, at this time 4 higher-frequency cross-talks
The U/V judgement of band is all quantified as 0;If the voiced sound degree of peak low band subband is determined as that voiced sound, present frame are unvoiced frame, this
When its voiced sound degree of the U/V grounds of judgment of 4 higher frequency band subbands VbpiIt carries out, if Vbpi> 0.6, the then corresponding ratio of current sub-band
Spy is quantified as 1 voiced sound, is otherwise quantified as 0 voiceless sound;If the result of 4 quantized subbands is 0001, forced to be set as 0000,
The pressure of highest frequency band is considered voiceless sound subband.
Introduce below transmission rate be 2400bps when, the quantization method of Fourier modulus.
In the present embodiment, the Fourier modulus of preceding 10 subharmonic of detection prediction residual signal and quantization, if harmonic number is small
In 10, then higher hamonic wave spectral amplitude is set 1, is combined into 10 n dimensional vector ns and is quantified.The spectrum of 10 subharmonic above sections is carried out when decoding
The normalized energy of planarization process, harmonic wave indicates that the decoding sequence that is, decoder obtains is with 1
This 10 dimension harmonic amplitude vector can be quantified using the vector code book of 8 bits, codebook search is calculated using full search
Method, distortion measure is using the weighted euclidean distance criterion being shown below:
Wherein, A is the vector in code book, and w (i) is weighting coefficient, and w (i) is calculated by following formula:
Wherein, fi=8000i/60 is the corresponding frequency of i-th harmonic wave that pitch period is 60.
Table 1 is the quantized result signal for the speech parameter that transmission rate is 2400bps, and table 2 is the transmission sequence of speech frame
Signal.As shown in Table 1 and Table 2, each frame includes 48bit, and wherein 1bit indicates frame type, indicates that current speech frame is speech frame
Or non-speech frame, remaining bits are used to indicate speech parameter.
Table 1
Table 2
In above-described embodiment, mainly describe transmission rate be 2400bps when speech parameter quantization method, below emphasis
Introduce speech parameter quantization method when transmission rate is 1200bps.Under 1200bps mode, combine quantization side using three frames
Formula quantifies the superframe of 60ms each time, exports 72bit.Table 3 is super frame mode signal, and table 4 is that 5 kinds of super frame modes correspond to bit points
With mode.As shown in table 3, superframe can be divided into, wherein U represents voiceless sound by 5 kinds of modes according to the pure and impure sound type of 3 subframes
Frame, V represent unvoiced frame.
Table 3
Table 4
In one of the embodiments, according to the transmission rate of voice signal and speech parameter, determining and speech parameter pair
The quantization method answered quantifies speech parameter, comprising: if the transmission rate of voice signal is 1200bps, according to transmission
The pure and impure type of the time frame of voice signal determines the super frame mode of voice signal;According to speech parameter and super frame mode, determine
Quantization method corresponding with speech parameter quantifies speech parameter.
In the present embodiment, can be as shown in table 3 if the transmission rate of voice signal is 1200bps, according to time frame
Pure and impure type superframe is divided into 5 kinds of modes, then according to speech parameter and super frame mode, determine the amount of each speech parameter
Change method.
When the transmission rate for introducing voice signal below is 1200bps, the quantization method of LSF parameter.
In one embodiment, according to speech parameter and super frame mode, quantization method pair corresponding with speech parameter is determined
Speech parameter is quantified, comprising: if speech parameter is LSF parameter, according to super frame mode, using preset codebook quantification side
Method quantifies LSF parameter.
Optionally, according to super frame mode, LSF parameter is quantified using preset codebook quantification method, comprising: superframe
Mode includes three adjacent time frames, and two unvoiced frames are included at least in three adjacent time frames, then using preset
Three-level quantization code book quantifies the LSF parameter of third frame in current super frame, and is joined according to the LSF of third frame in previous superframe
The LSF parameter quantized value of third frame, quantifies the LSF parameter of first frame and the second frame in quantification value and current super frame,
Third frame is to be located at last time frame in super frame mode in timing.
In the present embodiment, if super frame mode is the Mode1 mode and Mode2 mode in table 3, the LSF of third frame is joined
Number is quantified using 7-6-6 bit three-level vector code book, and the method and code book of use are identical with the LSF of 2400bps quantization.If
Super frame mode is Mode3 mode, then uses the LSF parameter of a 9 bit level vector codebook quantification third frames.For first,
The LSF of second frame is pressed then using the quantized value of third frame LSF in the quantized value and a upper superframe of third frame LSF in current super frame
Quantified according to the mode of prediction.
Optionally, as shown in Fig. 2, step is " according in the LSF parameter quantized value and current super frame of third frame in previous superframe
The LSF parameter quantized value of third frame, quantifies the LSF parameter of first frame and the second frame ", comprising:
Step 201 is joined according to the LSF of third frame in the LSF parameter quantized value and current super frame of third frame in previous superframe
Quantification value determines the LSF parameter quantized value and the second frame of the corresponding first frame of each predictive coefficient in predictive coefficient code book
LSF parameter quantized value.
In the present embodiment, if ' f3It (i) is the quantized value of third frame LSF in current super frame, ' fp(i) in previous superframe
The quantized value of third frame LSF, then calculate separately the quantized value of first frame LSF parameter according to the following formulaWith the second frame LSF
The quantized value of parameter
Wherein, i=1 ..., 10, a1And a2It is the predictive coefficient in 2 bit predictions code books.4 are shared in predictive coefficient code book
Rank predictive coefficient vector successively acquires each predictive coefficient correspondingWith
Step 202, according to the LSF parameter of first frame, the LSF parameter of the second frame, the LSF parameter quantized value of first frame and
The LSF parameter quantized value of two frames, determines target prediction coefficient.
Optionally, according to the LSF parameter of first frame, the LSF parameter of the second frame, the LSF parameter quantized value of first frame and
The LSF parameter quantized value of two frames, determines target prediction coefficient, comprising: is joined according to the LSF of the LSF parameter of first frame, the second frame
The LSF parameter quantized value of number, the LSF parameter quantized value of first frame and the second frame, determines each prediction in predictive coefficient code book
The corresponding prediction error of coefficient;Determine that the corresponding predictive coefficient of the smallest prediction error is target prediction coefficient.
In the present embodiment, according to the LSF parameter f of first frame1(i), the LSF parameter f of the second frame2(i), the LSF of first frame
Parameter quantized valueWith the LSF parameter quantized value of the second frame
According to formulaPrediction error E is calculated, prediction is missed
Predictive coefficient corresponding to poor E minimum value is determined as target prediction coefficient, then predicts code book serial number corresponding to error E minimum value
The as quantized result of code book where target prediction coefficient.
Step 203 is determined residual error vector according to target prediction coefficient, and is sweared using preset second level vector code book to residual error
Quantified.
In the present embodiment, it is determined that mark predictive coefficient, i.e., after search obtains optimal predictive coefficient vector, according to formulaWithCalculate the corresponding prediction residual r of predictive coefficient vector1(i) and r2(i),
Wherein, i=1 ..., 10.Finally by r1(i) and r2(i) it is combined into residual error vector R=[r1(1),…,r1(10),r2(1),…,
r2(10)], vector quantization is carried out using 8-6 bit second level vector code book.
Optionally, if it is Mode2 mode, first joined using the LSF of 7-6-6 bit three-level vector codebook quantification third frame
Number, then above-mentioned prediction technique quantify the LSF parameter of the first, second frame, unlike Mode1 mode, in search predictive coefficient
a1And a2When, Mode2 mode uses the prediction code book of 4 bit, 16 rank, and Mode1 mode uses 2 bit, 4 rank to predict
Code book.
Optionally, if it is Mode3 mode, first joined using the LSF of a 9 bit level vector codebook quantification third frames
Number reuses the LSF parameter that above-mentioned prediction technique quantifies the first, second frame.The method of prediction is identical as Mode1 and Mode2, in advance
Survey coefficient a1And a2The code book that search uses is identical as Mode1 mode, but the code book that prediction residual quantization uses is 8-6-6-6
Level Four vector code book.
Optionally, according to super frame mode, LSF parameter is quantified using preset codebook quantification method, comprising: superframe
Mode includes three adjacent time frames, and includes a unvoiced frame in three adjacent time frames, then uses preset three-level
Vector code book quantifies the LSF parameter of unvoiced frame in super frame mode, using preset level-one vector code book to super frame mode
The LSF parameter of middle unvoiced frames is quantified.
It in the present embodiment, is then directly to quantify to the LSF parameter of three frames, the result of quantization is pressed if it is Mode4 mode
It is successively transmitted in the bitstream according to frame sequential.Used code book is 7-6-6 bit three-level vector code book, voiceless sound in unvoiced frame
It is 9 bit level vector code books when frame.
Optionally, according to super frame mode, LSF parameter is quantified using preset codebook quantification method, comprising: superframe
Mode includes three adjacent time frames, and three adjacent time frames are unvoiced frames, then uses preset level-one codebook vector
This quantifies the LSF parameter of unvoiced frames in super frame mode.For example, 9 bits one then can be used if it is Mode5 mode
Grade vector code book successively quantifies the LSF parameter of three frames.
When the transmission rate for introducing voice signal below is 1200bps, the quantization side of pitch period and pure and impure type
Method.
In one embodiment, according to speech parameter and super frame mode, quantization method pair corresponding with speech parameter is determined
Speech parameter is quantified, comprising: if speech parameter is pitch period and pure and impure type, according to super frame mode, using default
5 bit quantization method pitch period and pure and impure type are quantified.
In the present embodiment, if the transmission rate of voice signal is 1200bps, if speech parameter is pitch period and pure and impure
Type then can select different quantization methods according to the different super frame modes in table 3.Table 5 is pitch period and pure and impure class
The quantized result of type is illustrated, and as shown in table 5,12bit can be used and carry out joint quantization to pitch period and pure and impure type, wherein
9bit indicates that pitch period, 3bit indicate pure and impure type.
Table 5
Optionally, according to super frame mode, using preset 5 bit quantization method to pitch period and the pure and impure type amount of progress
Change, comprising: if super frame mode includes three adjacent time frames, and three adjacent time frames are unvoiced frames, then by fundamental tone
Period and the corresponding bit of pure and impure type are quantified as second value.For example, for the superframe of UUU mode, pitch period and clear
Voiced sound type is directly quantified as the 0 of 12 bits.
Optionally, according to super frame mode, using preset 5 bit quantization method to pitch period and the pure and impure type amount of progress
Change, comprising: if super frame mode includes three adjacent time frames, and three adjacent time frames include a unvoiced frames, then will
The corresponding bit of pure and impure type is quantified as second value, and the pitch period of unvoiced frame in super frame mode is carried out Logarithm conversion,
And Target quantization value is determined according to transformation result and pure and impure type.
Further, Target quantization value is determined according to transformation result and pure and impure type, comprising: carry out to transformation result uniform
Quantization obtains uniform quantization coefficient;According to the corresponding pass between uniform quantization coefficient, pure and impure type and preset code book serial number
System, determines Target quantization value.
In the present embodiment, tri- kinds of super frame modes of UUV, UVU, VUU are contained only with the type of 1 unvoiced frame, 3 bits are clear
Turbid type instruction is quantified as 000.For pitch period, the pitch period of wherein unvoiced frame is converted into pair with 10 bottom of for first
Number, carries out 99 rank uniform quantizations, and range is lg (20)~lg (144), quantization the result is that serial number between 1~99;Secondly will
It is that 512 ranks map code book serial number that uniform quantization coefficient, which combines pure and impure sound Type mapping, as final Target quantization value.
Optionally, according to super frame mode, using preset 5 bit quantization method to pitch period and the pure and impure type amount of progress
Change, comprising: if super frame mode includes three adjacent time frames, and include two unvoiced frames in three adjacent time frames, then
The corresponding bit of pure and impure type of unvoiced frame is quantified as second value, the corresponding bit of the pure and impure type of unvoiced frames is quantified
For the first value;And the pitch period of three adjacent time frames is quantified using preset vector code book.
In the present embodiment, in tri- kinds of super frame modes of VVU, VUV, UVV include two unvoiced frames type, 3 bits
Pure and impure type instruction is quantified as 001,010 and 100 respectively;The pitch period of three frames is combined sequentially into as a three-dimensional vector,
The code book for reusing 512 ranks is quantified.
Optionally, according to super frame mode, using preset 5 bit quantization method to pitch period and the pure and impure type amount of progress
Change, comprising: if super frame mode includes three adjacent time frames, and three adjacent time frames include a unvoiced frame, then root
The code book serial number of N-bit is obtained according to preset vector code book;According to the value of the preceding M-bit of the code book serial number of N-bit to pure and impure class
Type is quantified, and the low N-M bit of N-bit code book serial number is determined as to the quantized value of pitch period, wherein N and M is positive whole
Number, and N is greater than M.
In the present embodiment, the superframe type of VVV is more special, uses the codebook search of 12 ranks first, obtains 11
The code book serial number of bit;Secondly pure and impure type indicated value is each mapped to from 00 to 11 according to the value of 2 bits before code book serial number
011,101,110 and 111, then using low 9 bit of code book serial number as the quantized value of pitch period.
Below emphasis describe voice signal transmission rate be 1200bps when, the voicing decision quantization method of subband.
In the present embodiment, table 6 is the corresponding bits allocation signal of frame pattern in voicing decision quantization, such as 6 institute of table
Show, number of the bit number of the voicing decision quantization of subband dependent on unvoiced frame in superframe, for example, the superframe of VVV mode is total
There are 3 unvoiced frames, every frame distributes 2 bits, occupies 6 bits altogether;The bit number calculation method of other super frame modes is identical.
Table 6
Frame pattern | VVV | VVU, VUV, UVV | VUU, UVU, UUV | UUU |
Istributes bit number | 6 | 4 | 2 | 0 |
Due to sharing 5 subbands in 1 unvoiced frame, the voicing decision of peak low band subband can pass through fundamental tone by decoder
The quantized value in period obtains, and the voicing decision information of 4 subbands of complete representation high band totally 4 16 seed types of bit, amount
Need to be mapped to 2 bit, 4 seed type when change.Table 7 is the rule of voicing decision quantization mapping, 4 of voicing decision in table 7
Bit successively indicates the pure and impure sound type from the 2nd subband to the 5th subband, indicates voiced sound subband for 1,0 indicates voiceless sound subband.Through reflecting
Only remaining four kinds of U/V judgement types after penetrating, when quantization, are indicated with 2 bits.
Table 7
In one embodiment, gain quantization when being 1200bps for the transmission rate of voice signal, it is every in superframe
One frame needs to quantify two gain Gs 1 and G2, and method is identical with the gain quantization method of 2400bps mode.The G1 and G2 of 3 frames are pressed
Frame sequential rearranges the gain vector of one 6 dimension, reuses a 10 bit level Codebook of Vector Quantization and is quantified.
Below emphasis describe voice signal transmission rate be 1200bps when, the quantization method of Fourier modulus.
In one embodiment, if superframe is UUU mode, that is, unvoiced frame is not contained, does not then quantify to transmit Fourier width
Degree.For the superframe containing unvoiced frame, then therefrom choose a unvoiced frame fourier modulus quantified, the mode of quantization and
Code book is identical as the mode of 2400bps mode and code book;The fourier modulus of other unvoiced frames is derived in decoding by quantized value
Out.For example, settingBe the i-th frame in current super frame normalization fourier modulus analysis as a result,It is the i-th frame in preceding superframe
Normalization fourier modulus quantization after as a result,It is the quantized result of the last one unvoiced frame in a upper superframe, Q
[] indicates that use method identical with 2400bps is quantified, then the fourier modulus of current super frame quantifies with derivation rule such as
Shown in table 8.
Table 8
Below emphasis describe voice signal transmission rate be 1200bps when, the quantization method of aperiodic mark.
In one embodiment, using the aperiodic mark of 3 frames in 1 bit quantization superframe, first according to lowest frequency cross-talk
The voiced sound degree V of bandbp1The original aperiodic mark for obtaining three frames, if Vbp1< 0.5, the then aperiodic mark quantization of the original of present frame
It is 1, is otherwise quantified as 0.Secondly the final quantization value of 1 bit, quantization are obtained according to superframe type and original aperiodic mark
It is regular as shown in table 9.
Table 9
Corresponding, when decoding, the aperiodic mark of 3 frames is derived from the quantized value of 1 bit, and decoding rule is such as 10 institute of table
Show.
Table 10
It should be understood that although each step in the flow chart of Fig. 1-2 is successively shown according to the instruction of arrow,
These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps
Execution there is no stringent sequences to limit, these steps can execute in other order.Moreover, at least one in Fig. 1-2
Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps
Completion is executed, but can be executed at different times, the execution sequence in these sub-steps or stage is also not necessarily successively
It carries out, but can be at least part of the sub-step or stage of other steps or other steps in turn or alternately
It executes.
In one embodiment, as shown in figure 3, providing a kind of speech parameter quantization device, comprising: obtain 11 He of module
Determining module 12, in which:
Module 11 is obtained, for obtaining the speech parameter of voice signal using preset parameters analysis method;
Determining module 12, for the transmission rate and the speech parameter, determination and institute's predicate according to the voice signal
The corresponding quantization method of sound parameter quantifies the speech parameter.
In one embodiment, if the transmission rate that the determining module 12 is specifically used for the voice signal is
2400bps, and the speech parameter is pitch period, then is believed according to the cyclic attributes of the voice signal or the transmission voice
Number present frame pure and impure type, the pitch period is quantified.
In one embodiment, as shown in figure 4, the determining module 12 includes:
First quantization submodule 121, if being unvoiced frame for the present frame, to the voice signal described current
The pitch period of frame carries out Logarithm conversion, and carries out uniform quantization to transformation result using default order;
Second quantization submodule 122, if being unvoiced frames for the present frame, alternatively, the period of the voice signal belongs to
Property to be aperiodic, then bit quantization is carried out to the pitch period of the voice signal.
It in one embodiment, will if it is unvoiced frames that the second quantization submodule 122, which is specifically used for the present frame,
The corresponding bit of the pitch period of the voice signal is quantified as the first value;If the cyclic attributes of the voice signal are non-
The corresponding bit of the pitch period of the voice signal is then quantified as second value by the period.
In one embodiment, if the transmission rate that the determining module 12 is specifically used for the voice signal is
2400bps, and the speech parameter be line spectrum pair LSF parameter, then using preset three-level vector code book to the LSF parameter into
Row quantization.
In one embodiment, as shown in figure 5, the determining module 12 includes:
First determines submodule 123, if the transmission rate for the voice signal is 1200bps, according to transmission institute
The pure and impure type of the time frame of predicate sound signal determines the super frame mode of the voice signal;
Second determines submodule 124, for according to the speech parameter and the super frame mode, the determining and voice to be joined
The corresponding quantization method of number quantifies the speech parameter.
In one embodiment, if it is LSF parameter that the described second determining submodule 124, which is specifically used for the speech parameter,
Then according to the super frame mode, the LSF parameter is quantified using preset codebook quantification method.
In one embodiment, the super frame mode includes three adjacent time frames, and the three adjacent time
Two unvoiced frames are included at least in frame, described second determines submodule 124 according to the super frame mode, using preset code book amount
Change method quantifies the LSF parameter, comprising: described second determines that submodule 124 quantifies code book using preset three-level
The LSF parameter of third frame in current super frame is quantified, and according to the LSF parameter quantized value of third frame in previous superframe and institute
The LSF parameter quantized value for stating third frame in current super frame, quantifies the LSF parameter of first frame and the second frame, the third
Frame is to be located at last time frame in the super frame mode in timing.
In one embodiment, described second determine that submodule 124 quantifies according to the LSF parameter of third frame in previous superframe
The LSF parameter quantized value of third frame, quantifies the LSF parameter of first frame and the second frame in value and the current super frame, wraps
It includes:
Described second determines submodule 124 according to the LSF parameter quantized value of third frame in previous superframe and described current super
The LSF parameter quantized value of third frame in frame determines the LSF ginseng of the corresponding first frame of each predictive coefficient in predictive coefficient code book
The LSF parameter quantized value of quantification value and the second frame;According to the LSF parameter of the first frame, the LSF parameter of second frame,
The LSF parameter quantized value of the LSF parameter quantized value of the first frame and second frame, determines target prediction coefficient;According to institute
It states target prediction coefficient and determines residual error vector, and the residual error vector is quantified using preset second level vector code book.
In one embodiment, described second determine submodule 124 according to the LSF parameter of the first frame, described second
The LSF parameter quantized value of the LSF parameter of frame, the LSF parameter quantized value of the first frame and second frame, determines target prediction
Coefficient, comprising:
Described second determines submodule 124 according to the LSF parameter of the first frame, the LSF parameter of second frame, described
The LSF parameter quantized value of the LSF parameter quantized value of first frame and second frame, determines each of described predictive coefficient code book
The corresponding prediction error of predictive coefficient;Determine that the corresponding predictive coefficient of the smallest prediction error is the target prediction coefficient.
In one embodiment, the super frame mode includes three adjacent time frames, and the three adjacent time
It include a unvoiced frame in frame, described second determines submodule 124 according to the super frame mode, using preset codebook quantification side
Method quantifies the LSF parameter, comprising:
Described second determines submodule 124 using preset three-level vector code book to unvoiced frame in the super frame mode
LSF parameter is quantified, using preset level-one vector code book to the LSF parameter amount of progress of unvoiced frames in the super frame mode
Change.
In one embodiment, the super frame mode includes three adjacent time frames, and the three adjacent time
Frame is unvoiced frames, and described second determines submodule 124 according to the super frame mode, using preset codebook quantification method to institute
It states LSF parameter to be quantified, comprising: described second determines submodule 124 using preset level-one vector code book to the superframe
The LSF parameter of unvoiced frames is quantified in mode.
In one embodiment, if described second determines that submodule 124 is specifically used for the speech parameter for pitch period
With pure and impure type, then according to the super frame mode, using preset 5 bit quantization method to the pitch period and pure and impure type
Quantified.
In one embodiment, if the super frame mode includes three adjacent time frames, and it is described three it is adjacent when
Between frame be unvoiced frames, it is described second determine submodule 124 according to the super frame mode, using preset 5 bit quantization method pair
The pitch period and pure and impure type are quantified, comprising: described second determine submodule 124 be used for the pitch period and
The corresponding bit of pure and impure type is quantified as second value.
In one embodiment, if the super frame mode includes three adjacent time frames, and it is described three it is adjacent when
Between frame include a unvoiced frames, it is described second determine submodule 124 according to the super frame mode, using preset bit quantization side
Method quantifies the pitch period and pure and impure type, comprising: described second determines submodule 124 by the pure and impure type pair
The bit answered is quantified as second value, by the pitch period progress Logarithm conversion of unvoiced frame in the super frame mode, and according to
Transformation result and the pure and impure type determine Target quantization value.
In one embodiment, described second determine that submodule 124 determines mesh according to transformation result and the pure and impure type
Scalarization value, comprising: described second determines that submodule 124 carries out uniform quantization to the transformation result, obtains uniform quantization system
Number;According to the corresponding relationship between the uniform quantization coefficient, the pure and impure type and preset code book serial number, the mesh is determined
Scalarization value.
In one embodiment, if the super frame mode includes three adjacent time frames, and it is described three it is adjacent when
Between include two unvoiced frames in frame, described second determines submodule 124 according to the super frame mode, using preset bit quantization
Method quantifies the pitch period and pure and impure type, comprising: described second determines submodule 124 by the unvoiced frame
The corresponding bit of pure and impure type is quantified as second value, and the corresponding bit of pure and impure type of the unvoiced frames is quantified as first
Value;And the pitch period of three adjacent time frames is quantified using preset vector code book.
In one embodiment, if the super frame mode includes three adjacent time frames, and it is described three it is adjacent when
Between frame include a unvoiced frame, it is described second determine submodule 124 according to the super frame mode, using preset bit quantization side
Method quantifies the pitch period and pure and impure type, comprising: described second determines submodule 124 according to preset codebook vector
The code book serial number of this acquisition N-bit;The pure and impure type is carried out according to the value of the preceding M-bit of the code book serial number of the N-bit
The low N-M bit of the N-bit code book serial number, is determined as the quantized value of the pitch period, wherein N and M are positive by quantization
Integer, and N is greater than M.
Specific about speech parameter quantization device limits the limit that may refer to above for speech parameter quantization method
Fixed, details are not described herein.Modules in above-mentioned speech parameter quantization device can fully or partially through software, hardware and its
Combination is to realize.Above-mentioned each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also be with
It is stored in the memory in computer equipment in a software form, in order to which processor calls the above modules of execution corresponding
Operation.
In one embodiment, a kind of computer equipment is provided, which can be server, internal junction
Composition can be as shown in Figure 6.The computer equipment include by system bus connect processor, memory, network interface and
Database.Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory packet of the computer equipment
Include non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and data
Library.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating
The database of machine equipment is used for storaged voice supplemental characteristic.The network interface of the computer equipment is used to pass through with external terminal
Network connection communication.To realize a kind of speech parameter quantization method when the computer program is executed by processor.
It will be understood by those skilled in the art that structure shown in Fig. 6, only part relevant to application scheme is tied
The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme, specific computer equipment
It may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.
In one embodiment, a kind of computer equipment, including memory and processor are provided, is stored in memory
Computer program, the processor perform the steps of when executing computer program
The speech parameter of voice signal is obtained using preset parameters analysis method;
According to the transmission rate of the voice signal and the speech parameter, quantization corresponding with the speech parameter is determined
Method quantifies the speech parameter.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated
Machine program performs the steps of when being executed by processor
The speech parameter of voice signal is obtained using preset parameters analysis method;
According to the transmission rate of the voice signal and the speech parameter, quantization corresponding with the speech parameter is determined
Method quantifies the speech parameter.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer
In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein,
To any reference of memory, storage, database or other media used in each embodiment provided herein,
Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM
(PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include
Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms,
Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing
Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM
(RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment
In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance
Shield all should be considered as described in this specification.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously
It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art
It says, without departing from the concept of this application, various modifications and improvements can be made, these belong to the protection of the application
Range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.
Claims (21)
1. a kind of speech parameter quantization method, which comprises
The speech parameter of voice signal is obtained using preset parameters analysis method;
According to the transmission rate of the voice signal and the speech parameter, quantization method corresponding with the speech parameter is determined
The speech parameter is quantified.
2. the method according to claim 1, wherein the transmission rate according to the voice signal and described
Speech parameter determines that quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the transmission rate of the voice signal is 2400bps, and the speech parameter is pitch period, then according to the voice
The pure and impure type of the present frame of the cyclic attributes or transmission voice signal of signal, quantifies the pitch period.
3. according to the method described in claim 2, it is characterized in that, the cyclic attributes or transmission according to the voice signal
The pure and impure type of the present frame of the voice signal, quantifies the pitch period, comprising:
If the present frame is unvoiced frame, the pitch period to the voice signal in the present frame carries out Logarithm conversion,
And uniform quantization is carried out to transformation result using default order;
If the present frame be unvoiced frames, alternatively, the cyclic attributes of the voice signal be it is aperiodic, then to the voice signal
Pitch period carry out bit quantization.
4. according to the method described in claim 3, it is characterized in that, if the present frame is unvoiced frames, alternatively, the voice is believed
Number cyclic attributes be it is aperiodic, then bit quantization is carried out to the pitch period of the voice signal, comprising:
If the present frame is unvoiced frames, the corresponding bit of the pitch period of the voice signal is quantified as first
Value;
If the cyclic attributes of the voice signal be it is aperiodic, the corresponding bit of the pitch period of the voice signal is equal
It is quantified as second value.
5. the method according to claim 1, wherein the transmission rate according to the voice signal and described
Speech parameter determines that quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the transmission rate of the voice signal is 2400bps, and the speech parameter is line spectrum pair LSF parameter, then using pre-
If three-level vector code book the LSF parameter is quantified.
6. the method according to claim 1, wherein the transmission rate according to the voice signal and described
Speech parameter determines that quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the transmission rate of the voice signal is 1200bps, according to the pure and impure class for the time frame for transmitting the voice signal
Type determines the super frame mode of the voice signal;
According to the speech parameter and the super frame mode, determine quantization method corresponding with the speech parameter to the voice
Parameter is quantified.
7. according to the method described in claim 6, it is characterized in that, described according to the speech parameter and the super frame mode,
Determine that quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the speech parameter is LSF parameter, according to the super frame mode, using preset codebook quantification method to described
LSF parameter is quantified.
8. the method according to the description of claim 7 is characterized in that described according to the super frame mode, using preset code book
Quantization method quantifies the LSF parameter, comprising:
The super frame mode includes three adjacent time frames, and two voiced sounds are included at least in three adjacent time frames
Frame then quantifies the LSF parameter of third frame in current super frame using preset three-level quantization code book, and according to previous superframe
The LSF parameter quantized value of third frame in the LSF parameter quantized value and the current super frame of middle third frame, to first frame and the second frame
LSF parameter quantified, the third frame is to be located at last time frame in the super frame mode in timing.
9. according to the method described in claim 8, it is characterized in that, the LSF parameter amount according to third frame in previous superframe
The LSF parameter quantized value of third frame, quantifies the LSF parameter of first frame and the second frame in change value and the current super frame,
Include:
According to the LSF parameter quantized value of third frame in the LSF parameter quantized value and the current super frame of third frame in previous superframe,
Determine the LSF parameter quantized value of the corresponding first frame of each predictive coefficient in predictive coefficient code book and the LSF parameter of the second frame
Quantized value;
According to the LSF parameter of the first frame, the LSF parameter of second frame, the LSF parameter quantized value of the first frame and institute
The LSF parameter quantized value for stating the second frame, determines target prediction coefficient;
Determine residual error vector according to the target prediction coefficient, and using preset second level vector code book to the residual error vector into
Row quantization.
10. according to the method described in claim 9, it is characterized in that, according to the LSF parameter of the first frame, second frame
LSF parameter, the LSF parameter quantized value of the first frame and the LSF parameter quantized value of second frame, determine target prediction system
Number, comprising:
According to the LSF parameter of the first frame, the LSF parameter of second frame, the LSF parameter quantized value of the first frame and institute
The LSF parameter quantized value for stating the second frame determines the corresponding prediction error of each predictive coefficient in the predictive coefficient code book;
Determine that the corresponding predictive coefficient of the smallest prediction error is the target prediction coefficient.
11. the method according to the description of claim 7 is characterized in that described according to the super frame mode, using preset code book
Quantization method quantifies the LSF parameter, comprising:
The super frame mode includes three adjacent time frames, and includes a unvoiced frame in three adjacent time frames,
Then the LSF parameter of unvoiced frame in the super frame mode is quantified using preset three-level vector code book, using preset one
Grade vector code book quantifies the LSF parameter of unvoiced frames in the super frame mode.
12. the method according to the description of claim 7 is characterized in that described according to the super frame mode, using preset code book
Quantization method quantifies the LSF parameter, comprising:
The super frame mode includes three adjacent time frames, and three adjacent time frames are unvoiced frames, then use
Preset level-one vector code book quantifies the LSF parameter of unvoiced frames in the super frame mode.
13. according to the method described in claim 6, it is characterized in that, described according to the speech parameter and the super frame mode,
Determine that quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the speech parameter is pitch period and pure and impure type, according to the super frame mode, using preset bit quantization
Method quantifies the pitch period and pure and impure type.
14. according to the method for claim 13, which is characterized in that it is described according to the super frame mode, using preset ratio
Special quantization method quantifies the pitch period and pure and impure type, comprising:
If the super frame mode includes three adjacent time frames, and three adjacent time frames are unvoiced frames, then will
The pitch period and the corresponding bit of pure and impure type are quantified as second value.
15. according to the method for claim 13, which is characterized in that it is described according to the super frame mode, using preset ratio
Special quantization method quantifies the pitch period and pure and impure type, comprising:
If the super frame mode includes three adjacent time frames, and three adjacent time frames include a unvoiced frames,
The corresponding bit of the pure and impure type is then quantified as second value, by the pitch period of unvoiced frame in the super frame mode into
Row Logarithm conversion, and Target quantization value is determined according to transformation result and the pure and impure type.
16. according to the method for claim 15, which is characterized in that described to be determined according to transformation result with the pure and impure type
Target quantization value, comprising:
Uniform quantization is carried out to the transformation result, obtains uniform quantization coefficient;
According to the corresponding relationship between the uniform quantization coefficient, the pure and impure type and preset code book serial number, determine described in
Target quantization value.
17. according to the method for claim 13, which is characterized in that it is described according to the super frame mode, using preset ratio
Special quantization method quantifies the pitch period and pure and impure type, comprising:
If the super frame mode includes three adjacent time frames, and includes two voiced sounds in three adjacent time frames
The corresponding bit of pure and impure type of the unvoiced frame is then quantified as second value by frame, by the pure and impure type pair of the unvoiced frames
The bit answered is quantified as the first value;And using preset vector code book to the pitch periods of three adjacent time frames into
Row quantization.
18. according to the method for claim 13, which is characterized in that it is described according to the super frame mode, using preset ratio
Special quantization method quantifies the pitch period and pure and impure type, comprising:
If the super frame mode includes three adjacent time frames, and three adjacent time frames include a unvoiced frame,
The code book serial number of N-bit is then obtained according to preset vector code book;According to the value of the preceding M-bit of the code book serial number of the N-bit
The pure and impure type is quantified, the low N-M bit of the N-bit code book serial number is determined as to the quantization of the pitch period
Value, wherein N and M is positive integer, and N is greater than M.
19. a kind of speech parameter quantization device characterized by comprising
Module is obtained, for obtaining the speech parameter of voice signal using preset parameters analysis method;
Determining module, for the transmission rate and the speech parameter, determination and the speech parameter according to the voice signal
Corresponding quantization method quantifies the speech parameter.
20. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists
In the step of processor realizes any one of claims 1 to 18 the method when executing the computer program.
21. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program
The step of method described in any one of claims 1 to 18 is realized when being executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811109230.6A CN109256143A (en) | 2018-09-21 | 2018-09-21 | Speech parameter quantization method, device, computer equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811109230.6A CN109256143A (en) | 2018-09-21 | 2018-09-21 | Speech parameter quantization method, device, computer equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109256143A true CN109256143A (en) | 2019-01-22 |
Family
ID=65047672
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811109230.6A Pending CN109256143A (en) | 2018-09-21 | 2018-09-21 | Speech parameter quantization method, device, computer equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109256143A (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6094629A (en) * | 1998-07-13 | 2000-07-25 | Lockheed Martin Corp. | Speech coding system and method including spectral quantizer |
CN1975861A (en) * | 2006-12-15 | 2007-06-06 | 清华大学 | Vocoder fundamental tone cycle parameter channel error code resisting method |
CN101030377A (en) * | 2007-04-13 | 2007-09-05 | 清华大学 | Method for increasing base-sound period parameter quantified precision of 0.6kb/s voice coder |
CN102855878A (en) * | 2012-09-21 | 2013-01-02 | 山东省计算中心 | Quantification method of pure and impure pitch parameters of narrow-band voice sub-band |
CN103050121A (en) * | 2012-12-31 | 2013-04-17 | 北京迅光达通信技术有限公司 | Linear prediction speech coding method and speech synthesis method |
CN103247293A (en) * | 2013-05-14 | 2013-08-14 | 中国科学院自动化研究所 | Coding method and decoding method for voice data |
CN103325375A (en) * | 2013-06-05 | 2013-09-25 | 上海交通大学 | Coding and decoding device and method of ultralow-bit-rate speech |
CN106098072A (en) * | 2016-06-02 | 2016-11-09 | 重庆邮电大学 | A kind of 600bps very low speed rate encoding and decoding speech method based on MELP |
CN106935243A (en) * | 2015-12-29 | 2017-07-07 | 航天信息股份有限公司 | A kind of low bit digital speech vector quantization method and system based on MELP |
-
2018
- 2018-09-21 CN CN201811109230.6A patent/CN109256143A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6094629A (en) * | 1998-07-13 | 2000-07-25 | Lockheed Martin Corp. | Speech coding system and method including spectral quantizer |
CN1975861A (en) * | 2006-12-15 | 2007-06-06 | 清华大学 | Vocoder fundamental tone cycle parameter channel error code resisting method |
CN101030377A (en) * | 2007-04-13 | 2007-09-05 | 清华大学 | Method for increasing base-sound period parameter quantified precision of 0.6kb/s voice coder |
CN102855878A (en) * | 2012-09-21 | 2013-01-02 | 山东省计算中心 | Quantification method of pure and impure pitch parameters of narrow-band voice sub-band |
CN103050121A (en) * | 2012-12-31 | 2013-04-17 | 北京迅光达通信技术有限公司 | Linear prediction speech coding method and speech synthesis method |
CN103247293A (en) * | 2013-05-14 | 2013-08-14 | 中国科学院自动化研究所 | Coding method and decoding method for voice data |
CN103325375A (en) * | 2013-06-05 | 2013-09-25 | 上海交通大学 | Coding and decoding device and method of ultralow-bit-rate speech |
CN106935243A (en) * | 2015-12-29 | 2017-07-07 | 航天信息股份有限公司 | A kind of low bit digital speech vector quantization method and system based on MELP |
CN106098072A (en) * | 2016-06-02 | 2016-11-09 | 重庆邮电大学 | A kind of 600bps very low speed rate encoding and decoding speech method based on MELP |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101425944B1 (en) | Improved coding/decoding of digital audio signal | |
CN101057275B (en) | Vector conversion device and vector conversion method | |
CN1957398B (en) | Methods and devices for low-frequency emphasis during audio compression based on acelp/tcx | |
US8364495B2 (en) | Voice encoding device, voice decoding device, and methods therefor | |
CN101836251B (en) | Scalable speech and audio encoding using combinatorial encoding of MDCT spectrum | |
KR101175651B1 (en) | Method and apparatus for multiple compression coding | |
KR101180202B1 (en) | Method and apparatus for generating an enhancement layer within a multiple-channel audio coding system | |
JP6980871B2 (en) | Signal coding method and its device, and signal decoding method and its device | |
TWI605448B (en) | Apparatus for generating bandwidth extended signal | |
JP2014016625A (en) | Audio coding system, audio decoder, audio coding method, and audio decoding method | |
JP2012509515A (en) | Encoding audio digital signals with noise conversion in a scalable encoder | |
US10283133B2 (en) | Audio classification based on perceptual quality for low or medium bit rates | |
US8719011B2 (en) | Encoding device and encoding method | |
US20240127832A1 (en) | Decoder | |
US20100153099A1 (en) | Speech encoding apparatus and speech encoding method | |
JP5388849B2 (en) | Speech coding apparatus and speech coding method | |
US20100049508A1 (en) | Audio encoding device and audio encoding method | |
CN109256143A (en) | Speech parameter quantization method, device, computer equipment and storage medium | |
JPWO2008072733A1 (en) | Encoding apparatus and encoding method | |
US10176816B2 (en) | Vector quantization of algebraic codebook with high-pass characteristic for polarity selection | |
US20100094623A1 (en) | Encoding device and encoding method | |
US20100280830A1 (en) | Decoder | |
KR102539165B1 (en) | Residual coding method of linear prediction coding coefficient based on collaborative quantization, and computing device for performing the method | |
CN105122358A (en) | Apparatus and method for processing an encoded signal and encoder and method for generating an encoded signal | |
Liang et al. | A new 1.2 kb/s speech coding algorithm and its real-time implementation on TMS320LC548 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190122 |
|
RJ01 | Rejection of invention patent application after publication |