CN109256143A - Speech parameter quantization method, device, computer equipment and storage medium - Google Patents

Speech parameter quantization method, device, computer equipment and storage medium Download PDF

Info

Publication number
CN109256143A
CN109256143A CN201811109230.6A CN201811109230A CN109256143A CN 109256143 A CN109256143 A CN 109256143A CN 201811109230 A CN201811109230 A CN 201811109230A CN 109256143 A CN109256143 A CN 109256143A
Authority
CN
China
Prior art keywords
frame
parameter
quantization
voice signal
lsf
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811109230.6A
Other languages
Chinese (zh)
Inventor
袁念德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xi'an Bee Language Mdt Infotech Ltd
Original Assignee
Xi'an Bee Language Mdt Infotech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xi'an Bee Language Mdt Infotech Ltd filed Critical Xi'an Bee Language Mdt Infotech Ltd
Priority to CN201811109230.6A priority Critical patent/CN109256143A/en
Publication of CN109256143A publication Critical patent/CN109256143A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • G10L19/032Quantisation or dequantisation of spectral components

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

This application involves a kind of speech parameter quantization method, device, computer equipment and storage mediums.The described method includes: obtaining the speech parameter of voice signal using preset parameters analysis method;According to the transmission rate of the voice signal and the speech parameter, determine that quantization method corresponding with the speech parameter quantifies the speech parameter.The different demands of user can be met suitable for different scenes for the different transmission rates quantization method different with speech parameter flexible setting using this method.

Description

Speech parameter quantization method, device, computer equipment and storage medium
Technical field
This application involves technical field of audio, more particularly to a kind of speech parameter quantization method, device, computer Equipment and storage medium.
Background technique
Vocoder is the coder that a kind of pair of speech is analyzed and synthesized, also referred to as speech analysis synthesis system or speech Band compression system is compressed voice communication band and the powerful for carrying out secret communication.
Vocoder can be divided into encoder and decoder, and encoder converts bit stream for audio signal and is used for channel biography Defeated, decoder is recovered from bit stream carries out speech synthesis for the parameter of speech synthesis, exports audio data.Currently, compiling Code the common speech parameter quantization method of device include internet low rate encoding (Internet Low Bitrate Codec, ILBC), enhanced variable rate codec (Enhanced Variable Rate Codec, EVRC), adaptive multi-rate The methods of voice coding (Adaptive Multi Rate, AMR).
But above-mentioned speech parameter quantization method is inflexible, does not adapt to various scenes.
Summary of the invention
Based on this, it is necessary in view of the above technical problems, provide a kind of speech parameter for capableing of flexible adaptation several scenes Quantization method, device, computer equipment and storage medium.
A kind of speech parameter quantization method, which comprises
The speech parameter of voice signal is obtained using preset parameters analysis method;
According to the transmission rate of the voice signal and the speech parameter, quantization corresponding with the speech parameter is determined Method quantifies the speech parameter.
The transmission rate according to the voice signal and the speech parameter in one of the embodiments, determine Quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the transmission rate of the voice signal is 2400bps, and the speech parameter is pitch period, then according to The pure and impure type of the present frame of the cyclic attributes or transmission voice signal of voice signal, to the pitch period amount of progress Change.
The cyclic attributes or the transmission voice signal according to the voice signal in one of the embodiments, The pure and impure type of present frame, quantifies the pitch period, comprising:
If the present frame is unvoiced frame, logarithm is carried out in the pitch period of the present frame to the voice signal and is turned It changes, and uniform quantization is carried out to transformation result using default order;
If the present frame be unvoiced frames, alternatively, the cyclic attributes of the voice signal be it is aperiodic, then to the voice The pitch period of signal carries out bit quantization.
If the present frame is unvoiced frames in one of the embodiments, alternatively, the cyclic attributes of the voice signal are It is aperiodic, then bit quantization is carried out to the pitch period of the voice signal, comprising:
If the present frame is unvoiced frames, the corresponding bit of the pitch period of the voice signal is quantified as the One value;
If the cyclic attributes of the voice signal be it is aperiodic, by the corresponding bit of the pitch period of the voice signal Position is quantified as second value.
The transmission rate according to the voice signal and the speech parameter in one of the embodiments, determine Quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the transmission rate of the voice signal is 2400bps, and the speech parameter is line spectrum pair LSF parameter, then adopts The LSF parameter is quantified with preset three-level vector code book.
The transmission rate according to the voice signal and the speech parameter in one of the embodiments, determine Quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the transmission rate of the voice signal is 1200bps, according to the clear of the time frame for transmitting the voice signal Turbid type determines the super frame mode of the voice signal;
According to the speech parameter and the super frame mode, determine quantization method corresponding with the speech parameter to described Speech parameter is quantified.
It is described according to the speech parameter and the super frame mode, the determining and voice in one of the embodiments, The corresponding quantization method of parameter quantifies the speech parameter, comprising:
If the speech parameter is LSF parameter, according to the super frame mode, using preset codebook quantification method to institute LSF parameter is stated to be quantified.
It is described according to the super frame mode in one of the embodiments, using preset codebook quantification method to described LSF parameter is quantified, comprising:
The super frame mode includes three adjacent time frames, and two are included at least in three adjacent time frames Unvoiced frame then quantifies the LSF parameter of third frame in current super frame using preset three-level quantization code book, and according to previous In superframe in the LSF parameter quantized value and the current super frame of third frame third frame LSF parameter quantized value, to first frame and The LSF parameter of two frames is quantified, and the third frame is to be located at last time frame in the super frame mode in timing.
The LSF parameter quantized value according to third frame in previous superframe and described current in one of the embodiments, The LSF parameter quantized value of third frame, quantifies the LSF parameter of first frame and the second frame in superframe, comprising:
According to the LSF parameter amount of third frame in the LSF parameter quantized value and the current super frame of third frame in previous superframe Change value determines the LSF parameter quantized value of the corresponding first frame of each predictive coefficient in predictive coefficient code book and the LSF of the second frame Parameter quantized value;
According to the LSF parameter quantized value of the LSF parameter of the first frame, the LSF parameter of second frame, the first frame With the LSF parameter quantized value of second frame, target prediction coefficient is determined;
Residual error vector is determined according to the target prediction coefficient, and the residual error is sweared using preset second level vector code book Quantified.
In one of the embodiments, according to the LSF parameter of the first frame, the LSF parameter of second frame, described The LSF parameter quantized value of the LSF parameter quantized value of one frame and second frame, determines target prediction coefficient, comprising:
According to the LSF parameter quantized value of the LSF parameter of the first frame, the LSF parameter of second frame, the first frame With the LSF parameter quantized value of second frame, determine that the corresponding prediction of each predictive coefficient in the predictive coefficient code book misses Difference;
Determine that the corresponding predictive coefficient of the smallest prediction error is the target prediction coefficient.
It is described according to the super frame mode in one of the embodiments, using preset codebook quantification method to described LSF parameter is quantified, comprising:
The super frame mode includes three adjacent time frames, and includes a voiced sound in three adjacent time frames Frame then quantifies the LSF parameter of unvoiced frame in the super frame mode using preset three-level vector code book, and use is preset Level-one vector code book quantifies the LSF parameter of unvoiced frames in the super frame mode.
It is described according to the super frame mode in one of the embodiments, using preset codebook quantification method to described LSF parameter is quantified, comprising:
The super frame mode includes three adjacent time frames, and three adjacent time frames are unvoiced frames, then Quantified using LSF parameter of the preset level-one vector code book to unvoiced frames in the super frame mode.
It is described according to the speech parameter and the super frame mode, the determining and voice in one of the embodiments, The corresponding quantization method of parameter quantifies the speech parameter, comprising:
If the speech parameter is pitch period and pure and impure type, according to the super frame mode, using preset bit Quantization method quantifies the pitch period and pure and impure type.
It is described according to the super frame mode in one of the embodiments, using preset 5 bit quantization method to described Pitch period and pure and impure type are quantified, comprising:
If the super frame mode includes three adjacent time frames, and three adjacent time frames are unvoiced frames, The pitch period and the corresponding bit of pure and impure type are then quantified as second value.
It is described according to the super frame mode in one of the embodiments, using preset 5 bit quantization method to described Pitch period and pure and impure type are quantified, comprising:
If the super frame mode includes three adjacent time frames, and three adjacent time frames include a voiceless sound The corresponding bit of the pure and impure type is then quantified as second value by frame, by the fundamental tone week of unvoiced frame in the super frame mode Phase carries out Logarithm conversion, and determines Target quantization value according to transformation result and the pure and impure type.
It is described according to transformation result and the pure and impure type determines Target quantization value in one of the embodiments, comprising:
Uniform quantization is carried out to the transformation result, obtains uniform quantization coefficient;
According to the corresponding relationship between the uniform quantization coefficient, the pure and impure type and preset code book serial number, determine The Target quantization value.
It is described according to the super frame mode in one of the embodiments, using preset 5 bit quantization method to described Pitch period and pure and impure type are quantified, comprising:
If the super frame mode includes three adjacent time frames, and includes two turbid in three adjacent time frames The corresponding bit of pure and impure type of the unvoiced frame is then quantified as second value by sound frame, by the pure and impure type of the unvoiced frames Corresponding bit is quantified as the first value;And using preset vector code book to the pitch period of three adjacent time frames Quantified.
It is described according to the super frame mode in one of the embodiments, using preset 5 bit quantization method to described Pitch period and pure and impure type are quantified, comprising:
If the super frame mode includes three adjacent time frames, and three adjacent time frames include a voiced sound Frame then obtains the code book serial number of N-bit according to preset vector code book;According to the preceding M-bit of the code book serial number of the N-bit Value quantifies the pure and impure type, and the low N-M bit of the N-bit code book serial number is determined as to the amount of the pitch period Change value, wherein N and M is positive integer, and N is greater than M.
A kind of speech parameter quantization device, comprising:
Module is obtained, for obtaining the speech parameter of voice signal using preset parameters analysis method;
Determining module, for the transmission rate and the speech parameter, determination and the voice according to the voice signal The corresponding quantization method of parameter quantifies the speech parameter.
A kind of computer equipment, including memory and processor, the memory are stored with computer program, the processing Device performs the steps of when executing the computer program
The speech parameter of voice signal is obtained using preset parameters analysis method;
According to the transmission rate of the voice signal and the speech parameter, quantization corresponding with the speech parameter is determined Method quantifies the speech parameter.
A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor It is performed the steps of when row
The speech parameter of voice signal is obtained using preset parameters analysis method;
According to the transmission rate of the voice signal and the speech parameter, quantization corresponding with the speech parameter is determined Method quantifies the speech parameter.
Above-mentioned speech parameter quantization method, device, computer equipment and storage medium, using preset parameters analysis method The speech parameter for obtaining voice signal, according to the transmission rate and speech parameter of voice signal, determination is corresponding with speech parameter Quantization method quantifies speech parameter, can be directed to the quantization different with speech parameter flexible setting of different transmission rates Method meets the different demands of user suitable for different scenes.
Detailed description of the invention
Fig. 1 is a kind of flow chart for speech parameter quantization method that one embodiment of the application provides;
Fig. 2 is a kind of flow chart for speech parameter quantization method that another embodiment of the application provides;
Fig. 3 is a kind of block diagram for speech parameter quantization device that one embodiment of the application provides;
Fig. 4 is a kind of block diagram for speech parameter quantization device that another embodiment of the application provides;
Fig. 5 is a kind of block diagram for speech parameter quantization device that another embodiment of the application provides;
Fig. 6 is the internal structure chart of computer equipment in one embodiment.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not For limiting the application.
Vocoder can generally be divided into encoder and decoder, and encoder can be by 8000hz, 16bit uniform quantization Audio signal is converted into the bit stream of 2400bps or 1200bps for transmission, and decoder is recovered from bit stream and is used for The parameter synthesis audio signal of speech synthesis, and export the audio data of 8000hz, 16bit.In the application with encoder be hold Row main body, emphasis describe the quantization scheme in encoder, and decoding process is the inverse process of coding.
Fig. 1 is a kind of flow chart for speech parameter quantization method that one embodiment of the application provides, the execution master of this method Body is encoder, as shown in Figure 1, this method comprises:
Step 101, the speech parameter that voice signal is obtained using preset parameters analysis method.
Wherein, speech parameter may include pitch period, the pure and impure type of subband, linear predictive coding (Linear Prediction Coding, LPC) parameter, line spectrum pair (Line Spectrum Frequency, LSF) parameter, gain parameter, Fourier modulus parameter etc..
In the present embodiment, when carrying out LPC parameter extraction based on 10 rank linear filters, specifically, can be using synthesis Filter transfer functionTo extract LPC parameter, wherein aiFor LPC (linear prediction) coefficient.By LPC After analysis, LPC coefficient can be converted to LSF parameter.It can be using normalization amplitude difference ENERGY METHOD or auto-correlation function The pitch period that voice signal is obtained with method, can also be smoothed pitch period, obtain the pitch period of integer. For the pure and impure type of subband, voice signal can be divided into 5 subbands, voiced sound degree is individually calculated on each subband, is gone forward side by side The pure and impure judgement of row, obtains pure and impure type.For the voice signal of each frame input, two can be calculated according to different the window's positions A yield value G1 and G2, as gain parameter.Preceding ten of gene frequency in the corresponding surplus spectrum of voice signal can also be extracted The range value of harmonic wave is as Fourier modulus parameter etc..The application of acquisition methods in to(for) speech parameter are not limited.
Step 102, transmission rate and speech parameter according to voice signal determine quantization method corresponding with speech parameter Speech parameter is quantified.
Wherein, the transmission rate of voice signal is usually 2400bps or 1200bps, or other rates, this Shen It please be not limited thereto.
In the present embodiment, can according to actual needs, different transmission rates, different phonetic parameter for voice signal Different quantization methods are set.For example, the LSF parameter for being 2400bps for the transmission rate of voice signal, can use 3 grades of codes This is quantified;The LSF parameter that transmission rate for voice signal is 1200bps, can also be to the different of voice signal Different quantization methods is arranged in super frame mode;The pitch period that transmission rate for voice signal is 2400bps, can be by base Because period and aperiodic mark carry out even quantization;The pitch period that transmission rate for voice signal is 1200bps, can be with Pitch period and pure and impure type are subjected to joint quantization etc..
Speech parameter quantization method provided by the embodiments of the present application obtains voice signal using preset parameters analysis method Speech parameter determine quantization method corresponding with speech parameter to language according to the transmission rate and speech parameter of voice signal Sound parameter is quantified, and can be directed to the different transmission rates quantization method different with speech parameter flexible setting, is suitable for In different scenes, meet the different demands of user.
Introduce below transmission rate be 2400bps when, the quantization method of pitch period.
In one embodiment, according to the transmission rate of voice signal and speech parameter, determination is corresponding with speech parameter Quantization method quantifies speech parameter, comprising: if the transmission rate of voice signal is 2400bps, and speech parameter is base The sound period, then according to the pure and impure type of the present frame of the cyclic attributes of voice signal or transmission of speech signals, to pitch period into Row quantization.
It in the present embodiment, can be according to the cyclic attributes of voice signal if the transmission rate of voice signal is 2400bps Or the pure and impure type of the present frame of transmission of speech signals, the pitch period of voice signal and aperiodicity mark are subjected to joint amount Change, for example, using 8 bits to pitch period carry out quantization encoding, the quantized value simultaneously indicate frame pure and impure type and non-week Phase mark.
Optionally, step is " right according to the pure and impure type of the present frame of the cyclic attributes of voice signal or transmission of speech signals Pitch period is quantified ", comprising: if present frame is unvoiced frame, voice signal is carried out pair in the pitch period of present frame Number conversion, and uniform quantization is carried out to transformation result using default order;If present frame is unvoiced frames, alternatively, voice signal Cyclic attributes be it is aperiodic, then bit quantization is carried out to the pitch period of voice signal.
In the present embodiment, judge the pure and impure type of present frame, it, can be to present frame if the present frame is unvoiced frame Pitch period carries out Logarithm conversion, and carries out uniform quantization to transformation result using default order.For example, if present frame is turbid The pitch period of unvoiced frame is converted into the logarithm with 10 or 2 bottom of for first, and carries out 254 rank uniform quantizations by sound frame, if fundamental tone Periodic regime is lg (20)~lg (144), then quantifying the result is that serial number between 2~255.If present frame is unvoiced frames, or Person, the cyclic attributes of voice signal be it is aperiodic, then bit is carried out using pitch period of the preset quantized value to voice signal Quantization.
Optionally, if present frame be unvoiced frames, alternatively, the cyclic attributes of voice signal be it is aperiodic, then to voice signal Pitch period carry out bit quantization, comprising: if present frame be unvoiced frames, by the corresponding bit of the pitch period of voice signal Position is quantified as the first value;If the cyclic attributes of voice signal be it is aperiodic, by the corresponding ratio of the pitch period of voice signal Special position is quantified as second value.
In the present embodiment, the first value can be 1, and second value can be 0, alternatively, being also possible to the first value is 0, second Value is 1, is mainly 1 with the first value in the embodiment of the present application, second value is 0 to come for example, but not being limited once.For example, If encoding using 8 bit quantizations, if present frame is unvoiced frames, 8 bit quantization codings are disposed as 1;If present frame is Aperiodic mark is 1, and 8 bit quantization codings are disposed as 0.
In the embodiment of the present application, when the transmission rate of voice signal is 2400bps, when quantifying for pitch period, by language The pitch period and aperiodicity mark of sound signal carry out joint quantization, and the quantized value of pitch period can not only indicate the clear of frame Turbid type also can indicate that aperiodic mark, so that the range that pitch period indicates is wider, moreover, the operand of the above method It is lower.
Introduce below transmission rate be 2400bps when, the quantization method of LSF parameter.
In one embodiment, according to the transmission rate of voice signal and speech parameter, determination is corresponding with speech parameter Quantization method quantifies speech parameter, comprising: if the transmission rate of voice signal is 2400bps, and speech parameter is line Spectrum then quantifies LSF parameter using preset three-level vector code book to LSF parameter.
In the present embodiment, it is based on multi-stage vector quantization (Multi-Stage Vector Quantizer, MSVQ) scheme, 3 grades of (7+6+6) vector code books can be used to quantify LSF parameter, contain 128,64,64 LSF codebook vectors respectively, quantify Result afterwards is codebook vectors index, occupies 19bits altogether.Specific quantization method is as follows: will be converted first by LPC parameter To LSF parameter arranged according to ascending order, and guarantee to be at least 50Hz per the distance of adjacent LSF parameter, the LSF after sort joins Number vector f;Then it is scanned in each vector code book according to weighted euclidean distance criterion shown in following formula:
Wherein, ' f be code book in vector, be the accumulated value of 3 grades of code books.So that dlsp(corresponding to f, ' f) the smallest ' f Codebook vectors are search as a result, that is, making dlsp(codebook vectors corresponding to f, ' f) the smallest ' f are LSF parameter Quantized result.W (i) is weighting coefficient, can be calculated according to following formula:
Wherein, p (f (i)) is the power spectral density that predictive filter is exported in i-th of line spectral frequency parameters respective frequencies.
In the embodiment of the present application, if the transmission rate of voice signal is 2400bps, and speech parameter is line spectrum pair LSF ginseng Number then quantifies LSF parameter using preset three-level vector code book, and operand is smaller.
Introduce below transmission rate be 2400bps when, gain quantization method.
In the present embodiment, each frame needs to quantify two gain Gs 1 and G2.G2 carries out uniform quantization, model using 5bits It encloses from 10dB to 77dB;The quantization of G1 needs to combine G2 and PG2 (G2 of PG2 previous frame), is used using following adaptive algorithms 3bits carries out uniform quantization: if | G2-PG2 | < 5, and | G1-0.5 (G2+PG2) | < 3.0, then G1 quantization encoding is 3 bits 0;Otherwise, G1 carries out 7 layers of uniform quantization of 3 bits in (g_min, g_max) range, quantization the result is that between 1~7 Serial number.Wherein g_min and g_max are calculated according to the following formula, and are limited in 10 to 77 ranges: g-min=min (G2, PG2)- 6.0, g-max=max (G2, PG2)+6.0.
Introduce below transmission rate be 2400bps when, the quantization method of pure and impure type.
In the present embodiment, pure and impure (U/V) judgement that 5 subbands can be used in the pure and impure type of subband is quantified, often A subband indicates voiceless sound or voiced sound with 1 bit, for example, if the bit is expressed as voiced sound for 1, if the bit is 0 expression For voiceless sound.5 subbands need 5 bits to indicate altogether, but since the pure and impure sound of peak low band subband is the quantization by pitch period Value judgement, therefore only need to quantify the pure and impure sound of 4 higher frequency band subbands, 4 bits are actually needed.The rule specifically quantified It is as follows:
If the voiced sound degree judgement of peak low band subband is voiceless sound, present frame is unvoiced frames, at this time 4 higher-frequency cross-talks The U/V judgement of band is all quantified as 0;If the voiced sound degree of peak low band subband is determined as that voiced sound, present frame are unvoiced frame, this When its voiced sound degree of the U/V grounds of judgment of 4 higher frequency band subbands VbpiIt carries out, if Vbpi> 0.6, the then corresponding ratio of current sub-band Spy is quantified as 1 voiced sound, is otherwise quantified as 0 voiceless sound;If the result of 4 quantized subbands is 0001, forced to be set as 0000, The pressure of highest frequency band is considered voiceless sound subband.
Introduce below transmission rate be 2400bps when, the quantization method of Fourier modulus.
In the present embodiment, the Fourier modulus of preceding 10 subharmonic of detection prediction residual signal and quantization, if harmonic number is small In 10, then higher hamonic wave spectral amplitude is set 1, is combined into 10 n dimensional vector ns and is quantified.The spectrum of 10 subharmonic above sections is carried out when decoding The normalized energy of planarization process, harmonic wave indicates that the decoding sequence that is, decoder obtains is with 1 This 10 dimension harmonic amplitude vector can be quantified using the vector code book of 8 bits, codebook search is calculated using full search Method, distortion measure is using the weighted euclidean distance criterion being shown below:
Wherein, A is the vector in code book, and w (i) is weighting coefficient, and w (i) is calculated by following formula:
Wherein, fi=8000i/60 is the corresponding frequency of i-th harmonic wave that pitch period is 60.
Table 1 is the quantized result signal for the speech parameter that transmission rate is 2400bps, and table 2 is the transmission sequence of speech frame Signal.As shown in Table 1 and Table 2, each frame includes 48bit, and wherein 1bit indicates frame type, indicates that current speech frame is speech frame Or non-speech frame, remaining bits are used to indicate speech parameter.
Table 1
Table 2
In above-described embodiment, mainly describe transmission rate be 2400bps when speech parameter quantization method, below emphasis Introduce speech parameter quantization method when transmission rate is 1200bps.Under 1200bps mode, combine quantization side using three frames Formula quantifies the superframe of 60ms each time, exports 72bit.Table 3 is super frame mode signal, and table 4 is that 5 kinds of super frame modes correspond to bit points With mode.As shown in table 3, superframe can be divided into, wherein U represents voiceless sound by 5 kinds of modes according to the pure and impure sound type of 3 subframes Frame, V represent unvoiced frame.
Table 3
Table 4
In one of the embodiments, according to the transmission rate of voice signal and speech parameter, determining and speech parameter pair The quantization method answered quantifies speech parameter, comprising: if the transmission rate of voice signal is 1200bps, according to transmission The pure and impure type of the time frame of voice signal determines the super frame mode of voice signal;According to speech parameter and super frame mode, determine Quantization method corresponding with speech parameter quantifies speech parameter.
In the present embodiment, can be as shown in table 3 if the transmission rate of voice signal is 1200bps, according to time frame Pure and impure type superframe is divided into 5 kinds of modes, then according to speech parameter and super frame mode, determine the amount of each speech parameter Change method.
When the transmission rate for introducing voice signal below is 1200bps, the quantization method of LSF parameter.
In one embodiment, according to speech parameter and super frame mode, quantization method pair corresponding with speech parameter is determined Speech parameter is quantified, comprising: if speech parameter is LSF parameter, according to super frame mode, using preset codebook quantification side Method quantifies LSF parameter.
Optionally, according to super frame mode, LSF parameter is quantified using preset codebook quantification method, comprising: superframe Mode includes three adjacent time frames, and two unvoiced frames are included at least in three adjacent time frames, then using preset Three-level quantization code book quantifies the LSF parameter of third frame in current super frame, and is joined according to the LSF of third frame in previous superframe The LSF parameter quantized value of third frame, quantifies the LSF parameter of first frame and the second frame in quantification value and current super frame, Third frame is to be located at last time frame in super frame mode in timing.
In the present embodiment, if super frame mode is the Mode1 mode and Mode2 mode in table 3, the LSF of third frame is joined Number is quantified using 7-6-6 bit three-level vector code book, and the method and code book of use are identical with the LSF of 2400bps quantization.If Super frame mode is Mode3 mode, then uses the LSF parameter of a 9 bit level vector codebook quantification third frames.For first, The LSF of second frame is pressed then using the quantized value of third frame LSF in the quantized value and a upper superframe of third frame LSF in current super frame Quantified according to the mode of prediction.
Optionally, as shown in Fig. 2, step is " according in the LSF parameter quantized value and current super frame of third frame in previous superframe The LSF parameter quantized value of third frame, quantifies the LSF parameter of first frame and the second frame ", comprising:
Step 201 is joined according to the LSF of third frame in the LSF parameter quantized value and current super frame of third frame in previous superframe Quantification value determines the LSF parameter quantized value and the second frame of the corresponding first frame of each predictive coefficient in predictive coefficient code book LSF parameter quantized value.
In the present embodiment, if ' f3It (i) is the quantized value of third frame LSF in current super frame, ' fp(i) in previous superframe The quantized value of third frame LSF, then calculate separately the quantized value of first frame LSF parameter according to the following formulaWith the second frame LSF The quantized value of parameter
Wherein, i=1 ..., 10, a1And a2It is the predictive coefficient in 2 bit predictions code books.4 are shared in predictive coefficient code book Rank predictive coefficient vector successively acquires each predictive coefficient correspondingWith
Step 202, according to the LSF parameter of first frame, the LSF parameter of the second frame, the LSF parameter quantized value of first frame and The LSF parameter quantized value of two frames, determines target prediction coefficient.
Optionally, according to the LSF parameter of first frame, the LSF parameter of the second frame, the LSF parameter quantized value of first frame and The LSF parameter quantized value of two frames, determines target prediction coefficient, comprising: is joined according to the LSF of the LSF parameter of first frame, the second frame The LSF parameter quantized value of number, the LSF parameter quantized value of first frame and the second frame, determines each prediction in predictive coefficient code book The corresponding prediction error of coefficient;Determine that the corresponding predictive coefficient of the smallest prediction error is target prediction coefficient.
In the present embodiment, according to the LSF parameter f of first frame1(i), the LSF parameter f of the second frame2(i), the LSF of first frame Parameter quantized valueWith the LSF parameter quantized value of the second frame
According to formulaPrediction error E is calculated, prediction is missed Predictive coefficient corresponding to poor E minimum value is determined as target prediction coefficient, then predicts code book serial number corresponding to error E minimum value The as quantized result of code book where target prediction coefficient.
Step 203 is determined residual error vector according to target prediction coefficient, and is sweared using preset second level vector code book to residual error Quantified.
In the present embodiment, it is determined that mark predictive coefficient, i.e., after search obtains optimal predictive coefficient vector, according to formulaWithCalculate the corresponding prediction residual r of predictive coefficient vector1(i) and r2(i), Wherein, i=1 ..., 10.Finally by r1(i) and r2(i) it is combined into residual error vector R=[r1(1),…,r1(10),r2(1),…, r2(10)], vector quantization is carried out using 8-6 bit second level vector code book.
Optionally, if it is Mode2 mode, first joined using the LSF of 7-6-6 bit three-level vector codebook quantification third frame Number, then above-mentioned prediction technique quantify the LSF parameter of the first, second frame, unlike Mode1 mode, in search predictive coefficient a1And a2When, Mode2 mode uses the prediction code book of 4 bit, 16 rank, and Mode1 mode uses 2 bit, 4 rank to predict Code book.
Optionally, if it is Mode3 mode, first joined using the LSF of a 9 bit level vector codebook quantification third frames Number reuses the LSF parameter that above-mentioned prediction technique quantifies the first, second frame.The method of prediction is identical as Mode1 and Mode2, in advance Survey coefficient a1And a2The code book that search uses is identical as Mode1 mode, but the code book that prediction residual quantization uses is 8-6-6-6 Level Four vector code book.
Optionally, according to super frame mode, LSF parameter is quantified using preset codebook quantification method, comprising: superframe Mode includes three adjacent time frames, and includes a unvoiced frame in three adjacent time frames, then uses preset three-level Vector code book quantifies the LSF parameter of unvoiced frame in super frame mode, using preset level-one vector code book to super frame mode The LSF parameter of middle unvoiced frames is quantified.
It in the present embodiment, is then directly to quantify to the LSF parameter of three frames, the result of quantization is pressed if it is Mode4 mode It is successively transmitted in the bitstream according to frame sequential.Used code book is 7-6-6 bit three-level vector code book, voiceless sound in unvoiced frame It is 9 bit level vector code books when frame.
Optionally, according to super frame mode, LSF parameter is quantified using preset codebook quantification method, comprising: superframe Mode includes three adjacent time frames, and three adjacent time frames are unvoiced frames, then uses preset level-one codebook vector This quantifies the LSF parameter of unvoiced frames in super frame mode.For example, 9 bits one then can be used if it is Mode5 mode Grade vector code book successively quantifies the LSF parameter of three frames.
When the transmission rate for introducing voice signal below is 1200bps, the quantization side of pitch period and pure and impure type Method.
In one embodiment, according to speech parameter and super frame mode, quantization method pair corresponding with speech parameter is determined Speech parameter is quantified, comprising: if speech parameter is pitch period and pure and impure type, according to super frame mode, using default 5 bit quantization method pitch period and pure and impure type are quantified.
In the present embodiment, if the transmission rate of voice signal is 1200bps, if speech parameter is pitch period and pure and impure Type then can select different quantization methods according to the different super frame modes in table 3.Table 5 is pitch period and pure and impure class The quantized result of type is illustrated, and as shown in table 5,12bit can be used and carry out joint quantization to pitch period and pure and impure type, wherein 9bit indicates that pitch period, 3bit indicate pure and impure type.
Table 5
Optionally, according to super frame mode, using preset 5 bit quantization method to pitch period and the pure and impure type amount of progress Change, comprising: if super frame mode includes three adjacent time frames, and three adjacent time frames are unvoiced frames, then by fundamental tone Period and the corresponding bit of pure and impure type are quantified as second value.For example, for the superframe of UUU mode, pitch period and clear Voiced sound type is directly quantified as the 0 of 12 bits.
Optionally, according to super frame mode, using preset 5 bit quantization method to pitch period and the pure and impure type amount of progress Change, comprising: if super frame mode includes three adjacent time frames, and three adjacent time frames include a unvoiced frames, then will The corresponding bit of pure and impure type is quantified as second value, and the pitch period of unvoiced frame in super frame mode is carried out Logarithm conversion, And Target quantization value is determined according to transformation result and pure and impure type.
Further, Target quantization value is determined according to transformation result and pure and impure type, comprising: carry out to transformation result uniform Quantization obtains uniform quantization coefficient;According to the corresponding pass between uniform quantization coefficient, pure and impure type and preset code book serial number System, determines Target quantization value.
In the present embodiment, tri- kinds of super frame modes of UUV, UVU, VUU are contained only with the type of 1 unvoiced frame, 3 bits are clear Turbid type instruction is quantified as 000.For pitch period, the pitch period of wherein unvoiced frame is converted into pair with 10 bottom of for first Number, carries out 99 rank uniform quantizations, and range is lg (20)~lg (144), quantization the result is that serial number between 1~99;Secondly will It is that 512 ranks map code book serial number that uniform quantization coefficient, which combines pure and impure sound Type mapping, as final Target quantization value.
Optionally, according to super frame mode, using preset 5 bit quantization method to pitch period and the pure and impure type amount of progress Change, comprising: if super frame mode includes three adjacent time frames, and include two unvoiced frames in three adjacent time frames, then The corresponding bit of pure and impure type of unvoiced frame is quantified as second value, the corresponding bit of the pure and impure type of unvoiced frames is quantified For the first value;And the pitch period of three adjacent time frames is quantified using preset vector code book.
In the present embodiment, in tri- kinds of super frame modes of VVU, VUV, UVV include two unvoiced frames type, 3 bits Pure and impure type instruction is quantified as 001,010 and 100 respectively;The pitch period of three frames is combined sequentially into as a three-dimensional vector, The code book for reusing 512 ranks is quantified.
Optionally, according to super frame mode, using preset 5 bit quantization method to pitch period and the pure and impure type amount of progress Change, comprising: if super frame mode includes three adjacent time frames, and three adjacent time frames include a unvoiced frame, then root The code book serial number of N-bit is obtained according to preset vector code book;According to the value of the preceding M-bit of the code book serial number of N-bit to pure and impure class Type is quantified, and the low N-M bit of N-bit code book serial number is determined as to the quantized value of pitch period, wherein N and M is positive whole Number, and N is greater than M.
In the present embodiment, the superframe type of VVV is more special, uses the codebook search of 12 ranks first, obtains 11 The code book serial number of bit;Secondly pure and impure type indicated value is each mapped to from 00 to 11 according to the value of 2 bits before code book serial number 011,101,110 and 111, then using low 9 bit of code book serial number as the quantized value of pitch period.
Below emphasis describe voice signal transmission rate be 1200bps when, the voicing decision quantization method of subband.
In the present embodiment, table 6 is the corresponding bits allocation signal of frame pattern in voicing decision quantization, such as 6 institute of table Show, number of the bit number of the voicing decision quantization of subband dependent on unvoiced frame in superframe, for example, the superframe of VVV mode is total There are 3 unvoiced frames, every frame distributes 2 bits, occupies 6 bits altogether;The bit number calculation method of other super frame modes is identical.
Table 6
Frame pattern VVV VVU, VUV, UVV VUU, UVU, UUV UUU
Istributes bit number 6 4 2 0
Due to sharing 5 subbands in 1 unvoiced frame, the voicing decision of peak low band subband can pass through fundamental tone by decoder The quantized value in period obtains, and the voicing decision information of 4 subbands of complete representation high band totally 4 16 seed types of bit, amount Need to be mapped to 2 bit, 4 seed type when change.Table 7 is the rule of voicing decision quantization mapping, 4 of voicing decision in table 7 Bit successively indicates the pure and impure sound type from the 2nd subband to the 5th subband, indicates voiced sound subband for 1,0 indicates voiceless sound subband.Through reflecting Only remaining four kinds of U/V judgement types after penetrating, when quantization, are indicated with 2 bits.
Table 7
In one embodiment, gain quantization when being 1200bps for the transmission rate of voice signal, it is every in superframe One frame needs to quantify two gain Gs 1 and G2, and method is identical with the gain quantization method of 2400bps mode.The G1 and G2 of 3 frames are pressed Frame sequential rearranges the gain vector of one 6 dimension, reuses a 10 bit level Codebook of Vector Quantization and is quantified.
Below emphasis describe voice signal transmission rate be 1200bps when, the quantization method of Fourier modulus.
In one embodiment, if superframe is UUU mode, that is, unvoiced frame is not contained, does not then quantify to transmit Fourier width Degree.For the superframe containing unvoiced frame, then therefrom choose a unvoiced frame fourier modulus quantified, the mode of quantization and Code book is identical as the mode of 2400bps mode and code book;The fourier modulus of other unvoiced frames is derived in decoding by quantized value Out.For example, settingBe the i-th frame in current super frame normalization fourier modulus analysis as a result,It is the i-th frame in preceding superframe Normalization fourier modulus quantization after as a result,It is the quantized result of the last one unvoiced frame in a upper superframe, Q [] indicates that use method identical with 2400bps is quantified, then the fourier modulus of current super frame quantifies with derivation rule such as Shown in table 8.
Table 8
Below emphasis describe voice signal transmission rate be 1200bps when, the quantization method of aperiodic mark.
In one embodiment, using the aperiodic mark of 3 frames in 1 bit quantization superframe, first according to lowest frequency cross-talk The voiced sound degree V of bandbp1The original aperiodic mark for obtaining three frames, if Vbp1< 0.5, the then aperiodic mark quantization of the original of present frame It is 1, is otherwise quantified as 0.Secondly the final quantization value of 1 bit, quantization are obtained according to superframe type and original aperiodic mark It is regular as shown in table 9.
Table 9
Corresponding, when decoding, the aperiodic mark of 3 frames is derived from the quantized value of 1 bit, and decoding rule is such as 10 institute of table Show.
Table 10
It should be understood that although each step in the flow chart of Fig. 1-2 is successively shown according to the instruction of arrow, These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps Execution there is no stringent sequences to limit, these steps can execute in other order.Moreover, at least one in Fig. 1-2 Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps Completion is executed, but can be executed at different times, the execution sequence in these sub-steps or stage is also not necessarily successively It carries out, but can be at least part of the sub-step or stage of other steps or other steps in turn or alternately It executes.
In one embodiment, as shown in figure 3, providing a kind of speech parameter quantization device, comprising: obtain 11 He of module Determining module 12, in which:
Module 11 is obtained, for obtaining the speech parameter of voice signal using preset parameters analysis method;
Determining module 12, for the transmission rate and the speech parameter, determination and institute's predicate according to the voice signal The corresponding quantization method of sound parameter quantifies the speech parameter.
In one embodiment, if the transmission rate that the determining module 12 is specifically used for the voice signal is 2400bps, and the speech parameter is pitch period, then is believed according to the cyclic attributes of the voice signal or the transmission voice Number present frame pure and impure type, the pitch period is quantified.
In one embodiment, as shown in figure 4, the determining module 12 includes:
First quantization submodule 121, if being unvoiced frame for the present frame, to the voice signal described current The pitch period of frame carries out Logarithm conversion, and carries out uniform quantization to transformation result using default order;
Second quantization submodule 122, if being unvoiced frames for the present frame, alternatively, the period of the voice signal belongs to Property to be aperiodic, then bit quantization is carried out to the pitch period of the voice signal.
It in one embodiment, will if it is unvoiced frames that the second quantization submodule 122, which is specifically used for the present frame, The corresponding bit of the pitch period of the voice signal is quantified as the first value;If the cyclic attributes of the voice signal are non- The corresponding bit of the pitch period of the voice signal is then quantified as second value by the period.
In one embodiment, if the transmission rate that the determining module 12 is specifically used for the voice signal is 2400bps, and the speech parameter be line spectrum pair LSF parameter, then using preset three-level vector code book to the LSF parameter into Row quantization.
In one embodiment, as shown in figure 5, the determining module 12 includes:
First determines submodule 123, if the transmission rate for the voice signal is 1200bps, according to transmission institute The pure and impure type of the time frame of predicate sound signal determines the super frame mode of the voice signal;
Second determines submodule 124, for according to the speech parameter and the super frame mode, the determining and voice to be joined The corresponding quantization method of number quantifies the speech parameter.
In one embodiment, if it is LSF parameter that the described second determining submodule 124, which is specifically used for the speech parameter, Then according to the super frame mode, the LSF parameter is quantified using preset codebook quantification method.
In one embodiment, the super frame mode includes three adjacent time frames, and the three adjacent time Two unvoiced frames are included at least in frame, described second determines submodule 124 according to the super frame mode, using preset code book amount Change method quantifies the LSF parameter, comprising: described second determines that submodule 124 quantifies code book using preset three-level The LSF parameter of third frame in current super frame is quantified, and according to the LSF parameter quantized value of third frame in previous superframe and institute The LSF parameter quantized value for stating third frame in current super frame, quantifies the LSF parameter of first frame and the second frame, the third Frame is to be located at last time frame in the super frame mode in timing.
In one embodiment, described second determine that submodule 124 quantifies according to the LSF parameter of third frame in previous superframe The LSF parameter quantized value of third frame, quantifies the LSF parameter of first frame and the second frame in value and the current super frame, wraps It includes:
Described second determines submodule 124 according to the LSF parameter quantized value of third frame in previous superframe and described current super The LSF parameter quantized value of third frame in frame determines the LSF ginseng of the corresponding first frame of each predictive coefficient in predictive coefficient code book The LSF parameter quantized value of quantification value and the second frame;According to the LSF parameter of the first frame, the LSF parameter of second frame, The LSF parameter quantized value of the LSF parameter quantized value of the first frame and second frame, determines target prediction coefficient;According to institute It states target prediction coefficient and determines residual error vector, and the residual error vector is quantified using preset second level vector code book.
In one embodiment, described second determine submodule 124 according to the LSF parameter of the first frame, described second The LSF parameter quantized value of the LSF parameter of frame, the LSF parameter quantized value of the first frame and second frame, determines target prediction Coefficient, comprising:
Described second determines submodule 124 according to the LSF parameter of the first frame, the LSF parameter of second frame, described The LSF parameter quantized value of the LSF parameter quantized value of first frame and second frame, determines each of described predictive coefficient code book The corresponding prediction error of predictive coefficient;Determine that the corresponding predictive coefficient of the smallest prediction error is the target prediction coefficient.
In one embodiment, the super frame mode includes three adjacent time frames, and the three adjacent time It include a unvoiced frame in frame, described second determines submodule 124 according to the super frame mode, using preset codebook quantification side Method quantifies the LSF parameter, comprising:
Described second determines submodule 124 using preset three-level vector code book to unvoiced frame in the super frame mode LSF parameter is quantified, using preset level-one vector code book to the LSF parameter amount of progress of unvoiced frames in the super frame mode Change.
In one embodiment, the super frame mode includes three adjacent time frames, and the three adjacent time Frame is unvoiced frames, and described second determines submodule 124 according to the super frame mode, using preset codebook quantification method to institute It states LSF parameter to be quantified, comprising: described second determines submodule 124 using preset level-one vector code book to the superframe The LSF parameter of unvoiced frames is quantified in mode.
In one embodiment, if described second determines that submodule 124 is specifically used for the speech parameter for pitch period With pure and impure type, then according to the super frame mode, using preset 5 bit quantization method to the pitch period and pure and impure type Quantified.
In one embodiment, if the super frame mode includes three adjacent time frames, and it is described three it is adjacent when Between frame be unvoiced frames, it is described second determine submodule 124 according to the super frame mode, using preset 5 bit quantization method pair The pitch period and pure and impure type are quantified, comprising: described second determine submodule 124 be used for the pitch period and The corresponding bit of pure and impure type is quantified as second value.
In one embodiment, if the super frame mode includes three adjacent time frames, and it is described three it is adjacent when Between frame include a unvoiced frames, it is described second determine submodule 124 according to the super frame mode, using preset bit quantization side Method quantifies the pitch period and pure and impure type, comprising: described second determines submodule 124 by the pure and impure type pair The bit answered is quantified as second value, by the pitch period progress Logarithm conversion of unvoiced frame in the super frame mode, and according to Transformation result and the pure and impure type determine Target quantization value.
In one embodiment, described second determine that submodule 124 determines mesh according to transformation result and the pure and impure type Scalarization value, comprising: described second determines that submodule 124 carries out uniform quantization to the transformation result, obtains uniform quantization system Number;According to the corresponding relationship between the uniform quantization coefficient, the pure and impure type and preset code book serial number, the mesh is determined Scalarization value.
In one embodiment, if the super frame mode includes three adjacent time frames, and it is described three it is adjacent when Between include two unvoiced frames in frame, described second determines submodule 124 according to the super frame mode, using preset bit quantization Method quantifies the pitch period and pure and impure type, comprising: described second determines submodule 124 by the unvoiced frame The corresponding bit of pure and impure type is quantified as second value, and the corresponding bit of pure and impure type of the unvoiced frames is quantified as first Value;And the pitch period of three adjacent time frames is quantified using preset vector code book.
In one embodiment, if the super frame mode includes three adjacent time frames, and it is described three it is adjacent when Between frame include a unvoiced frame, it is described second determine submodule 124 according to the super frame mode, using preset bit quantization side Method quantifies the pitch period and pure and impure type, comprising: described second determines submodule 124 according to preset codebook vector The code book serial number of this acquisition N-bit;The pure and impure type is carried out according to the value of the preceding M-bit of the code book serial number of the N-bit The low N-M bit of the N-bit code book serial number, is determined as the quantized value of the pitch period, wherein N and M are positive by quantization Integer, and N is greater than M.
Specific about speech parameter quantization device limits the limit that may refer to above for speech parameter quantization method Fixed, details are not described herein.Modules in above-mentioned speech parameter quantization device can fully or partially through software, hardware and its Combination is to realize.Above-mentioned each module can be embedded in the form of hardware or independently of in the processor in computer equipment, can also be with It is stored in the memory in computer equipment in a software form, in order to which processor calls the above modules of execution corresponding Operation.
In one embodiment, a kind of computer equipment is provided, which can be server, internal junction Composition can be as shown in Figure 6.The computer equipment include by system bus connect processor, memory, network interface and Database.Wherein, the processor of the computer equipment is for providing calculating and control ability.The memory packet of the computer equipment Include non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program and data Library.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.The calculating The database of machine equipment is used for storaged voice supplemental characteristic.The network interface of the computer equipment is used to pass through with external terminal Network connection communication.To realize a kind of speech parameter quantization method when the computer program is executed by processor.
It will be understood by those skilled in the art that structure shown in Fig. 6, only part relevant to application scheme is tied The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme, specific computer equipment It may include perhaps combining certain components or with different component layouts than more or fewer components as shown in the figure.
In one embodiment, a kind of computer equipment, including memory and processor are provided, is stored in memory Computer program, the processor perform the steps of when executing computer program
The speech parameter of voice signal is obtained using preset parameters analysis method;
According to the transmission rate of the voice signal and the speech parameter, quantization corresponding with the speech parameter is determined Method quantifies the speech parameter.
In one embodiment, a kind of computer readable storage medium is provided, computer program is stored thereon with, is calculated Machine program performs the steps of when being executed by processor
The speech parameter of voice signal is obtained using preset parameters analysis method;
According to the transmission rate of the voice signal and the speech parameter, quantization corresponding with the speech parameter is determined Method quantifies the speech parameter.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the computer program can be stored in a non-volatile computer In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, To any reference of memory, storage, database or other media used in each embodiment provided herein, Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM (PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM is available in many forms, Such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate sdram (DDRSDRAM), enhancing Type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
Each technical characteristic of above embodiments can be combined arbitrarily, for simplicity of description, not to above-described embodiment In each technical characteristic it is all possible combination be all described, as long as however, the combination of these technical characteristics be not present lance Shield all should be considered as described in this specification.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously It cannot therefore be construed as limiting the scope of the patent.It should be pointed out that coming for those of ordinary skill in the art It says, without departing from the concept of this application, various modifications and improvements can be made, these belong to the protection of the application Range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.

Claims (21)

1. a kind of speech parameter quantization method, which comprises
The speech parameter of voice signal is obtained using preset parameters analysis method;
According to the transmission rate of the voice signal and the speech parameter, quantization method corresponding with the speech parameter is determined The speech parameter is quantified.
2. the method according to claim 1, wherein the transmission rate according to the voice signal and described Speech parameter determines that quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the transmission rate of the voice signal is 2400bps, and the speech parameter is pitch period, then according to the voice The pure and impure type of the present frame of the cyclic attributes or transmission voice signal of signal, quantifies the pitch period.
3. according to the method described in claim 2, it is characterized in that, the cyclic attributes or transmission according to the voice signal The pure and impure type of the present frame of the voice signal, quantifies the pitch period, comprising:
If the present frame is unvoiced frame, the pitch period to the voice signal in the present frame carries out Logarithm conversion, And uniform quantization is carried out to transformation result using default order;
If the present frame be unvoiced frames, alternatively, the cyclic attributes of the voice signal be it is aperiodic, then to the voice signal Pitch period carry out bit quantization.
4. according to the method described in claim 3, it is characterized in that, if the present frame is unvoiced frames, alternatively, the voice is believed Number cyclic attributes be it is aperiodic, then bit quantization is carried out to the pitch period of the voice signal, comprising:
If the present frame is unvoiced frames, the corresponding bit of the pitch period of the voice signal is quantified as first Value;
If the cyclic attributes of the voice signal be it is aperiodic, the corresponding bit of the pitch period of the voice signal is equal It is quantified as second value.
5. the method according to claim 1, wherein the transmission rate according to the voice signal and described Speech parameter determines that quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the transmission rate of the voice signal is 2400bps, and the speech parameter is line spectrum pair LSF parameter, then using pre- If three-level vector code book the LSF parameter is quantified.
6. the method according to claim 1, wherein the transmission rate according to the voice signal and described Speech parameter determines that quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the transmission rate of the voice signal is 1200bps, according to the pure and impure class for the time frame for transmitting the voice signal Type determines the super frame mode of the voice signal;
According to the speech parameter and the super frame mode, determine quantization method corresponding with the speech parameter to the voice Parameter is quantified.
7. according to the method described in claim 6, it is characterized in that, described according to the speech parameter and the super frame mode, Determine that quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the speech parameter is LSF parameter, according to the super frame mode, using preset codebook quantification method to described LSF parameter is quantified.
8. the method according to the description of claim 7 is characterized in that described according to the super frame mode, using preset code book Quantization method quantifies the LSF parameter, comprising:
The super frame mode includes three adjacent time frames, and two voiced sounds are included at least in three adjacent time frames Frame then quantifies the LSF parameter of third frame in current super frame using preset three-level quantization code book, and according to previous superframe The LSF parameter quantized value of third frame in the LSF parameter quantized value and the current super frame of middle third frame, to first frame and the second frame LSF parameter quantified, the third frame is to be located at last time frame in the super frame mode in timing.
9. according to the method described in claim 8, it is characterized in that, the LSF parameter amount according to third frame in previous superframe The LSF parameter quantized value of third frame, quantifies the LSF parameter of first frame and the second frame in change value and the current super frame, Include:
According to the LSF parameter quantized value of third frame in the LSF parameter quantized value and the current super frame of third frame in previous superframe, Determine the LSF parameter quantized value of the corresponding first frame of each predictive coefficient in predictive coefficient code book and the LSF parameter of the second frame Quantized value;
According to the LSF parameter of the first frame, the LSF parameter of second frame, the LSF parameter quantized value of the first frame and institute The LSF parameter quantized value for stating the second frame, determines target prediction coefficient;
Determine residual error vector according to the target prediction coefficient, and using preset second level vector code book to the residual error vector into Row quantization.
10. according to the method described in claim 9, it is characterized in that, according to the LSF parameter of the first frame, second frame LSF parameter, the LSF parameter quantized value of the first frame and the LSF parameter quantized value of second frame, determine target prediction system Number, comprising:
According to the LSF parameter of the first frame, the LSF parameter of second frame, the LSF parameter quantized value of the first frame and institute The LSF parameter quantized value for stating the second frame determines the corresponding prediction error of each predictive coefficient in the predictive coefficient code book;
Determine that the corresponding predictive coefficient of the smallest prediction error is the target prediction coefficient.
11. the method according to the description of claim 7 is characterized in that described according to the super frame mode, using preset code book Quantization method quantifies the LSF parameter, comprising:
The super frame mode includes three adjacent time frames, and includes a unvoiced frame in three adjacent time frames, Then the LSF parameter of unvoiced frame in the super frame mode is quantified using preset three-level vector code book, using preset one Grade vector code book quantifies the LSF parameter of unvoiced frames in the super frame mode.
12. the method according to the description of claim 7 is characterized in that described according to the super frame mode, using preset code book Quantization method quantifies the LSF parameter, comprising:
The super frame mode includes three adjacent time frames, and three adjacent time frames are unvoiced frames, then use Preset level-one vector code book quantifies the LSF parameter of unvoiced frames in the super frame mode.
13. according to the method described in claim 6, it is characterized in that, described according to the speech parameter and the super frame mode, Determine that quantization method corresponding with the speech parameter quantifies the speech parameter, comprising:
If the speech parameter is pitch period and pure and impure type, according to the super frame mode, using preset bit quantization Method quantifies the pitch period and pure and impure type.
14. according to the method for claim 13, which is characterized in that it is described according to the super frame mode, using preset ratio Special quantization method quantifies the pitch period and pure and impure type, comprising:
If the super frame mode includes three adjacent time frames, and three adjacent time frames are unvoiced frames, then will The pitch period and the corresponding bit of pure and impure type are quantified as second value.
15. according to the method for claim 13, which is characterized in that it is described according to the super frame mode, using preset ratio Special quantization method quantifies the pitch period and pure and impure type, comprising:
If the super frame mode includes three adjacent time frames, and three adjacent time frames include a unvoiced frames, The corresponding bit of the pure and impure type is then quantified as second value, by the pitch period of unvoiced frame in the super frame mode into Row Logarithm conversion, and Target quantization value is determined according to transformation result and the pure and impure type.
16. according to the method for claim 15, which is characterized in that described to be determined according to transformation result with the pure and impure type Target quantization value, comprising:
Uniform quantization is carried out to the transformation result, obtains uniform quantization coefficient;
According to the corresponding relationship between the uniform quantization coefficient, the pure and impure type and preset code book serial number, determine described in Target quantization value.
17. according to the method for claim 13, which is characterized in that it is described according to the super frame mode, using preset ratio Special quantization method quantifies the pitch period and pure and impure type, comprising:
If the super frame mode includes three adjacent time frames, and includes two voiced sounds in three adjacent time frames The corresponding bit of pure and impure type of the unvoiced frame is then quantified as second value by frame, by the pure and impure type pair of the unvoiced frames The bit answered is quantified as the first value;And using preset vector code book to the pitch periods of three adjacent time frames into Row quantization.
18. according to the method for claim 13, which is characterized in that it is described according to the super frame mode, using preset ratio Special quantization method quantifies the pitch period and pure and impure type, comprising:
If the super frame mode includes three adjacent time frames, and three adjacent time frames include a unvoiced frame, The code book serial number of N-bit is then obtained according to preset vector code book;According to the value of the preceding M-bit of the code book serial number of the N-bit The pure and impure type is quantified, the low N-M bit of the N-bit code book serial number is determined as to the quantization of the pitch period Value, wherein N and M is positive integer, and N is greater than M.
19. a kind of speech parameter quantization device characterized by comprising
Module is obtained, for obtaining the speech parameter of voice signal using preset parameters analysis method;
Determining module, for the transmission rate and the speech parameter, determination and the speech parameter according to the voice signal Corresponding quantization method quantifies the speech parameter.
20. a kind of computer equipment, including memory and processor, the memory are stored with computer program, feature exists In the step of processor realizes any one of claims 1 to 18 the method when executing the computer program.
21. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program The step of method described in any one of claims 1 to 18 is realized when being executed by processor.
CN201811109230.6A 2018-09-21 2018-09-21 Speech parameter quantization method, device, computer equipment and storage medium Pending CN109256143A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811109230.6A CN109256143A (en) 2018-09-21 2018-09-21 Speech parameter quantization method, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811109230.6A CN109256143A (en) 2018-09-21 2018-09-21 Speech parameter quantization method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN109256143A true CN109256143A (en) 2019-01-22

Family

ID=65047672

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811109230.6A Pending CN109256143A (en) 2018-09-21 2018-09-21 Speech parameter quantization method, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109256143A (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6094629A (en) * 1998-07-13 2000-07-25 Lockheed Martin Corp. Speech coding system and method including spectral quantizer
CN1975861A (en) * 2006-12-15 2007-06-06 清华大学 Vocoder fundamental tone cycle parameter channel error code resisting method
CN101030377A (en) * 2007-04-13 2007-09-05 清华大学 Method for increasing base-sound period parameter quantified precision of 0.6kb/s voice coder
CN102855878A (en) * 2012-09-21 2013-01-02 山东省计算中心 Quantification method of pure and impure pitch parameters of narrow-band voice sub-band
CN103050121A (en) * 2012-12-31 2013-04-17 北京迅光达通信技术有限公司 Linear prediction speech coding method and speech synthesis method
CN103247293A (en) * 2013-05-14 2013-08-14 中国科学院自动化研究所 Coding method and decoding method for voice data
CN103325375A (en) * 2013-06-05 2013-09-25 上海交通大学 Coding and decoding device and method of ultralow-bit-rate speech
CN106098072A (en) * 2016-06-02 2016-11-09 重庆邮电大学 A kind of 600bps very low speed rate encoding and decoding speech method based on MELP
CN106935243A (en) * 2015-12-29 2017-07-07 航天信息股份有限公司 A kind of low bit digital speech vector quantization method and system based on MELP

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6094629A (en) * 1998-07-13 2000-07-25 Lockheed Martin Corp. Speech coding system and method including spectral quantizer
CN1975861A (en) * 2006-12-15 2007-06-06 清华大学 Vocoder fundamental tone cycle parameter channel error code resisting method
CN101030377A (en) * 2007-04-13 2007-09-05 清华大学 Method for increasing base-sound period parameter quantified precision of 0.6kb/s voice coder
CN102855878A (en) * 2012-09-21 2013-01-02 山东省计算中心 Quantification method of pure and impure pitch parameters of narrow-band voice sub-band
CN103050121A (en) * 2012-12-31 2013-04-17 北京迅光达通信技术有限公司 Linear prediction speech coding method and speech synthesis method
CN103247293A (en) * 2013-05-14 2013-08-14 中国科学院自动化研究所 Coding method and decoding method for voice data
CN103325375A (en) * 2013-06-05 2013-09-25 上海交通大学 Coding and decoding device and method of ultralow-bit-rate speech
CN106935243A (en) * 2015-12-29 2017-07-07 航天信息股份有限公司 A kind of low bit digital speech vector quantization method and system based on MELP
CN106098072A (en) * 2016-06-02 2016-11-09 重庆邮电大学 A kind of 600bps very low speed rate encoding and decoding speech method based on MELP

Similar Documents

Publication Publication Date Title
KR101425944B1 (en) Improved coding/decoding of digital audio signal
CN101057275B (en) Vector conversion device and vector conversion method
CN1957398B (en) Methods and devices for low-frequency emphasis during audio compression based on acelp/tcx
US8364495B2 (en) Voice encoding device, voice decoding device, and methods therefor
CN101836251B (en) Scalable speech and audio encoding using combinatorial encoding of MDCT spectrum
KR101175651B1 (en) Method and apparatus for multiple compression coding
KR101180202B1 (en) Method and apparatus for generating an enhancement layer within a multiple-channel audio coding system
JP6980871B2 (en) Signal coding method and its device, and signal decoding method and its device
TWI605448B (en) Apparatus for generating bandwidth extended signal
JP2014016625A (en) Audio coding system, audio decoder, audio coding method, and audio decoding method
JP2012509515A (en) Encoding audio digital signals with noise conversion in a scalable encoder
US10283133B2 (en) Audio classification based on perceptual quality for low or medium bit rates
US8719011B2 (en) Encoding device and encoding method
US20240127832A1 (en) Decoder
US20100153099A1 (en) Speech encoding apparatus and speech encoding method
JP5388849B2 (en) Speech coding apparatus and speech coding method
US20100049508A1 (en) Audio encoding device and audio encoding method
CN109256143A (en) Speech parameter quantization method, device, computer equipment and storage medium
JPWO2008072733A1 (en) Encoding apparatus and encoding method
US10176816B2 (en) Vector quantization of algebraic codebook with high-pass characteristic for polarity selection
US20100094623A1 (en) Encoding device and encoding method
US20100280830A1 (en) Decoder
KR102539165B1 (en) Residual coding method of linear prediction coding coefficient based on collaborative quantization, and computing device for performing the method
CN105122358A (en) Apparatus and method for processing an encoded signal and encoder and method for generating an encoded signal
Liang et al. A new 1.2 kb/s speech coding algorithm and its real-time implementation on TMS320LC548

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190122

RJ01 Rejection of invention patent application after publication