WO2011144130A1 - 一种频带扩展的方法和装置 - Google Patents

一种频带扩展的方法和装置 Download PDF

Info

Publication number
WO2011144130A1
WO2011144130A1 PCT/CN2011/075079 CN2011075079W WO2011144130A1 WO 2011144130 A1 WO2011144130 A1 WO 2011144130A1 CN 2011075079 W CN2011075079 W CN 2011075079W WO 2011144130 A1 WO2011144130 A1 WO 2011144130A1
Authority
WO
WIPO (PCT)
Prior art keywords
band
statistical characteristic
frequency band
parameters
parameter
Prior art date
Application number
PCT/CN2011/075079
Other languages
English (en)
French (fr)
Inventor
凯瑟·本特
斯卡弗·马格纳斯
瓦里·皮特
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2011144130A1 publication Critical patent/WO2011144130A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques

Definitions

  • the present invention relates to the field of communications technologies, and in particular, to a method and apparatus for band extension.
  • WB Broadband
  • SWB Super Windband
  • ITU International Telecommunication Union
  • G.722 G.722.1, G.722.2, and G.729.1
  • 3GPP Third Generation Partnership Project
  • AMR-WB Adaptive Multi-Rate Windband
  • VMR-WB variable-rate multimode broadband
  • ITU's G.722.2 broadband voice codec standard
  • VMR-WB variable-rate multimode broadband
  • ITU's G.722.2 variable-rate multimode broadband
  • VMR-WB variable-rate multimode broadband
  • the ITU recently proposed G.729.1 & G.718 joint ultra-wideband, G.711, WB & G.722 joint ultra-wideband, etc.
  • CELP Code-Excited Linear-Prediction
  • transform codes such as Modified Discrete Cosine Transform (MDCT), Transform Code Excitation ( Transform Coded excitation, TCX) and so on.
  • MDCT Modified Discrete Cosine Transform
  • TCX Transform Coded excitation
  • Band extension is widely used in the field of speech/audio coding, which can effectively improve the perceived quality of band-limited speech/music.
  • the band-extended quality enhancement technology used on the terminal is a good application example.
  • Band extension technology is also widely used in embedded variable rate speech coder, especially for audio bandwidth switching that occurs when transmission channel conditions change.
  • Common bandwidth switching mainly includes switching between Narrow band (NB), broadband, ultra-wideband, and Full Band (FB).
  • the method of realizing band extension can be divided into two types: band extension with side information and band extension with borderless information.
  • the band extension of the side information needs to extract some feature information of the frequency band to be extended at the encoding end, and sends the information to the decoding end to guide the decoding end to perform corresponding frequency band expansion.
  • Frequency of borderless information The band extension is also called blind expansion. It does not need to extract information at the encoding end. It only needs to manually generate the information of the required extension band by a certain estimation algorithm according to the information of the partial frequency band obtained by the decoding end.
  • the method of band extension can also be divided into time domain based extension and frequency domain based extension.
  • the time domain based expansion is usually based on the time domain information of the partial frequency band obtained by the decoding end, and the time domain information of the required extended frequency band is obtained after shaping in the time domain and the frequency domain, thereby realizing the frequency band expansion.
  • the frequency domain-based extension is usually based on the frequency domain information of the partial frequency band obtained by the decoding end, and the frequency domain information is shaped in the frequency domain and the time domain to obtain the frequency domain information of the required extended frequency band, thereby realizing the frequency band expansion.
  • the band extension technique of borderless information is generally processed in the time domain, and one of the methods is a piecewise linear mapping band spreading method based on statistical characteristics.
  • the implementation steps of the method are as follows: 1. Extracting the feature vector of the decoded partial frequency band;
  • the signal is classified by comparing the extracted feature vector with the statistical characteristic classification feature vector set obtained by pre-training before the band expansion; the above training refers to: extracting useful information from a data set according to a certain rule, Use the guidance of this useful information to divide the data into different classes, and the same type of data is represented by a corresponding useful information.
  • the parameter information of the required extended frequency band is obtained, thereby realizing the band expansion.
  • the inventors have found that: Since the signals are divided into a limited number of categories, the parameter information of the extended frequency band that can be generated is only limited, and cannot be adapted to a wide range of signal characteristics, resulting in an unsmooth transition between frames. Causes poor hearing.
  • a technical problem to be solved by embodiments of the present invention is to provide a method and apparatus for band extension for improving auditory experience.
  • the embodiment of the present invention provides the following technical solutions:
  • the estimated height is high
  • the band parameters are adjusted to obtain the adjusted high band parameters
  • the high frequency band signal is reconstructed based on the adjusted high band parameters.
  • a band extending device comprising:
  • a vector obtaining unit configured to acquire a feature vector of the low frequency band signal
  • a classifying unit configured to classify the low-band signal according to the feature vector and a preset statistical characteristic classification feature vector set, to obtain a statistical characteristic classification result
  • An estimation unit configured to classify a state transition matrix according to the statistical characteristic classification result, a feature vector, and a preset statistical characteristic, to obtain an estimated high frequency band parameter
  • an adjusting unit configured to adjust the estimated high-band parameters according to the statistical characteristic classification result and a preset post-processing smoothing factor set, to obtain the adjusted high-band parameters
  • a signal reconstruction unit configured to reconstruct the high frequency band signal according to the adjusted high frequency band parameter.
  • an adaptive post-processing is added on the basis of the piecewise linear mapping band extension algorithm based on statistical characteristics, and the method effectively utilizes the classification information obtained in the piecewise linear mapping band extension algorithm.
  • the parameter information of the extended frequency band obtained by the piecewise linear mapping band extension algorithm is adaptively post-processed according to the class, so that the obtained extended band parameter information is more targeted, the inter-frame transition is smoother, and the obtained extended signal has Higher hearing experience.
  • FIG. 1 is a schematic flowchart of a method for frequency band extension according to Embodiment 1 of the present invention
  • FIG. 2 is a schematic flowchart of a method for frequency band extension according to Embodiment 2 of the present invention
  • FIG. 4 is a schematic structural diagram of a device for band extension according to Embodiment 4 of the present invention
  • FIG. 5 is a schematic structural diagram of another device for band extension according to Embodiment 4 of the present invention
  • FIG. 6 is a schematic structural diagram of another apparatus for band extension according to Embodiment 4 of the present invention
  • FIG. 7 is a schematic structural diagram of another apparatus for band extension according to Embodiment 4 of the present invention.
  • the embodiment of the invention provides a method for frequency band expansion, as shown in FIG. 1 , including:
  • the feature vector may include: a time domain envelope and a linear prediction coefficient, wherein the time domain envelope represents an energy level of each subframe signal in the time domain, and the linear prediction coefficient represents a formant position and an amplitude of the signal.
  • the above-mentioned feature vector may further include: a frequency domain envelope of the low frequency band signal, a frequency domain linear prediction coefficient and the like are not limited in this embodiment of the present invention.
  • the state transition matrix is classified according to the statistical characteristic classification result, the feature vector, and the preset statistical characteristics, and the estimated high-band parameters are obtained:
  • the state transition matrix corresponding to the classification result is queried in the preset statistical characteristic classification state transition matrix; and the estimated high frequency band parameter is obtained according to the state transition matrix and the feature vector.
  • the estimated high frequency band parameter is obtained, and there are other ways, for example, the following two methods: According to the obtained statistical characteristic classification result, the state transition matrix corresponding to the classification result is queried in the preset statistical characteristic classification state transition matrix; and the state transition matrix corresponding to the statistical characteristic classification result is multiplied by the eigenvector of the low frequency band signal , get the estimated high band parameters. (2) querying the state transition map index value corresponding to the classification result in the preset statistical characteristic classification state transition matrix according to the obtained statistical characteristic classification result; and obtaining the corresponding estimated height according to the state transition mapping index value lookup table. Band parameters.
  • the foregoing high-band parameters may include: a time domain envelope and a frequency domain envelope; the foregoing time domain envelope The energy level of each sub-frame signal in the time domain is represented, and the frequency domain envelope represents the gain of each sub-band signal in the frequency domain.
  • the above-mentioned high-band parameters may also include: the time-domain linear prediction coefficient of the high-band signal, the frequency-domain linear prediction coefficient, and the like are not limited in this embodiment of the present invention.
  • the high-band parameters include: a time domain envelope and a frequency domain envelope
  • the post-processing factor parameters include: an intra-frame smoothing factor
  • adjusting the estimated high-band parameters to obtain the adjusted high-frequency The parameter may be: classify the corresponding intra-frame smoothing factor according to the statistical characteristics, adjust the time domain envelope parameter and the frequency domain envelope parameter in the predicted high-band parameter, and obtain the adjusted high-band parameter.
  • the foregoing adjusting the estimated high-band parameters according to the statistical characteristic classification result and the preset post-processing smoothing factor set includes: querying the foregoing pre-processing smoothing factor set according to the statistical characteristic classification result The post-processing factor parameter corresponding to the statistical characteristic classification result; ⁇ according to the post-processing factor parameter corresponding to the statistical characteristic classification result, adjusting the estimated high-band parameter to obtain the adjusted high-band parameter.
  • the method for adjusting the estimated high-band parameters may have other methods, for example, the following three examples: (1) According to the obtained statistics As a result of the feature classification, the post-processing factor parameter corresponding to the classification result is queried in a preset post-processing smoothing factor set; and the post-processing factor parameter corresponding to the obtained statistical characteristic classification result is estimated according to the post-processing factor parameter.
  • the band parameters are intra- and/or inter-frame smoothed, and the estimated high-band parameters are adaptively adjusted to obtain adjusted high-band parameters.
  • the factor parameter attenuates or enhances the estimated high-band parameters, adaptively adjusts the estimated high-band parameters, and obtains the adjusted high-band parameters.
  • the factor parameter performs intra- and/or inter-frame smoothing on the estimated high-band parameters, and performs parameter attenuation or enhancement, adaptively adjusts the estimated high-band parameters, and obtains the adjusted high-band parameters.
  • the foregoing post-processing factor parameters may include: an intra-frame smoothing factor and an inter-frame smoothing factor.
  • the adjusting the estimated high-band parameters includes: classifying corresponding intra-frame smoothing factors according to statistical characteristics, and adjusting time-domain envelope parameters in the predicted high-band parameters; classifying corresponding inter-frames according to statistical characteristics Smoothing factor, initially adjusting the frequency domain envelope parameter in the estimated high-band parameters; classifying the corresponding intra-frame smoothing factor according to the statistical characteristics, and re-adjusting the initially adjusted high-band frequency domain envelope parameter to obtain adjustment After the high band parameters.
  • the method adds an adaptive post-processing based on the piecewise linear mapping band extension algorithm based on statistical characteristics.
  • This method effectively utilizes the classification information obtained in the piecewise linear mapping band extension algorithm and the piecewise linear mapping band.
  • the parameter information of the extended frequency band obtained by the extended algorithm is adaptively post-processed according to the class, so that the obtained extended band parameter information is more targeted, the inter-frame transition is smoother, and the obtained extended signal has higher auditory feeling.
  • Embodiment 2 The following steps:
  • the feature vectors can be combined in various ways, and only need to be able to reflect the characteristics of the low-band signal.
  • the feature vector may contain a time domain envelope and a linear prediction coefficient of the low frequency band signal, and may also include a time domain envelope and a frequency domain envelope of the low frequency band signal.
  • the specific implementation may use a vector quantization method, and the preset statistical characteristic classification feature vector set is used as a code book, and the code book is searched for the code word with the smallest distance from the feature vector, and the corresponding index of the code word in the code book is statistical.
  • Characteristic classification i The codebook is an array containing the classification feature vectors corresponding to the M categories arranged in order, each feature vector is a codeword, and M feature vectors are M codewords.
  • the index of the codeword indicates the position of the codeword in the codebook, and the index corresponds to the classification number.
  • represents the calculation of the mean square error
  • denotes the statistical characteristic classification state transition matrix H corresponding to the statistical characteristic classification i
  • the transposition is calculated (the characteristic classification i corresponds to the i-th vector in the state transition matrix ⁇ ).
  • the statistical characteristic classification set and the statistical characteristic classification state transition matrix maintain a one-to-one correspondence when training at the same time.
  • the above high band parameters may be combined in different ways as long as they can reflect the characteristics of the high band signal.
  • the high band parameters may include a time domain envelope and a linear prediction coefficient of the high band signal, and may also include a time domain envelope and a frequency domain envelope of the high band signal. It should be noted that the combination of the high-band parameters and the feature vectors in the above may be inconsistent, and does not affect the implementation of the embodiments of the present invention.
  • the preset post-processing factor set may have different combinations, and may include only an inter-frame smoothing factor, an inter-frame smoothing factor and an intra-frame smoothing factor, and may also include different post-processing factors such as a state hopping factor.
  • Each parameter in the preset post-processing factor set can be classified separately for different statistical characteristics, thereby embodying the adaptive characteristics of the post-processing method and fully utilizing the characteristics of statistical feature classification.
  • Reconstructing the high-band signal according to the obtained high-band parameters is mainly based on the specific content of the high-band parameters.
  • the frequency domain band extension may be used.
  • the frequency domain spectrum of the low frequency band is transformed into the time domain according to the frequency domain envelope, and then the reconstructed high frequency band signal is obtained according to the time domain envelope; the time domain frequency band extension method may also be adopted, and the low frequency band may be adopted.
  • the time domain excitation signal is shaped according to the time domain envelope and then transformed into the frequency domain, and then shaped according to the frequency domain envelope, and finally transformed back to the time domain to obtain the reconstructed high frequency band signal.
  • the reconstructed high-band signals can be obtained by using a low-band time-domain excitation signal through a synthesis filter composed of high-band linear prediction coefficients.
  • the following is a summary of the statistical feature classification feature vector set, the preset statistical property classification state transition matrix set, and the preset post-processing factor set, which are obtained according to the statistical characteristics of a large number of signals.
  • the low-band feature vector of each signal and the corresponding high-band parameter vector are extracted from the training set to form a low-band feature vector training set and a high-band parameter vector training set respectively;
  • the above training set is data for training Set, the data set is a pre-selected voice/audio corpus.
  • the low-band feature vector set x f ⁇ J e ⁇ !' L , M ⁇ ) is obtained from the low-band feature vector training set training, and the feature vector set is classified according to the statistical characteristics.
  • the clustering obtains the corresponding high-band parameter vector set 17 ⁇ ⁇ ⁇ !' L , ⁇ ⁇ ).
  • the corresponding state transition matrix is calculated:
  • the state transition matrix corresponding to each statistical feature class j ⁇ , ⁇ constitutes the state transition matrix set Hj ( ⁇ 1, L, M ⁇ ).
  • the estimated high-band vector y of each signal in the training is calculated according to the method of classifying linear mapping band extension.
  • the statistical classification classification i corresponds to the average classification error:
  • N is the number of all vectors belonging to the i-th statistical feature classification in the training set
  • is the dimension of the high-band parameter vector
  • the average classification error corresponding to i can obtain the reliability factor corresponding to the statistical feature classification i:
  • the reliability factor vector is formed and used as the latter.
  • the set of processing factors and the number of groups of the reliability factors are not limited in the embodiment of the present invention. For example, if multiple sets of reliability factors are used, for example, if the high-band parameter vector includes time-domain envelope parameters and frequency-domain envelope parameters, the above method can be used to calculate the reliability factor corresponding to each statistical feature classification. Make and A, form the reliability factor vector ⁇ , A ⁇ .
  • the trained low-band feature vector set X ( ' e ⁇ 1 ' L ' ⁇ ) is the preset statistical characteristic classification feature vector set
  • the trained state transition matrix set Hj ⁇ e ⁇ 'L M is the preset statistical characteristic classification state transition matrix set
  • the trained post-processing factor set is the preset post-processing factor set.
  • the classification information obtained in the algorithm is adaptively post-processed by the parameter information of the extended frequency band obtained by the piecewise linear mapping band extension algorithm, so that the obtained extended band parameter information is more targeted, and the inter-frame transition is smoother.
  • the resulting extended signal has a higher hearing experience.
  • This embodiment provides a frequency band extension method for applying broadband to ultra-wideband in an ultra-wideband decoder. It can be understood that the method of the embodiment can also be applied to narrowband to broadband, and narrowband to ultra-wideband.
  • the examples are not to be construed as limiting the embodiments of the invention.
  • the low-band signal is a wideband signal
  • the signal bandwidth is 0 ⁇ 7KHz
  • the high-band signal is an ultra-wideband signal
  • the signal bandwidth is 7 ⁇ 14KHz.
  • the composite signal sampling rate is 32KHz
  • the specific frequency band expansion method is shown in Figure 3, including the following steps:
  • the wideband decoded signal of the current frame is obtained, which is denoted as ⁇ .
  • the feature vector of the wideband signal is illustrated by taking the time domain envelope of the wideband signal and the linear prediction coefficient as an example.
  • LPC Zow ⁇ LPC Zow (0), LPC (l), L, LPC (K-1) ⁇ of the wideband signal.
  • the time domain envelope is calculated: dividing one frame signal into L subframes, N/L samples per subframe, and calculating each subframe separately Energy, using the energy of L sub-frames as the time domain envelope of the signal
  • the characteristic vector X f of the wideband signal can be written as: X ⁇ fe (0), E low 1, eight, E i ow (L - 1), LPC (0), LPC (1), ⁇ , LPC (N - l) ⁇ 302: according to the characteristics of the obtained wideband signal
  • the statistical characteristic classification i can be implemented by using a vector quantization method, and the preset statistical characteristic classification feature vector set is used as a code book, and the code book is searched for the code word with the smallest distance from the feature vector ⁇ , and the index corresponding to the code word is Classify statistical features i:
  • Ultra-wideband parameters can be combined in different ways, as long as they reflect the characteristics of the UWB signal.
  • the ultra-wideband parameters include a time domain envelope and a frequency domain envelope of the ultra-wideband signal.
  • the estimated UWB parameters can be recorded as:
  • G hi g h (0), G high (l) , L , G high (Pl) is the estimated frequency domain envelope parameter of the ultra-wideband signal
  • the estimated ultra-wideband parameters are adaptively adjusted, and the adjusted ultra-wideband parameter post is obtained.
  • the preset post-processing factor set may have different combinations, and may include only an inter-frame smoothing factor, an inter-frame smoothing factor and an intra-frame smoothing factor, and may also include different post-processing factors such as a state hopping factor.
  • the parameters of the preset post-processing factor set are classified according to different statistical characteristics, which embodies the adaptive characteristics of the post-processing method and fully utilizes the characteristics of statistical feature classification.
  • the post-processing factor set includes an intra-frame smoothing factor set and an inter-frame smoothing factor set.
  • the statistical characteristic classification i corresponds to the intra-frame smoothing factor, and the inter-frame smoothing factor is recorded as .
  • the process of adaptively adjusting the estimated ultra-wideband parameters includes:
  • the intra-frame smoothing factor corresponding to i is adjusted to adjust the time-domain envelope parameters in the predicted ultra-wideband parameters.
  • E , p . st (0) ⁇ ⁇ ⁇ (0) + (1 - ) (L-1)
  • E p . St represents the ultra-wideband time domain envelope of the current frame adjustment.
  • i corresponds to the inter-frame smoothing factor A, and initially adjusts the estimated ultra-wideband parameters.
  • S t represents the ultra-wideband frequency i or envelope after the initial adjustment of the current frame.
  • const is the statistical average of the self-band energy in the ultra-wideband frequency domain.
  • the ultra-wideband parameter includes a time domain envelope and a frequency domain envelope, and the frequency domain band extension method may be adopted.
  • the broadband frequency domain spectrum is copied as an ultra-wideband spectrum, and then the adjusted frequency domain envelope is followed.
  • the UWB spectrum is shaped, and after the shaping, the UWB spectrum signal is transformed into the time domain, and then the time domain shaping is performed according to the adjusted time domain envelope to obtain the reconstructed high frequency band signal.
  • the method provided by the embodiment of the invention adds an adaptive post-processing method based on the piecewise linear mapping band extension algorithm based on statistical characteristics, and the method effectively utilizes the classification information obtained by the piecewise linear mapping band extension algorithm.
  • Extended band obtained by the piecewise linear mapping band extension algorithm The parameter information is adaptively post-processed according to the class, so that the obtained extended band parameter information is more targeted, the inter-frame transition is smoother, and the obtained extended signal has higher auditory feeling.
  • the embodiment of the present invention further provides a device for frequency band expansion, as shown in FIG. 4, comprising: a vector obtaining unit 401, configured to acquire a feature vector of a low frequency band signal;
  • the classifying unit 402 is configured to classify the low frequency band signal according to the feature vector and the preset statistical characteristic classification feature vector set to obtain a statistical characteristic classification result;
  • the estimating unit 403 is configured to classify the state transition matrix according to the statistical characteristic classification result, the feature vector, and the preset statistical characteristics, to obtain the estimated high frequency band parameter;
  • the adjusting unit 404 is configured to adjust the estimated high-band parameters according to the statistical characteristic classification result and the preset post-processing smoothing factor set, to obtain the adjusted high-band parameters;
  • the signal reconstruction unit 405 is configured to reconstruct the high frequency band signal according to the adjusted high frequency band parameter.
  • the foregoing adjusting unit 404 shown in FIG. 5 includes:
  • the post-processing factor query unit 501 is configured to query, according to the statistical characteristic classification result, the post-processing factor parameter corresponding to the statistical characteristic classification result in the preset post-processing smoothing factor set;
  • the adjusting subunit 502 is configured to adjust the estimated high frequency band parameter according to the post processing factor parameter corresponding to the statistical characteristic classification result to obtain the adjusted high frequency band parameter.
  • the foregoing estimating unit 403 includes:
  • the matrix query unit 601 is configured to query, according to the statistical characteristic classification result, a state transition matrix corresponding to the classification result in a preset statistical characteristic classification state transition matrix;
  • the prediction sub-unit 602 is configured to obtain an estimated high-band parameter according to the state transition matrix and the feature vector.
  • the foregoing adjusting unit 404 includes:
  • a first adjusting unit 701 configured to classify a corresponding intra-frame smoothing factor according to a statistical characteristic, and adjust a time domain envelope parameter in the estimated high-band parameter;
  • the second adjusting unit 702 is configured to classify the corresponding inter-frame smoothing factor according to the statistical characteristic, and initially adjust the frequency domain envelope parameter in the predicted high-band parameter;
  • a third adjusting unit 703 configured to classify corresponding intra-frame smoothing factors according to statistical characteristics,
  • the adjusted high-band frequency domain envelope parameters are re-adjusted to obtain adjusted high-band parameters.
  • the apparatus adds an adaptive post-processing method to the segmentation linear mapping band extension algorithm based on statistical characteristics, and the method effectively utilizes the piecewise linear mapping band extension algorithm.
  • the parameter information of the extended frequency band obtained by the piecewise linear mapping band extension algorithm is adaptively post-processed according to the class, so that the obtained extended band parameter information is more targeted, the inter-frame transition is smoother, and the obtained extension is extended.
  • the signal has a higher hearing experience.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Description

一种频带扩展的方法和装置
本申请要求于 2010 年 07 月 16 日提交中国专利局、 申请号为 201010233033.2、 发明名称为 "一种频带扩展的方法和装置" 的中国专利申请 的优先权, 其全部内容通过引用结合在本申请中。
技术领域
本发明涉及通信技术领域, 特别涉及一种频带扩展的方法和装置。
背景技术
随着承载技术的发展,人们越来越不满足于窄带语音编解码器的质量, 因 此语音编解码器已逐步向宽带 (Windband, WB )、 超宽带 ( Super Windband, SWB )扩展。例如国际电信联盟( International Telecommunication Union , ITU ) 推出了 G.722、 G.722.1 , G.722.2, G.729.1 等宽带语音编解码标准, 第三代移 动通信伙伴项目 ( Third Generation Partnership Project, 3 GPP )推出了自适应 多速率宽带( Adaptive Multi-Rate Windband, AMR-WB ) (即 ITU的 G.722.2 ) 这一宽带语音编解码标准, 3GPP2则推出了变速率多模式宽带 (Variable-Rate Multimode Windband, VMR-WB )„ 此外 ITU最近又提出了 G.729.1 &G.718联 合超宽带, G.711、 WB&G.722联合超宽带等。 这些标准都是从窄带扩展而来 的, 核心层一般为码激励线性预测 (Code-Excited Linear-Prediction, CELP ) 编码, 而宽带、 超宽带部分使用变换编码技术。 变换编码有很多, 例如修正的 离散余弦变换 ( Modified Discrete Cosine Transform, MDCT ), 变换码激励 ( Transform Coded excitation, TCX )等。
频带扩展在语音 /音频编码领域非常广泛的使用, 可以有效地提高限带语 音 /音乐的感知质量, 在终端上使用的基于频带扩展的质量增强技术就是一类 很好的应用实例。
频带扩展技术还被广范应用于嵌入式变速率语音编码器中,特别是在传输 信道条件发生变化时产生的音频带宽切换。 常见的带宽切换主要有窄带 ( Narrow band, NB )、 宽带、 超宽带、 全带 (Full Band, FB)之间的切换。
实现频带扩展的方法可以分为有边信息的频带扩展和无边信息的频带扩 展两种。 有边信息的频带扩展需要在编码端提取待扩展频带的一些特征信息, 并将这些信息产送到解码端,指导解码端进行相应的频带扩展。无边信息的频 带扩展又称为盲扩, 不需要在编码端提取信息, 只需要根据解码端得到的部分 频带的信息通过一定的估计算法人工产生所需扩展频带的信息。
频带扩展的方法还可以分为基于时域的扩展和基于频域的扩展。基于时域 的扩展通常是基于解码端得到的部分频带的时域信息进行时域及频域的整形 后得到所需扩展频带的时域信息,从而实现频带扩展。基于频域的扩展通常是 基于解码端得到的部分频带的频域信息进行频域及时域的整形后得到所需扩 展频带的频域信息, 从而实现频带扩展。
目前, 无边信息的频带扩展技术一般在时域进行的处理, 其中有一种方法 是基于统计特性的分段线性映射频带扩展法。 这种方法的实现步骤如下: 1、 提取解码得到的部分频带的特征矢量;
2、 通过对提取的特征矢量与频带扩展前预先训练得到的统计特性分类特 征矢量集进行比较, 对信号进行分类; 上述训练是指: 根据一定的规则, 从一 个数据集中间提取出有用信息,使用这些有用信息的指导将这些数据分成不同 的类, 对于同一类的数据用其对应的一个有用信息来表示。
3、 根据上述分的类对应的预先设定的状态转移矩阵, 得到所需扩展频带 的参数信息, 从而实现频带扩展。
发明人在实现本发明的过程中发现: 由于将信号分成有限的几类, 因此可 以生成的扩展频带的参数信息只有有限的几种, 无法适配广泛的信号特征,致 使帧间过渡不平滑, 导致听觉感受差。
发明内容
本发明实施例要解决的技术问题是提供一种频带扩展的方法和装置,用以 提高听觉感受。
为解决上述技术问题, 本发明实施例提供以下技术方案:
获取低频带信号的特征矢量;
根据所述特征矢量及预设的统计特性分类特征矢量集对对所述低频带信 号进行分类, 得到统计特性分类结果;
根据所述统计特性分类结果、特征矢量以及预设的统计特性分类状态转移 矩阵, 得到预估的高频带参数;
根据所述统计特性分类结果及预设的后处理平滑因子集,对所述预估的高 频带参数进行调整, 得到调整后的高频带参数;
根据调整后的高频带参数, 重建高频带信号。
一种频带扩展的装置, 包括:
矢量获取单元, 用于获取低频带信号的特征矢量;
分类单元,用于根据所述特征矢量及预设的统计特性分类特征矢量集对对 所述低频带信号进行分类, 得到统计特性分类结果;
预估单元, 用于根据所述统计特性分类结果、特征矢量以及预设的统计特 性分类状态转移矩阵, 得到预估的高频带参数;
调整单元, 用于根据所述统计特性分类结果及预设的后处理平滑因子集, 对所述预估的高频带参数进行调整, 得到调整后的高频带参数;
信号重建单元, 用于根据调整后的高频带参数, 重建高频带信号。
上述技术方案具有如下有益效果:在基于统计特性的分段线性映射频带扩 展算法的基础上增加了一个自适应后处理,该方法有效地利用了分段线性映射 频带扩展算法中获得的分类信息,对分段线性映射频带扩展算法得到的扩展频 带的参数信息按类再进行自适应的后处理,使得获得的扩展频带参数信息更加 有针对性, 帧间过渡更加平滑, 得到的扩展出的信号具有更高的听觉感受。 附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所 需要使用的附图作筒单地介绍,显而易见地, 下面描述中的附图仅仅是本发明 的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提 下, 还可以根据这些附图获得其他的附图。
图 1为本发明实施例一提供的一种频带扩展的方法的流程示意图; 图 2为本发明实施例二提供的一种频带扩展的方法的流程示意图; 图 3为本发明实施例三提供的一种频带扩展的方法的流程示意图; 图 4为本发明实施例四提供的一种频带扩展的装置的结构示意图; 图 5为本发明实施例四提供的另一种频带扩展的装置的结构示意图; 图 6为本发明实施例四提供的另一种频带扩展的装置的结构示意图; 图 7为本发明实施例四提供的另一种频带扩展的装置的结构示意图。
具体实施方式 下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清 楚、 完整地描述, 显然, 所描述的实施例仅仅是本发明一部分实施例, 而不是 全部的实施例。基于本发明中的实施例, 本领域普通技术人员在没有作出创造 性劳动前提下所获得的所有其他实施例, 都属于本发明保护的范围。 实施例一、
本发明实施例提供了一种频带扩展的方法, 如图 1所示, 包括:
101: 获取低频带信号的特征矢量;
具体地, 上述特征矢量可以包括: 时域包络和线性预测系数, 上述时域包 络表示时域内各子帧信号的能量大小,上述线性预测系数表示信号的共振峰位 置及幅度。 当然上述特征矢量还可以包括: 低频带信号的频域包络、 频域线性 预测系数等参数对此本发明实施例不予限定。
102: 根据上述特征矢量及预设的统计特性分类特征矢量集对上述低频带 信号进行分类, 得到统计特性分类结果;
103: 根据上述统计特性分类结果、 特征矢量以及预设的统计特性分类状 态转移矩阵, 得到预估的高频带参数;
具体地, 上述步骤 103中, 根据统计特性分类结果、 特征矢量以及预设的 统计特性分类状态转移矩阵, 得到预估的高频带参数包括:
根据上述统计特性分类结果,在预设的统计特性分类状态转移矩阵中查询 上述分类结果对应的状态转移矩阵;根据上述状态转移矩阵以及上述特征矢量 得到预估的高频带参数。
当然上述根据得到的统计特性分类结果、低频带信号的特征矢量以及预设 的统计特性分类状态转移矩阵, 得到预估的高频带参数, 还可以有其他方式, 例如以下两种方式: (1 )根据得到的统计特性分类结果, 在预设的统计特性分 类状态转移矩阵中查询该分类结果对应的状态转移矩阵;根据得到的统计特性 分类结果对应的状态转移矩阵乘以低频带信号的特征矢量,得到预估的高频带 参数。(2 )根据得到的统计特性分类结果, 在预设的统计特性分类状态转移矩 阵中查询该分类结果对应的状态转移映射索引值;根据得到状态转移映射索引 值查表得到对应的预估的高频带参数。
具体地, 上述高频带参数可以包括: 时域包络和频域包络; 上述时域包络 表示时域内各子帧信号的能量大小,上述频域包络表示频域内各子带信号的增 益大小。 当然上述高频带参数还可以包括: 高频带信号的时域线性预测系数、 频域线性预测系数等参数对此本发明实施例不予限定。
104: 根据上述统计特性分类结果及预设的后处理平滑因子集, 对上述预 估的高频带参数进行调整, 得到调整后的高频带参数;
若上述高频带参数包括: 时域包络和频域包络; 上述后处理因子参数, 包 括: 帧内平滑因子; 则上述对预估的高频带参数进行调整, 得到调整后的高频 带参数可以为: 根据统计特性分类对应的帧内平滑因子,调整预估的高频带参 数中的时域包络参数和频域包络参数, 得到调整后的高频带参数。
具体地, 上述根据统计特性分类结果及预设的后处理平滑因子集,对上述 预估的高频带参数进行调整包括: 根据上述统计特性分类结果,在预设的后处 理平滑因子集中查询上述统计特性分类结果对应的后处理因子参数; ^^据与上 述统计特性分类结果对应的后处理因子参数,调整上述预估的高频带参数,得 到调整后的高频带参数。
当然,上述根据得到的分类结果及预设的后处理平滑因子集,对预估的高 频带参数进行调整的方法还可以有其他的方式, 例如如下 3个举例: (1 )根据 得到的统计特性分类结果,在预设的后处理平滑因子集中查询该分类结果对应 的后处理因子参数; 根据与得到的统计特性分类结果相对应的后处理因子参 数, 根据后处理因子参数对预估的高频带参数进行帧内和 /或帧间平滑, 自适 应地调整预估的高频带参数, 得到调整后的高频带参数。 ( 2 )根据得到的统计 特性分类结果,在预设的后处理平滑因子集中查询该分类结果对应的后处理因 子参数; 根据与得到的统计特性分类结果相对应的后处理因子参数,根据后处 理因子参数对预估的高频带参数进行参数衰减或增强, 自适应地调整预估的高 频带参数, 得到调整后的高频带参数。 (3 )根据得到的统计特性分类结果, 在 预设的后处理平滑因子集中查询该分类结果对应的后处理因子参数;根据与得 到的统计特性分类结果相对应的后处理因子参数,根据后处理因子参数对预估 的高频带参数进行帧内和 /或帧间平滑, 并进行参数衰减或增强, 自适应地调 整预估的高频带参数, 得到调整后的高频带参数。
105: 根据上述调整后的高频带参数, 重建高频带信号。 具体地, 上述后处理因子参数可以包括: 帧内平滑因子和帧间平滑因子。 更具体地, 上述调整预估的高频带参数包括: 根据统计特性分类对应的帧 内平滑因子,调整预估的高频带参数中的时域包络参数; 根据统计特性分类对 应的帧间平滑因子,初步调整预估的高频带参数中的频域包络参数; 根据统计 特性分类对应的帧内平滑因子,对初步调整后的高频带频域包络参数进行再调 整, 得到调整后的高频带参数。 的方法在基于统计特性的分段线性映射频带扩展算法的基础上增加了一个自 适应后处理,该方法有效地利用了分段线性映射频带扩展算法中获得的分类信 息,对分段线性映射频带扩展算法得到的扩展频带的参数信息按类再进行自适 应的后处理,使得获得的扩展频带参数信息更加有针对性,帧间过渡更加平滑, 得到的扩展出的信号具有更高的听觉感受。 实施例二、 如下步骤:
201 : 解码得到低频带信号;
202: 提取低频带信号的特征矢量 X / 。
特征矢量可以有各种组合方式, 只需要能够反映低频带信号的特征即可。 例如,特征矢量可以包含低频带信号的时域包络和线性预测系数,也可以包含 低频带信号的时域包络和频域包络。
203: 根据得到的低频带信号的特征矢量 X f 以及预设的统计特性分类特 征矢量集 X ^ e L ,Μ}) , 对低频带信号进行分类, 得到统计特性分类 i。
具体实现可以使用矢量量化的方法,将预设的统计特性分类特征矢量集作 为码书, 在码书中搜索与特征矢量 距离最小的码字, 该码字在码书中对 应的索引即为统计特性分类 i: 码书是一个数组, 包含了顺序排列的 M个分类 对应的分类特征矢量, 每一个特征矢量就是一个码字, M个特征矢量就是 M个 码字。 码字的索引表示该码字在码书中的位置, 索引对应了分类号。
Figure imgf000009_0001
其中 ΙΙΊΙ 表示计算均方误差
204: 根据得到的统计特性分类 i、 低频带信号的特征矢量 X ,以及预设的 统计特性分类状态转移矩阵集 H , 预估高频带参数 , y = H - Xf o
其中 ί表示对统计特性分类 i对应的统计特性分类状态转移矩阵 H,计算 转置(特性分类 i对应了状态转移矩阵 Η中的第 i个矢量 )。 统计特性分类集 和统计特性分类状态转移矩阵同时训练出来时保持了一一对应关系。
上述高频带参数可以有不同的组合方式,只要能够反映出高频带信号的特 征即可。 例如, 高频带参数可以包含高频带信号的时域包络和线性预测系数, 也可以包含高频带信号的时域包络和频域包络。需要说明的是高频带参数与上 文中的特征矢量的组合方式是可以不一致的, 不影响本发明实施例的实现。
205: 根据统计特性分类 i及预设的后处理因子集, 对预估高频带参数进行 自适应调整, 得到调整后的高频带参数^。st
上述预设的后处理因子集可以有不同的组合方式,可以只包含帧间平滑因 子,也可以包含帧间平滑因子和帧内平滑因子,还可以包含状态跳变因子等不 同的后处理因子。预设的后处理因子集中的各参量可以分别针对不同的统计特 性分类, 从而体现后处理方法的自适应特性及充分运用统计特性分类的特点。
206: 根据调整后的高频带参数重建高频带信号。
根据得到的高频带参数重建高频带信号,主要依据高频带参数包含的具体 内容, 例如, 高频带参数包含了时域包络和频域包络时, 可以采用频域频带扩 展的方法,对低频带的频域谱按照频域包络整形后变换到时域,再根据时域包 络进行整形得到重建的高频带信号; 也可以采用时域频带扩展的方法,对低频 带的时域激励信号按照时域包络进行整形后变换到频域,再根据频域包络进行 整形, 最后再变换回时域, 得到重建的高频带信号。 高频带参数包含了高频带 线性预测系数时,可以用低频带的时域激励信号经过高频带线性预测系数构成 的合成滤波器, 得到重建的高频带信号。 以下对预设的统计特性分类特征矢量集、预设的统计特性分类状态转移矩 阵集和预设的后处理因子集,是根据大量信号的统计特性得到的, 具体的训练 方法: ¾口下:
首先,从训练集中提取出每一个信号的低频带特征矢量以及相应的高频带 参数矢量, 分别组成低频带特征矢量训练集和高频带参数矢量训练集; 上述训 练集为用于训练的数据集, 该数据集为预先选定的语音 /音频语料。
然后, 根据不同的统计特征, 按照聚类的方法, 从低频带特征矢量训练集 中训练得到低频带特征矢量集 x f { J e {!'LM}) , 同时根据统计特性分 类特征矢量集的聚类得到相应的高频带参数矢量集17 ^ ^^ {!'LΜί)。 根据每一个统计特征分类 j e {1,L , M }对应的训练数据 (统计特征分类 j 对应 ί ,.和高频带参数矢量 ), 计算对应的状态转移矩阵:
Figure imgf000010_0001
Xj 表示求矢量 的伪逆运算, x/ =(x,xj— x,。 每一个统计特征 分类 j≡ , }对应的状态转移矩阵就构成了状态转移矩阵集 Hj ( {1,L,M})。
根据已经训练好的的低频带特征矢量集和状态转移矩阵集,按照分类线性 映射频带扩展的方法, 计算训练中每一个信号的预估高频带矢量 y
计算每一个统计特征分类对应的可靠性因子,并将其作为后处理因子集中 的参数。上述可靠性因子为 ¾。后处理因子的范围更加宽泛,可以只包含 , 也可以包含除可靠性因子外的其他因子。 统计特征分类 i对应的平均分类误差 为:
N N、.
N、 1=0 其中 ( 表示属于第 i个统计特征分类的第 n个预估高频带矢量的第 1 个分量, ( 表示属于第 i个统计特征分类的第 n个实际高频带参数矢量的 第 1个分量, N为训练集中属于第 i个统计特征分类的所有矢量的个数, ^为高 频带参数矢量的维数。
根据统计特征分类 i对应的平均分类误差可以得到统计特征分类 i对应的可 靠性因子:
c 1
=
其中 C为常数。
实际应用中, 可以只计算一组可靠性因子 ; 也可以针对高频带参数矢 量中包含的不同的参数, 分别计算几组可靠性因子如 、 A等, 组成可靠性 因子矢量并将其作为后处理因子集,对与可靠性因子的组数本发明实施例不予 限定。 使用多组可靠性因子的实例, 例如, 如果高频带参数矢量中包含了时域 包络参数和频域包络参数,可以使用上面的方法分别计算每一个统计特征分类 对应的可靠性因子记作 和 A , 组成可靠性因子矢量 { , A}。 在上述实例中,训练好的低频带特征矢量集 X ( 'e {1'L 'Μ})即为预设的 统计特性分类特征矢量集, 训练好的状态转移矩阵集 Hj ^e ^'L,M 即为 预设的统计特性分类状态转移矩阵集,训练好的后处理因子集即为预设的后处 理因子集。 本发明实施例提供的方法在基于统计特性的分段线性映射频带扩展算法 的基础上增加了一个自适应后处理,该方法有效地利用了分段线性映射频带扩 展算法中获得的分类信息,对分段线性映射频带扩展算法得到的扩展频带的参 数信息按类再进行自适应的后处理,使得获得的扩展频带参数信息更加有针对 性, 帧间过渡更加平滑, 得到的扩展出的信号具有更高的听觉感受。 实施例三、
本实施例提供一个应用在超宽带解码器中从宽带到超宽带的频带扩展方 法, 可以理解的是本实施例的方法也可以应用于从窄带到宽带,从窄带到超宽 带的扩展, 本实施例作为一个实例不应理解为对本发明实施例的限定。本实施 例中低频带信号即为宽带信号, 信号带宽为 0~7KHz, 高频带信号即为超宽带 信号, 信号带宽为 7~14KHz。 合成信号采样率 32KHz, 信号以 20ms为一帧, 即 N=160点 /帧。 具体频带扩展的方法如图 3所示, 包括如下步骤:
301: 解码得到宽带信号, 并提取宽带信号的特征矢量 Xf 。 根据超宽带解码器中宽带解码方法,得到当前帧的宽带解码信号,记作 χ„。 本实施例中,宽带信号的特征矢量以包含宽带信号的时域包络以及线性预测系 数为例进行说明 。 首先, 求解宽带信号的 Ν阶线性预测 系数 LPCZow ={LPCZow(0),LPC (l),L ,LPC (K— 1)} , 具体方法可以使用莱文迅- 杜宾算法。 本实施例中 Κ=64, 当然也可以选取其他的阶数。 然后, 计算时域 包络: 将一帧信号分成 L个子帧, 每一子帧 N/L个样点, 分别计算每一子帧的 能 量 , 用 L 个 子 帧 的 能 量 作 为 信 号 的 时 域 包 络
E =½ (0), 。w(l),L ,Elow(L-\)} , 其中第 i个子帧的能量 E (0:
Figure imgf000012_0001
本实施例中 L=8 , 当然也可以选取其他的时域子帧划分方式。 那么, 宽带信号的特征矢量 Xf可以记作: X Γ fe (0), E low①,八, Eiow (L - 1), LPC (0), LPC (1),Λ , LPC (N - l)} 302:根据得到的宽带信号的特征矢量 ^以及预设的统计特性分类特征矢 量集 X ^ ( e {l,L ,M}), 对宽带信号进行分类, 得到统计特性分类 i。 得到统计特性分类 i具体实现可以使用矢量量化的方法, 将预设的统计特 性分类特征矢量集作为码书, 在码书中搜索与特征矢量 ^距离最小的码字, 该码字对应的索引即为统计特性分类 i:
|2
I = arg min f f ,J
;e{l,L ,Μ}
2
其中 1111 表示计算均方误差, 本实施例中 M=8。
303 : 根据得到的统计特性分类 i、 宽带信号的特征矢量 ^以及预设的统 计特性分类状态转移矩阵集 H , 预估超宽带参数 。
Figure imgf000013_0001
其中 ί表示对统计特性分类 i对应的统计特性分类状态转移矩阵 计算 转置。超宽带参数可以有不同的组合方式, 只要能够反映出超宽带信号的特征 即可。 在本实施例中, 超宽带参数包含了超宽带信号的时域包络和频域包络。 预估超宽带参数 可以记作:
Υ = lgh (0), Ehigh (1),Λ , Ehigh (L - 1), Ghlgh (0), Ghlgh (1),Λ , Ghlgh (P - 1)} 其中 Ehigh (0) , Ehigh (1) ,L , Ehigh (L - 1)为超宽带信号的预估时域包络参数,
Ehlgh (i)表示第 i个子帧的时域能量参数, i = 0,L ,L-1 , 本实施例中 L=8 , 当 然 也 可 以 选 取 其 他 的 时 域 子 帧 划 分 方 式 。 Ghigh (0),Ghigh (l) ,L , Ghigh (P-l)为超宽带信号的预估频域包络参 数, Ghigh ( 表示第 i个子带的频域增益因子, i = 0,L ,Ρ-1 , 本实施例中 Ρ=18, 当然也可以选取其他的频域子带划分方式。
304: 根据统计特性分类 i及预设的后处理因子集, 对预估超宽带参数进行 自适应调整, 得到调整后的超宽带参数 post。
预设的后处理因子集可以有不同的组合方式, 可以只包含帧间平滑因子, 也可以包含帧间平滑因子和帧内平滑因子,还可以包含状态跳变因子等不同的 后处理因子。 预设的后处理因子集中的各参量分别针对不同的统计特性分类, 体现了后处理方法的自适应特性及充分运用了统计特性分类的特点。本实施例 中, 后处理因子集包含了帧内平滑因子集和帧间平滑因子集。 统计特性分类 i 对应的帧内平滑因子记作 , 帧间平滑因子记作 。 根据统计特性分类 i及预 设的后处理因子集, 对预估超宽带参数进行自适应调整的过程包括:
( 1 )根据统计特性分类 i对应的帧内平滑因子" 调整预估超宽带参数中 的时域包络参数。 E ,pst (0) = α{ · (0) + (1 - ) · (L— 1)
E! -kpost = · E^h (k) + (1 - ) · E^h (k-1) , k = i,L ,L-l 其中 Είι 1ι) 表示前一帧的第 L-i个未经过调整的超宽带时域包络,
E pst表示当前帧调整后的超宽带时域包络。
( 2 )根据统计特性分类 i对应的帧间平滑因子 A , 初步调整预估超宽带参 数中的频域包络参数。
Figure imgf000015_0001
(k) = h (k) + (卜" i ) O ) , k = 0,L ,P-1 ^(n) (n-1) 其中 high表示当前帧未经过调整的超宽带频域包络, ^high表示前一 帧未经过调整的超宽带频域包络, high,p。St表示当前帧初步调整后的超宽带 频 i或包络。
( 3 )根据统计特性分类 i, 对初步调整后的超宽带频域包络参数进行再调
2 (k) = G^, (k) -画 t小 k = 0,L ,P-l 其中 const为常数因子, 本实施例中 const为超宽带频域自带能量的统计 平均值。 那么调整后的超宽带参数 yst记作:
E^h,pst ( 0 ), E^h,post ( 1 ), L, E|^h,pst ( L- 1 ),
ypost
Ghigh,P。st2 (0), Ghigh,pst2 (1),L, Ghigh,pst2 (P- 1)
305: 根据调整后的超宽带参数重建高频带信号。 根据得到的超宽带参数重建超宽带信号,主要依据超宽带参数包含的具体 内容。 本实施例中, 超宽带参数包含了时域包络和频域包络, 可以采用频域频 带扩展的方法, 首先复制宽带的频域谱作为超宽带谱, 然后按照调整后的频域 包络对超宽带谱进行整形, 整形后将超宽带谱信号变换到时域,再根据调整后 的时域包络进行时域整形, 得到重建的高频带信号。
本发明实施例提供的方法在基于统计特性的分段线性映射频带扩展算法 的基础上增加了一个自适应后处理方法,该方法有效地利用了分段线性映射频 带扩展算法中获得的分类信息,对分段线性映射频带扩展算法得到的扩展频带 的参数信息按类再进行自适应的后处理,使得获得的扩展频带参数信息更加有 针对性, 帧间过渡更加平滑, 得到的扩展出的信号具有更高的听觉感受。 实施例四、
本发明实施例还提供了一种频带扩展的装置, 如图 4所示, 包括: 矢量获取单元 401 , 用于获取低频带信号的特征矢量;
分类单元 402, 用于根据上述特征矢量及预设的统计特性分类特征矢量集 对对上述低频带信号进行分类, 得到统计特性分类结果;
预估单元 403 , 用于根据上述统计特性分类结果、 特征矢量以及预设的统 计特性分类状态转移矩阵, 得到预估的高频带参数;
调整单元 404, 用于根据上述统计特性分类结果及预设的后处理平滑因子 集, 对上述预估的高频带参数进行调整, 得到调整后的高频带参数;
信号重建单元 405 , 用于根据调整后的高频带参数, 重建高频带信号。 具体地, 如图 5所示上述调整单元 404包括:
后处理因子查询单元 501 , 用于根据上述统计特性分类结果, 在预设的后 处理平滑因子集中查询上述统计特性分类结果对应的后处理因子参数;
调整子单元 502, 用于根据与上述统计特性分类结果对应的后处理因子参 数, 调整上述预估的高频带参数, 得到调整后的高频带参数。
具体地, 如图 6所示, 上述预估单元 403包括:
矩阵查询单元 601 , 用于根据上述统计特性分类结果, 在预设的统计特性 分类状态转移矩阵中查询上述分类结果对应的状态转移矩阵;
预估子单元 602, 用于根据上述状态转移矩阵以及上述特征矢量得到预估 的高频带参数。
具体地, 如图 7所示, 上述调整单元 404包括:
第一调整单元 701 , 用于根据统计特性分类对应的帧内平滑因子, 调整预 估的高频带参数中的时域包络参数;
第二调整单元 702, 用于根据统计特性分类对应的帧间平滑因子, 初步调 整预估的高频带参数中的频域包络参数;
第三调整单元 703 , 用于根据统计特性分类对应的帧内平滑因子, 对初步 调整后的高频带频域包络参数进行再调整, 得到调整后的高频带参数。
可以理解的是,本实施例中的频带扩展的装置的各个功能模块的功能可以 根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实 施例的相关描述, 此处不再赘述。
综上,本发明实施例提供的装置在基于统计特性的分段线性映射频带扩展 算法的基 上增加了一个自适应后处理方法,该方法有效地利用了分段线性映 射频带扩展算法中获得的分类信息,对分段线性映射频带扩展算法得到的扩展 频带的参数信息按类再进行自适应的后处理,使得获得的扩展频带参数信息更 加有针对性, 帧间过渡更加平滑, 得到的扩展出的信号具有更高的听觉感受。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分步骤 是可以通过程序来指令相关的硬件完成,上述的程序可以存储于一种计算机可 读存储介质中, 上述提到的存储介质可以是只读存储器, 磁盘或光盘等。
以上对本发明实施例所提供的一种频带扩展的方法和装置进行了详细介 绍, 本文中应用了具体个例对本发明的原理及实施方式进行了阐述, 以上实施 例的说明只是用于帮助理解本发明的方法及其核心思想; 同时,对于本领域的 一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变 之处, 综上, 本说明书内容不应理解为对本发明的限制。

Claims

权 利 要 求
1、 一种频带扩展的方法, 其特征在于, 包括:
获取低频带信号的特征矢量;
根据所述特征矢量及预设的统计特性分类特征矢量集对对所述低频带信 号进行分类, 得到统计特性分类结果;
根据所述统计特性分类结果、特征矢量以及预设的统计特性分类状态转移 矩阵, 得到预估的高频带参数;
根据所述统计特性分类结果及预设的后处理平滑因子集,对所述预估的高 频带参数进行调整, 得到调整后的高频带参数;
根据调整后的高频带参数, 重建高频带信号。
2、 根据权利要求 1所述方法, 其特征在于, 所述根据统计特性分类结果及 预设的后处理平滑因子集, 对所述预估的高频带参数进行调整, 包括:
根据所述统计特性分类结果,在预设的后处理平滑因子集中查询所述统计 特性分类结果对应的后处理因子参数;
根据与所述统计特性分类结果对应的后处理因子参数,调整所述预估的高 频带参数, 得到调整后的高频带参数。
3、 根据权利要求 1或 2所述方法, 其特征在于, 所述根据统计特性分类结 果、特征矢量以及预设的统计特性分类状态转移矩阵,得到预估的高频带参数 包括:
根据所述统计特性分类结果,在预设的统计特性分类状态转移矩阵中查询 所述分类结果对应的状态转移矩阵;
根据所述状态转移矩阵以及所述特征矢量得到预估的高频带参数。
4、 根据权利要求 1或 2所述方法, 其特征在于,
所述特征矢量, 包括: 时域包络和线性预测系数,
所述时域包络表示时域内各子帧信号的能量大小,所述线性预测系数表示 信号的共振峰位置及幅度。
5、 根据权利要求 1或 2所述方法, 其特征在于,
所述高频带参数, 包括: 时域包络和频域包络;
所述后处理因子参数, 包括: 帧内平滑因子; 所述对预估的高频带参数进行调整, 得到调整后的高频带参数, 包括: 根据统计特性分类对应的帧内平滑因子,调整预估的高频带参数中的时域 包络参数和频域包络参数, 得到调整后的高频带参数。
6、 根据权利要求 1或 2所述方法, 其特征在于,
所述后处理因子参数, 包括: 帧内平滑因子和帧间平滑因子。
7、 根据权利要求 6所述方法, 其特征在于,
所述调整预估的高频带参数, 包括:
根据统计特性分类对应的帧内平滑因子,调整预估的高频带参数中的时域 包络参数;
根据统计特性分类对应的帧间平滑因子,初步调整预估的高频带参数中的 频域包络参数;
才艮据统计特性分类对应的帧内平滑因子,对初步调整后的高频带频域包络 参数进行再调整, 得到调整后的高频带参数。
8、 一种频带扩展的装置, 其特征在于, 包括:
矢量获取单元, 用于获取低频带信号的特征矢量;
分类单元,用于根据所述特征矢量及预设的统计特性分类特征矢量集对对 所述低频带信号进行分类, 得到统计特性分类结果;
预估单元, 用于根据所述统计特性分类结果、特征矢量以及预设的统计特 性分类状态转移矩阵, 得到预估的高频带参数;
调整单元, 用于根据所述统计特性分类结果及预设的后处理平滑因子集, 对所述预估的高频带参数进行调整, 得到调整后的高频带参数;
信号重建单元, 用于根据调整后的高频带参数, 重建高频带信号。
9、 根据权利要求 8所述装置, 其特征在于, 所述调整单元包括: 后处理因子查询单元, 用于根据所述统计特性分类结果,在预设的后处理 平滑因子集中查询所述统计特性分类结果对应的后处理因子参数;
调整子单元, 用于^ ^据与所述统计特性分类结果对应的后处理因子参数, 调整所述预估的高频带参数, 得到调整后的高频带参数。
10、 根据权利要求 8或 9所述装置, 其特征在于, 所述预估单元包括: 矩阵查询单元, 用于根据所述统计特性分类结果,在预设的统计特性分类 状态转移矩阵中查询所述分类结果对应的状态转移矩阵;
预估子单元,用于根据所述状态转移矩阵以及所述特征矢量得到预估的高 频带参数。
11、 根据权利要求 10所述装置, 其特征在于, 所述调整单元包括: 第一调整单元, 用于根据统计特性分类对应的帧内平滑因子,调整预估的 高频带参数中的时域包络参数;
第二调整单元, 用于根据统计特性分类对应的帧间平滑因子,初步调整预 估的高频带参数中的频域包络参数;
第三调整单元, 用于根据统计特性分类对应的帧内平滑因子,对初步调整 后的高频带频域包络参数进行再调整, 得到调整后的高频带参数。
PCT/CN2011/075079 2010-07-16 2011-06-01 一种频带扩展的方法和装置 WO2011144130A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2010102330332A CN102339607A (zh) 2010-07-16 2010-07-16 一种频带扩展的方法和装置
CN201010233033.2 2010-07-16

Publications (1)

Publication Number Publication Date
WO2011144130A1 true WO2011144130A1 (zh) 2011-11-24

Family

ID=44991196

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/075079 WO2011144130A1 (zh) 2010-07-16 2011-06-01 一种频带扩展的方法和装置

Country Status (2)

Country Link
CN (1) CN102339607A (zh)
WO (1) WO2011144130A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104517610B (zh) * 2013-09-26 2018-03-06 华为技术有限公司 频带扩展的方法及装置
WO2018070522A1 (ja) 2016-10-14 2018-04-19 公立大学法人大阪府立大学 嚥下診断装置およびプログラム

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101304261A (zh) * 2007-05-12 2008-11-12 华为技术有限公司 一种频带扩展的方法及装置
CN101620854A (zh) * 2008-06-30 2010-01-06 华为技术有限公司 频带扩展的方法、***和设备
CN101770777A (zh) * 2008-12-31 2010-07-07 华为技术有限公司 一种线性预测编码频带扩展方法、装置和编解码***

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7174135B2 (en) * 2001-06-28 2007-02-06 Koninklijke Philips Electronics N. V. Wideband signal transmission system
WO2006062202A1 (ja) * 2004-12-10 2006-06-15 Matsushita Electric Industrial Co., Ltd. 広帯域符号化装置、広帯域lsp予測装置、帯域スケーラブル符号化装置及び広帯域符号化方法
JP4876574B2 (ja) * 2005-12-26 2012-02-15 ソニー株式会社 信号符号化装置及び方法、信号復号装置及び方法、並びにプログラム及び記録媒体
US8041578B2 (en) * 2006-10-18 2011-10-18 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding an information signal
US7912729B2 (en) * 2007-02-23 2011-03-22 Qnx Software Systems Co. High-frequency bandwidth extension in the time domain
CN101751926B (zh) * 2008-12-10 2012-07-04 华为技术有限公司 信号编码、解码方法及装置、编解码***

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101304261A (zh) * 2007-05-12 2008-11-12 华为技术有限公司 一种频带扩展的方法及装置
CN101620854A (zh) * 2008-06-30 2010-01-06 华为技术有限公司 频带扩展的方法、***和设备
CN101770777A (zh) * 2008-12-31 2010-07-07 华为技术有限公司 一种线性预测编码频带扩展方法、装置和编解码***

Also Published As

Publication number Publication date
CN102339607A (zh) 2012-02-01

Similar Documents

Publication Publication Date Title
JP7214726B2 (ja) ニューラルネットワークプロセッサを用いた帯域幅が拡張されたオーディオ信号を生成するための装置、方法またはコンピュータプログラム
CN108806703B (zh) 用于隐藏帧错误的方法和设备
CN107481725B (zh) 时域帧错误隐藏设备和时域帧错误隐藏方法
KR101940742B1 (ko) 시간 도메인 여기 신호를 변형하는 오류 은닉을 사용하여 디코딩된 오디오 정보를 제공하기 위한 오디오 디코더 및 방법
CN105765651B (zh) 使用错误隐藏提供经解码的音频信息的音频解码器及方法
AU2014211479B2 (en) Audio encoder, audio decoder, method for providing an encoded audio information, method for providing a decoded audio information, computer program and encoded representation using a signal-adaptive bandwidth extension
CN107103910B (zh) 帧错误隐藏方法和设备以及音频解码方法和设备
US20100286805A1 (en) System and Method for Correcting for Lost Data in a Digital Audio Signal
US20110178807A1 (en) Method and apparatus for decoding audio signal
WO2009067883A1 (fr) Procédé de codage/décodage et dispositif pour le bruit de fond
EP3427258A1 (en) Error concealment unit, audio decoder, and related method and computer program using characteristics of a decoded representation of a properly decoded audio frame
EP2951825A1 (en) Apparatus and method for generating a frequency enhanced signal using temporal smoothing of subbands
WO2010000179A1 (zh) 频带扩展的方法、***和设备
WO2011144130A1 (zh) 一种频带扩展的方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11783047

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11783047

Country of ref document: EP

Kind code of ref document: A1