JP4905962B2 - Method and apparatus for extracting HLAC feature from conversion value of one-dimensional signal - Google Patents

Method and apparatus for extracting HLAC feature from conversion value of one-dimensional signal Download PDF

Info

Publication number
JP4905962B2
JP4905962B2 JP2007020169A JP2007020169A JP4905962B2 JP 4905962 B2 JP4905962 B2 JP 4905962B2 JP 2007020169 A JP2007020169 A JP 2007020169A JP 2007020169 A JP2007020169 A JP 2007020169A JP 4905962 B2 JP4905962 B2 JP 4905962B2
Authority
JP
Japan
Prior art keywords
hlac
feature
dimensional
signal
gray
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP2007020169A
Other languages
Japanese (ja)
Other versions
JP2008185845A (en
Inventor
晃 佐宗
多喜夫 栗田
展之 大津
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Institute of Advanced Industrial Science and Technology AIST
Original Assignee
National Institute of Advanced Industrial Science and Technology AIST
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Institute of Advanced Industrial Science and Technology AIST filed Critical National Institute of Advanced Industrial Science and Technology AIST
Priority to JP2007020169A priority Critical patent/JP4905962B2/en
Publication of JP2008185845A publication Critical patent/JP2008185845A/en
Application granted granted Critical
Publication of JP4905962B2 publication Critical patent/JP4905962B2/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Description

本発明は、音声、音楽、環境音などの音響信号、または心電図波形、地震波形など様々な1次元信号の変換値から異常検出、特定信号の認識や検索または計数などの処理に効果的なHLAC特徴抽出方法および装置に関する。   The present invention is effective for processing such as abnormality detection, specific signal recognition, search, or counting from the converted values of various one-dimensional signals such as sound signals such as voice, music, and environmental sounds, or electrocardiogram waveforms and earthquake waveforms. The present invention relates to a feature extraction method and apparatus.

高次局所自己相関(HLAC)を用いて動画像から特徴を抽出し、異常動作の検出や移動対象の実時間追跡を行う先行技術(特許文献1〜3参照)として以下のものが既に提案されている。
また、音声信号を短時間フーリエ変換することで時間‐周波数領域に変換し、2次元GrayHLAC特徴を抽出する手法(非特許文献1参照)などの提案も既に見られる。
The following have already been proposed as prior arts (see Patent Documents 1 to 3) for extracting features from a moving image using high-order local autocorrelation (HLAC) and detecting abnormal motion and real-time tracking of a moving object. ing.
In addition, proposals have already been made such as a method of converting a speech signal into a time-frequency domain by performing a short-time Fourier transform to extract a two-dimensional Gray HLAC feature (see Non-Patent Document 1).

特開2005−092346号公報Japanese Patent Laying-Open No. 2005-092346 特開2006―079272号公報JP 2006-079272 A 特開2006−163452号公報JP 2006-163452 A 音講論「フィッシャー重みマップを利用した高次局所自己相関特徴による音素認識」加藤俊祐,滝口哲也,有木康雄: 日本音響学会平成17年度秋季研究発表会,1-P-10,pp.171-172,2005-09Sound lecture "Phoneme recognition using higher-order local autocorrelation features using Fisher weight maps" Shunsuke Kato, Tetsuya Takiguchi, Yasuo Ariki: The Acoustical Society of Japan 2005 Autumn Meeting, 1-P-10, pp.171- 172, 2005-09

高次局所自己相関(HLAC)を用いた先行特許は、何れも画像(2次元)または動画像(3次元)からの特徴抽出を目的としている。一方、1次元信号にHLACを適用した先行研究として、音声信号の時間‐周波数表現から2次元GrayHLAC特徴を求め、音素認識を行う例が報告されている。音声認識を行うには、短時間周波数成分の分布形状が重要な役割を果たす。先行研究では、音声の時間‐周波数表現から2次元GrayHLAC特徴を求めているため、時間軸だけでなく周波数軸に関しても、HLAC特徴量のシフト不変性が成立する。これでは、周波数成分が異なる帯域に同じ形状で分布している音声信号を区別できないという問題が生じる。   All prior patents using higher order local autocorrelation (HLAC) are aimed at feature extraction from images (2D) or moving images (3D). On the other hand, as a previous study in which HLAC is applied to a one-dimensional signal, an example has been reported in which two-dimensional Gray HLAC features are obtained from a time-frequency representation of a speech signal and phoneme recognition is performed. In order to perform speech recognition, the distribution shape of the short-time frequency component plays an important role. In the previous research, since the two-dimensional Gray HLAC feature is obtained from the time-frequency representation of speech, shift invariance of the HLAC feature amount is established not only on the time axis but also on the frequency axis. This causes a problem that audio signals distributed in the same shape in different bands of frequency components cannot be distinguished.

本発明の目的は、上記問題点に鑑み、周波数軸上の位置情報を考慮した1次元信号の変換値からHLAC特徴量を求めるHLAC特徴抽出方法および装置を提供することにある。   In view of the above problems, an object of the present invention is to provide an HLAC feature extraction method and apparatus for obtaining an HLAC feature quantity from a converted value of a one-dimensional signal in consideration of position information on the frequency axis.

本発明の1次元信号の変換値からのHLAC特徴抽出方法は、1次元信号をベクトル時系列に変換し、ベクトルの各要素の時系列に1次元GrayHLACを適用する。そして、全ての要素から得られたHLAC特徴量を結合して1つの特徴量として出力、または結合特徴量を主成分分析し次元縮退した特徴量を出力する。例えば、変換処理として、短時間フーリエ変換やウェーブレット変換を用いる場合、1次元信号は複数の周波数帯域に分割される。そして、その帯域毎の変換値に対して1次元GrayHLACをそれぞれ独立に適用することで、周波数軸上の位置情報を考慮したHLAC特徴量が得られる。本発明は、前記手順を実行する1次元信号の変換値からのHLAC特徴抽出方法、および、それらの手順を実行するように機能する制御装置を備えた1次元信号の変換値からのHLAC特徴抽出装置を課題解決手段とする。   The HLAC feature extraction method from the conversion value of the one-dimensional signal of the present invention converts the one-dimensional signal into a vector time series, and applies the one-dimensional Gray HLAC to the time series of each element of the vector. Then, the HLAC feature values obtained from all the elements are combined and output as one feature value, or the combined feature value is subjected to principal component analysis and a dimension reduced feature value is output. For example, when short-time Fourier transform or wavelet transform is used as the transform process, the one-dimensional signal is divided into a plurality of frequency bands. Then, the one-dimensional Gray HLAC is independently applied to the converted value for each band, thereby obtaining the HLAC feature amount considering the position information on the frequency axis. The present invention provides a method for extracting HLAC features from converted values of a one-dimensional signal for executing the above-described procedure, and an extraction of HLAC features from converted values of a one-dimensional signal provided with a control device that functions to execute the procedures. The apparatus is a problem solving means.

具体的には以下のようになる。
(1)本発明の1次元信号の変換値からのHLAC特徴抽出方法は、1次元信号をベクトル時系列に変換し、変換されたベクトルの各要素の時系列に1次元GrayHLACを適用し、全ての要素から得られたHLAC特徴量を結合して1つの特徴量として出力する手順を有することを特徴とする。
(2)上記(1)記載の1次元信号の変換値からのHLAC特徴抽出方法は、前記結合して1つの特徴量とした出力結合特徴量を主成分分析し次元縮退した特徴量を出力することを特徴とする。
Specifically:
(1) The HLAC feature extraction method from the converted value of the one-dimensional signal of the present invention converts the one-dimensional signal into a vector time series, applies the one-dimensional Gray HLAC to the time series of each element of the converted vector, The HLAC feature amount obtained from the elements is combined and output as one feature amount.
(2) In the HLAC feature extraction method from the converted value of the one-dimensional signal described in (1) above, a principal component analysis is performed on the output combined feature value that is combined to form one feature value, and a dimension-degenerated feature value is output. It is characterized by that.

(3)上記(1)又は(2)記載の1次元信号の変換値からのHLAC特徴抽出方法は、
前記1次元GrayHLACを、下記数4の式としたことを特徴とする。
但し、f(r)は入力信号、Nは次数、a0,a1,…,aNは変位ベクトル、b0,b1,…,bNはべき乗数、Mはフレーム幅を表す。1次元信号に適用する場合は、ak, bk は共にスカラーである。通常、a0=0である。(xは1次元GrayHLAC特徴量の1要素を表す。)
(3) The HLAC feature extraction method from the converted value of the one-dimensional signal described in (1) or (2) above is:
The one-dimensional Gray HLAC is expressed by the following equation (4).
Here, f (r) is an input signal, N is a degree, a0, a1,..., AN are displacement vectors, b0, b1,..., BN are exponents, and M is a frame width. When applied to a one-dimensional signal, both ak and bk are scalars. Usually, a0 = 0. (X N represents one element of the one-dimensional GrayHLAC feature amount.)

(4)上記(1)又は(2)記載の1次元信号の変換値からのHLAC特徴抽出方法は、
前記1次元GrayHLACを、下記数5の式としたことを特徴とする。

但し、f(r)は入力信号、Nは次数、a0,a1,…,aNは変位ベクトル、b0,b1,…,bNはべき乗数、Mはフレーム幅を表す。1次元信号に適用する場合は、ak, bk は共にスカラーである。間引き率はDである。通常、a0=0である。(xは1次元GrayHLAC特徴量の1要素を表す。)
(4) The HLAC feature extraction method from the converted value of the one-dimensional signal described in (1) or (2) above is:
The one-dimensional Gray HLAC is expressed by the following equation (5).

Here, f (r) is an input signal, N is a degree, a0, a1,..., AN are displacement vectors, b0, b1,..., BN are exponents, and M is a frame width. When applied to a one-dimensional signal, both ak and bk are scalars. The thinning rate is D. Usually, a0 = 0. (X N represents one element of the one-dimensional GrayHLAC feature amount.)

(5)上記(4)記載の1次元信号の変換値からのHLAC特徴抽出方法は、前記間引き率Dを、下記数6の式により求めることを特徴とする。
但し、fsは入力信号のサンプリング周波数、faは入力信号の変換に用いるフィルタの帯域幅を表す。
(6)1次元信号の変換値からのHLAC特徴抽出装置は、上記(1)記載の手順を実行する制御装置を備えたことを特徴とする。
(7)上記(6)記載の1次元信号の変換値からのHLAC特徴抽出装置は、上記(2)乃至(5)の手順を実行する前記制御装置を備えたことを特徴とする。
(5) The HLAC feature extraction method from the converted value of the one-dimensional signal described in (4) above is characterized in that the thinning rate D is obtained by the following equation (6).
Here, fs represents the sampling frequency of the input signal, and fa represents the bandwidth of the filter used for conversion of the input signal.
(6) The HLAC feature extraction device from the converted value of the one-dimensional signal includes a control device that executes the procedure described in (1) above.
(7) The HLAC feature extraction apparatus from the converted value of the one-dimensional signal described in (6) includes the control apparatus that executes the procedures (2) to (5).

以下に本発明の詳細を説明する。但し、断りのない限り信号は1次元信号を意味する。
図1は本発明の特徴抽出方法の手順に沿って構成された特徴抽出装置のブロック図である。
本発明の1次元信号の変換値からのHLAC特徴抽出装置1は、入力信号である1次元信号をベクトル時系列信号に変換する変換手段2と、変換手段2で変換されたベクトル時系列信号の各要素の時系列毎に1次元GrayHLACを適用する手段3と、複数の1次元 Gray HLAC手段3の全ての要素から得られたHLAC特徴量を結合手段4により結合して1つの特徴量として出力する。
更には、本発明の1次元信号の変換値からのHLAC特徴抽出装置10は、上記HLAC特徴抽出装置1における結合して1つの特徴量とした出力結合特徴量を、主成分分析し次元縮退した特徴量を出力する主成分分析手段5を備える。
Details of the present invention will be described below. However, unless otherwise noted, the signal means a one-dimensional signal.
FIG. 1 is a block diagram of a feature extraction apparatus constructed according to the procedure of the feature extraction method of the present invention.
The HLAC feature extraction apparatus 1 from a conversion value of a one-dimensional signal according to the present invention includes a conversion unit 2 that converts a one-dimensional signal that is an input signal into a vector time-series signal, and a vector time-series signal converted by the conversion unit 2. The means 3 for applying the one-dimensional Gray HLAC for each element time series and the HLAC feature values obtained from all the elements of the plurality of one-dimensional Gray HLAC means 3 are combined by the combining means 4 and output as one feature quantity. To do.
Furthermore, the HLAC feature extraction apparatus 10 from the conversion value of the one-dimensional signal of the present invention performs dimensionality reduction by performing principal component analysis on the output combination feature quantity combined into one feature quantity in the HLAC feature extraction apparatus 1. Principal component analysis means 5 for outputting feature quantities is provided.

本発明による特徴抽出方法は、まず、ディジタル入力信号を変換手段によりベクトル時系列へ変換する。変換手段としては、Wavelet変換、Filter Bank、Running Fast Fourier Transform (FFT),Mel−frequency Log Filter Bank(MelLogFB),Mel‐Frequency Cepstral Coefficient (MFCC)など様々な変換手段を用いることができる。なお変換手段はこれらに限定するものではない。   In the feature extraction method according to the present invention, first, a digital input signal is converted into a vector time series by a conversion means. As conversion means, Wavelet conversion, Filter Bank, Running Fast Fourier Transform (FFT), Mel-frequency Log Filter Bank (MelLog FB), Mel-Frequency Coff, etc. can be used. The conversion means is not limited to these.

変換したベクトルの各要素の時系列に対して1次元GrayHLACを計算する。
1次元GrayHLACの算出手順は以下の通りである。
ここでf(r)は入力信号、Nは次数、a0,a1,…,aNは変位ベクトル、b0,b1,…,bNはべき乗数、Mはフレーム幅を表す。1次元信号に適用する場合は、ak, bk は共にスカラーである。通常、a0=0である。(xは1次元GrayHLAC特徴量の1要素を表す。)
A one-dimensional Gray HLAC is calculated for the time series of each element of the transformed vector.
The calculation procedure of the one-dimensional Gray HLAC is as follows.
Here, f (r) is an input signal, N is a degree, a0, a1,..., AN are displacement vectors, b0, b1,..., BN are powers, and M is a frame width. When applied to a one-dimensional signal, both ak and bk are scalars. Usually, a0 = 0. (X N represents one element of the one-dimensional GrayHLAC feature amount.)

音声信号などは通常16ビットで量子化をしているが、そのようなダイナミックレンジの大きい信号に数7のように各サンプル値をべき乗して更に互いに乗算すると、オーバーフローを起こして計算できない場合がある。
この問題を改善する最も簡単な方法は、入力信号f(r)の絶対値の最大値が1になるように正規化する方法である。
An audio signal or the like is usually quantized with 16 bits, but if such a signal with a large dynamic range is multiplied by each sample value as shown in Equation 7 and further multiplied with each other, an overflow may occur and calculation may not be possible. is there.
The simplest method for improving this problem is a method of normalizing so that the maximum absolute value of the input signal f (r) becomes 1.

絶対値が1以下に制限されていれば、べき乗しても値が1を越えることはなくなる。しかし、もともと値が小さいサンプルはアンダーフローを起こす可能性があるが、そのようなサンプルには重要な情報が含まれていないと想定できる場合においては、振幅の正規化で十分対応できる可能性がある。 If the absolute value is limited to 1 or less, the value will not exceed 1 even if the power is raised. However, samples with small values may underflow, but if it can be assumed that such samples do not contain important information, amplitude normalization may be sufficient. is there.

数7または数8と数9の方法では、オーバーフローまたはアンダーフローを起こす可能性や、異なる次数のHLAC係数間で大きく桁が異なるため、後続の数値演算が不安定になる可能性もある。このような問題を改善するために、次式を用いた1次元GrayHLAC特徴量の算出が有効である。
数10により求める特徴量を、1次元 Gray HLACと呼ぶ。
In the methods of Equation 7 or Equation 8 and Equation 9, overflow or underflow may occur, and since the digits differ greatly between HLAC coefficients of different orders, subsequent numerical operations may become unstable. In order to improve such a problem, it is effective to calculate a one-dimensional Gray HLAC feature value using the following equation.
The feature amount obtained by Equation 10 is called one-dimensional Gray HLAC.

高次相関の計算に用いるサンプル間の相対的な位置を決定するマスクパターンは、a0,…,aN と b0,…,bNで指定する。例として、マスク幅を4としたときの2次までのマスクパターンを図2に示す。
図2は、本発明の特徴抽出方法において、マスク幅を4としたときの2次までのマスクパターンを示す。
この例では、15種類のマスクパターンが存在する。マスクパターンの1番左の項がa0=0の参照点を表し、参照点から右側にある非ゼロの項が、順番に、a1,a2,…のサンプル位置を表し、数値がb1,b2,…のべき乗数を表している。例えば、1次の最初のマスクパターン「2000」は、参照点(a0=0)のサンプルを2乗(b0=2)するマスクパターンを表している。その下のマスクパターン「1100」は、参照点(a0=0,b0=1)と、その1サンプル過去(a1=1)のサンプル値(b1=1)との相関を求めることを意味している。更に、2次のマスクパターンの中で「1011」は、参照点(a0=0,b0=1)と、その2サンプル過去(a1=2)のサンプル値(b2=1)と、参照点から3サンプル過去(a2=3)のサンプル値(b2=1)の相関を求めるマスクパターンを表している。
A mask pattern for determining a relative position between samples used for calculation of higher order correlation is designated by a0,..., AN and b0,. As an example, FIG. 2 shows a mask pattern up to the second order when the mask width is 4.
FIG. 2 shows up to second-order mask patterns when the mask width is 4 in the feature extraction method of the present invention.
In this example, there are 15 types of mask patterns. The leftmost term of the mask pattern represents a reference point of a0 = 0, the non-zero terms on the right side of the reference point represent the sample positions of a1, a2,. Represents the power of…. For example, a primary first mask pattern “2000” represents a mask pattern in which a sample of a reference point (a0 = 0) is squared (b0 = 2). The mask pattern “1100” below indicates that the correlation between the reference point (a0 = 0, b0 = 1) and the sample value (b1 = 1) of the past one sample (a1 = 1) is obtained. Yes. Further, in the secondary mask pattern, “1011” is obtained from the reference point (a0 = 0, b0 = 1), the sample value (b2 = 1) of the past two samples (a1 = 2), and the reference point. A mask pattern for obtaining a correlation between sample values (b2 = 1) of three samples in the past (a2 = 3) is shown.

HLACをより長い時間範囲から求めるには、マスク幅を広げる必要がある。しかし、マスク幅のサンプル数を多くとると、生成されるマスクパターンの個数は爆発的に多くなってしまう。例えば、マスク幅のサンプル数を80個とすると、2次までのマスクパターンは全部で3321個となる。このとき、全てのマスクパターンを用いた1次元 Gray HLAC特徴量の次元数は3321となる。   In order to obtain HLAC from a longer time range, it is necessary to widen the mask width. However, if the number of mask width samples is increased, the number of generated mask patterns will increase explosively. For example, if the number of mask width samples is 80, the total number of mask patterns up to the second order is 3321. At this time, the number of dimensions of the one-dimensional Gray HLAC feature amount using all the mask patterns is 3321.

非常に大きな次元数の特徴量をより低次の特徴量で近似表現するには、図3に示すように、主成分分析を用いる。
図3は、本発明の入力信号から直接1次元GrayHLAC特徴量を求める手続きのブロック図を示す。
変換を用いた特徴抽出方法においては、図1に示すように、入力信号を変換して、得られたベクトル時系列の各要素から1次元 Gray HLAC特徴量を求め、各要素の1次元 Gray HLAC特徴量を結合し、その結合特徴量に対して主成分分析を適用する。
In order to approximate the feature quantity having a very large number of dimensions with a lower-order feature quantity, principal component analysis is used as shown in FIG.
FIG. 3 shows a block diagram of a procedure for obtaining a one-dimensional Gray HLAC feature quantity directly from an input signal of the present invention.
In the feature extraction method using transformation, as shown in FIG. 1, the input signal is transformed, and a one-dimensional Gray HLAC feature amount is obtained from each element of the obtained vector time series, and the one-dimensional Gray HLAC of each element is obtained. The feature amounts are combined, and principal component analysis is applied to the combined feature amounts.

より長い時間範囲からHLACを求める際の次元数爆発を抑えるもう1つの方法を、以下に述べる。
図1中の変換方法として、例えば、Wavelet変換やFilter Bankなどを用いる場合、ベクトル時系列の各要素には帯域制限された信号が出力される。ある要素の時系列がΔfaの帯域幅で制限されているとき、サンプリング定理によれば、サンプリング周波数を2Δfaまで間引くことが可能である。
例えば、入力信号のサンプリング周波数が16kHzで、この信号の有効帯域(0〜8kHz)を4等分割すると、1つのチャネルに出力される信号の帯域幅Δfaは2kHzとなる。従って、この信号は4kHzのサンプリング周波数でサンプリング可能と言える。
Another way to suppress dimensional explosion when determining HLAC from a longer time range is described below.
As a conversion method in FIG. 1, for example, when using Wavelet conversion, Filter Bank, or the like, a band-limited signal is output to each element of the vector time series. When the time series of a certain element is limited by the bandwidth of Δfa, according to the sampling theorem, the sampling frequency can be thinned out to 2Δfa.
For example, when the sampling frequency of the input signal is 16 kHz and the effective band (0 to 8 kHz) of this signal is divided into four equal parts, the bandwidth Δfa of the signal output to one channel is 2 kHz. Therefore, it can be said that this signal can be sampled at a sampling frequency of 4 kHz.

入力信号と同じサンプリング周波数の信号からHLACを求めるよりも、間引いた信号からHLACを求める方が、同じマスク幅でも、後者の方がより長い時間幅のHLACを求めることができる。
間引き率をDとするとき、1次元 Gray HLAC特徴量は次式により求められる。
Rather than obtaining the HLAC from the signal having the same sampling frequency as the input signal, the latter can obtain the HLAC having a longer time width even when the mask width is the same.
When the thinning rate is D, the one-dimensional Gray HLAC feature amount is obtained by the following equation.

特に、Wavelet変換はスケールaによって帯域幅Δfが異なるので、間引き率Dを
に設定することで、スケールに依存してマスク幅を変えながら、1次元 Gray HLAC特徴量の算出が可能となる。但し、fsは入力信号のサンプリング周波数を表す。これにより、例えば音声のようなマルチスケールの現象を含む信号から、より効果的な特徴抽出が可能になる。
In particular, since the Wavelet transform bandwidth Delta] f a varies depending scales a, a thinning ratio D
By setting to, the one-dimensional Gray HLAC feature quantity can be calculated while changing the mask width depending on the scale. Here, fs represents the sampling frequency of the input signal. Thereby, more effective feature extraction can be performed from a signal including a multiscale phenomenon such as voice.

数10と数11の特徴抽出方法はWavelet変換に限定されるものではなく、その他の様々なマルチレート変換方法との組み合わせが可能である。
高速フーリエ変換(FFT)で用いられるパワースペクトルと等価な関係にある自己相関値との関連性を明確にするには、数7、数9、数10、数11の入力信号f(m)を
とおき周期Mの周期列に拡張し、g(m)を改めて入力信号とする。
また、1次元信号の変換値からのHLAC特徴抽出装置は、上記手順を実行する制御装置を有する。制御装置は、I/Oインターフェース、メモリ、CPU等を有し、マイクロコンピュータやパーソナルコンピュータ等で構成することができる。
The feature extraction methods of Equations 10 and 11 are not limited to Wavelet conversion, and can be combined with other various multi-rate conversion methods.
In order to clarify the relationship with the autocorrelation value that is equivalent to the power spectrum used in the fast Fourier transform (FFT), the input signals f (m) of Equations 7, 9, 10, and 11 are used.
The period is expanded to a period sequence of M, and g (m) is used again as an input signal.
The HLAC feature extraction device from the converted value of the one-dimensional signal has a control device that executes the above-described procedure. The control device includes an I / O interface, a memory, a CPU, and the like, and can be configured by a microcomputer, a personal computer, or the like.

1次元信号にHLACを適用した先行研究として、音声信号の時間‐周波数表現から2次元GrayHLAC特徴を求め、音素認識を行う例が報告されている。音声認識を行うには、短時間周波数成分の分布形状が重要な役割を果たす。先行研究では、音声の時間‐周波数表現から2次元GrayHLAC特徴を求めているため、時間軸だけでなく周波数軸に関しても、HLAC特徴量のシフト不変性が成立する。これでは、周波数成分が異なる帯域に同じ形状で分布している音声信号を区別できないという問題が生じる。   As a previous study applying HLAC to a one-dimensional signal, an example has been reported in which a two-dimensional Gray HLAC feature is obtained from a time-frequency representation of a speech signal and phoneme recognition is performed. In order to perform speech recognition, the distribution shape of the short-time frequency component plays an important role. In the previous research, since the two-dimensional Gray HLAC feature is obtained from the time-frequency representation of speech, shift invariance of the HLAC feature amount is established not only on the time axis but also on the frequency axis. This causes a problem that audio signals distributed in the same shape in different bands of frequency components cannot be distinguished.

本発明による特徴抽出法では、図1に示すように、信号をベクトル時系列に変換し、ベクトルの要素毎に1次元 Gray HLAC特徴量を抽出する。例えば、Wavelet変換やFilter Bankなどで1次元信号をベクトル時系列に変換することで、どの周波数帯域にどのような信号が含まれているかという周波数軸上の位置情報を保った形でHLAC特徴量を得ることができるようになる。音声など様々な音響信号の分析や認識などを行う際に、この特性は重要である。   In the feature extraction method according to the present invention, as shown in FIG. 1, a signal is converted into a vector time series, and a one-dimensional Gray HLAC feature quantity is extracted for each vector element. For example, by converting a one-dimensional signal into a vector time series by Wavelet transform, Filter Bank, etc., the HLAC feature quantity maintains the positional information on the frequency axis indicating what signal is included in which frequency band. You will be able to get This characteristic is important when analyzing and recognizing various acoustic signals such as voice.

数10または数11により求める1次元 Gray HLAC 特徴量は、オーバーフローやアンダーフロー、または異なる次数のHLAC係数間で桁が大きく異なるなどの問題を回避でき、安定的にHLAC特徴量を算出できる。   The one-dimensional Gray HLAC feature value obtained by Equation 10 or Equation 11 can avoid problems such as overflow, underflow, or significant differences in digits between HLAC coefficients of different orders, and can stably calculate the HLAC feature value.

数11と数12で示すように、間引き信号から1次元 Gray HLACを求めることで、マスク幅を広げることなく、時間的に広範囲のHLACを算出できる。マスク幅を広げずにすめば、特徴量の次元数を低く抑えることができるので、主成分分析などを用いた特徴量の次元縮退処理が不要になるなどの利点がある。   As shown in Equations 11 and 12, by obtaining the one-dimensional Gray HLAC from the thinned signal, a wide range of HLACs can be calculated in time without increasing the mask width. If the mask width is not widened, the number of dimensions of the feature quantity can be kept low, so that there is an advantage that the dimension reduction process of the feature quantity using principal component analysis becomes unnecessary.

特にWavelet変換と数11と数12で示す間引き信号から1次元 Gray HLAC を組み合わせることで、例えば音声のようなマルチスケールの現象を含む信号から、より効果的な特徴抽出が可能になる。
また、上記の手順を実行するように機能する制御装置を備えたHLAC特徴抽出装置は、必要な演算処理を短い時間内に行うことができる。
In particular, by combining the one-dimensional Gray HLAC from the wavelet transform and the thinned-out signals represented by Equations 11 and 12, more effective feature extraction can be performed from a signal including a multiscale phenomenon such as speech.
In addition, the HLAC feature extraction device including a control device that functions to execute the above-described procedure can perform necessary arithmetic processing within a short time.

本発明の実施の形態を図に基づいて詳細に説明する。   Embodiments of the present invention will be described in detail with reference to the drawings.

以下では、本発明の特徴量を用いて音響信号の認識実験を行いその有効性を示す。
実験に用いた音響信号は木質、金属、プラスティック、セラミックなどの材質からなる36種類の物体に単発の衝撃を与えたときに発生する衝突音を用いる。
音響信号はサンプリング周波数16kHz,16ビット量子化でディジタル信号に変換している。
In the following, an acoustic signal recognition experiment is performed using the feature quantity of the present invention, and its effectiveness is shown.
The acoustic signal used in the experiment uses a collision sound generated when a single impact is applied to 36 kinds of objects made of wood, metal, plastic, ceramic, or the like.
The acoustic signal is converted into a digital signal with a sampling frequency of 16 kHz and 16-bit quantization.

認識には以下の6種類の特徴量を用いた。
1.1次元 Gray HLAC+PCA
2.FFT+1次元 Gray HLAC+PCA
3.MelLogFB+1次元 Gray HLAC+PCA
4.MFCC+1次元 Gray HLAC+PCA
5.Wavelet変換+1次元 Gray HLAC+PCA
6.MFCC+Δ+ΔΔ(従来の特徴量)
The following six types of feature values were used for recognition.
1.1D Gray HLAC + PCA
2. FFT + 1 Dimension Gray HLAC + PCA
3. MelLogFB + 1 dimension Gray HLAC + PCA
4). MFCC + 1 dimension Gray HLAC + PCA
5. Wavelet transform + 1D Gray HLAC + PCA
6). MFCC + Δ + ΔΔ (conventional feature)

上記1から5までは本発明の特徴量で、6は従来の音声認識で広く用いられている特徴量である。材質毎に衝突音を100サンプル用意した。半分の50サンプルを用いて特徴量の主成分分析(PCA)および認識に用いるHidden Markov Model (HMM)の学習を行った。HMMは3状態のLeft-to-Right モデルで、各状態の分布数は32分布とした。認識実験は、残りの50サンプルを用いて行った。従って、この認識実験はオープンテストになっている。   The above 1 to 5 are feature quantities of the present invention, and 6 is a feature quantity widely used in conventional speech recognition. 100 impact sounds were prepared for each material. Half of the 50 samples were used to learn a principal component analysis (PCA) of feature values and a Hidden Markov Model (HMM) used for recognition. The HMM is a three-state Left-to-Right model, and the number of distributions in each state is 32. The recognition experiment was performed using the remaining 50 samples. Therefore, this recognition experiment is an open test.

各特徴量の詳細を以下に述べる。
1.1次元 Gray HLAC+PCA
入力信号から直接1次元 Gray HLAC を求める。フレーム幅M=80サンプル(5ms)、フレーム周期16サンプル(1ms)で解析フレームを切り出した後、(数13)を用いて周期列に拡張する。そして、マスク幅80サンプル、最高次数2次までの1次元GrayHLAC係数を数10により求めた。このとき特徴ベクトルの次元数は3321である。これに対して主成分分析を行い、次元数を39にした。

2.FFT+1次元 Gray HLAC+PCA
Details of each feature amount will be described below.
1.1D Gray HLAC + PCA
One-dimensional Gray HLAC is obtained directly from the input signal. After an analysis frame is cut out with a frame width M = 80 samples (5 ms) and a frame period of 16 samples (1 ms), it is expanded into a periodic sequence using (Equation 13). Then, the one-dimensional Gray HLAC coefficient up to 80 samples of mask width and the second order of maximum order was obtained by Expression 10. At this time, the dimension number of the feature vector is 3321. A principal component analysis was performed on this, and the number of dimensions was 39.

2. FFT + 1 Dimension Gray HLAC + PCA

FFT分析フレーム幅は80サンプル(5ms)、FFT分析フレーム周期16サンプル(1ms)として、窓を掛けた後、Zero−paddingして512サンプルのFFTを計算する。有効帯域内の256本の周波数binの振幅を求める。各周波数binの振幅値時系列から、1次元GrayHLACを数11により求める。このとき、マスク幅は4サンプル、最高次数はN=2、フレーム幅はM=1,フレーム周期1サンプル、そして間引き率はD=3とした。生成されるHLAC特徴ベクトルの次元数は3840で、これを主成分分析により13次元に縮退した。

3.MelLogFB+1次元GrayHLAC+PCA
The FFT analysis frame width is 80 samples (5 ms), and the FFT analysis frame period is 16 samples (1 ms). The amplitudes of 256 frequencies bin within the effective band are obtained. A one-dimensional Gray HLAC is obtained from Equation 11 from the amplitude value time series of each frequency bin. At this time, the mask width was 4 samples, the highest order was N = 2, the frame width was M = 1, the frame period was 1 sample, and the thinning rate was D = 3. The number of dimensions of the generated HLAC feature vector is 3840, and this is reduced to 13 dimensions by principal component analysis.

3. MelLogFB + 1 dimension GrayHLAC + PCA

上記FFTの周波数bin256本の振幅値を、中心周波数がMel−Frequency軸上で等間隔に並ぶ23個の三角形重みで重み付け加算する。そして、その値の対数を計算することで、23次元のMelLogFB特徴量を求める。このMelLogFB特徴量の各次元の時系列に対して、1次元 Gray HLAC特徴量を数11により求める。このとき、マスク幅は4サンプル、最高次数はN=2、フレーム幅はM=1,フレーム周期1サンプル、そして間引き率はD=3とした。生成されるHLAC特徴ベクトルの次元数は345で、これを主成分分析により39次元に縮退した。

4.MFCC+1次元GrayHLAC+PCA
The amplitude values of 256 FFT frequency bins are weighted and added with 23 triangular weights whose center frequencies are arranged at equal intervals on the Mel-Frequency axis. Then, a 23-dimensional MelLogFB feature value is obtained by calculating the logarithm of the value. A one-dimensional Gray HLAC feature quantity is obtained by Equation 11 for each dimension time series of the MelLogFB feature quantity. At this time, the mask width was 4 samples, the highest order was N = 2, the frame width was M = 1, the frame period was 1 sample, and the thinning rate was D = 3. The number of dimensions of the generated HLAC feature vector is 345, and this is reduced to 39 dimensions by principal component analysis.

4). MFCC + 1D Gray HLAC + PCA

上記23次元のMelLogFB特徴量を離散コサイン変換(DCT)して、13次元のMFCCを算出する。このMFCC特徴量の各次元の時系列に対して、1次元Gray HLAC特徴量を数11により求める。このとき、マスク幅は4サンプル、最高次数はN=2、フレーム幅はM=1,フレーム周期1サンプル、そして間引き率はD=3とした。生成されるHLAC特徴ベクトルの次元数は195で、これを主成分分析により39次元に縮退した。

5.Wavelet変換+1次元GrayHLAC+PCA
The 23-dimensional MelLogFB feature is subjected to discrete cosine transform (DCT) to calculate a 13-dimensional MFCC. A one-dimensional Gray HLAC feature value is obtained by Equation 11 for each dimension time series of the MFCC feature value. At this time, the mask width was 4 samples, the highest order was N = 2, the frame width was M = 1, the frame period was 1 sample, and the thinning rate was D = 3. The number of dimensions of the generated HLAC feature vector was 195, and this was reduced to 39 dimensions by principal component analysis.

5. Wavelet transform + 1D Gray HLAC + PCA

Gabor Waveletの実部を用いて、入力信号をWavelet変換する。Waveletのシフトパラメータは入力信号のサンプリング周波数と同じに設定し、スケールパラメータは
の10チャネルを用いた。
Wavelet transform of the input signal is performed using the real part of Gabor Wavelet. The wavelet shift parameter is set to be the same as the sampling frequency of the input signal, and the scale parameter is
10 channels were used.

10チャネルのWavelet係数時系列に対して、1次元GrayHLAC特徴量を数11により求める。このとき、マスク幅は4サンプル、最高次数はN=2、フレーム幅はM=80、そしてフレーム周期は16とした。生成されるHLAC特徴ベクトルの次元数は195で、これを主成分分析により39次元に縮退した。   A one-dimensional Gray HLAC feature quantity is obtained by Equation 11 with respect to a 10-channel Wavelet coefficient time series. At this time, the mask width was 4 samples, the highest order was N = 2, the frame width was M = 80, and the frame period was 16. The number of dimensions of the generated HLAC feature vector was 195, and this was reduced to 39 dimensions by principal component analysis.

本実験で使用したWaveletのスケールに対する中心周波数は
であり、この中心周波数の半周期に相当するサンプル数
を間引き率として用いた。
The center frequency for the Wavelet scale used in this experiment is
And the number of samples corresponding to half the center frequency
Was used as the decimation rate.

本実験に用いたWaveletの帯域幅は中心周波数より小さいので、数12の条件は満たされている。

6.MFCC+Δ+ΔΔ(従来の特徴量)
Since the Wavelet bandwidth used in this experiment is smaller than the center frequency, the condition of Equation 12 is satisfied.

6). MFCC + Δ + ΔΔ (conventional feature)

この特徴量は音声認識の特徴量として従来広く用いられているもので、比較のために求める。MFCCは、MFCC+1次元GrayHLAC+PCAで求めた13次元のMFCCに、1次と2次の微係数を加えた39次元の特徴量を用いる。   This feature amount is conventionally used widely as a feature amount for speech recognition, and is obtained for comparison. The MFCC uses a 39-dimensional feature value obtained by adding a first-order and a second-order differential coefficient to a 13-dimensional MFCC obtained by MFCC + 1-dimensional GrayHLAC + PCA.

図4は、本発明の特徴抽出方法を用いた実験結果を示す。
従来、音声認識で広く用いられている6.MFCC+Δ+ΔΔを用いた実験では認識率が98.2%であった。これに対して、提案特徴量の内、3.MelLogFB+1次元GrayHLAC+PCAと4.MFCC+1次元GrayHLAC+PCAの2つの認識率がそれぞれ99.19%と98.37%で、従来法よりも良い認識精度を達成した。
1次元信号の変換値からのHLAC特徴抽出装置は、上記手順を実行する制御装置を有する。制御装置は、I/Oインターフェース、メモリ、CPU等を有し、マイクロコンピュータやパーソナルコンピュータ等で構成することができる。
FIG. 4 shows experimental results using the feature extraction method of the present invention.
5. Conventionally used widely in voice recognition. In the experiment using MFCC + Δ + ΔΔ, the recognition rate was 98.2%. In contrast, among the proposed features, 3. 3. MelLog FB + 1 dimensional Gray HLAC + PCA The two recognition rates of MFCC + 1-dimensional GrayHLAC + PCA were 99.19% and 98.37%, respectively, achieving better recognition accuracy than the conventional method.
The HLAC feature extraction device from the converted value of the one-dimensional signal has a control device that executes the above procedure. The control device includes an I / O interface, a memory, a CPU, and the like, and can be configured by a microcomputer, a personal computer, or the like.

本発明の特徴抽出方法の手順に沿って構成された特徴抽出装置のブロック図である。It is a block diagram of the feature extraction apparatus comprised along the procedure of the feature extraction method of this invention. 本発明の特徴抽出方法において、マスク幅を4としたときの2次までのマスクパターンを示す。In the feature extraction method of the present invention, mask patterns up to the second order when the mask width is 4 are shown. 本発明の入力信号から直接1次元GrayHLAC特徴量を求める手続きのブロック図を示す。The block diagram of the procedure which calculates | requires the one-dimensional Gray HLAC feature-value directly from the input signal of this invention is shown. 本発明の特徴抽出方法を用いた実験結果を示す。The experimental result using the feature extraction method of this invention is shown.

符号の説明Explanation of symbols

1,10・・・HLAC特徴抽出装置、
2・・・変換手段、
3・・・1次元GrayHLAC手段3、
4・・・結合手段、
5・・・主成分分析手段5
1,10... HLAC feature extraction device,
2 ... conversion means,
3 ... 1-dimensional Gray HLAC means 3,
4 ... coupling means,
5 ... Principal component analysis means 5

Claims (6)

1次元信号を、複数の周波数帯域に分割し、その帯域毎にベクトル時系列信号に変換する手順1と、前記手順1で変換された周波数帯域毎のベクトル時系列信号における各要素に時系列に1次元GrayHLACを適用する手順2と、前記手順2において全ての要素から得られたHLAC特徴量を、そのHLAC特徴量の値を残しながら結合して1つの特徴量として出力する手順3と、上記手順3において結合して1つの特徴量とした出力結合特徴量を、主成分分析し次元縮退した特徴量を出力する手順4とを有することを特徴とする1次元信号の変換値から1つのHLAC特徴量を求めるHLAC特徴抽出方法One-dimensional signal is divided into a plurality of frequency bands, the the Step 1 to convert a vector time series signals for each band, time-series basis to each element in the vector time series signal of the steps for each converted frequency band with 1 Step 2 for applying one-dimensional Gray HLAC to the above, Step 3 for combining the HLAC feature values obtained from all the elements in Step 2 while leaving the values of the HLAC feature values, and outputting as one feature value , The output combination feature amount combined in step 3 above as one feature amount is subjected to principal component analysis, and the step 4 is performed to output the dimension reduced feature amount . An HLAC feature extraction method for obtaining an HLAC feature quantity . 前記1次元GrayHLACを、下記数1の式としたことを特徴とする請求項1記載の1次元信号の変換値から1つのHLAC特徴量を求めるHLAC特徴抽出方法。
但し、f(r)は入力信号、Nは次数、a,a,…,aは変位ベクトル、b,b
,…,bはべき乗数、Mはフレーム幅を表す。1次元信号に適用する場合は、a
は共にスカラーである。通常、a=0である。(xは1次元GrayHLAC特徴量の1要素を表す。)
2. The HLAC feature extraction method for obtaining one HLAC feature quantity from a converted value of a one-dimensional signal according to claim 1, wherein the one-dimensional Gray HLAC is expressed by the following equation (1).
Here, f (r) is an input signal, N is an order, a 0 , a 1 ,..., A N are displacement vectors, b 0 , b
1 ,..., B N is a power multiplier, and M is a frame width. When applied to a one-dimensional signal, a k ,
Both b k are scalars. Usually, a 0 = 0. (X N represents one element of the one-dimensional GrayHLAC feature amount.)
前記1次元GrayHLACを、下記数2の式としたことを特徴とする請求項1記載の1次元信号の変換値から1つのHLAC特徴量を求めるHLAC特徴抽出方法。
但し、f(r)は入力信号、Nは次数、a,a,…,aは変位ベクトル、b,b
,…,bはべき乗数、Mはフレーム幅を表す。1次元信号に適用する場合は、a
は共にスカラーである。間引き率はDである。通常、a=0である。(xは1次元GrayHLAC特徴量の1要素を表す。)
2. The HLAC feature extraction method for obtaining one HLAC feature quantity from a converted value of a one-dimensional signal according to claim 1, wherein the one-dimensional Gray HLAC is expressed by the following equation (2).
Here, f (r) is an input signal, N is an order, a 0 , a 1 ,..., A N are displacement vectors, b 0 , b
1 ,..., B N is a power multiplier, and M is a frame width. When applied to a one-dimensional signal, a k ,
Both b k are scalars. The thinning rate is D. Usually, a 0 = 0. (X N represents one element of the one-dimensional GrayHLAC feature amount.)
前記間引き率Dを、下記数3の式により求めることを特徴とする請求項記載の1次元信
号の変換値から1つのHLAC特徴量を求めるHLAC特徴抽出方法。
但し、fsは入力信号のサンプリング周波数、faは入力信号の変換に用いるフィルタの
帯域幅を表す。
4. The HLAC feature extraction method for obtaining one HLAC feature amount from a converted value of a one-dimensional signal according to claim 3, wherein the thinning rate D is obtained by the following equation ( 3 ).
Here, fs represents the sampling frequency of the input signal, and fa represents the bandwidth of the filter used for conversion of the input signal.
1次元信号を、複数の周波数帯域に分割し、その帯域毎にベクトル時系列信号に変換する変換手段2と、前記変換手段2で変換された周波数帯域毎のベクトル時系列信号における各要素に時系列毎に1次元GrayHLACを適用する複数の1次元GrayHLAC手段3と、前記複数の1次元GrayHLAC手段3の全ての要素から得られたHLAC特徴量を、そのHLAC特徴量の値を残しながら結合して1つの特徴量として出力する結合手段4と、上記結合手段4における結合して1つの特徴量とした出力結合特徴量を、主成分分析し次元縮退した特徴量を出力する主成分分析手段5とを備えたことを特徴とする1次元信号の変換値から1つのHLAC特徴量を求めるHLAC特徴抽出装置。 A conversion means 2 that divides a one-dimensional signal into a plurality of frequency bands, and converts each band into a vector time-series signal, and each element in the vector time-series signal for each frequency band converted by the conversion means 2 is timed. A plurality of one-dimensional Gray HLAC means 3 to which one-dimensional Gray HLAC is applied for each series and HLAC feature quantities obtained from all elements of the plurality of one-dimensional Gray HLAC means 3 are combined while leaving the values of the HLAC feature quantities. A combination means 4 for outputting as one feature quantity, and a principal component analysis means 5 for outputting a feature quantity obtained by performing principal component analysis on the output combination feature quantity obtained by combining in the combination means 4 and forming one feature quantity. An HLAC feature extraction device that obtains one HLAC feature quantity from a converted value of a one-dimensional signal. 請求項2乃至4のいずれか1項記載の手順を実行する前記複数の1次元GrayHLAC手段3を備えたことを特徴とする請求項5記載の1次元信号の変換値から1つのHLAC特徴量を求めるHLAC特徴抽出装置。
A plurality of one-dimensional Gray HLAC means 3 for executing the procedure according to any one of claims 2 to 4 is provided, and one HLAC feature value is obtained from the converted value of the one-dimensional signal according to claim 5. HLAC feature extraction device for determining.
JP2007020169A 2007-01-30 2007-01-30 Method and apparatus for extracting HLAC feature from conversion value of one-dimensional signal Expired - Fee Related JP4905962B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2007020169A JP4905962B2 (en) 2007-01-30 2007-01-30 Method and apparatus for extracting HLAC feature from conversion value of one-dimensional signal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2007020169A JP4905962B2 (en) 2007-01-30 2007-01-30 Method and apparatus for extracting HLAC feature from conversion value of one-dimensional signal

Publications (2)

Publication Number Publication Date
JP2008185845A JP2008185845A (en) 2008-08-14
JP4905962B2 true JP4905962B2 (en) 2012-03-28

Family

ID=39728925

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2007020169A Expired - Fee Related JP4905962B2 (en) 2007-01-30 2007-01-30 Method and apparatus for extracting HLAC feature from conversion value of one-dimensional signal

Country Status (1)

Country Link
JP (1) JP4905962B2 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4840819B2 (en) * 2007-04-09 2011-12-21 独立行政法人産業技術総合研究所 HLAC feature quantity extraction method and feature quantity extraction device by binarization of one-dimensional signal
JP5131863B2 (en) * 2009-10-30 2013-01-30 独立行政法人産業技術総合研究所 HLAC feature extraction method, abnormality detection method and apparatus
JP4754651B2 (en) * 2009-12-22 2011-08-24 アレクセイ・ビノグラドフ Signal detection method, signal detection apparatus, and signal detection program
JP5598815B2 (en) * 2010-05-24 2014-10-01 独立行政法人産業技術総合研究所 Signal feature extraction apparatus and signal feature extraction method
JP5644934B2 (en) * 2013-12-09 2014-12-24 独立行政法人産業技術総合研究所 Signal feature extraction apparatus and signal feature extraction method
CN105632505B (en) * 2014-11-28 2019-12-20 北京天籁传音数字技术有限公司 Encoding and decoding method and device for Principal Component Analysis (PCA) mapping model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5051746B2 (en) * 2006-11-01 2012-10-17 独立行政法人産業技術総合研究所 Feature extraction apparatus and method, and program

Also Published As

Publication number Publication date
JP2008185845A (en) 2008-08-14

Similar Documents

Publication Publication Date Title
Huzaifah Comparison of time-frequency representations for environmental sound classification using convolutional neural networks
KR101988222B1 (en) Apparatus and method for large vocabulary continuous speech recognition
CN103117059B (en) Voice signal characteristics extracting method based on tensor decomposition
JP4905962B2 (en) Method and apparatus for extracting HLAC feature from conversion value of one-dimensional signal
Hibare et al. Feature extraction techniques in speech processing: a survey
US8566084B2 (en) Speech processing based on time series of maximum values of cross-power spectrum phase between two consecutive speech frames
Yapanel et al. A new perspective on feature extraction for robust in-vehicle speech recognition.
US20080167862A1 (en) Pitch Dependent Speech Recognition Engine
JP6195548B2 (en) Signal analysis apparatus, method, and program
JP2000172292A (en) Method and device for automatically recognizing voice
Zulkifly et al. Relative spectral-perceptual linear prediction (RASTA-PLP) speech signals analysis using singular value decomposition (SVD)
JP2020060757A (en) Speaker recognition device, speaker recognition method, and program
JP4571871B2 (en) Speech signal analysis method and apparatus for performing the analysis method, speech recognition apparatus using the speech signal analysis apparatus, program for executing the analysis method, and storage medium thereof
Prasanna Kumar et al. Single-channel speech separation using combined EMD and speech-specific information
KR101361034B1 (en) Robust speech recognition method based on independent vector analysis using harmonic frequency dependency and system using the method
Wani et al. Multilanguage speech-based gender classification using time-frequency features and SVM classifier
Kanisha et al. Speech recognition with advanced feature extraction methods using adaptive particle swarm optimization
CN103778914A (en) Anti-noise voice identification method and device based on signal-to-noise ratio weighing template characteristic matching
JP4760179B2 (en) Voice feature amount calculation apparatus and program
Asakawa et al. Automatic recognition of connected vowels only using speaker-invariant representation of speech dynamics.
JP4362072B2 (en) Speech signal analysis method and apparatus for performing the analysis method, speech recognition apparatus using the speech signal analysis apparatus, program for executing the analysis method, and storage medium thereof
Hidayat et al. Improving Accuracy of Isolated Word Recognition System by using Syllable Number Characteristics.
Hidayat Frequency domain analysis of MFCC feature extraction in children’s speech recognition system
Prasanna Kumar et al. Unsupervised speech separation by detecting speaker changeover points under single channel condition
Müller et al. On using the auditory image model and invariant-integration for noise robust automatic speech recognition

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20090227

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20110107

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20110208

RD02 Notification of acceptance of power of attorney

Free format text: JAPANESE INTERMEDIATE CODE: A7422

Effective date: 20110215

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A821

Effective date: 20110216

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20110401

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20111220

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20120105

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20150120

Year of fee payment: 3

R150 Certificate of patent or registration of utility model

Ref document number: 4905962

Country of ref document: JP

Free format text: JAPANESE INTERMEDIATE CODE: R150

Free format text: JAPANESE INTERMEDIATE CODE: R150

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

S533 Written request for registration of change of name

Free format text: JAPANESE INTERMEDIATE CODE: R313533

R350 Written notification of registration of transfer

Free format text: JAPANESE INTERMEDIATE CODE: R350

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

LAPS Cancellation because of no payment of annual fees