CN103956165A

CN103956165A - Method for improving audio classification accuracy through mixed component clustering Fisher scoring algorithm

Info

Publication number: CN103956165A
Application number: CN201410194236.3A
Authority: CN
Inventors: 王荣燕; 李海军
Original assignee: Dezhou University
Current assignee: Dezhou University
Priority date: 2014-05-09
Filing date: 2014-05-09
Publication date: 2014-07-30

Abstract

The invention discloses a method for improving the audio classification accuracy through a mixed component clustering Fisher scoring algorithm. The method includes the steps that all class of GMMs are united, and Gaussian components are combined into one Gaussian; a CGMM is formed; Fisher transform is performed on the CGMM; the Fisher score is solved to obtain the equal length characteristic. According to the method for improving the audio classification accuracy through the mixed component clustering Fisher scoring algorithm, each class of GMMs are united; the Gaussian components are combined into one Gaussian; the CGMM is formed; Fisher transform is performed on the CGMM; and the Fisher score is solved to obtain the equal length characteristic. The method combines the advantages of a generative mode and the advantages of a discriminant model, the differentiating characteristics among classes can be described, details can be well differentiated, and particularly when the fragment length of an extracted characteristic is small, high classification accuracy can still be achieved. Through the method, the classification accuracy of six voices can reach 77 percent.

Description

Utilize mixed components cluster Fisher scoring method to improve the method for audio classification accuracy rate

Technical field

The invention belongs to Fisher scoring method application, relate in particular to a kind of method of utilizing mixed components cluster Fisher scoring method to improve audio classification accuracy rate.

Background technology

At present, Fisher score, produces based on information generated, and it attempts from single generation model, to extract more information, and is not their output probability.The object of Fisher score conversion is to analyze score how to depend on model, and which part of model is important when determining this score, thereby obtains the information about data internal representation form.From this angle, by the direction of extension model parameter, come the similarity of two data points of comparison to seem very natural, namely the scoring function of two points is regarded as to the function of parameter, and compared these two gradients.If these two gradients are similar, mean these two data points adaptive model in the same way, namely, from the angle of the given parameters model parameter current arranges, they are similar, because they require parameter to do similar modification.The thought of Fisher score that Here it is.Traditional transform method is that the feature of each frame in an audio fragment is averaged and variance, and as the new isometric feature of this fragment, this method has obtained good effect in some classification problem.But existing method exists the differentiation details that easily neglects some classifications, when the length of audio fragment more in short-term, the effect obtaining can variation, the problem that treatment effeciency is low.

Therefore, invent a kind of method of utilizing mixed components cluster Fisher scoring method to improve audio classification accuracy rate and seem very necessary.

Summary of the invention

The object of the present invention is to provide a kind of method of utilizing mixed components cluster Fisher scoring method to improve audio classification accuracy rate, be intended to existing method and exist the differentiation details that easily neglects some classifications, when the length of audio fragment more in short-term, the effect meeting variation obtaining, the problem that treatment effeciency is low.The present invention is achieved in that

A kind of necessary technology scheme of utilizing mixed components cluster Fisher scoring method to improve the method for audio classification accuracy rate:

The present invention is achieved in that a kind of method of utilizing mixed components cluster Fisher scoring method to improve audio classification accuracy rate comprises,

Step 1: each classification GMM is combined;

Step 2: gaussian component is merged into a Gauss;

Step 3: form CGMM model;

Step 4: CGMM is carried out to Fisher conversion;

Step 5: ask Fisher score to obtain isometric feature.

A kind of less important technical scheme of utilizing mixed components cluster Fisher scoring method to improve the method for audio classification accuracy rate:

Further, in step 1, train the GMM of each classification, each GMM is combined.Each gaussian component that is about to each model is carried out order arrangement, is combined into a new model, and redistributes weight to the gaussian component of model, and making all weight sums is 1.The new model obtaining is like this UGMM, and the mixed components number sum that its mixed components number is each classification, carries out cluster by the gaussian component of UGMM;

Further, in step 2, closely similar gaussian component is gathered and is one bunch, and by bunch in gaussian component merge into a Gauss, as the component that represents of this bunch;

Further, in step 3, by the representing that Gauss is together in series and form new GMM of every cluster, this model is exactly CGMM;

Further, in step 4, to CGMM carry out feature that Fisher conversion obtains not only dimension reduced, and, removed partial redundance information, can better express the differentiation between classification, similarity between gaussian component depends on the distance metric of use, in order better to measure the similarity between gaussian component, adopt some conventional distance metrics of machine learning field, comprising: Euclidean distance, mahalanobis distance, Pasteur's distance and K_L2 distance, and needn't calculate the distance between gauss hybrid models, therefore, do not adopt class divergence distance;

Further, in step 5, based on CGMM, all samples are asked to Fisher score, obtain new isometric feature.With the isometric features training support vector machine multicategory classification device obtaining.

The method of mixed components cluster Fisher scoring method raising audio classification accuracy rate of utilizing provided by the invention is by combining each classification GMM; Gaussian component is merged into a Gauss; Form CGMM model; CGMM is carried out to Fisher conversion; Ask Fisher score to obtain isometric feature.Make a kind of method of utilizing mixed components cluster Fisher scoring method to improve audio classification accuracy rate can be good at the differentiation details of classification, when the length of audio fragment more in short-term, the effect obtaining can variation, treatment effeciency uprises.The present invention combines the advantage of production pattern and discriminative model, can describe the differentiation formula feature between classification, can be good at again distinguishing details, especially, when extracting characteristic fragment length more in short-term, still can obtain classification accuracy.In the present invention's experiment, about 1500 files of downloading on the audio retrieval website (http://www.findsounds.com) that data set Shi Cong U.S. Comparisonics company used releases and other websites, comprise six semantic classess: ox cry, the tinkle of bells, barking, horse cry, frog cry and laugh are all non-sound-types.Each file is an isolated audio fragment, and the length of fragment is from being less than 1s to 1min not etc.The audio format of downloading has wav, mp3, au, aif and aiff etc., and all audio formats are converted into unified wav form, and sampling rate is 8kH, 16bit, single channel, pcm encoder form.

Experimental result demonstration, utilizing the statistical nature of all frames of sheet intersegmental part is 66.29% as six kinds of average classification accuracies of sound of svm classifier device of inputting; And six kinds of average classification accuracies of sound that utilize mixed components cluster scoring method to obtain that the application proposes minimum be 77.11%, than svm classifier device, improved 10.82 percentage points; Even, six kinds of average classification accuracies of sound that the CGMM-SVM algorithm that the application proposes obtains can reach 82.04%, than svm classifier device, have improved 15.75 percentage points.

Accompanying drawing explanation

Fig. 1 is that the mixed components cluster Fisher scoring method of utilizing that the embodiment of the present invention provides improves the method flow diagram of the method for audio classification accuracy rate.

Embodiment

In order to make object of the present invention, technical scheme and advantage clearer, below in conjunction with embodiment, the present invention is further elaborated.Should be appreciated that specific embodiment described herein, only in order to explain the present invention, is not intended to limit the present invention.

Below in conjunction with drawings and the specific embodiments, application principle of the present invention is further described.

As shown in Figure 1, the present invention is achieved in that a kind of method of utilizing mixed components cluster Fisher scoring method to improve audio classification accuracy rate comprises,

S101: each classification GMM is combined;

S102: gaussian component is merged into a Gauss;

S103: form CGMM model;

S104: CGMM is carried out to Fisher conversion;

S105: ask Fisher score to obtain isometric feature.

Further, at S101, train the GMM of each classification, each GMM is combined.Each gaussian component that is about to each model is carried out order arrangement, is combined into a new model, and redistributes weight to the gaussian component of model, and making all weight sums is 1.The new model obtaining is like this UGMM, and the mixed components number sum that its mixed components number is each classification, carries out cluster by the gaussian component of UGMM;

Further, at S102, closely similar gaussian component is gathered and is one bunch, and by bunch in gaussian component merge into a Gauss, as the component that represents of this bunch;

Further, at S103, by the representing that Gauss is together in series and form new GMM of every cluster, this model is exactly CGMM;

Further, at S104, to CGMM carry out feature that Fisher conversion obtains not only dimension reduced, and, removed partial redundance information, can better express the differentiation between classification, similarity between gaussian component depends on the distance metric of use, in order better to measure the similarity between gaussian component, adopt some conventional distance metrics of machine learning field, comprising: Euclidean distance, mahalanobis distance, Pasteur's distance and K_L2 distance, and needn't calculate the distance between gauss hybrid models, therefore, do not adopt class divergence distance;

Further, at S105, based on CGMM, all samples are asked to Fisher score, obtain new isometric feature.With the isometric features training support vector machine multicategory classification device obtaining.

A kind of method of mixed components cluster Fisher scoring method raising audio classification accuracy rate of utilizing of the present invention is by combining each classification GMM; Gaussian component is merged into a Gauss; Form CGMM model; CGMM is carried out to Fisher conversion; Ask Fisher score to obtain isometric feature.Make a kind of method of utilizing mixed components cluster Fisher scoring method to improve audio classification accuracy rate can be good at the differentiation details of classification, when the length of audio fragment more in short-term, the effect obtaining can variation, treatment effeciency uprises.

The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, all any modifications of doing within the spirit and principles in the present invention, be equal to and replace and improvement etc., within all should being included in protection scope of the present invention.

Claims

1. utilize mixed components cluster Fisher scoring method to improve a method for audio classification accuracy rate, it is characterized in that, the described method of utilizing mixed components cluster Fisher scoring method to improve audio classification accuracy rate comprises:

Step 1: each classification GMM is combined;

Step 2: gaussian component is merged into a Gauss;

Step 3: form CGMM model;

Step 4: CGMM is carried out to Fisher conversion;

Step 5: ask Fisher score to obtain isometric feature.

2. the method for utilizing mixed components cluster Fisher scoring method to improve audio classification accuracy rate as claimed in claim 1, is characterized in that, in step 1, trains the GMM of each classification, and each GMM is combined; Each gaussian component that is about to each model is carried out order arrangement, is combined into a new model, and redistributes weight to the gaussian component of model, and making all weight sums is 1; The new model obtaining is like this UGMM, and mixed components number is the mixed components number sum of each classification, and the gaussian component of UGMM is carried out to cluster.

3. the method for utilizing mixed components cluster Fisher scoring method to improve audio classification accuracy rate as claimed in claim 1, it is characterized in that, in step 2, closely similar gaussian component is gathered and is one bunch, and by bunch in gaussian component merge into a Gauss, as the component that represents of this bunch.

4. the method for utilizing mixed components cluster Fisher scoring method to improve audio classification accuracy rate as claimed in claim 1, is characterized in that, in step 3, by the representing that Gauss is together in series and form new GMM of every cluster, model is exactly CGMM.

5. the method for utilizing mixed components cluster Fisher scoring method to improve audio classification accuracy rate as claimed in claim 1, it is characterized in that, in step 4, to CGMM carry out feature that Fisher conversion obtains not only dimension reduced, and, removed partial redundance information, can better express the differentiation between classification, similarity between gaussian component depends on the distance metric of use, in order to measure the similarity between gaussian component, adopt the conventional distance metric in machine learning field, comprise: Euclidean distance, mahalanobis distance, Pasteur's distance and K_L2 distance.

6. the method for utilizing mixed components cluster Fisher scoring method to improve audio classification accuracy rate as claimed in claim 1, it is characterized in that, in step 5, based on CGMM, all samples are asked to Fisher score, obtain new isometric feature, with the isometric features training support vector machine multicategory classification device obtaining.