CN109408660A

CN109408660A - A method of the music based on audio frequency characteristics is classified automatically

Info

Publication number: CN109408660A
Application number: CN201811012539.3A
Authority: CN
Inventors: 熊飞; 张海荣; 郑雅玲; 产文涛
Original assignee: Anhui Sun Create Electronic Co Ltd
Current assignee: Anhui Sun Create Electronic Co Ltd
Priority date: 2018-08-31
Filing date: 2018-08-31
Publication date: 2019-03-01
Anticipated expiration: 2038-08-31
Also published as: CN109408660B

Abstract

The invention discloses a kind of methods that the music based on audio frequency characteristics is classified automatically, comprising: obtains the music building music training set and music test set of different classifications；Audio feature extraction is carried out to the music in the music and the music test set in the music training set respectively, and obtains the musical features collection of music training set and the musical features collection of music test set；The musical features collection of the music training set as training sample and is subjected to neural metwork training, neural network after being trained, and using the musical features collection of the music test set as test sample and the neural network after training is tested, and adjust neural network parameter.The present invention is from the audio frequency characteristics of music, the audio frequency characteristics of musical specific property are selected preferably to express, solve the problems, such as that audio frequency characteristics are single, effective data basis is provided to classify automatically, the accuracy that music is classified automatically is improved, and neural metwork training is carried out to audio frequency characteristics on this basis.

Description

A method of the music based on audio frequency characteristics is classified automatically

Technical field

The present invention relates to field of computer technology, method that especially a kind of music based on audio frequency characteristics is classified automatically.

Background technique

With the fast development of multimedia technology and network technology, a large amount of music data is caused to circulate on the net, and more Music data in media database is in explosive growth.But can the value of large-scale music data effective with user The content that ground browses music data is closely bound up.Therefore, people are badly in need of the effective automatic management of research and retrieval music data Method cope with such development trend, and in recent years, the automatic classification technology and retrieval technique of music has become research heat Topic.

The classification automatically of music is the process of a pattern-recognition, including audio feature extraction and automatic classification two are substantially Process.The audio frequency characteristics extracted in the prior art are relatively simple, and some lays particular emphasis on tone color, and some lays particular emphasis on rhythm, are not enough to complete The expression musical specific property of face entirety faces large-scale music data, cannot be finely accurately right in subsequent automatic classification Music is classified.

Summary of the invention

In order to overcome above-mentioned defect in the prior art, the present invention provides a kind of music based on audio frequency characteristics and classifies automatically Method selected the audio frequency characteristics that can preferably express musical specific property from the audio frequency characteristics of music, solve audio spy Single problem is levied, effective data basis is provided for automatic classification, improves the accuracy that music is classified automatically, and herein On the basis of to audio frequency characteristics carry out neural metwork training.

To achieve the above object, the present invention uses following technical scheme, comprising:

A method of the music based on audio frequency characteristics is classified automatically, comprising the following steps:

S1, obtains the music building music collections of different classifications, the music collections include for trained music training set and For the music test set of test, and music training set and music test set include the music of different classifications；

S2 carries out audio frequency characteristics to the music in the music and the music test set in the music training set respectively and mentions It takes, obtains the audio frequency characteristics of the music in the music training set, and constitute the music of music training set according to this audio frequency characteristics Feature set；And the audio frequency characteristics of the music in the music test set are obtained, and constitute music test according to this audio frequency characteristics The musical features collection of collection；

The audio frequency characteristics include: Mel Cepstral Frequency Coefficients MFCC, music energy, music rhythm；

The musical features collection of the music training set as training sample and is carried out neural metwork training, is instructed by S3 Neural network after white silk；And using the musical features collection of the music test set as test sample and to the neural network after training It is tested.

In step S1, the mode classification of the music of the different classifications includes: prevalence, R&B, jazz, electronics, folk rhyme, shakes Rolling, a Chinese musical telling, rural area.

In step S2, using MATLB to the music in the music and the music test set in the music training set into Row audio feature extraction.

In step S2, the extracting mode of the Mel Cepstral Frequency Coefficients MFCC are as follows: pre-add is carried out to the music signal of music Weight, framing, adding window, Fast Fourier Transform (FFT), Mel filtering, logarithm operation, discrete cosine transform, obtain the Meier scramble of 12 dimensions Spectral coefficient, and second differnce is carried out to the Mel Cepstral Frequency Coefficients of 12 dimension, the Mel Cepstral Frequency Coefficients of 24 dimensions are obtained, by plum That cepstral coefficients are used as a characteristic parameter per the numerical value on one-dimensional, obtain 24 according to the Mel Cepstral Frequency Coefficients of 24 dimensions A characteristic parameter.

In step S2, the extracting mode of the music energy are as follows: preemphasis is carried out to the music signal of music, framing, is added Window, Fast Fourier Transform (FFT), and music energy is obtained according to the Fourier spectrum figure after Fast Fourier Transform (FFT)；The music energy Amount includes 2 characteristic parameters, the respectively mean value of the maximum amplitude in Fourier spectrum figure and amplitude.

In step S2, the extracting mode of the music rhythm are as follows: extract the bat number of the every five seconds of music, and according to every five seconds Bat number obtains music rhythm；The music rhythm includes 2 characteristic parameters, the respectively maximum bat in every five seconds bat number Several and every five seconds bat number mean value.

In step S3, the neural metwork training is to be trained using BP neural network algorithm；The BP network neural Algorithm: input training sample carries out adjusting training repeatedly using weight and threshold value of the back-propagation algorithm to BP neural network, The difference between output vector and Mean Vector is set to be less than the 10% of Mean Vector；The output vector is that the prediction of music is classified Value；The Mean Vector is the actual classification value of music.

In step S3, the BP neural network after training is tested to obtain the classification accuracy of the BP neural network, Number of levels, circulation if the classification accuracy less than 85%, needs to readjust neural network parameter, to BP neural network Number, learning rate, target error are modified；According to the quantity and music test set of accurate music of classifying in music test set In the quantity of all music obtain the classification accuracy, the number for accurate music of classifying in classification accuracy=music test set The quantity of all music in amount/music test set.

The present invention has the advantages that

(1) audio frequency characteristics of the extracted music of the present invention include 28 characteristic parameters, more comprehensively whole to music spy Property is expressed, to accurately classify to music.

(2) when the present invention extracts Mel Cepstral Frequency Coefficients MFCC, second differnce has been carried out, it is clearer to represent Music variation characteristic keeps the extraction of the tamber characteristic of music more accurate.

(3) present invention is trained the automatic classification of music using BP neural network algorithm, and leads to during the test Cross constantly adjustment neural network parameter value mode, to the number of levels of neural network, cycle-index, learning rate, target error into Row modification, makes classification accuracy be higher than 85%, to guarantee the classifying quality that music is classified automatically.

Detailed description of the invention

Fig. 1 is flow chart of the method for the present invention.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

As shown in Figure 1, a method of the music based on audio frequency characteristics is classified automatically, comprising the following steps:

S1 obtains the music of different classifications according to the mainstream mode classification of current music, constitutes music collections；It will be in music collections 80% as the music training set for training, by 20% in music collections as the music test set for being used for test, and institute State include in music training set and the music test set different classifications music.

The mainstream mode classification includes: prevalence, R&B, jazz, electronics, folk rhyme, rock and roll, a Chinese musical telling, rural area.

S2 respectively carries out the music in the music and the music test set in the music training set using MATLAB Audio feature extraction obtains the audio frequency characteristics of the music in the music training set, and constitutes music instruction according to this audio frequency characteristics Practice the musical features collection of collection；And the audio frequency characteristics of the music in the music test set are obtained, and according to this audio frequency characteristics structure At the musical features collection of music test set；

The audio frequency characteristics include: Mel Cepstral Frequency Coefficients MFCC, music energy, music rhythm.

Wherein, the extracting mode of the Mel Cepstral Frequency Coefficients MFCC are as follows: preemphasis is carried out to the music signal of music, is divided Frame, adding window, Fast Fourier Transform (FFT), Mel filtering, logarithm operation, discrete cosine transform, obtain the Mel-cepstral system of 12 dimensions Number carries out second differnce to the Mel Cepstral Frequency Coefficients of 12 dimension, obtains to more clearly from represent music variation characteristic To the Mel Cepstral Frequency Coefficients of 24 dimensions, Mel Cepstral Frequency Coefficients are regard as a characteristic parameter, root per the numerical value on one-dimensional 24 characteristic parameters are obtained according to the Mel Cepstral Frequency Coefficients of 24 dimensions.

The extracting mode of the music energy are as follows: preemphasis, framing, adding window, quick Fu are carried out to the music signal of music In leaf transformation, and music energy is obtained according to the Fourier spectrum figure after Fast Fourier Transform (FFT)；The music energy includes 2 Characteristic parameter, the respectively mean value of the maximum amplitude in Fourier spectrum figure and amplitude.

The extracting mode of the music rhythm are as follows: extract the bat number of the every five seconds of music, and obtained according to every five seconds bat number To music rhythm；The music rhythm includes 2 characteristic parameters, maximum bat number and every five seconds respectively in every five seconds bat number The mean value of bat number.

The audio frequency characteristics of a piece of music include 28 characteristic parameters, are respectively obtained by the Mel Cepstral Frequency Coefficients of 24 dimensions 24 characteristic parameters, 2 characteristic parameters for indicating music energy, 2 characteristic parameters for indicating music rhythm.

The musical features collection of the music training set as training sample and is carried out neural metwork training, is instructed by S3 BP neural network after white silk；And the BP mind using the musical features collection of the music test set as test sample, after test training Classification accuracy through network, it is right if the classification accuracy of BP neural network less than 85%, readjusts neural network parameter The number of levels of BP neural network, cycle-index, learning rate, target error are modified；If the classification accuracy of BP neural network More than or equal to 92.5%, then it represents that test effect is good, i.e., the BP neural network after the training has good classification accurate Property.

The neural metwork training is to be trained using BP neural network algorithm；The BP network neural algorithm: input Training sample carries out adjusting training repeatedly to the weight and threshold value of BP neural network using back-propagation algorithm, make output to Difference between amount and Mean Vector is less than the 10% of Mean Vector；The output vector is the prediction classification value of music, is BP The classification value of neural network prediction；The Mean Vector is the actual classification value of music, is the Accurate classification value of music training set.

When carrying out the training and test of BP neural network, one is added before 28 characteristic parameters of the audio frequency characteristics For the characteristic parameter of mark, that is, the audio frequency characteristics after increasing the characteristic parameter for mark include 29 characteristic parameters, respectively 2 characteristic parameters, expression music for 24 characteristic parameters, expression music energy that are obtained by the Mel Cepstral Frequency Coefficients of 24 dimensions 2 characteristic parameters of rhythm, 1 characteristic parameter for mark；The characteristic parameter for mark is to identify music not With classification, prevalence, R&B, jazz, electronics, folk rhyme, rock and roll, a Chinese musical telling, rural area are respectively indicated with 1,2,3,4,5,6,7,8.

When carrying out the test of BP neural network, input parameter is the musical features collection of music test set, and output parameter is Music assorting corresponding to music in music test set, and the output parameter of test is also used into 1,2,3,4,5,6,7,8 difference Indicate prevalence, R&B, jazz, electronics, folk rhyme, rock and roll, a Chinese musical telling, rural area.If in the audio frequency characteristics of certain song for mark Characteristic parameter is 1, then output parameter of the audio frequency characteristics of the song after BP neural network is tested also is to answer 1, i.e., according to defeated Out parameter whether unanimously come with the characteristic parameter for mark the BP neural network after training of judgement classification it is whether accurate, if one It causes, then classification is accurate；It otherwise is classification inaccuracy.

It is obtained according to the quantity of all music in the quantity for accurate music of classifying in music test set and music test set The classification accuracy owns in the quantity/music test set for accurate music of classifying in classification accuracy=music test set The quantity of music.

The present invention has selected the audio frequency characteristics that can preferably express musical specific property from the audio frequency characteristics of music, solves Audio frequency characteristics single problems provide effective data basis for automatic classification, improve music classify automatically it is accurate Property, and neural metwork training is carried out to audio frequency characteristics on the basis of this audio feature extraction, the automatic classification of music is made It is further to improve, to construct the higher music automatic categorizer of classification accuracy.The present invention is according to 24 Jan Vermeer scrambles Spectral coefficient obtains 24 characteristic parameters of audio frequency characteristics, equally, according to Mel Cepstral Frequency Coefficients extracting mode and carry out difference More high-dimensional Mel Cepstral Frequency Coefficients can be obtained, to obtain the more features parameter of audio frequency characteristics in processing.

The above is only the preferred embodiments of the invention, are not intended to limit the invention creation, all in the present invention Made any modifications, equivalent replacements, and improvements etc., should be included in the guarantor of the invention within the spirit and principle of creation Within the scope of shield.

Claims

1. a kind of method that the music based on audio frequency characteristics is classified automatically, which comprises the following steps:

S1 obtains the music building music collections of different classifications, and the music collections include for trained music training set and being used for The music test set of test, and music training set and music test set include the music of different classifications；

S2 carries out audio feature extraction to the music in the music and the music test set in the music training set respectively, The audio frequency characteristics of the music in the music training set are obtained, and constitute the musical features of music training set according to this audio frequency characteristics Collection；And the audio frequency characteristics of the music in the music test set are obtained, and constitute music test set according to this audio frequency characteristics Musical features collection；

The musical features collection of the music training set as training sample and is carried out neural metwork training, after being trained by S3 Neural network；And using the musical features collection of the music test set as test sample and the neural network after training is carried out Test.

2. the method that a kind of music based on audio frequency characteristics according to claim 1 is classified automatically, which is characterized in that step In S1, the mode classification of the music of the different classifications includes: prevalence, R&B, jazz, electronics, folk rhyme, rock and roll, a Chinese musical telling, rural area.

3. the method that a kind of music based on audio frequency characteristics according to claim 1 is classified automatically, which is characterized in that step In S2, audio frequency characteristics are carried out to the music in the music and the music test set in the music training set using MATLB and are mentioned It takes.

4. the method that a kind of music based on audio frequency characteristics according to claim 3 is classified automatically, which is characterized in that step In S2, the extracting mode of the Mel Cepstral Frequency Coefficients MFCC are as follows: to the music signal of music carry out preemphasis, framing, adding window, Fast Fourier Transform (FFT), Mel filtering, logarithm operation, discrete cosine transform, obtain the Mel Cepstral Frequency Coefficients of 12 dimensions, and to institute The Mel Cepstral Frequency Coefficients for stating 12 dimensions carry out second differnce, the Mel Cepstral Frequency Coefficients of 24 dimensions are obtained, by Mel Cepstral Frequency Coefficients Be used as a characteristic parameter per the numerical value on one-dimensional, obtain 24 characteristic parameters according to the Mel Cepstral Frequency Coefficients of 24 dimensions.

5. the method that a kind of music based on audio frequency characteristics according to claim 3 is classified automatically, which is characterized in that step In S2, the extracting mode of the music energy are as follows: preemphasis, framing, adding window, fast Fourier are carried out to the music signal of music Transformation, and music energy is obtained according to the Fourier spectrum figure after Fast Fourier Transform (FFT)；The music energy includes 2 features Parameter, the respectively mean value of the maximum amplitude in Fourier spectrum figure and amplitude.

6. the method that a kind of music based on audio frequency characteristics according to claim 3 is classified automatically, which is characterized in that step In S2, the extracting mode of the music rhythm are as follows: extract the bat number of the every five seconds of music, and sound is obtained according to every five seconds bat number Happy rhythm；The music rhythm includes 2 characteristic parameters, maximum bat number and every five seconds bat respectively in every five seconds bat number Several mean values.

7. the method that a kind of music based on audio frequency characteristics according to claim 1 is classified automatically, which is characterized in that step In S3, the neural metwork training is to be trained using BP neural network algorithm；The BP network neural algorithm: input training Sample carries out adjusting training repeatedly using weight and threshold value of the back-propagation algorithm to neural network, makes output vector and phase The difference hoped between vector is less than the 10% of Mean Vector；The output vector is the prediction classification value of music；It is described it is expected to Amount is the actual classification value of music.

8. the method that a kind of music based on audio frequency characteristics according to claim 1 is classified automatically, which is characterized in that step In S3, the BP neural network after training is tested to obtain the classification accuracy of the BP neural network, if classification accuracy It less than 85%, then needs to readjust neural network parameter, to the number of levels of BP neural network, cycle-index, learning rate, target Error is modified；According to the quantity of all music in the quantity for accurate music of classifying in music test set and music test set The classification accuracy is obtained, in the quantity/music test set for accurate music of classifying in classification accuracy=music test set The quantity of all music.