CN102375834B

CN102375834B - Audio file search method, system and audio file type recognition methods, system

Info

Publication number: CN102375834B
Application number: CN201010256981.8A
Authority: CN
Inventors: 肖力豪
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2010-08-17
Filing date: 2010-08-17
Publication date: 2016-01-20
Anticipated expiration: 2030-08-17
Also published as: CN102375834A

Abstract

The invention provides audio file search method, system and audio file type recognition methods, system.This audio file search method comprises: the cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient that pre-determine each audio file in audio file library; Extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of audio file to be retrieved, the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of audio file to be retrieved and/or musical note feature Pitch coefficient and each audio file in audio file library and/or musical note feature Pitch coefficient are mated, from audio file library, retrieves audio file according to matching result.Application the present invention can retrieve audio file according to tune or can retrieve the audio file that belongs to certain type or improve and identify that audio file belongs to the efficiency of which kind.

Description

Audio file search method, system and audio file type recognition methods, system

Technical field

The present invention relates to audio file retrieval technique field, particularly relate to audio file search method, system and audio file type recognition methods, system.

Background technology

The core missions of audio file retrieval retrieve the audio file of meeting consumers' demand based on audio file library.

At present, the attributive character retrieval audio file of the audio file such as audio file title, artist name that audio file searching system can only provide according to user, there is following shortcoming in this method:

First, when user does not know the attributive character such as title, artist name of audio file, and when only can groan out tune, existing audio retrieval system cannot retrieve corresponding audio file according to tune.

Moreover when user is not intended to the audio file of looking for certain to determine, but when wanting to look for the audio file can expressing certain artistic conception, existing audio retrieval system also cannot retrieve corresponding audio file.

In addition, if adopt manual method to carry out artistic conception type identification to all audio files, by the manpower of at substantial, and efficiency is lower.

Summary of the invention

In view of this, the invention provides audio file search method, system and audio file type recognition methods, system, identify that audio file belongs to the efficiency of which kind audio file can be retrieved according to tune or can retrieve the audio file that belongs to certain type or improve.

A kind of audio file type recognition methods, the method is for marking out the type belonging to each audio file in audio file library, and the method comprises:

Cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file is stored in feature database;

Extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of audio file to be identified, determine to be the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the sequence of operation of musical note feature Pitch coefficient needs execution by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion, and the minimal-overhead that often kind of action type produces, wherein, described action type is copy, replace, delete, any one in inserting and exchanging, according to the order of each operation in the described sequence of operation, iteration corresponding operating type is as the minimal-overhead produced during last operation successively, obtain being the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion according to iteration result, the matching degree of value less then this audio file to be identified and this distinctive tone frequency file of this minimal-overhead is higher, audio types belonging to matching degree and distinctive tone frequency file identifies the audio types of described audio file to be identified.

A kind of audio file type recognition system, this system comprises feature database, audio feature extraction module, matching module and type identification module;

Described feature database, stores cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file;

Described audio feature extraction module, extracts cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of audio file to be identified;

Described matching module, determine to be the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the sequence of operation of musical note feature Pitch coefficient needs execution by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion, and the minimal-overhead that often kind of action type produces, wherein, described action type is copy, replace, delete, any one in inserting and exchanging, according to the order of each operation in the described sequence of operation, iteration corresponding operating type is as the minimal-overhead produced during last operation successively, obtain being the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion according to iteration result, the matching degree of value less then this audio file to be identified and this distinctive tone frequency file of this minimal-overhead is higher,

Described type identification module, the matching degree obtained according to described matching module and the audio types belonging to distinctive tone frequency file identify the audio types of described audio file to be identified.

A kind of audio file search method, the method comprises:

Extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of the audio file in audio file library in advance, determine to be the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the sequence of operation of musical note feature Pitch coefficient needs execution by the cepstrum coefficient of frequency MFCC of the audio file in audio file library and/or musical note feature Pitch coefficients conversion, and the minimal-overhead that often kind of action type produces, wherein, described action type is copy, replace, delete, any one in inserting and exchanging, according to the order of each operation in the described sequence of operation, iteration corresponding operating type is as the minimal-overhead produced during last operation successively, obtain being the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of the audio file in audio file library and/or musical note feature Pitch coefficients conversion according to iteration result, audio file in less then this audio file library of value of this minimal-overhead and the matching degree of this distinctive tone frequency file higher, audio types identification belonging to matching degree and distinctive tone frequency file the audio types of storing audio files storehouse sound intermediate frequency file,

Receive audio types to be retrieved, according to the audio types of the audio file library sound intermediate frequency file stored, retrieve the audio file belonging to described audio types to be retrieved.

A kind of audio file searching system, this system comprises feature database, audio file type identification module, audio file type memory module and retrieval module;

Described audio file type identification module, extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of the audio file in audio file library in advance, determine to be the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the sequence of operation of musical note feature Pitch coefficient needs execution by the cepstrum coefficient of frequency MFCC of the audio file in audio file library and/or musical note feature Pitch coefficients conversion, and the minimal-overhead that often kind of action type produces, wherein, described action type is copy, replace, delete, any one in inserting and exchanging, according to the order of each operation in the described sequence of operation, iteration corresponding operating type is as the minimal-overhead produced during last operation successively, obtain being the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of the audio file in audio file library and/or musical note feature Pitch coefficients conversion according to iteration result, audio file in less then this audio file library of value of this minimal-overhead and the matching degree of this distinctive tone frequency file higher, the audio types of the audio types identification audio file library sound intermediate frequency file belonging to matching degree and distinctive tone frequency file,

Described audio file type memory module, according to the audio types of the recognition result storing audio files storehouse sound intermediate frequency file of described audio file type identification module;

Described retrieval module, receives audio types to be retrieved, according to the audio types of each audio file stored in described audio file type memory module, retrieves the audio file belonging to described audio types to be retrieved.

As seen from the above technical solution, the present invention is by extracting the audio characteristic data of audio file, i.e. MFCC coefficient and Pitch coefficient, the audio characteristic data of audio file more to be retrieved and the audio characteristic data of audio repository sound intermediate frequency file, make audio retrieval system can retrieve audio file according to tune.

When the audio characteristic data of the distinctive tone frequency file of each type in the audio characteristic data and feature database of each audio file during the present invention is by coupling audio repository, when belonging to the audio file of which kind according to a certain audio file of matching result identification, owing to not needing manually to identify each audio file, identify that audio file belongs to the efficiency of which kind therefore, it is possible to improve.

Audio types recognition methods provided by the invention can also be adopted to identify the audio types of each audio file in audio file library and store the audio types of each audio file, thus corresponding audio file can be retrieved according to audio types to be retrieved.

Accompanying drawing explanation

Fig. 1 is audio file search method process flow diagram provided by the invention.

Fig. 2 is the extracting method flow process of MFCC coefficient.

Fig. 3 is the extracting method flow process of Pitch coefficient.

Fig. 4 is audio file searching system provided by the invention composition schematic diagram.

Fig. 5 is audio file type recognition methods process flow diagram provided by the invention.

Fig. 6 is audio file type recognition system provided by the invention composition schematic diagram.

Fig. 7 is audio file search method process flow diagram provided by the invention.

Fig. 8 is audio file searching system provided by the invention composition schematic diagram.

Embodiment

As shown in Figure 1, the method comprises:

Step 101, pre-determines cepstrum frequency MFCC coefficient and the musical note feature Pitch coefficient of each audio file in audio file library.

Step 102, extracts cepstrum frequency MFCC coefficient and the musical note feature Pitch coefficient of audio file to be retrieved.

Step 103, mates the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of audio file to be retrieved and musical note feature Pitch coefficient and each audio file in audio file library and musical note feature Pitch coefficient.

Step 104, retrieves audio file according to matching result from audio file library.

Wherein, Mel cepstrum coefficient of frequency (MFCC) proposes based on the auditory properties of people's ear, and its adopts a kind of nonlinear cps (Mel frequency) to simulate the auditory system of people.Acoustic investigation shows, MFCC coefficient can react audio frequency characteristics preferably.

The audio characteristic data that method shown in Fig. 1 carries out audio file retrieval institute foundation comprises MFCC coefficient and the Pitch coefficient of audio file, in actual applications, and also can only according to MFCC coefficient or only according to Pitch coefficient.

Fig. 2 is the extracting method flow process of MFCC coefficient.

As shown in Figure 2, the method comprises:

Step 201, carries out pre-emphasis process to the audio file signal to be retrieved of input.

The object of pre-emphasis makes the frequency spectrum of signal become smooth, keeps, in whole frequency band from low to high, to ask frequency spectrum, so that spectrum analysis or channel parameters analysis by same signal to noise ratio (S/N ratio).Prior art specifically can be adopted to carry out pre-emphasis.

Step 202, carries out windowing process to the signal after pre-emphasis process.

Voice signal is a kind of typical non-stationary signal, generally use window function (such as Hamming window) intercept wherein one section analyze, the part signal that intercepts out is considered to short-term stationarity.

Step 203, is transformed into frequency-region signal by the signal after windowing process.

In this step, by Fast Fourier Transform (FFT) (FFT conversion) or discrete Fourier transformation (DFT conversion), the signal after windowing can be transformed into frequency-region signal.

Step 204, converts to described frequency-region signal quantity.

Step 205, carries out filtering process to the frequency-region signal be converted to quantity.

Step 206, is transformed into time-domain signal by the frequency-region signal after filtering process, and this time-domain signal is cepstrum frequency MFCC coefficient.

In this step, the frequency-region signal after filtering process is carried out discrete cosine transform (dct transform) and obtain time-domain signal, this time-domain signal is exactly MFCC coefficient.

Fig. 3 is the extracting method flow process of Pitch coefficient.

As shown in Figure 3, the method comprises:

Step 301, carries out pre-emphasis and windowing process to the audio time domain signal of input, the time-domain signal after windowing process is transformed into frequency-region signal.

Step 302, determines the main harmonic frequency of frequency-region signal.

This step can adopt existing techniques in realizing: calculate frequency-region signal amplitude on each frequency, amplitude its energy larger is higher, and extract the maximum frequency values adopting some correspondence of energy of each frame signal, this value is the main harmonic frequency of this frame signal.

Step 303, by the basic frequency of each sound in main harmonic frequency map to octave, according to the pitch melody of the audio time domain signal of input in mapping result determining step 301.

Table one is octave table:

Table one

Wherein, the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of audio file to be retrieved and musical note feature Pitch coefficient and each audio file in audio file library and musical note feature Pitch coefficient are mated, from audio file library, retrieving audio file according to matching result can comprise:

Determine to be the cepstrum frequency MFCC coefficient of audio file library sound intermediate frequency file and the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of audio file to be retrieved and musical note feature Pitch coefficients conversion, the described minimal-overhead that each audio file of comparing audio library is corresponding, is defined as result for retrieval by audio file corresponding for minimal-overhead minimum for minimal-overhead intermediate value described in each.

In brief, the present invention is by calculating the minimal-overhead of a sequence transformation needed for another sequence, and determine the similarity of two sequences, i.e. the matching degree of two sequences, minimal-overhead is less, then two sequence similarity are higher, also more mates.

Particularly, the present invention can determine the cepstrum coefficient of frequency MFCC of audio file to be retrieved to be converted to the cepstrum frequency MFCC coefficient of audio file library sound intermediate frequency file or be the sequence of operation that the musical note feature Pitch coefficient of audio file library sound intermediate frequency file needs to perform by the musical note feature Pitch coefficients conversion of audio file to be retrieved, and the minimal-overhead that often kind of action type produces, according to the order of each operation in the described sequence of operation, the minimal-overhead of iteration corresponding operating type generation successively, obtain the cepstrum coefficient of frequency MFCC of audio file to be retrieved to be converted to the cepstrum frequency MFCC coefficient of audio file library sound intermediate frequency file according to iteration result or be the minimal-overhead that the musical note feature Pitch coefficient of audio file library sound intermediate frequency file produces by the musical note feature Pitch coefficients conversion of audio file to be retrieved.

Such as, suppose that cepstrum coefficient of frequency MFCC or the musical note feature Pitch coefficient sequence x [i] of audio file to be retrieved represent, cepstrum frequency MFCC coefficient or the musical note feature Pitch coefficient sequences y [j] of audio file library sound intermediate frequency file represent, minimal-overhead sequence x [i] being converted to sequences y [j] is c [i, j], i and j is natural number, then determine that the minimal-overhead that often kind of action type produces comprises:

When last operation sequence x [i] being converted to sequences y [j] is copy, c [i, j] equals c [i-1, j-1] and the minimal-overhead sum needed for copy function.

Last operation sequence x [i] being converted to sequences y [j] is when replacing, and c [i, j] equals c [i-1, j-1] and the minimal-overhead sum needed for replacement operation.

Last operation sequence x [i] being converted to sequences y [j] is when deleting, and c [i, j] equals c [i-1, j] and the minimal-overhead sum needed for deletion action.

Last operation sequence x [i] being converted to sequences y [j] is when inserting, and c [i, j] equals c [i, j-1] and the minimal-overhead sum needed for update.

Last operation sequence x [i] being converted to sequences y [j] is when exchanging, and c [i, j] equals c [i-2, j-2] and exchanges the minimal-overhead sum needed for operating.

Delete the first character string in sequence x [i] and the minimal-overhead needed for the second character string that in insetion sequence y [j], length is s equals the minimal-overhead needed for deletion first character string and the minimal-overhead sum needed for s update.

By method shown in Fig. 1, can realize retrieving corresponding audio file according to the tune of user's humming.According to the method for Fig. 1, the present invention gives corresponding audio file searching system, specifically refers to Fig. 4.

As shown in Figure 4, this system comprises audio file library 401, audio feature extraction module 402, matching module 403 and retrieval module 404.

Audio file library 401, stores cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of each audio file.

Audio feature extraction module 402, extracts cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of audio file to be retrieved.

Matching module 403, mates the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of audio file to be retrieved and/or musical note feature Pitch coefficient and each audio file in audio file library and/or musical note feature Pitch coefficient.

Retrieval module 404, the matching result according to matching module 403 retrieves audio file from audio file library 401.

Present invention also offers audio file type recognition methods and system, specifically ask for an interview Fig. 5 and 6.

As shown in Figure 5, the method comprises:

Step 501, stores cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file in feature database.

This step is preprocessing process, completes before carrying out audio file type identification, except non-required upgrades described feature database, otherwise directly performs step 502 when carrying out audio file type identification at every turn.

Step 502, extracts cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of audio file to be identified.

Step 503, mates the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of audio file to be identified and/or musical note feature Pitch coefficient and each distinctive tone frequency file and/or musical note feature Pitch coefficient.

Step 504, the audio types belonging to matching degree and distinctive tone frequency file identifies the audio types of described audio file to be identified.

In this step, can determine the distinctive tone frequency file number being greater than predetermined value in the distinctive tone frequency file of every type with the matching degree of described audio file to be identified, be that described distinctive tone frequency file number is greater than predetermined value or comes the audio types of top N by the type identification of audio file to be identified.

Described the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of audio file to be identified and/or musical note feature Pitch coefficient and each distinctive tone frequency file and/or musical note feature Pitch coefficient are carried out mating comprising: determine to be the minimal-overhead that the same type coefficient (cepstrum frequency MFCC coefficient and/or musical note feature Pitch coefficient) of distinctive tone frequency file produces by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion, the matching degree of value less then this audio file to be identified and this distinctive tone frequency file of this minimal-overhead is higher.

Particularly, can determine to be the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the sequence of operation of musical note feature Pitch coefficient needs execution by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion, and the minimal-overhead that often kind of action type produces, according to the order of each operation in the described sequence of operation, the minimal-overhead of iteration corresponding operating type generation successively, obtain being the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion according to iteration result.

Wherein, the related description of the defining method method shown in Figure 1 of minimal-overhead that produces of often kind of action type.

As shown in Figure 6, this system comprises feature database 601, audio feature extraction module 602, matching module 603 and type identification module 604.

Feature database 601, stores cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file.

Audio feature extraction module 602, extracts cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of audio file to be identified.

Matching module 603, mates the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of audio file to be identified and/or musical note feature Pitch coefficient and each distinctive tone frequency file and/or musical note feature Pitch coefficient.

Type identification module 604, the matching degree obtained according to matching module 603 and the audio types belonging to distinctive tone frequency file identify the audio types of described audio file to be identified.

Type identification module 604, can determine the distinctive tone frequency file number being greater than predetermined value in the distinctive tone frequency file of every type with the matching degree of described audio file to be identified, be that described distinctive tone frequency file number is greater than predetermined value or comes the audio types of top N by the type identification of audio file to be identified.

Method shown in application drawing 5 and 6 or system, can automatically identify audio file and belong to which kind, such as, can express which artistic conception type such as happiness, sorrow, excitement, thus improve the efficiency of audio file type identification.

Utilize the scheme that Fig. 5 and Fig. 6 provides, the type described in each audio file in audio file library can be marked out, therefore, according to the type keyword of user's input, the audio file belonging to the type can be retrieved, specifically refer to Fig. 7 and Fig. 8.

As shown in Figure 7, the method comprises:

Step 701, stores cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file in feature database.

Step 702, extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of the audio file in audio file library in advance, the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of the audio file in audio file library and/or musical note feature Pitch coefficient and each distinctive tone frequency file and/or musical note feature Pitch coefficient are mated, the audio types identification belonging to matching degree and distinctive tone frequency file the audio types of storing audio files storehouse sound intermediate frequency file.

Wherein, step 701-702 completes in advance before retrieving, and when retrieving the audio file of a certain type, directly can perform step 703 at every turn.

Step 703, receives audio types to be retrieved, according to the audio types of the audio file library sound intermediate frequency file stored, retrieves the audio file belonging to described audio types to be retrieved.

As described in Figure 8, this system comprises feature database 801, audio file type identification module 802, audio file type memory module 803 and retrieval module 804.

Feature database 801, stores cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file.

Audio file type identification module 802, extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of the audio file in audio file library in advance, the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of the audio file in audio file library and/or musical note feature Pitch coefficient and each distinctive tone frequency file and/or musical note feature Pitch coefficient are mated, the audio types of the audio types identification audio file library sound intermediate frequency file belonging to matching degree and distinctive tone frequency file.

Audio file type memory module 803, according to the audio types of the recognition result storing audio files storehouse sound intermediate frequency file of described audio file type identification module.

Retrieval module 804, receives audio types to be retrieved, according to the audio types of each audio file stored in described audio file type memory module, retrieves the audio file belonging to described audio types to be retrieved.

When the information of storing audio files, for each audio file, three groups of files can be adopted to store the information of this audio file, described three groups of files comprise tag file, property file and semantic file.Described tag file, for the audio characteristic data of storing audio files, such as MFCC coefficient and Pitch coefficient, a kind of audio characteristic data wherein can be stored as separately a kind of tag file, such as, MFCC coefficient and Pitch coefficient are individually stored as MFCC coefficient files and Pitch coefficient files.Described property file is used for the base attribute of storing audio files, the size of such as storing audio files, the filename containing extension name, the filename not containing extension name, Ge Shouming, album name, semantic tagger phrase and the lyrics etc.Semantic tagger vocabulary wherein shows the audio types identified by kind identification method provided by the invention, such as glad, sad, exciting etc.Described semantic file, for storing one group of semantic tagger word audio file being carried out to semantic description.

The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, within the spirit and principles in the present invention all, any amendment made, equivalent replacement, improvement etc., all should be included within the scope of protection of the invention.

Claims

1. an audio file type recognition methods, is characterized in that, the method is for marking out the type belonging to each audio file in audio file library, and the method comprises:

In feature database, store cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file, wherein, described audio types comprises artistic conception type;

Extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of audio file to be identified, determine to be the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the sequence of operation of musical note feature Pitch coefficient needs execution by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion, and the minimal-overhead that often kind of action type produces, wherein, described action type is copy, replace, delete, any one in inserting and exchanging, according to the order of each operation in the described sequence of operation, iteration corresponding operating type is as the minimal-overhead produced during last operation successively, obtain being the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion according to iteration result, the matching degree of value less then this audio file to be identified and this distinctive tone frequency file of this minimal-overhead is higher, audio types belonging to matching degree and distinctive tone frequency file identifies the audio types of described audio file to be identified, wherein, described audio file to be identified comprises the audio file in audio file library.

2. method according to claim 1, is characterized in that, the audio types that described audio types belonging to matching degree and distinctive tone frequency file identifies described audio file to be identified comprises:

Determine the distinctive tone frequency file number being greater than predetermined value in the distinctive tone frequency file of every type with the matching degree of described audio file to be identified,

Be that described distinctive tone frequency file number is greater than predetermined value or comes the audio types of top N by the type identification of audio file to be identified.

3. an audio file type recognition system, is characterized in that, this system is for marking out the type belonging to each audio file in audio file library, and this system comprises feature database, audio feature extraction module, matching module and type identification module;

Described feature database, store cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file, wherein, described audio types comprises artistic conception type;

Described type identification module, the matching degree obtained according to described matching module and the audio types belonging to distinctive tone frequency file identify the audio types of described audio file to be identified;

Wherein, described audio file to be identified comprises the audio file in audio file library.

4. system according to claim 3, is characterized in that,

Described type identification module, determining the distinctive tone frequency file number being greater than predetermined value in the distinctive tone frequency file of every type with the matching degree of described audio file to be identified, is that described distinctive tone frequency file number is greater than predetermined value or comes the audio types of top N by the type identification of audio file to be identified.

5. an audio file search method, is characterized in that, the method comprises:

6. an audio file searching system, is characterized in that, this system comprises feature database, audio file type identification module, audio file type memory module and retrieval module;