CN102375834B - Audio file search method, system and audio file type recognition methods, system - Google Patents

Audio file search method, system and audio file type recognition methods, system Download PDF

Info

Publication number
CN102375834B
CN102375834B CN201010256981.8A CN201010256981A CN102375834B CN 102375834 B CN102375834 B CN 102375834B CN 201010256981 A CN201010256981 A CN 201010256981A CN 102375834 B CN102375834 B CN 102375834B
Authority
CN
China
Prior art keywords
audio file
audio
file
coefficient
frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201010256981.8A
Other languages
Chinese (zh)
Other versions
CN102375834A (en
Inventor
肖力豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201010256981.8A priority Critical patent/CN102375834B/en
Publication of CN102375834A publication Critical patent/CN102375834A/en
Application granted granted Critical
Publication of CN102375834B publication Critical patent/CN102375834B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides audio file search method, system and audio file type recognition methods, system.This audio file search method comprises: the cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient that pre-determine each audio file in audio file library; Extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of audio file to be retrieved, the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of audio file to be retrieved and/or musical note feature Pitch coefficient and each audio file in audio file library and/or musical note feature Pitch coefficient are mated, from audio file library, retrieves audio file according to matching result.Application the present invention can retrieve audio file according to tune or can retrieve the audio file that belongs to certain type or improve and identify that audio file belongs to the efficiency of which kind.

Description

Audio file search method, system and audio file type recognition methods, system
Technical field
The present invention relates to audio file retrieval technique field, particularly relate to audio file search method, system and audio file type recognition methods, system.
Background technology
The core missions of audio file retrieval retrieve the audio file of meeting consumers' demand based on audio file library.
At present, the attributive character retrieval audio file of the audio file such as audio file title, artist name that audio file searching system can only provide according to user, there is following shortcoming in this method:
First, when user does not know the attributive character such as title, artist name of audio file, and when only can groan out tune, existing audio retrieval system cannot retrieve corresponding audio file according to tune.
Moreover when user is not intended to the audio file of looking for certain to determine, but when wanting to look for the audio file can expressing certain artistic conception, existing audio retrieval system also cannot retrieve corresponding audio file.
In addition, if adopt manual method to carry out artistic conception type identification to all audio files, by the manpower of at substantial, and efficiency is lower.
Summary of the invention
In view of this, the invention provides audio file search method, system and audio file type recognition methods, system, identify that audio file belongs to the efficiency of which kind audio file can be retrieved according to tune or can retrieve the audio file that belongs to certain type or improve.
A kind of audio file type recognition methods, the method is for marking out the type belonging to each audio file in audio file library, and the method comprises:
Cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file is stored in feature database;
Extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of audio file to be identified, determine to be the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the sequence of operation of musical note feature Pitch coefficient needs execution by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion, and the minimal-overhead that often kind of action type produces, wherein, described action type is copy, replace, delete, any one in inserting and exchanging, according to the order of each operation in the described sequence of operation, iteration corresponding operating type is as the minimal-overhead produced during last operation successively, obtain being the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion according to iteration result, the matching degree of value less then this audio file to be identified and this distinctive tone frequency file of this minimal-overhead is higher, audio types belonging to matching degree and distinctive tone frequency file identifies the audio types of described audio file to be identified.
A kind of audio file type recognition system, this system comprises feature database, audio feature extraction module, matching module and type identification module;
Described feature database, stores cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file;
Described audio feature extraction module, extracts cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of audio file to be identified;
Described matching module, determine to be the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the sequence of operation of musical note feature Pitch coefficient needs execution by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion, and the minimal-overhead that often kind of action type produces, wherein, described action type is copy, replace, delete, any one in inserting and exchanging, according to the order of each operation in the described sequence of operation, iteration corresponding operating type is as the minimal-overhead produced during last operation successively, obtain being the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion according to iteration result, the matching degree of value less then this audio file to be identified and this distinctive tone frequency file of this minimal-overhead is higher,
Described type identification module, the matching degree obtained according to described matching module and the audio types belonging to distinctive tone frequency file identify the audio types of described audio file to be identified.
A kind of audio file search method, the method comprises:
Cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file is stored in feature database;
Extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of the audio file in audio file library in advance, determine to be the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the sequence of operation of musical note feature Pitch coefficient needs execution by the cepstrum coefficient of frequency MFCC of the audio file in audio file library and/or musical note feature Pitch coefficients conversion, and the minimal-overhead that often kind of action type produces, wherein, described action type is copy, replace, delete, any one in inserting and exchanging, according to the order of each operation in the described sequence of operation, iteration corresponding operating type is as the minimal-overhead produced during last operation successively, obtain being the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of the audio file in audio file library and/or musical note feature Pitch coefficients conversion according to iteration result, audio file in less then this audio file library of value of this minimal-overhead and the matching degree of this distinctive tone frequency file higher, audio types identification belonging to matching degree and distinctive tone frequency file the audio types of storing audio files storehouse sound intermediate frequency file,
Receive audio types to be retrieved, according to the audio types of the audio file library sound intermediate frequency file stored, retrieve the audio file belonging to described audio types to be retrieved.
A kind of audio file searching system, this system comprises feature database, audio file type identification module, audio file type memory module and retrieval module;
Described feature database, stores cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file;
Described audio file type identification module, extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of the audio file in audio file library in advance, determine to be the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the sequence of operation of musical note feature Pitch coefficient needs execution by the cepstrum coefficient of frequency MFCC of the audio file in audio file library and/or musical note feature Pitch coefficients conversion, and the minimal-overhead that often kind of action type produces, wherein, described action type is copy, replace, delete, any one in inserting and exchanging, according to the order of each operation in the described sequence of operation, iteration corresponding operating type is as the minimal-overhead produced during last operation successively, obtain being the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of the audio file in audio file library and/or musical note feature Pitch coefficients conversion according to iteration result, audio file in less then this audio file library of value of this minimal-overhead and the matching degree of this distinctive tone frequency file higher, the audio types of the audio types identification audio file library sound intermediate frequency file belonging to matching degree and distinctive tone frequency file,
Described audio file type memory module, according to the audio types of the recognition result storing audio files storehouse sound intermediate frequency file of described audio file type identification module;
Described retrieval module, receives audio types to be retrieved, according to the audio types of each audio file stored in described audio file type memory module, retrieves the audio file belonging to described audio types to be retrieved.
As seen from the above technical solution, the present invention is by extracting the audio characteristic data of audio file, i.e. MFCC coefficient and Pitch coefficient, the audio characteristic data of audio file more to be retrieved and the audio characteristic data of audio repository sound intermediate frequency file, make audio retrieval system can retrieve audio file according to tune.
When the audio characteristic data of the distinctive tone frequency file of each type in the audio characteristic data and feature database of each audio file during the present invention is by coupling audio repository, when belonging to the audio file of which kind according to a certain audio file of matching result identification, owing to not needing manually to identify each audio file, identify that audio file belongs to the efficiency of which kind therefore, it is possible to improve.
Audio types recognition methods provided by the invention can also be adopted to identify the audio types of each audio file in audio file library and store the audio types of each audio file, thus corresponding audio file can be retrieved according to audio types to be retrieved.
Accompanying drawing explanation
Fig. 1 is audio file search method process flow diagram provided by the invention.
Fig. 2 is the extracting method flow process of MFCC coefficient.
Fig. 3 is the extracting method flow process of Pitch coefficient.
Fig. 4 is audio file searching system provided by the invention composition schematic diagram.
Fig. 5 is audio file type recognition methods process flow diagram provided by the invention.
Fig. 6 is audio file type recognition system provided by the invention composition schematic diagram.
Fig. 7 is audio file search method process flow diagram provided by the invention.
Fig. 8 is audio file searching system provided by the invention composition schematic diagram.
Embodiment
Fig. 1 is audio file search method process flow diagram provided by the invention.
As shown in Figure 1, the method comprises:
Step 101, pre-determines cepstrum frequency MFCC coefficient and the musical note feature Pitch coefficient of each audio file in audio file library.
Step 102, extracts cepstrum frequency MFCC coefficient and the musical note feature Pitch coefficient of audio file to be retrieved.
Step 103, mates the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of audio file to be retrieved and musical note feature Pitch coefficient and each audio file in audio file library and musical note feature Pitch coefficient.
Step 104, retrieves audio file according to matching result from audio file library.
Wherein, Mel cepstrum coefficient of frequency (MFCC) proposes based on the auditory properties of people's ear, and its adopts a kind of nonlinear cps (Mel frequency) to simulate the auditory system of people.Acoustic investigation shows, MFCC coefficient can react audio frequency characteristics preferably.
The audio characteristic data that method shown in Fig. 1 carries out audio file retrieval institute foundation comprises MFCC coefficient and the Pitch coefficient of audio file, in actual applications, and also can only according to MFCC coefficient or only according to Pitch coefficient.
Fig. 2 is the extracting method flow process of MFCC coefficient.
As shown in Figure 2, the method comprises:
Step 201, carries out pre-emphasis process to the audio file signal to be retrieved of input.
The object of pre-emphasis makes the frequency spectrum of signal become smooth, keeps, in whole frequency band from low to high, to ask frequency spectrum, so that spectrum analysis or channel parameters analysis by same signal to noise ratio (S/N ratio).Prior art specifically can be adopted to carry out pre-emphasis.
Step 202, carries out windowing process to the signal after pre-emphasis process.
Voice signal is a kind of typical non-stationary signal, generally use window function (such as Hamming window) intercept wherein one section analyze, the part signal that intercepts out is considered to short-term stationarity.
Step 203, is transformed into frequency-region signal by the signal after windowing process.
In this step, by Fast Fourier Transform (FFT) (FFT conversion) or discrete Fourier transformation (DFT conversion), the signal after windowing can be transformed into frequency-region signal.
Step 204, converts to described frequency-region signal quantity.
Step 205, carries out filtering process to the frequency-region signal be converted to quantity.
Step 206, is transformed into time-domain signal by the frequency-region signal after filtering process, and this time-domain signal is cepstrum frequency MFCC coefficient.
In this step, the frequency-region signal after filtering process is carried out discrete cosine transform (dct transform) and obtain time-domain signal, this time-domain signal is exactly MFCC coefficient.
Fig. 3 is the extracting method flow process of Pitch coefficient.
As shown in Figure 3, the method comprises:
Step 301, carries out pre-emphasis and windowing process to the audio time domain signal of input, the time-domain signal after windowing process is transformed into frequency-region signal.
Step 302, determines the main harmonic frequency of frequency-region signal.
This step can adopt existing techniques in realizing: calculate frequency-region signal amplitude on each frequency, amplitude its energy larger is higher, and extract the maximum frequency values adopting some correspondence of energy of each frame signal, this value is the main harmonic frequency of this frame signal.
Step 303, by the basic frequency of each sound in main harmonic frequency map to octave, according to the pitch melody of the audio time domain signal of input in mapping result determining step 301.
Table one is octave table:
Table one
Wherein, the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of audio file to be retrieved and musical note feature Pitch coefficient and each audio file in audio file library and musical note feature Pitch coefficient are mated, from audio file library, retrieving audio file according to matching result can comprise:
Determine to be the cepstrum frequency MFCC coefficient of audio file library sound intermediate frequency file and the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of audio file to be retrieved and musical note feature Pitch coefficients conversion, the described minimal-overhead that each audio file of comparing audio library is corresponding, is defined as result for retrieval by audio file corresponding for minimal-overhead minimum for minimal-overhead intermediate value described in each.
In brief, the present invention is by calculating the minimal-overhead of a sequence transformation needed for another sequence, and determine the similarity of two sequences, i.e. the matching degree of two sequences, minimal-overhead is less, then two sequence similarity are higher, also more mates.
Particularly, the present invention can determine the cepstrum coefficient of frequency MFCC of audio file to be retrieved to be converted to the cepstrum frequency MFCC coefficient of audio file library sound intermediate frequency file or be the sequence of operation that the musical note feature Pitch coefficient of audio file library sound intermediate frequency file needs to perform by the musical note feature Pitch coefficients conversion of audio file to be retrieved, and the minimal-overhead that often kind of action type produces, according to the order of each operation in the described sequence of operation, the minimal-overhead of iteration corresponding operating type generation successively, obtain the cepstrum coefficient of frequency MFCC of audio file to be retrieved to be converted to the cepstrum frequency MFCC coefficient of audio file library sound intermediate frequency file according to iteration result or be the minimal-overhead that the musical note feature Pitch coefficient of audio file library sound intermediate frequency file produces by the musical note feature Pitch coefficients conversion of audio file to be retrieved.
Such as, suppose that cepstrum coefficient of frequency MFCC or the musical note feature Pitch coefficient sequence x [i] of audio file to be retrieved represent, cepstrum frequency MFCC coefficient or the musical note feature Pitch coefficient sequences y [j] of audio file library sound intermediate frequency file represent, minimal-overhead sequence x [i] being converted to sequences y [j] is c [i, j], i and j is natural number, then determine that the minimal-overhead that often kind of action type produces comprises:
When last operation sequence x [i] being converted to sequences y [j] is copy, c [i, j] equals c [i-1, j-1] and the minimal-overhead sum needed for copy function.
Last operation sequence x [i] being converted to sequences y [j] is when replacing, and c [i, j] equals c [i-1, j-1] and the minimal-overhead sum needed for replacement operation.
Last operation sequence x [i] being converted to sequences y [j] is when deleting, and c [i, j] equals c [i-1, j] and the minimal-overhead sum needed for deletion action.
Last operation sequence x [i] being converted to sequences y [j] is when inserting, and c [i, j] equals c [i, j-1] and the minimal-overhead sum needed for update.
Last operation sequence x [i] being converted to sequences y [j] is when exchanging, and c [i, j] equals c [i-2, j-2] and exchanges the minimal-overhead sum needed for operating.
Delete the first character string in sequence x [i] and the minimal-overhead needed for the second character string that in insetion sequence y [j], length is s equals the minimal-overhead needed for deletion first character string and the minimal-overhead sum needed for s update.
By method shown in Fig. 1, can realize retrieving corresponding audio file according to the tune of user's humming.According to the method for Fig. 1, the present invention gives corresponding audio file searching system, specifically refers to Fig. 4.
Fig. 4 is audio file searching system provided by the invention composition schematic diagram.
As shown in Figure 4, this system comprises audio file library 401, audio feature extraction module 402, matching module 403 and retrieval module 404.
Audio file library 401, stores cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of each audio file.
Audio feature extraction module 402, extracts cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of audio file to be retrieved.
Matching module 403, mates the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of audio file to be retrieved and/or musical note feature Pitch coefficient and each audio file in audio file library and/or musical note feature Pitch coefficient.
Retrieval module 404, the matching result according to matching module 403 retrieves audio file from audio file library 401.
Present invention also offers audio file type recognition methods and system, specifically ask for an interview Fig. 5 and 6.
Fig. 5 is audio file type recognition methods process flow diagram provided by the invention.
As shown in Figure 5, the method comprises:
Step 501, stores cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file in feature database.
This step is preprocessing process, completes before carrying out audio file type identification, except non-required upgrades described feature database, otherwise directly performs step 502 when carrying out audio file type identification at every turn.
Step 502, extracts cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of audio file to be identified.
Step 503, mates the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of audio file to be identified and/or musical note feature Pitch coefficient and each distinctive tone frequency file and/or musical note feature Pitch coefficient.
Step 504, the audio types belonging to matching degree and distinctive tone frequency file identifies the audio types of described audio file to be identified.
In this step, can determine the distinctive tone frequency file number being greater than predetermined value in the distinctive tone frequency file of every type with the matching degree of described audio file to be identified, be that described distinctive tone frequency file number is greater than predetermined value or comes the audio types of top N by the type identification of audio file to be identified.
Described the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of audio file to be identified and/or musical note feature Pitch coefficient and each distinctive tone frequency file and/or musical note feature Pitch coefficient are carried out mating comprising: determine to be the minimal-overhead that the same type coefficient (cepstrum frequency MFCC coefficient and/or musical note feature Pitch coefficient) of distinctive tone frequency file produces by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion, the matching degree of value less then this audio file to be identified and this distinctive tone frequency file of this minimal-overhead is higher.
Particularly, can determine to be the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the sequence of operation of musical note feature Pitch coefficient needs execution by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion, and the minimal-overhead that often kind of action type produces, according to the order of each operation in the described sequence of operation, the minimal-overhead of iteration corresponding operating type generation successively, obtain being the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion according to iteration result.
Wherein, the related description of the defining method method shown in Figure 1 of minimal-overhead that produces of often kind of action type.
Fig. 6 is audio file type recognition system provided by the invention composition schematic diagram.
As shown in Figure 6, this system comprises feature database 601, audio feature extraction module 602, matching module 603 and type identification module 604.
Feature database 601, stores cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file.
Audio feature extraction module 602, extracts cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of audio file to be identified.
Matching module 603, mates the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of audio file to be identified and/or musical note feature Pitch coefficient and each distinctive tone frequency file and/or musical note feature Pitch coefficient.
Type identification module 604, the matching degree obtained according to matching module 603 and the audio types belonging to distinctive tone frequency file identify the audio types of described audio file to be identified.
Type identification module 604, can determine the distinctive tone frequency file number being greater than predetermined value in the distinctive tone frequency file of every type with the matching degree of described audio file to be identified, be that described distinctive tone frequency file number is greater than predetermined value or comes the audio types of top N by the type identification of audio file to be identified.
Method shown in application drawing 5 and 6 or system, can automatically identify audio file and belong to which kind, such as, can express which artistic conception type such as happiness, sorrow, excitement, thus improve the efficiency of audio file type identification.
Utilize the scheme that Fig. 5 and Fig. 6 provides, the type described in each audio file in audio file library can be marked out, therefore, according to the type keyword of user's input, the audio file belonging to the type can be retrieved, specifically refer to Fig. 7 and Fig. 8.
Fig. 7 is audio file search method process flow diagram provided by the invention.
As shown in Figure 7, the method comprises:
Step 701, stores cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file in feature database.
Step 702, extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of the audio file in audio file library in advance, the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of the audio file in audio file library and/or musical note feature Pitch coefficient and each distinctive tone frequency file and/or musical note feature Pitch coefficient are mated, the audio types identification belonging to matching degree and distinctive tone frequency file the audio types of storing audio files storehouse sound intermediate frequency file.
Wherein, step 701-702 completes in advance before retrieving, and when retrieving the audio file of a certain type, directly can perform step 703 at every turn.
Step 703, receives audio types to be retrieved, according to the audio types of the audio file library sound intermediate frequency file stored, retrieves the audio file belonging to described audio types to be retrieved.
Fig. 8 is audio file searching system provided by the invention composition schematic diagram.
As described in Figure 8, this system comprises feature database 801, audio file type identification module 802, audio file type memory module 803 and retrieval module 804.
Feature database 801, stores cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file.
Audio file type identification module 802, extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of the audio file in audio file library in advance, the cepstrum frequency MFCC coefficient of the cepstrum frequency MFCC coefficient of the audio file in audio file library and/or musical note feature Pitch coefficient and each distinctive tone frequency file and/or musical note feature Pitch coefficient are mated, the audio types of the audio types identification audio file library sound intermediate frequency file belonging to matching degree and distinctive tone frequency file.
Audio file type memory module 803, according to the audio types of the recognition result storing audio files storehouse sound intermediate frequency file of described audio file type identification module.
Retrieval module 804, receives audio types to be retrieved, according to the audio types of each audio file stored in described audio file type memory module, retrieves the audio file belonging to described audio types to be retrieved.
When the information of storing audio files, for each audio file, three groups of files can be adopted to store the information of this audio file, described three groups of files comprise tag file, property file and semantic file.Described tag file, for the audio characteristic data of storing audio files, such as MFCC coefficient and Pitch coefficient, a kind of audio characteristic data wherein can be stored as separately a kind of tag file, such as, MFCC coefficient and Pitch coefficient are individually stored as MFCC coefficient files and Pitch coefficient files.Described property file is used for the base attribute of storing audio files, the size of such as storing audio files, the filename containing extension name, the filename not containing extension name, Ge Shouming, album name, semantic tagger phrase and the lyrics etc.Semantic tagger vocabulary wherein shows the audio types identified by kind identification method provided by the invention, such as glad, sad, exciting etc.Described semantic file, for storing one group of semantic tagger word audio file being carried out to semantic description.
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, within the spirit and principles in the present invention all, any amendment made, equivalent replacement, improvement etc., all should be included within the scope of protection of the invention.

Claims (6)

1. an audio file type recognition methods, is characterized in that, the method is for marking out the type belonging to each audio file in audio file library, and the method comprises:
In feature database, store cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file, wherein, described audio types comprises artistic conception type;
Extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of audio file to be identified, determine to be the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the sequence of operation of musical note feature Pitch coefficient needs execution by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion, and the minimal-overhead that often kind of action type produces, wherein, described action type is copy, replace, delete, any one in inserting and exchanging, according to the order of each operation in the described sequence of operation, iteration corresponding operating type is as the minimal-overhead produced during last operation successively, obtain being the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion according to iteration result, the matching degree of value less then this audio file to be identified and this distinctive tone frequency file of this minimal-overhead is higher, audio types belonging to matching degree and distinctive tone frequency file identifies the audio types of described audio file to be identified, wherein, described audio file to be identified comprises the audio file in audio file library.
2. method according to claim 1, is characterized in that, the audio types that described audio types belonging to matching degree and distinctive tone frequency file identifies described audio file to be identified comprises:
Determine the distinctive tone frequency file number being greater than predetermined value in the distinctive tone frequency file of every type with the matching degree of described audio file to be identified,
Be that described distinctive tone frequency file number is greater than predetermined value or comes the audio types of top N by the type identification of audio file to be identified.
3. an audio file type recognition system, is characterized in that, this system is for marking out the type belonging to each audio file in audio file library, and this system comprises feature database, audio feature extraction module, matching module and type identification module;
Described feature database, store cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file, wherein, described audio types comprises artistic conception type;
Described audio feature extraction module, extracts cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of audio file to be identified;
Described matching module, determine to be the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the sequence of operation of musical note feature Pitch coefficient needs execution by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion, and the minimal-overhead that often kind of action type produces, wherein, described action type is copy, replace, delete, any one in inserting and exchanging, according to the order of each operation in the described sequence of operation, iteration corresponding operating type is as the minimal-overhead produced during last operation successively, obtain being the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of audio file to be identified and/or musical note feature Pitch coefficients conversion according to iteration result, the matching degree of value less then this audio file to be identified and this distinctive tone frequency file of this minimal-overhead is higher,
Described type identification module, the matching degree obtained according to described matching module and the audio types belonging to distinctive tone frequency file identify the audio types of described audio file to be identified;
Wherein, described audio file to be identified comprises the audio file in audio file library.
4. system according to claim 3, is characterized in that,
Described type identification module, determining the distinctive tone frequency file number being greater than predetermined value in the distinctive tone frequency file of every type with the matching degree of described audio file to be identified, is that described distinctive tone frequency file number is greater than predetermined value or comes the audio types of top N by the type identification of audio file to be identified.
5. an audio file search method, is characterized in that, the method comprises:
In feature database, store cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file, wherein, described audio types comprises artistic conception type;
Extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of the audio file in audio file library in advance, determine to be the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the sequence of operation of musical note feature Pitch coefficient needs execution by the cepstrum coefficient of frequency MFCC of the audio file in audio file library and/or musical note feature Pitch coefficients conversion, and the minimal-overhead that often kind of action type produces, wherein, described action type is copy, replace, delete, any one in inserting and exchanging, according to the order of each operation in the described sequence of operation, iteration corresponding operating type is as the minimal-overhead produced during last operation successively, obtain being the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of the audio file in audio file library and/or musical note feature Pitch coefficients conversion according to iteration result, audio file in less then this audio file library of value of this minimal-overhead and the matching degree of this distinctive tone frequency file higher, audio types identification belonging to matching degree and distinctive tone frequency file the audio types of storing audio files storehouse sound intermediate frequency file,
Receive audio types to be retrieved, according to the audio types of the audio file library sound intermediate frequency file stored, retrieve the audio file belonging to described audio types to be retrieved.
6. an audio file searching system, is characterized in that, this system comprises feature database, audio file type identification module, audio file type memory module and retrieval module;
Described feature database, store cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of various audio types characteristic of correspondence audio file, wherein, described audio types comprises artistic conception type;
Described audio file type identification module, extract cepstrum frequency MFCC coefficient and/or the musical note feature Pitch coefficient of the audio file in audio file library in advance, determine to be the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the sequence of operation of musical note feature Pitch coefficient needs execution by the cepstrum coefficient of frequency MFCC of the audio file in audio file library and/or musical note feature Pitch coefficients conversion, and the minimal-overhead that often kind of action type produces, wherein, described action type is copy, replace, delete, any one in inserting and exchanging, according to the order of each operation in the described sequence of operation, iteration corresponding operating type is as the minimal-overhead produced during last operation successively, obtain being the cepstrum frequency MFCC coefficient of distinctive tone frequency file and/or the minimal-overhead of musical note feature Pitch coefficient generation by the cepstrum coefficient of frequency MFCC of the audio file in audio file library and/or musical note feature Pitch coefficients conversion according to iteration result, audio file in less then this audio file library of value of this minimal-overhead and the matching degree of this distinctive tone frequency file higher, the audio types of the audio types identification audio file library sound intermediate frequency file belonging to matching degree and distinctive tone frequency file,
Described audio file type memory module, according to the audio types of the recognition result storing audio files storehouse sound intermediate frequency file of described audio file type identification module;
Described retrieval module, receives audio types to be retrieved, according to the audio types of each audio file stored in described audio file type memory module, retrieves the audio file belonging to described audio types to be retrieved.
CN201010256981.8A 2010-08-17 2010-08-17 Audio file search method, system and audio file type recognition methods, system Active CN102375834B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010256981.8A CN102375834B (en) 2010-08-17 2010-08-17 Audio file search method, system and audio file type recognition methods, system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201010256981.8A CN102375834B (en) 2010-08-17 2010-08-17 Audio file search method, system and audio file type recognition methods, system

Publications (2)

Publication Number Publication Date
CN102375834A CN102375834A (en) 2012-03-14
CN102375834B true CN102375834B (en) 2016-01-20

Family

ID=45794457

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010256981.8A Active CN102375834B (en) 2010-08-17 2010-08-17 Audio file search method, system and audio file type recognition methods, system

Country Status (1)

Country Link
CN (1) CN102375834B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104978962B (en) * 2014-04-14 2019-01-18 科大讯飞股份有限公司 Singing search method and system
CN105718486B (en) * 2014-12-05 2021-07-06 科大讯飞股份有限公司 Online humming retrieval method and system
CN106528706B (en) * 2016-10-26 2020-02-07 北京邮电大学 Music retrieval method and device
CN111444383B (en) * 2020-03-30 2021-07-27 腾讯科技(深圳)有限公司 Audio data processing method and device and computer readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006132596A1 (en) * 2005-06-07 2006-12-14 Matsushita Electric Industrial Co., Ltd. Method and apparatus for audio clip classification
CN1897109A (en) * 2006-06-01 2007-01-17 电子科技大学 Single audio-frequency signal discrimination based on MFCC
CN101271457A (en) * 2007-03-21 2008-09-24 中国科学院自动化研究所 Music retrieval method and device based on rhythm
CN101281534A (en) * 2008-05-28 2008-10-08 叶睿智 Method for searching multimedia resource based on audio content retrieval
CN101566999A (en) * 2009-06-02 2009-10-28 哈尔滨工业大学 A quick audio retrieval method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006132596A1 (en) * 2005-06-07 2006-12-14 Matsushita Electric Industrial Co., Ltd. Method and apparatus for audio clip classification
CN1897109A (en) * 2006-06-01 2007-01-17 电子科技大学 Single audio-frequency signal discrimination based on MFCC
CN101271457A (en) * 2007-03-21 2008-09-24 中国科学院自动化研究所 Music retrieval method and device based on rhythm
CN101281534A (en) * 2008-05-28 2008-10-08 叶睿智 Method for searching multimedia resource based on audio content retrieval
CN101566999A (en) * 2009-06-02 2009-10-28 哈尔滨工业大学 A quick audio retrieval method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《基于动态分割和加权综合匹配的音乐检索算法》;秦静,周明全,王醒策,沈复兴;《计算机工程》;20070731;第33卷(第13期);第194-199页 *
《语音理解中的容错技术的研究》;***,王作英;《21世纪通信新技术-第六届全国青年通信学术会议论文集》;20011208;第237-243页 *

Also Published As

Publication number Publication date
CN102375834A (en) 2012-03-14

Similar Documents

Publication Publication Date Title
CN103177722B (en) A kind of song retrieval method based on tone color similarity
CN103440313B (en) music retrieval system based on audio fingerprint feature
CN103823867A (en) Humming type music retrieval method and system based on note modeling
CN101533401B (en) Search system and search method for speech database
US20110231189A1 (en) Methods and apparatus for extracting alternate media titles to facilitate speech recognition
WO2003010754A1 (en) Speech input search system
CN109308912A (en) Music style recognition methods, device, computer equipment and storage medium
CN102375834B (en) Audio file search method, system and audio file type recognition methods, system
CN102436806A (en) Audio frequency copy detection method based on similarity
CN108197319A (en) A kind of audio search method and system of the characteristic point based on time-frequency local energy
CN104750677A (en) Speech translation apparatus, speech translation method and speech translation program
CN105389356A (en) Music database retrieval system based on feature extraction
CN104142831A (en) Application program searching method and device
De Leon et al. Enhancing timbre model using MFCC and its time derivatives for music similarity estimation
CN110399522A (en) A kind of music singing search method and device based on LSTM and layering and matching
Audhkhasi et al. Keyword search using modified minimum edit distance measure
CN111125299B (en) Dynamic word stock updating method based on user behavior analysis
Fujihara et al. Hyperlinking Lyrics: A Method for Creating Hyperlinks Between Phrases in Song Lyrics.
Nagavi et al. Content based audio retrieval with MFCC feature extraction, clustering and sort-merge techniques
KR101069534B1 (en) Method and apparatus for searching voice data from audio and video data under the circumstances including unregistered words
CN114969001A (en) Database metadata field matching method, device, equipment and medium
Turunen et al. Speech retrieval from unsegmented Finnish audio using statistical morpheme-like units for segmentation, recognition, and retrieval
Tsai et al. On the extraction of vocal-related information to facilitate the management of popular music collections
Cerva et al. Using Unsupervised Feature-Based Speaker Adaptation for Improved Transcription of Spoken Archives.
Liu et al. Automatic summarization of MP3 music objects

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant