CN102623007B - Audio characteristic classification method based on variable duration - Google Patents

Audio characteristic classification method based on variable duration Download PDF

Info

Publication number
CN102623007B
CN102623007B CN201110033410.2A CN201110033410A CN102623007B CN 102623007 B CN102623007 B CN 102623007B CN 201110033410 A CN201110033410 A CN 201110033410A CN 102623007 B CN102623007 B CN 102623007B
Authority
CN
China
Prior art keywords
rightarrow
vector
short
time characteristic
training sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201110033410.2A
Other languages
Chinese (zh)
Other versions
CN102623007A (en
Inventor
卢敏
窦维蓓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201110033410.2A priority Critical patent/CN102623007B/en
Publication of CN102623007A publication Critical patent/CN102623007A/en
Application granted granted Critical
Publication of CN102623007B publication Critical patent/CN102623007B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Electrically Operated Instructional Devices (AREA)

Abstract

The invention discloses an audio characteristic classification method based on variable duration in a multimedia signal processing and mode identification technology field. The method comprises the following steps: taking a marked audio sequence whose type is determined as a training sequence; extracting short time characteristics of an audio signal in the training sequence so as to form a short time characteristic vector; calculating a statistical parameter of the each short time characteristic in setting duration so as to acquire a statistical characteristic vector corresponding to the short time characteristic vector; calculating a group of the statistical characteristic vectors corresponding to the short time characteristic vector, and forming a long time characteristic vector of the training sequence by the group of the statistical characteristic vectors; using the long time characteristic vector of the training sequence to train a classifier; extracting a short time characteristic of an ist frame audio signal in a test sequence and calculating an ist frame input long time characteristic vector of the test sequence; sending the ist frame input long time characteristic vector into the trained classifier so as to obtain a classification type. By using the method of the invention, a time-delay problem caused by long time characteristic extraction can be avoided and real time classification of the audio characteristic can be realized.

Description

Audio frequency characteristics sorting technique based on variable duration
Technical field
The invention belongs to multimedia signal dispose and mode identification technology, relate in particular to a kind of audio frequency characteristics sorting technique based on variable duration.
Background technology
Along with the development of the communication technology, digital audio processing is widely used in a plurality of fields such as mobile communication, internet, broadcast and personal electrics.With audio encoding and decoding technique, it take narrowband voice as main voice coding from traditional, expand to gradually the more much higher media audio of bandwidth expansion quality coding, the rise of 3G, LTE has also further had higher requirement to audio encoding and decoding technique of new generation at aspects such as the reliability of the adaptability to channel, transmission and encoding and decoding quality.And no matter be audio coding decoding, or the sounds effects editing making, the diversity that sound signal itself has, make and may need to select different treatment technologies to dissimilar sound signal.As ITU-T G.718 and G.729.1, just sound signal has been divided into to voice and two kinds of coding modes of music, and after G.718-SWB in added the coding mode to the sound signal containing sinuso sine protractor.This shows, in some application scenarios, need first to sound signal, carry out simple and classify efficiently, know affiliated type.
During classification, the feature when short-time characteristic of extraction sound signal and length.Due to the stationarity in short-term of sound signal, usually compare short-time characteristic, when long, the stability of feature and the property distinguished are better, but shortcoming is that the detection time delay is large, and the application on the real-time grading system is had to certain limitation.In addition, steady cycle that different characteristic shows may be inconsistent, if to these features all get under surely same duration calculate corresponding when long feature may not be optimum.
Summary of the invention
The object of the invention is to, while for audio frequency characteristics sorting technique commonly used, mainly adopting extraction long, the technical scheme of feature affects the problem of live effect, a kind of audio frequency characteristics sorting technique based on variable duration is proposed, when long by the variable duration that extracts the same statistical parameter formation of same short-time characteristic under different durations, feature is carried out training classifier, and utilizes the sorter trained to carry out the audio frequency characteristics classification.
Technical scheme of the present invention is that a kind of audio frequency characteristics sorting technique based on variable duration, is characterized in that described method comprises the following steps:
Step 1: will determine that the tonic train of type process mark is as training sequence;
Step 2: the short-time characteristic F that extracts the sound signal in training sequence 1, F 2..., F k, form short character vector
Figure BDA0000046240470000021
, K is the component number of short character vector;
Step 3: calculate each short-time characteristic F kin setting duration, the statistical parameter of the short-time characteristic of present frame and (n-1) frame before, n is for setting the totalframes in duration; Each short-time characteristic F kcorresponding one group of statistical nature vector formed by the statistical parameter of this short-time characteristic
Figure BDA0000046240470000022
, and then short character vector a corresponding statistical nature vector
Figure BDA0000046240470000024
, wherein
Figure BDA0000046240470000025
; 1≤k≤K;
Step 4: choose P value, N 1, N 2..., N pmeet N 1<N 2<...<N p, make n equal respectively N 1, N 2..., N p, according to step 3, calculate short character vector one group of corresponding statistical nature vector
Figure BDA0000046240470000027
, proper vector during by this group statistical nature vector composing training sequence long:
Figure BDA0000046240470000028
Step 5: proper vector while utilizing training sequence long
Figure BDA0000046240470000031
training classifier;
Step 6: extract the short-time characteristic of the sound signal in cycle tests, and calculate the statistical nature vector of the i frame of cycle tests according to the method for step 2 and step 3
Figure BDA0000046240470000032
and cycle tests;
Figure BDA0000046240470000034
Step 7: according to the statistical nature vector of the i frame of cycle tests
Figure BDA0000046240470000035
and cycle tests
Figure BDA0000046240470000037
, the proper vector when input of the i frame of calculating cycle tests is long;
Figure BDA0000046240470000038
Step 8: the proper vector when input of i frame is long
Figure BDA0000046240470000039
send in the sorter after step 5 is trained, its output is the classification type of i frame.
Described short-time characteristic comprises logarithm energy, zero-crossing rate and evenly sub belt energy distribution.
The statistical parameter of the short-time characteristic of described present frame and (n-1) frame before comprises the short-time characteristic maximal value MaxF of present frame and (n-1) frame before k(n), minimum M inF k(n), arithmetic mean AvgF kor variance VarF (n) k(n) one or more in.
Described proper vector while utilizing training sequence long
Figure BDA00000462404700000310
proper vector when training classifier specifically utilizes training sequence long train single sorter.
Described proper vector while utilizing training sequence long
Figure BDA00000462404700000312
training classifier specifically uses the forward direction Method for Feature Selection, proper vector when training sequence long
Figure BDA00000462404700000313
proper vector when middle selection validity feature forms effective length
Figure BDA00000462404700000314
, and utilize proper vector while effectively growing
Figure BDA00000462404700000315
train single sorter.
Described proper vector while utilizing training sequence long
Figure BDA00000462404700000316
proper vector when training classifier specifically utilizes training sequence long
Figure BDA00000462404700000317
minute the vector
Figure BDA00000462404700000318
train respectively separately the set of classifiers formed in parallel after single sorter of the same type.
The proper vector when input of the i frame of described calculating cycle tests is long specifically utilize formula
Figure BDA0000046240470000042
Wherein, q=1,2, L, P-1,
Figure BDA0000046240470000043
in
Figure BDA0000046240470000044
total q,
Figure BDA0000046240470000045
in
Figure BDA0000046240470000046
total P-q.
Described single sorter is the independent characteristic sorter based on normal distribution.
Features training sorter when the present invention is long by the variable duration that extracts the same statistical parameter formation of same short-time characteristic under different durations, and utilize the sorter trained to carry out the audio frequency characteristics classification, avoid extracting the latency issue that feature causes when long, realized the real-time grading of audio frequency characteristics.
The accompanying drawing explanation
Fig. 1 is based on the audio frequency characteristics sorting technique process flow diagram of variable duration;
Fig. 2 is the schematic diagram that while utilizing training sequence long, proper vector is trained single sorter;
When Fig. 3 is effective long that while utilizing training sequence long, the validity feature of proper vector forms, proper vector is trained the schematic diagram of single sorter;
Fig. 4 is that while utilizing training sequence long, minute vector of proper vector is trained respectively composition and classification device group schematic diagram in parallel after single sorter of the same type separately;
Fig. 5 is the training sample database information table;
Fig. 6 is test sample book library information table;
Fig. 7 is the classifier performance contrast table.
Embodiment
Below in conjunction with accompanying drawing, preferred embodiment is elaborated.Should be emphasized that, following explanation is only exemplary, rather than in order to limit the scope of the invention and to apply.
The present invention is categorized as example with the voice/music signal under the 32kHz sampling rate and describes.To the audio signal classification of other types, the present invention stands good.
Fig. 1 is based on the audio frequency characteristics sorting technique process flow diagram of variable duration.In Fig. 1, the audio frequency characteristics sorting technique based on variable duration comprises the following steps:
Step 1: will determine that the tonic train of type process mark is as training sequence.
Step 2: the short-time characteristic F that extracts the sound signal in training sequence 1, F 2..., F k, form short character vector
Figure BDA0000046240470000051
, K is the component number of short character vector.
The present embodiment sound intermediate frequency signal is by every 40ms mono-frame, and the short-time characteristic of calculating comprises logarithm energy, zero-crossing rate and evenly sub belt energy distribution.In the present invention, short-time characteristic includes but not limited to logarithm energy, zero-crossing rate and evenly sub belt energy distribution.
If the sound signal sampling point of i frame is x (n), n=(i-1) L, (i-1) L+1, L, iL-1, L is frame length, the computing formula of each short-time characteristic is as follows:
A, logarithm energy
E 1 ( i ) = &Sigma; n = ( i - 1 ) L i &CenterDot; L - 1 x 2 ( n )
E 2(i)=max(log[E 1(i)],-10)
B, zero-crossing rate
ZCR ( i ) = &Sigma; n = ( i - 1 ) L i &CenterDot; L - 1 [ sign ( x ( n ) - x ( n - 1 ) ) + 1 ] / 2
Wherein, sign (x) is-symbol function, sign ( x ) = 1 , x > 0 0 , x = 0 - 1 , x < 0
C, evenly sub belt energy distribution
SubE ( i , k ) = &Sigma; m = ( k - 1 ) L / 2 K kL / 2 K - 1 X ( i , m ) ,k=1,2,L,K
Wherein, X (i, m) is the amplitude spectrum after i frame sound signal is done the FFT conversion.
X ( i , m ) = | &Sigma; k = 1 L x ( ( i - 1 ) L + k - 1 ) &CenterDot; exp [ - j &CenterDot; 2 &pi; L ( m - 1 ) ( k - 1 ) ] | ,m=1,2,L,L
Known according to the character of real sequence FFT, X (i, m) is about the m=L/2+1 even symmetry, therefore (L/2+1) individual value before can only retaining.K is even sub band number, makes K=16 in the present embodiment.
When the present embodiment extracts audio frequency characteristics, the short character vector of i frame
V r s ( i ) = E 2 ( i ) ZCR ( i ) SubE ( i , 1 ) M SubE ( i , 16 )
Its vectorial dimension is 18.E 2(i), ZCR (i), SubE (i, 1) ..., SubE (i, 16) is respectively the short character vector F of i frame 1, F 2..., F 18.
Step 3: calculate each short-time characteristic F kin setting duration, the statistical parameter of the short-time characteristic of present frame and (n-1) frame before, n is for setting the totalframes in duration; Each short-time characteristic F kcorresponding one group of statistical nature vector formed by the statistical parameter of this short-time characteristic
Figure BDA0000046240470000064
, and then short character vector
Figure BDA0000046240470000065
a corresponding statistical nature vector
Figure BDA0000046240470000066
, wherein
Figure BDA0000046240470000067
; 1≤k≤K.
The statistical parameter of the short-time characteristic of present frame and (n-1) frame before comprises the short-time characteristic maximal value MaxF of present frame and (n-1) frame before k(n), minimum M inF k(n), arithmetic mean AvgF kor variance VarF (n) k(n) one or more in.In the present embodiment, select maximal value and variance as statistical parameter, each short-time characteristic F kcorresponding one group of statistical nature vector formed by the statistical parameter of this short-time characteristic
Figure BDA0000046240470000071
Figure BDA0000046240470000072
.After the present embodiment the 2nd step calculating, 18 short-time characteristics are arranged, the statistical nature vector that the statistical parameter by this short-time characteristic that each short-time characteristic is corresponding forms has 2, short character vector
Figure BDA0000046240470000073
a corresponding statistical nature vector
Figure BDA0000046240470000074
dimension be 36 dimensions.
Step 4: choose P value, N 1, N 2..., N pmeet N 1<N 2<... N p, make n equal respectively N 1, N 2..., N p, according to step 3, calculate short character vector one group of corresponding statistical nature vector
Figure BDA0000046240470000076
, proper vector during by this group statistical nature vector composing training sequence long
Figure BDA0000046240470000077
In the present embodiment, get P=3, N 1=5, N 2=15, N 3=25, obtain the corresponding one group of statistical nature vector of 3 short character vector of i frame
Figure BDA0000046240470000078
, their vectorial dimension is all 36 dimensions.And then, proper vector during by this group statistical nature vector composing training sequence long
Figure BDA0000046240470000079
, its vectorial dimension is 108 dimensions.
Step 5: proper vector while utilizing training sequence long
Figure BDA00000462404700000710
training classifier.
Proper vector when obtaining training sequence long
Figure BDA00000462404700000711
after, can use known technology, proper vector while utilizing training sequence long
Figure BDA00000462404700000712
training classifier.
Fig. 2 is the schematic diagram that while utilizing training sequence long, proper vector is trained single sorter.In Fig. 2, proper vector while utilizing training sequence long
Figure BDA00000462404700000713
proper vector when training classifier can utilize training sequence long
Figure BDA00000462404700000714
directly train single sorter.
When Fig. 3 is effective long that while utilizing training sequence long, the validity feature of proper vector forms, proper vector is trained the schematic diagram of single sorter.In Fig. 3, proper vector while utilizing training sequence long
Figure BDA0000046240470000081
training classifier also can use the forward direction Method for Feature Selection, proper vector when training sequence long proper vector when middle selection validity feature forms effective length
Figure BDA0000046240470000083
, and utilize proper vector while effectively growing
Figure BDA0000046240470000084
train single sorter.
Fig. 4 is that while utilizing training sequence long, minute vector of proper vector is trained respectively the set of classifiers schematic diagram formed in parallel after single sorter of the same type separately.In Fig. 4, proper vector while utilizing training sequence long
Figure BDA0000046240470000085
proper vector when training classifier can also utilize training sequence long
Figure BDA0000046240470000086
minute the vector
Figure BDA0000046240470000087
Figure BDA0000046240470000088
train respectively separately the set of classifiers formed in parallel after single sorter of the same type.
In the present embodiment, single sorter is selected the independent characteristic sorter based on normal distribution, and for other sorter, the present invention stands good.During training classifier, use method training classifier as shown in Figure 3 and Figure 4.Use the forward direction Method for Feature Selection, proper vector when training sequence long
Figure BDA0000046240470000089
108 dimensional features in, select 36 dimension validity features to form proper vector when effectively long
Figure BDA00000462404700000810
, and utilize proper vector while effectively growing
Figure BDA00000462404700000811
train single sorter.Simultaneously, respectively with
Figure BDA00000462404700000812
for the characteristic of division vector, the sorter of stand-alone training same type.
Step 6: extract the short-time characteristic of the sound signal in cycle tests, and calculate the statistical nature vector of the i frame of cycle tests according to the method for step 2 and step 3
Figure BDA00000462404700000813
and cycle tests.
Figure BDA00000462404700000814
Figure BDA00000462404700000815
Step 7: according to the statistical nature vector of the i frame of cycle tests
Figure BDA00000462404700000816
and cycle tests
Figure BDA00000462404700000817
Figure BDA00000462404700000818
, the proper vector when input of the i frame of calculating cycle tests is long.
Figure BDA00000462404700000819
The proper vector when input of the i frame of calculating cycle tests is long
Figure BDA00000462404700000820
specifically utilize formula
Figure BDA0000046240470000091
Wherein, q=1,2, L, P-1, in
Figure BDA0000046240470000093
total q,
Figure BDA0000046240470000094
in
Figure BDA0000046240470000095
total P-q.
Step 8: the proper vector when input of i frame is long
Figure BDA0000046240470000096
send in the sorter of step 5 training, its output is the classification type of i frame.
Training sample database in the present embodiment and test sample book storehouse form by voice sequence and music sequence, separate between two databases.Fig. 5 is the training sample database information table, and Fig. 6 is test sample book library information table.On test sample book as above storehouse, test, comparison-of-pair sorting's device results of property as shown in Figure 7.In Fig. 7, test result contrast can be found out: when long, the duration of feature is larger, and classification accuracy rate is higher, but the time delay of type conversion to be detected also larger simultaneously; By contrast, the sorter that obtains of training according to the present invention, aspect the promptness two changed in classification accuracy and the type of detection of audio types, have more excellent performance performance, is more suitable for the system of real-time music/Classification of Speech.
The above; be only the present invention's embodiment preferably, but protection scope of the present invention is not limited to this, anyly is familiar with in technical scope that those skilled in the art disclose in the present invention; the variation that can expect easily or replacement, within all should being encompassed in protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claim.

Claims (7)

1. the audio frequency characteristics sorting technique based on variable duration, is characterized in that described method comprises the following steps:
Step 1: will determine that the tonic train of type process mark is as training sequence;
Step 2: the short-time characteristic F that extracts the sound signal in training sequence 1, F 2..., F k, form short character vector V &RightArrow; S = F 1 F 2 &CenterDot; &CenterDot; &CenterDot; F K , K is the component number of short character vector;
Step 3: calculate each short-time characteristic F kin setting duration, the statistical parameter of the short-time characteristic of present frame and (n-1) frame before, n is for setting the totalframes in duration; Each short-time characteristic F kcorresponding one group of statistical nature vector formed by the statistical parameter of this short-time characteristic and then short character vector
Figure FDA00003422979400013
a corresponding statistical nature vector
Figure FDA00003422979400014
wherein V &RightArrow; L ( n ) = L &RightArrow; 1 ( n ) L &RightArrow; 2 ( n ) &CenterDot; &CenterDot; &CenterDot; L &RightArrow; K ( n ) ; 1≤k≤K;
Step 4: choose P value, N 1, N 2..., N pmeet N 1<N 2<...<N p, make n equal respectively N 1, N 2..., N p, according to step 3, calculate short character vector
Figure FDA00003422979400016
one group of corresponding statistical nature vector
Figure FDA00003422979400017
proper vector during by this group statistical nature vector composing training sequence long V &RightArrow; F = &lsqb; V &RightArrow; L T ( N 1 ) , V &RightArrow; L T ( N 2 ) , . . . , V &RightArrow; L T ( N P ) &rsqb; T ;
Step 5: proper vector while utilizing training sequence long
Figure FDA00003422979400019
training classifier;
Step 6: extract the short-time characteristic of the sound signal in cycle tests, and calculate the statistical nature vector of the i frame of cycle tests according to the method for step 2 and step 3
Figure FDA000034229794000110
and cycle tests
Figure FDA000034229794000111
V &RightArrow; L ( N 2 ) , . . . , V &RightArrow; L ( N P ) ;
Step 7: according to the statistical nature vector of the i frame of cycle tests
Figure FDA00003422979400022
and cycle tests
Figure FDA00003422979400023
Figure FDA00003422979400024
the proper vector when input of the i frame of calculating cycle tests is long
Figure FDA00003422979400025
The proper vector when input of the i frame of calculating cycle tests is long
Figure FDA00003422979400026
specifically utilize formula
V &RightArrow; IN ( i ) = &lsqb; V &RightArrow; L T ( i ) , . . . , V &RightArrow; L T ( i ) &rsqb; T , &lsqb; V &RightArrow; L T ( N 1 ) , . . . , V &RightArrow; L T ( N q ) , V &RightArrow; L T ( i ) , . . . , V &RightArrow; L T ( i ) &rsqb; T , V &RightArrow; F , i < N 1 N 1 < i &GreaterEqual; N P . . . < N q &le; i < N q + 1 < . . . < N P
Wherein, q=1,2 ..., P-1,
Figure FDA00003422979400029
in
Figure FDA000034229794000210
total q, &lsqb; V &RightArrow; L T ( N 1 ) , . . . , V &RightArrow; L T ( N q ) , V &RightArrow; L T ( i ) , . . . , V &RightArrow; L T ( i ) &rsqb; T In total P-q;
Step 8: the proper vector when input of i frame is long
Figure FDA000034229794000213
send in the sorter after step 5 is trained, its output is the classification type of i frame.
2. a kind of audio frequency characteristics sorting technique based on variable duration according to claim 1, is characterized in that described short-time characteristic comprises logarithm energy, zero-crossing rate and evenly sub belt energy distribution.
3. a kind of audio frequency characteristics sorting technique based on variable duration according to claim 1, the statistical parameter that it is characterized in that the short-time characteristic of described present frame and (n-1) frame before comprises the short-time characteristic maximal value MaxF of present frame and (n-1) frame before k(n), minimum M inF k(n), arithmetic mean AvgF kor variance VarF (n) k(n) one or more in.
4. a kind of audio frequency characteristics sorting technique based on variable duration according to claim 1, is characterized in that described proper vector while utilizing training sequence long
Figure FDA000034229794000214
proper vector when training classifier specifically utilizes training sequence long
Figure FDA000034229794000215
train single sorter.
5. a kind of audio frequency characteristics sorting technique based on variable duration according to claim 1, is characterized in that described proper vector while utilizing training sequence long
Figure FDA00003422979400031
training classifier specifically uses the forward direction Method for Feature Selection, proper vector when training sequence long
Figure FDA00003422979400032
proper vector when middle selection validity feature forms effective length and utilize proper vector when effectively long
Figure FDA00003422979400034
train single sorter.
6. a kind of audio frequency characteristics sorting technique based on variable duration according to claim 1, is characterized in that described proper vector while utilizing training sequence long
Figure FDA00003422979400035
proper vector when training classifier specifically utilizes training sequence long
Figure FDA00003422979400036
minute the vector train respectively separately the set of classifiers formed in parallel after single sorter of the same type.
7. according to the described a kind of audio frequency characteristics sorting technique based on variable duration of any one claim in claim 4-6, it is characterized in that described single sorter is for the independent characteristic sorter based on normal distribution.
CN201110033410.2A 2011-01-30 2011-01-30 Audio characteristic classification method based on variable duration Expired - Fee Related CN102623007B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110033410.2A CN102623007B (en) 2011-01-30 2011-01-30 Audio characteristic classification method based on variable duration

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110033410.2A CN102623007B (en) 2011-01-30 2011-01-30 Audio characteristic classification method based on variable duration

Publications (2)

Publication Number Publication Date
CN102623007A CN102623007A (en) 2012-08-01
CN102623007B true CN102623007B (en) 2014-01-01

Family

ID=46562887

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110033410.2A Expired - Fee Related CN102623007B (en) 2011-01-30 2011-01-30 Audio characteristic classification method based on variable duration

Country Status (1)

Country Link
CN (1) CN102623007B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968986B (en) * 2012-11-07 2015-01-28 华南理工大学 Overlapped voice and single voice distinguishing method based on long time characteristics and short time characteristics
CN106328152B (en) * 2015-06-30 2020-01-31 芋头科技(杭州)有限公司 automatic indoor noise pollution identification and monitoring system
CN105654944B (en) * 2015-12-30 2019-11-01 中国科学院自动化研究所 It is a kind of merged in short-term with it is long when feature modeling ambient sound recognition methods and device
WO2018199997A1 (en) * 2017-04-28 2018-11-01 Hewlett-Packard Development Company, L.P. Audio classifcation with machine learning model using audio duration
CN108305616B (en) * 2018-01-16 2021-03-16 国家计算机网络与信息安全管理中心 Audio scene recognition method and device based on long-time and short-time feature extraction
CN113780180B (en) * 2021-09-13 2024-06-25 俞加利 Audio long-term fingerprint extraction and matching method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101067930A (en) * 2007-06-07 2007-11-07 深圳先进技术研究院 Intelligent audio frequency identifying system and identifying method
CN101236742A (en) * 2008-03-03 2008-08-06 中兴通讯股份有限公司 Music/ non-music real-time detection method and device
CN101364408A (en) * 2008-10-07 2009-02-11 西安成峰科技有限公司 Sound image combined monitoring method and system
CN101398825A (en) * 2007-09-29 2009-04-01 三星电子株式会社 Rapid music assorting and searching method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101067930A (en) * 2007-06-07 2007-11-07 深圳先进技术研究院 Intelligent audio frequency identifying system and identifying method
CN101398825A (en) * 2007-09-29 2009-04-01 三星电子株式会社 Rapid music assorting and searching method and device
CN101236742A (en) * 2008-03-03 2008-08-06 中兴通讯股份有限公司 Music/ non-music real-time detection method and device
CN101364408A (en) * 2008-10-07 2009-02-11 西安成峰科技有限公司 Sound image combined monitoring method and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Cyril Joder等.Temporal Integration for Audio Classification With Application to Musical Instrument Classification.《IEEE TRANSACTIONS ON AUDIO,SPEECH,AND LANGUAGE PROCESSING》.2009,第17卷(第1期),174-186.
Temporal Integration for Audio Classification With Application to Musical Instrument Classification;Cyril Joder等;《IEEE TRANSACTIONS ON AUDIO,SPEECH,AND LANGUAGE PROCESSING》;20090131;第17卷(第1期);174-186 *

Also Published As

Publication number Publication date
CN102623007A (en) 2012-08-01

Similar Documents

Publication Publication Date Title
CN102623007B (en) Audio characteristic classification method based on variable duration
CN101159834B (en) Method and system for detecting repeatable video and audio program fragment
CN108597498A (en) Multi-microphone voice acquisition method and device
CN110827837A (en) Whale activity audio classification method based on deep learning
CN102446504B (en) Voice/Music identifying method and equipment
CN100580693C (en) Advertisement detecting and recognizing method and system
CN104143324B (en) A kind of musical tone recognition method
CN109767776B (en) Deception voice detection method based on dense neural network
CN101599271A (en) A kind of recognition methods of digital music emotion
CN105741835A (en) Audio information processing method and terminal
CN103854646A (en) Method for classifying digital audio automatically
CN103985381A (en) Voice frequency indexing method based on parameter fusion optimized decision
CN112133277B (en) Sample generation method and device
CN111128211B (en) Voice separation method and device
Lu et al. Self-supervised audio spatialization with correspondence classifier
CN102708861A (en) Poor speech recognition method based on support vector machine
CN106098079A (en) Method and device for extracting audio signal
CN102723079A (en) Music and chord automatic identification method based on sparse representation
CN108615536A (en) Time-frequency combination feature musical instrument assessment of acoustics system and method based on microphone array
CN104123949B (en) card frame detection method and device
Taenzer et al. Investigating CNN-based Instrument Family Recognition for Western Classical Music Recordings.
Shifas et al. A non-causal FFTNet architecture for speech enhancement
CN102214219B (en) Audio/video content retrieval system and method
CN105721090B (en) A kind of detection and recognition methods of illegal f-m broadcast station
Valero et al. Narrow-band autocorrelation function features for the automatic recognition of acoustic environments

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20140101

Termination date: 20180130