CN105895110A - Method and device for classifying audio files - Google Patents
Method and device for classifying audio files Download PDFInfo
- Publication number
- CN105895110A CN105895110A CN201610512234.3A CN201610512234A CN105895110A CN 105895110 A CN105895110 A CN 105895110A CN 201610512234 A CN201610512234 A CN 201610512234A CN 105895110 A CN105895110 A CN 105895110A
- Authority
- CN
- China
- Prior art keywords
- audio file
- sound spectrograph
- target audio
- section
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 37
- 238000001228 spectrum Methods 0.000 claims description 18
- 230000007935 neutral effect Effects 0.000 claims description 12
- 238000013527 convolutional neural network Methods 0.000 claims description 10
- 230000011218 segmentation Effects 0.000 claims description 10
- 238000006243 chemical reaction Methods 0.000 claims description 5
- 230000008569 process Effects 0.000 description 9
- 239000012634 fragment Substances 0.000 description 5
- 238000000605 extraction Methods 0.000 description 4
- 238000009432 framing Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/68—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/683—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Library & Information Science (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention discloses a method and a device for classifying audio files. The method comprises the following steps of pre-classifying music, and obtaining a spectrogram of each type of music; for a target audio file to be classified, obtaining a spectrogram of the target audio file; according to the similarity of the spectrogram of the target audio file and the spectrogram of each type of music, determining the type of the target audio file. The method disclosed by the embodiment is used for classifying the audio files according to the spectrograms.
Description
Technical field
The present invention relates to Audiotechnica field, particularly to sorting technique and the device of a kind of audio file.
Background technology
In the internet multimedia epoch, people become more and more diversified to the demand of music.Music assorting,
Contribute to people music is labeled, such as different to different musical genre marks emotions, it is possible to
To facilitate user preferably to obtain music sources according to interest.
Traditional music assorting method, by audio extraction feature, then classifies with grader.
Audio frequency characteristics includes: temporal signatures, comprise short-time average energy, linear predictor coefficient, zero-crossing rate and
Derivative feature;Frequency domain character, comprises Mel coefficient, LPC cepstral coefficients and entropy feature;Time-frequency is special
Levy, comprise wavelet coefficient.In this process, effective audio feature extraction and selection be one more
Complicated process.
Summary of the invention
The purpose of the embodiment of the present invention is to provide sorting technique and the device of a kind of audio file, logical to realize
Cross sound spectrograph audio file is classified.
For reaching above-mentioned purpose, the embodiment of the invention discloses the sorting technique of a kind of audio file, in advance will
Music is classified, and obtains the sound spectrograph of each class music;Method includes:
For target audio file to be sorted, it is thus achieved that the sound spectrograph of described target audio file;
Sound spectrograph according to described target audio file and the similarity of the sound spectrograph of described each class music,
Determine the classification of described target audio file.
It is also preferred that the left described for target audio file to be sorted, it is thus achieved that the language spectrum of described target audio file
Figure, including:
For target audio file to be sorted, described target audio file is carried out segmentation;
Obtain the sound spectrograph of each section audio file respectively.
It is also preferred that the left the language spectrum of the described sound spectrograph according to described target audio file and described each class music
The similarity of figure, determines the classification of described target audio file, including:
Utilize neutral net, according to the sound spectrograph of described each section audio file and described each class music
The similarity of sound spectrograph, determines the classification of each section audio file;
According to the classification of all section audio files, determine the classification of described target audio file.
It is also preferred that the left the described sound spectrograph obtaining each section audio file respectively, including:
Being respectively directed to each section audio file, each audio frame for described section audio file carries out Fourier
Conversion, obtains the spectrum value of described audio frame;
The spectrum value of each audio frame according to described section audio file, generates the language spectrum of described section audio file
Figure.
It is also preferred that the left described neutral net is:
Convolutional neural networks.
For reaching above-mentioned purpose, the embodiment of the invention discloses the sorter of a kind of audio file, in advance will
Music is classified, and obtains the sound spectrograph of each class music;Device includes:
Obtain module, for for target audio file to be sorted, it is thus achieved that the language of described target audio file
Spectrogram;
Determine module, for the sound spectrograph according to described target audio file and the language of described each class music
The similarity of spectrogram, determines the classification of described target audio file.
It is also preferred that the left described acquisition module, including:
Segmentation submodule, for for target audio file to be sorted, is carried out described target audio file
Segmentation;
Obtain submodule, for obtaining the sound spectrograph of each section audio file respectively.
Module is determined it is also preferred that the left described, specifically for:
Utilize neutral net, according to the sound spectrograph of described each section audio file and described each class music
The similarity of sound spectrograph, determines the classification of each section audio file;
According to the classification of all section audio files, determine the classification of described target audio file.
It is also preferred that the left described acquisition submodule, specifically for:
Being respectively directed to each section audio file, each audio frame for described section audio file carries out Fourier
Conversion, obtains the spectrum value of described audio frame;
The spectrum value of each audio frame according to described section audio file, generates the language spectrum of described section audio file
Figure.
It is also preferred that the left described neutral net is:
Convolutional neural networks.
As seen from the above technical solutions, the embodiment of the present invention provide a kind of audio file sorting technique and
Device, classifies music in advance, and obtains the sound spectrograph of each class music;For target to be sorted
Audio file, it is thus achieved that the sound spectrograph of described target audio file;Sound spectrograph according to described target audio file
And the similarity of the sound spectrograph of described each class music, determine the classification of described target audio file.
Visible, utilize the sound spectrograph of target audio file and the similarity of the sound spectrograph of each class music, determine
The classification of target audio file, it is achieved that audio file is classified by sound spectrograph.
Certainly, arbitrary product or the method for implementing the present invention must be not necessarily required to reach above-described institute simultaneously
There is advantage.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to enforcement
In example or description of the prior art, the required accompanying drawing used is briefly described, it should be apparent that, describe below
In accompanying drawing be only some embodiments of the present invention, for those of ordinary skill in the art, do not paying
On the premise of going out creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
The schematic flow sheet of the sorting technique of a kind of audio file that Fig. 1 provides for the embodiment of the present invention;
The structural representation of the sorter of a kind of audio file that Fig. 2 provides for the embodiment of the present invention;
The sound spectrograph of the Jazz that Fig. 3 provides for the embodiment of the present invention;
The sound spectrograph of the Blue that Fig. 4 provides for the embodiment of the present invention;
The sound spectrograph of the Metal that Fig. 5 provides for the embodiment of the present invention;
The sound spectrograph of the Pop that Fig. 6 provides for the embodiment of the present invention;
The sound spectrograph of the Hip-pop that Fig. 7 provides for the embodiment of the present invention.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clearly
Chu, be fully described by, it is clear that described embodiment be only a part of embodiment of the present invention rather than
Whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art are not making creation
The every other embodiment obtained under property work premise, broadly falls into the scope of protection of the invention.
First below the sorting technique of a kind of audio file that the embodiment of the present invention provides is described in detail.
See the flow process signal of the sorting technique of a kind of audio file that Fig. 1, Fig. 1 provide for the embodiment of the present invention
Figure, classifies music in advance, and obtains the sound spectrograph of each class music;May include steps of:
S101, for target audio file to be sorted, it is thus achieved that the sound spectrograph of described target audio file;
Concrete, audio file can be music file, music is classified in advance, such as Jazz, Blue,
Metal, Pop, Hip-pop etc., obtain the sound spectrograph of each class music simultaneously.Wherein, each class music
Sound spectrograph respectively the most as shown in Figure 3, Figure 4, Figure 5, Figure 6, Figure 7.For target audio file to be sorted,
Target audio file can be carried out segmentation.Such as, the music of 1 first 60s being started anew, every 5s is divided into one
Fragment, is divided into 12 snatch of musics.
Concrete, for a wherein section audio file, the voice signal in this section audio file can be carried out
Windowing, is moved into row framing according to certain window length and window;To every frame audio sample by fast Fourier transform,
Obtain this section audio file spectrum value;The spectrum value of this section audio file can be normalized,
Change into the value between 0 to 255, generate the sound spectrograph of this section audio file.For each section audio file all
Process in this way, obtain the sound spectrograph of each section audio file respectively, thus obtain target audio
The sound spectrograph of file.Wherein, the windowing of voice signal, framing and fast Fourier transform belong to existing
Technology, does not repeats them here.
S102, according to the sound spectrograph of described target audio file and the phase of the sound spectrograph of described each class music
Like property, determine the classification of described target audio file.
Concrete, the texture of same type of music sound spectrograph has similarity, and human eye can be according to texture one
Determine in degree, to tell different music categories.In actual applications, it is possible to use neutral net, according to often
The sound spectrograph of one section audio file and the similarity of the sound spectrograph of each class music, determine each section audio literary composition
The classification of part;According to the classification of all section audio files, determine the classification of described target audio file.In reality
In the application of border, this neutral net can be convolutional neural networks CNN.
Exemplary, the method that maximum is voted can be used, for 12 music file fragments of 1 song,
Utilizing convolutional neural networks, determine that the classification of wherein 9 fragments is Jazz, the classification of 2 fragments is Blue, 1
The classification of individual fragment is Pop, then after processing, final classification results is Jazz, so that it is determined that the classification of this music
For jazz (Jazz).Wherein, convolutional neural networks (CNN) is that a kind of image based on degree of depth study divides
Class and object detection algorithms, belong to prior art, does not repeats them here.
Visible, utilize the sound spectrograph of target audio file and the similarity of the sound spectrograph of each class music, determine
The classification of target audio file, is not related to the complex process of audio feature extraction and selection, thus realizes passing through
Audio file is classified by sound spectrograph.
See the structural representation of the sorter of a kind of audio file that Fig. 2, Fig. 2 provide for the embodiment of the present invention
Figure, corresponding with the flow process shown in Fig. 1, in advance music is classified, and obtain the language of each class music
Spectrogram;This sorter may include that acquisition module 201, determines module 202.
Obtain module 201, for for target audio file to be sorted, it is thus achieved that described target audio file
Sound spectrograph;
Concrete, it is thus achieved that module 201, may include that segmentation submodule and obtain submodule (not shown);
Segmentation submodule, for for target audio file to be sorted, is carried out described target audio file
Segmentation;
Obtain submodule, for obtaining the sound spectrograph of each section audio file respectively.
Concrete, described acquisition submodule, specifically may be used for:
Being respectively directed to each section audio file, each audio frame for described section audio file carries out Fourier
Conversion, obtains the spectrum value of described audio frame;The spectrum value of each audio frame according to described section audio file,
Generate the sound spectrograph of described section audio file.
Determine module 202, for according to the sound spectrograph of described target audio file and described each class music
The similarity of sound spectrograph, determines the classification of described target audio file.
Concrete, determine module 202, specifically may be used for:
Utilize neutral net, according to the sound spectrograph of described each section audio file and described each class music
The similarity of sound spectrograph, determines the classification of each section audio file;According to the classification of all section audio files,
Determine the classification of described target audio file.
Concrete, described neutral net can be: convolutional neural networks.
Visible, utilize the sound spectrograph of target audio file and the similarity of the sound spectrograph of each class music, determine
The classification of target audio file, is not related to the complex process of audio feature extraction and selection, thus realizes passing through
Audio file is classified by sound spectrograph.
It should be noted that in this article, the relational terms of such as first and second or the like be used merely to by
One entity or operation separate with another entity or operating space, and not necessarily require or imply these
Relation or the order of any this reality is there is between entity or operation.And, term " includes ", " comprising "
Or its any other variant is intended to comprising of nonexcludability, so that include the mistake of a series of key element
Journey, method, article or equipment not only include those key elements, but also other including being not expressly set out
Key element, or also include the key element intrinsic for this process, method, article or equipment.Do not having
In the case of more restrictions, statement " including ... " key element limited, it is not excluded that including described wanting
Process, method, article or the equipment of element there is also other identical element.
Each embodiment in this specification all uses relevant mode to describe, phase homophase between each embodiment
As part see mutually, what each embodiment stressed is the difference with other embodiments.
For device embodiment, owing to it is substantially similar to embodiment of the method, so the comparison described
Simply, relevant part sees the part of embodiment of the method and illustrates.
One of ordinary skill in the art will appreciate that all or part of step realizing in said method embodiment
The program that can be by completes to instruct relevant hardware, and described program can be stored in computer-readable
Take in storage medium, the storage medium obtained designated herein, such as: ROM/RAM, magnetic disc, CD etc..
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit protection scope of the present invention.
All any modification, equivalent substitution and improvement etc. made within the spirit and principles in the present invention, are all contained in
In protection scope of the present invention.
Claims (10)
1. the sorting technique of an audio file, it is characterised in that in advance music is classified, and obtain
The sound spectrograph of each class music;Described method includes:
For target audio file to be sorted, it is thus achieved that the sound spectrograph of described target audio file;
Sound spectrograph according to described target audio file and the similarity of the sound spectrograph of described each class music,
Determine the classification of described target audio file.
Method the most according to claim 1, described for target audio file to be sorted, it is thus achieved that institute
State the sound spectrograph of target audio file, including:
For target audio file to be sorted, described target audio file is carried out segmentation;
Obtain the sound spectrograph of each section audio file respectively.
Method the most according to claim 2, the described sound spectrograph according to described target audio file and
The similarity of the sound spectrograph of described each class music, determines the classification of described target audio file, including:
Utilize neutral net, according to the sound spectrograph of described each section audio file and described each class music
The similarity of sound spectrograph, determines the classification of each section audio file;
According to the classification of all section audio files, determine the classification of described target audio file.
Method the most according to claim 2, the described sound spectrograph obtaining each section audio file respectively,
Including:
Being respectively directed to each section audio file, each audio frame for described section audio file carries out Fourier
Conversion, obtains the spectrum value of described audio frame;
The spectrum value of each audio frame according to described section audio file, generates the language spectrum of described section audio file
Figure.
Method the most according to claim 3, described neutral net is:
Convolutional neural networks.
6. the sorter of an audio file, it is characterised in that in advance music is classified, and obtain
The sound spectrograph of each class music;Described device includes:
Obtain module, for for target audio file to be sorted, it is thus achieved that the language of described target audio file
Spectrogram;
Determine module, for the sound spectrograph according to described target audio file and the language of described each class music
The similarity of spectrogram, determines the classification of described target audio file.
Device the most according to claim 6, described acquisition module, including:
Segmentation submodule, for for target audio file to be sorted, is carried out described target audio file
Segmentation;
Obtain submodule, for obtaining the sound spectrograph of each section audio file respectively.
Device the most according to claim 7, described determines module, specifically for:
Utilize neutral net, according to the sound spectrograph of described each section audio file and described each class music
The similarity of sound spectrograph, determines the classification of each section audio file;
According to the classification of all section audio files, determine the classification of described target audio file.
Device the most according to claim 7, described acquisition submodule, specifically for:
Being respectively directed to each section audio file, each audio frame for described section audio file carries out Fourier
Conversion, obtains the spectrum value of described audio frame;
The spectrum value of each audio frame according to described section audio file, generates the language spectrum of described section audio file
Figure.
Device the most according to claim 8, described neutral net is:
Convolutional neural networks.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610512234.3A CN105895110A (en) | 2016-06-30 | 2016-06-30 | Method and device for classifying audio files |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610512234.3A CN105895110A (en) | 2016-06-30 | 2016-06-30 | Method and device for classifying audio files |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105895110A true CN105895110A (en) | 2016-08-24 |
Family
ID=56718595
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610512234.3A Withdrawn CN105895110A (en) | 2016-06-30 | 2016-06-30 | Method and device for classifying audio files |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105895110A (en) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106919710A (en) * | 2017-03-13 | 2017-07-04 | 东南大学 | A kind of dialect sorting technique based on convolutional neural networks |
CN106952649A (en) * | 2017-05-14 | 2017-07-14 | 北京工业大学 | Method for distinguishing speek person based on convolutional neural networks and spectrogram |
CN107358946A (en) * | 2017-06-08 | 2017-11-17 | 南京邮电大学 | Speech-emotion recognition method based on section convolution |
CN107393554A (en) * | 2017-06-20 | 2017-11-24 | 武汉大学 | In a kind of sound scene classification merge class between standard deviation feature extracting method |
CN107492383A (en) * | 2017-08-07 | 2017-12-19 | 上海六界信息技术有限公司 | Screening technique, device, equipment and the storage medium of live content |
CN108053836A (en) * | 2018-01-18 | 2018-05-18 | 成都嗨翻屋文化传播有限公司 | A kind of audio automation mask method based on deep learning |
CN108206027A (en) * | 2016-12-20 | 2018-06-26 | 北京酷我科技有限公司 | A kind of audio quality evaluation method and system |
CN108206029A (en) * | 2016-12-16 | 2018-06-26 | 北京酷我科技有限公司 | A kind of method and system for realizing the word for word lyrics |
CN108257614A (en) * | 2016-12-29 | 2018-07-06 | 北京酷我科技有限公司 | The method and its system of audio data mark |
CN108538311A (en) * | 2018-04-13 | 2018-09-14 | 腾讯音乐娱乐科技(深圳)有限公司 | Audio frequency classification method, device and computer readable storage medium |
CN108877783A (en) * | 2018-07-05 | 2018-11-23 | 腾讯音乐娱乐科技(深圳)有限公司 | The method and apparatus for determining the audio types of audio data |
CN108989882A (en) * | 2018-08-03 | 2018-12-11 | 百度在线网络技术(北京)有限公司 | Method and apparatus for exporting the snatch of music in video |
CN109087634A (en) * | 2018-10-30 | 2018-12-25 | 四川长虹电器股份有限公司 | A kind of sound quality setting method based on audio classification |
CN109522445A (en) * | 2018-11-15 | 2019-03-26 | 辽宁工程技术大学 | A kind of audio classification search method merging CNNs and phase algorithm |
CN110111810A (en) * | 2019-04-29 | 2019-08-09 | 华院数据技术(上海)有限公司 | Voice personality prediction technique based on convolutional neural networks |
CN110222227A (en) * | 2019-05-13 | 2019-09-10 | 西安交通大学 | A kind of Chinese folk song classification of countries method merging auditory perceptual feature and visual signature |
WO2020024396A1 (en) * | 2018-08-02 | 2020-02-06 | 平安科技(深圳)有限公司 | Music style recognition method and apparatus, computer device, and storage medium |
CN110930986A (en) * | 2019-12-06 | 2020-03-27 | 北京明略软件***有限公司 | Voice processing method and device, electronic equipment and storage medium |
CN111259189A (en) * | 2018-11-30 | 2020-06-09 | 马上消费金融股份有限公司 | Music classification method and device |
CN112420072A (en) * | 2021-01-25 | 2021-02-26 | 北京远鉴信息技术有限公司 | Method and device for generating spectrogram, electronic equipment and storage medium |
WO2022152831A1 (en) * | 2021-01-15 | 2022-07-21 | Continental Automotive Technologies GmbH | Adaptive device for reducing the noise of an fm radio signal |
CN114822589A (en) * | 2022-04-02 | 2022-07-29 | 中科猷声(苏州)科技有限公司 | Indoor acoustic parameter measuring method, model building method, device and electronic equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101599271A (en) * | 2009-07-07 | 2009-12-09 | 华中科技大学 | A kind of recognition methods of digital music emotion |
CN103854646A (en) * | 2014-03-27 | 2014-06-11 | 成都康赛信息技术有限公司 | Method for classifying digital audio automatically |
CN104240719A (en) * | 2013-06-24 | 2014-12-24 | 浙江大华技术股份有限公司 | Feature extraction method and classification method for audios and related devices |
CN104318931A (en) * | 2014-09-30 | 2015-01-28 | 百度在线网络技术(北京)有限公司 | Emotional activity obtaining method and apparatus of audio file, and classification method and apparatus of audio file |
CN105047194A (en) * | 2015-07-28 | 2015-11-11 | 东南大学 | Self-learning spectrogram feature extraction method for speech emotion recognition |
-
2016
- 2016-06-30 CN CN201610512234.3A patent/CN105895110A/en not_active Withdrawn
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101599271A (en) * | 2009-07-07 | 2009-12-09 | 华中科技大学 | A kind of recognition methods of digital music emotion |
CN104240719A (en) * | 2013-06-24 | 2014-12-24 | 浙江大华技术股份有限公司 | Feature extraction method and classification method for audios and related devices |
CN103854646A (en) * | 2014-03-27 | 2014-06-11 | 成都康赛信息技术有限公司 | Method for classifying digital audio automatically |
CN104318931A (en) * | 2014-09-30 | 2015-01-28 | 百度在线网络技术(北京)有限公司 | Emotional activity obtaining method and apparatus of audio file, and classification method and apparatus of audio file |
CN105047194A (en) * | 2015-07-28 | 2015-11-11 | 东南大学 | Self-learning spectrogram feature extraction method for speech emotion recognition |
Non-Patent Citations (1)
Title |
---|
迷之飞羽: "caffe学习笔记8 实例基于卷积神经网络的声音识别-薛开宇", 《豆丁网》 * |
Cited By (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108206029A (en) * | 2016-12-16 | 2018-06-26 | 北京酷我科技有限公司 | A kind of method and system for realizing the word for word lyrics |
CN108206027A (en) * | 2016-12-20 | 2018-06-26 | 北京酷我科技有限公司 | A kind of audio quality evaluation method and system |
CN108257614A (en) * | 2016-12-29 | 2018-07-06 | 北京酷我科技有限公司 | The method and its system of audio data mark |
CN106919710A (en) * | 2017-03-13 | 2017-07-04 | 东南大学 | A kind of dialect sorting technique based on convolutional neural networks |
CN106952649A (en) * | 2017-05-14 | 2017-07-14 | 北京工业大学 | Method for distinguishing speek person based on convolutional neural networks and spectrogram |
CN107358946A (en) * | 2017-06-08 | 2017-11-17 | 南京邮电大学 | Speech-emotion recognition method based on section convolution |
CN107358946B (en) * | 2017-06-08 | 2020-11-13 | 南京邮电大学 | Voice emotion recognition method based on slice convolution |
CN107393554A (en) * | 2017-06-20 | 2017-11-24 | 武汉大学 | In a kind of sound scene classification merge class between standard deviation feature extracting method |
CN107492383A (en) * | 2017-08-07 | 2017-12-19 | 上海六界信息技术有限公司 | Screening technique, device, equipment and the storage medium of live content |
CN107492383B (en) * | 2017-08-07 | 2022-01-11 | 上海六界信息技术有限公司 | Live content screening method, device, equipment and storage medium |
CN108053836A (en) * | 2018-01-18 | 2018-05-18 | 成都嗨翻屋文化传播有限公司 | A kind of audio automation mask method based on deep learning |
CN108053836B (en) * | 2018-01-18 | 2021-03-23 | 成都嗨翻屋科技有限公司 | Audio automatic labeling method based on deep learning |
CN108538311A (en) * | 2018-04-13 | 2018-09-14 | 腾讯音乐娱乐科技(深圳)有限公司 | Audio frequency classification method, device and computer readable storage medium |
CN108877783A (en) * | 2018-07-05 | 2018-11-23 | 腾讯音乐娱乐科技(深圳)有限公司 | The method and apparatus for determining the audio types of audio data |
WO2020024396A1 (en) * | 2018-08-02 | 2020-02-06 | 平安科技(深圳)有限公司 | Music style recognition method and apparatus, computer device, and storage medium |
CN108989882A (en) * | 2018-08-03 | 2018-12-11 | 百度在线网络技术(北京)有限公司 | Method and apparatus for exporting the snatch of music in video |
CN109087634A (en) * | 2018-10-30 | 2018-12-25 | 四川长虹电器股份有限公司 | A kind of sound quality setting method based on audio classification |
CN109522445A (en) * | 2018-11-15 | 2019-03-26 | 辽宁工程技术大学 | A kind of audio classification search method merging CNNs and phase algorithm |
CN111259189A (en) * | 2018-11-30 | 2020-06-09 | 马上消费金融股份有限公司 | Music classification method and device |
CN110111810A (en) * | 2019-04-29 | 2019-08-09 | 华院数据技术(上海)有限公司 | Voice personality prediction technique based on convolutional neural networks |
CN110111810B (en) * | 2019-04-29 | 2020-12-18 | 华院数据技术(上海)有限公司 | Voice personality prediction method based on convolutional neural network |
CN110222227A (en) * | 2019-05-13 | 2019-09-10 | 西安交通大学 | A kind of Chinese folk song classification of countries method merging auditory perceptual feature and visual signature |
CN110930986A (en) * | 2019-12-06 | 2020-03-27 | 北京明略软件***有限公司 | Voice processing method and device, electronic equipment and storage medium |
CN110930986B (en) * | 2019-12-06 | 2022-05-17 | 北京明略软件***有限公司 | Voice processing method and device, electronic equipment and storage medium |
WO2022152831A1 (en) * | 2021-01-15 | 2022-07-21 | Continental Automotive Technologies GmbH | Adaptive device for reducing the noise of an fm radio signal |
FR3119056A1 (en) * | 2021-01-15 | 2022-07-22 | Continental Automotive | Adaptive device for noise reduction of an FM radio signal |
CN112420072A (en) * | 2021-01-25 | 2021-02-26 | 北京远鉴信息技术有限公司 | Method and device for generating spectrogram, electronic equipment and storage medium |
CN112420072B (en) * | 2021-01-25 | 2021-04-27 | 北京远鉴信息技术有限公司 | Method and device for generating spectrogram, electronic equipment and storage medium |
CN114822589A (en) * | 2022-04-02 | 2022-07-29 | 中科猷声(苏州)科技有限公司 | Indoor acoustic parameter measuring method, model building method, device and electronic equipment |
CN114822589B (en) * | 2022-04-02 | 2023-07-04 | 中科猷声(苏州)科技有限公司 | Indoor acoustic parameter determination method, model construction method, device and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105895110A (en) | Method and device for classifying audio files | |
US20070083365A1 (en) | Neural network classifier for separating audio sources from a monophonic audio signal | |
CN109767785A (en) | Ambient noise method for identifying and classifying based on convolutional neural networks | |
CN109065071B (en) | Song clustering method based on iterative k-means algorithm | |
Ludeña-Choez et al. | Feature extraction based on the high-pass filtering of audio signals for Acoustic Event Classification | |
Birajdar et al. | Speech and music classification using spectrogram based statistical descriptors and extreme learning machine | |
WO2019053544A1 (en) | Identification of audio components in an audio mix | |
CN113614828A (en) | Method and apparatus for fingerprinting audio signals via normalization | |
Kim et al. | A single predominant instrument recognition of polyphonic music using CNN-based timbre analysis | |
Nilufar et al. | Spectrogram based features selection using multiple kernel learning for speech/music discrimination | |
Ghosal et al. | Speech/music classification using empirical mode decomposition | |
Oh et al. | Spectrogram-channels u-net: a source separation model viewing each channel as the spectrogram of each source | |
Felipe et al. | Acoustic scene classification using spectrograms | |
Lekshmi et al. | Multiple predominant instruments recognition in polyphonic music using spectro/modgd-gram fusion | |
Song et al. | Automatic vocal segments detection in popular music | |
Al-Maathidi et al. | NNET based audio content classification and indexing system | |
Sharma et al. | Trends in audio texture analysis, synthesis, and applications | |
Jeong et al. | Dlr: Toward a deep learned rhythmic representation for music content analysis | |
Joshi et al. | Comparative study of Mfcc and Mel spectrogram for Raga classification using CNN | |
Costa et al. | Sparse time-frequency representations for polyphonic audio based on combined efficient fan-chirp transforms | |
Sofianos et al. | H-Semantics: A hybrid approach to singing voice separation | |
Tran et al. | Separate sound into STFT frames to eliminate sound noise frames in sound classification | |
Agera et al. | Exploring textural features for automatic music genre classification | |
Won et al. | Adaptive multi-class audio classification in noisy in-vehicle environment | |
Thiruvengatanadhan | Music genre classification using mfcc and aann |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20160824 |