CN107369451A - A kind of birds sound identification method of the phenology research of auxiliary avian reproduction phase - Google Patents
A kind of birds sound identification method of the phenology research of auxiliary avian reproduction phase Download PDFInfo
- Publication number
- CN107369451A CN107369451A CN201710583313.8A CN201710583313A CN107369451A CN 107369451 A CN107369451 A CN 107369451A CN 201710583313 A CN201710583313 A CN 201710583313A CN 107369451 A CN107369451 A CN 107369451A
- Authority
- CN
- China
- Prior art keywords
- birds
- sound
- recording
- fragment
- breeding period
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 241000271566 Aves Species 0.000 title claims abstract description 43
- 238000000034 method Methods 0.000 title claims abstract description 14
- 238000011160 research Methods 0.000 title claims abstract description 11
- 238000009395 breeding Methods 0.000 claims abstract description 15
- 230000001488 breeding effect Effects 0.000 claims abstract description 15
- 239000012634 fragment Substances 0.000 claims abstract description 13
- 241000894007 species Species 0.000 claims abstract description 6
- 230000009467 reduction Effects 0.000 claims description 12
- 238000000926 separation method Methods 0.000 claims description 8
- 238000000605 extraction Methods 0.000 claims description 5
- 230000002123 temporal effect Effects 0.000 claims description 5
- 239000011159 matrix material Substances 0.000 claims description 4
- 230000011218 segmentation Effects 0.000 claims description 3
- 230000008859 change Effects 0.000 claims description 2
- 238000013178 mathematical model Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 description 5
- 230000003595 spectral effect Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 3
- 238000000513 principal component analysis Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 206010011469 Crying Diseases 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 210000003746 feather Anatomy 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/26—Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/16—Hidden Markov models [HMM]
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
A kind of birds sound identification method of the phenology research of auxiliary avian reproduction phase, it is characterized in that, the recording fragment at the scene of reading first, the fragment of some birdvocalizations is included in sound, then recognizer can identify the species that the birds belonging to fragment are called in recording, provide the confidence level of an identification, and record the actual TRDA of this section of sound, the recognition result of last combination algorithm calculates the quantity of this birds for sending song during this area all records, namely enter the quantity of the birds of breeding period, after some time after more than one threshold value set in advance of this quantity, i.e. it is believed that this birds in this area from this when initially enter breeding period, it is on the contrary, after quantity reduces by more than a threshold value, i.e. it is believed that this kind of bird finishes breeding period.
Description
Technical field
The present invention relates to birds voice recognition technology field, the bird of the phenology research of specifically a kind of auxiliary avian reproduction phase
Class sound identification method.
Background technology
Biologically, the cry of birds is divided into cry (bird call), song (bird song).Wherein, song (bird
Song the cry that birds send in breeding period) is referred to.The song pattern of birds of a feather is very fixed.And the ring of different birds
Sound area is not often very big.Therefore a kind of means of the song of birds as identification birds species can be used.
So-called phenology is to study animal and the subject of environment mechanical periodicity relation research.One of branch is research birds
Breeding period and environment mechanical periodicity relation.And the breeding period of birds can be obtained by identifying the sound of birds.Therefore, may be used
To pass through the research of the phenology of birds voice recognition auxiliary avian reproduction phase.
The content of the invention
The technical problems to be solved by the invention are to provide a kind of birds sound for the phenology research for aiding in the avian reproduction phase
Voice recognition method.
In order to solve the above technical problems, adopt the following technical scheme that:A kind of bird of the phenology research of auxiliary avian reproduction phase
Class sound identification method, it is characterised in that read live recording fragment first, the piece of some birdvocalizations is included in sound
Section, then recognizer can identify the species that the birds belonging to fragment are called in recording, provide the confidence level of an identification, and
Record this section and call the time of occurrence in fragment of recording, finally can calculate this bird for sending song with reference to the time of recording
The quantity of class, that is, the quantity of the birds into breeding period, this quantity is preset more than one after some time
Threshold value after, you can think this birds in this area from this when initially enter breeding period, conversely, when quantity reduce it is super
Cross after a threshold value, you can think that this kind of bird finishes breeding period.
Recognizer concretely comprises the following steps:1) it is used for doing source separation using semi-supervised Non-negative Matrix Factorization, 2) by signal
By a low pass filter, frequency compensation is then carried out;3) sound is split:Blank is found to crying using short-time energy
The transfer point of sound, the short-time energy of recording is calculated first:Then sound is found out according to threshold value
Fragment;4) feature extraction:Sound clip is added into overlapping window first, each window turns into a frame, carried for the value in each window
Temporal signatures and frequency domain character are taken, most of frequency domain character is based on Short Time Fourier Transform (STFT), then time domain spy
Frequency domain character of seeking peace synthesizes a vector, the characteristic vector as this frame;5) dimensionality reduction and noise reduction:Dimensionality reduction is used as using PCA
Means;6) hidden markov chain is used as each birdvocalization founding mathematical models, first using segmental k
Means carries out model initialization, and then HMM is entered using Forward-backward algorithm (forward-backward algorithm)
Row training, after HMM model has established, for new need recording to be processed, source separation is carried out, is pre-processed, segmentation, feature
Extraction, PCA, then obtained characteristic sequence is compared with the HMM that each is trained.Use viterbi algorithm
(Viterbi Algorithm) is decoded, and obtains confidence level.That maximum model of confidence level is chosen as recognition result.
Brief description of the drawings
Fig. 1 is a technical route schematic diagram of the invention
Embodiment
The present invention is described in further detail below in conjunction with the accompanying drawings.
Live recording fragment is read first, and the fragment of some birdvocalizations, then recognizer meeting are included in sound
The species of the birds belonging to fragment is called in identification recording, provides the confidence level of an identification, and records this section cry and is recording
Time of occurrence in tablet section, the quantity of this birds for sending song can be finally calculated with reference to the time recorded, that is,
Into the quantity of the birds of breeding period, after some time after more than one threshold value set in advance of this quantity, you can
Think this birds in this area from this when initially enter breeding period, conversely, after quantity reduces by more than a threshold value,
I.e. it is believed that this kind of bird finishes breeding period.
The specific step of recognizer is:
1)semi-supervised NMF
Semi-supervised NMF:For doing source separation (source separation).So-called source separation refers to record
The sound of sound machine record is the mixing of muli-sounds, is had at some overlapping.Source separation is for alternative sounds are separated
Technology.
NMF full name is non-negative matrix factorization, i.e. Non-negative Matrix Factorization.It is to do now
Source separating effect the best way.It can resolve into sound the form of different base (base) weightings.One group of base and corresponding
Weight a result that can be isolated as source.
Semi-supervised NMF refer to be trained with the data of some known particular categories first, obtain and this class
Not corresponding base, another set initial vector then is added to thinking that data to be processed use NMF algorithms with this group of base.In advance
The base of the known class trained and weighting are used for the result separated, and these results are carried out with follow-up processing.
Good separating effect can be obtained using Semi-supervisedNMF, can effectively suppress noise in addition.
This method is better than other noise-reduction method effects in some environment.Because traditional noise reduction means need the property to noise
Know about.But the Production conditions of noise are very uncertain.Therefore the property of noise can not be described accurately in advance.So pass
The means effect of system noise reduction is not just fine.But the method based on semi-supervisedNMF can make an uproar without knowing in advance
The property of sound.Therefore, the noise reduction of the method based on semi-supervised NMF is more preferable.
2) pre-process
Pretreatment mainly does two parts work.Signal is passed through into a low pass filter first.Then frequency compensation is carried out.
3) split
Recording is very long and includes blank and cry.Therefore need first to remove the part of blank, left behind cry
Part.Therefore need to split sound (segmentation).Looked for using short-time energy (short-term energy)
Transfer point (end point) to blank to cry.
The short-time energy of recording is calculated first, and sound clip is then found out according to threshold value.
4) feature extraction
Called for each section obtained, it is necessary to extract their feature.Sound clip is added into overlapping window first, often
One window is referred to as a frame, and temporal signatures and frequency domain character are extracted for the value in each window.Most of frequency domain character is based in short-term
Fourier transformation (STFT).Then temporal signatures and frequency domain character are synthesized a vector, the characteristic vector as this frame.
Temporal signatures have:Zero crossing rate, Short timeenergy, entropy of energy
Frequency domain character has:MFCC, spectral centroid, Spectral spread, Spectral entropy,
Spectral flux, Spectral rolloff
5)PCA
Because obtained characteristic vector dimension is higher, operand is very big if direct computing, and has some noises.
Therefore need to carry out dimensionality reduction to data, used here as means of the PCA as dimensionality reduction.
PCA full name are principal component analysis, principal component analysis.PCA is a kind of effective data
Dimensionality reduction means, data dimension can be reduced, reduce operand.And many noises can be reduced.So as to lifting system performance.
6)HMM
HMM full name is hidden markov chain (Hidden Markov Model).It is a kind of very famous when being used for
The mathematical modeling of sequence modeling.Compared to other method, HMM recognition efficiency is higher, and robustness is more preferable.
A HMM is established for the song of every kind of birds.It is initial that model is carried out using segmental k means first
Change, then HMM is trained using Forward-backward algorithm (forward-backward algorithm).
After training terminates, for the new characteristic vector for needing to identify after PCA is handled, using viterbi algorithm
(viterbi algorithm) decodes to each characteristic vector.Viterbi algorithm can obtain a probability, can basis
Need species corresponding to the maximum some HMM of select probability as a result.
HMM exports the species and confidence level of birds.
The above-described embodiments merely illustrate the principles and effects of the present invention, and the embodiment that part uses, for
For one of ordinary skill in the art, without departing from the concept of the premise of the invention, can also make it is some deformation and
Improve, these belong to protection scope of the present invention.
Claims (2)
1. the birds sound identification method of the phenology research of a kind of auxiliary avian reproduction phase, it is characterised in that read scene first
Recording fragment, the fragment of some birdvocalizations is included in sound, then recognizer, which can identify, calls fragment institute in recording
The species of the birds of category, provides the confidence level of an identification, and records the actual TRDA of this section of sound, finally combines
The recognition result of algorithm calculate this area all in recording this birds for sending song quantity, that is, into breeding period
Birds quantity, after some time after more than one threshold value set in advance of this quantity, you can think on the ground
This birds in area from this when initially enter breeding period, conversely, after quantity reduces by more than a threshold value, you can think this
Kind bird finishes breeding period.
2. the birds sound identification method of the phenology research of auxiliary avian reproduction phase according to claim 1, its feature exist
In recognizer concretely comprises the following steps:1) it is used for doing source separation using semi-supervised Non-negative Matrix Factorization, 2) signal is passed through one
Individual low pass filter, then carry out frequency compensation;3) sound is split:Blank is found using short-time energy to cry to turn
Change a little, calculate the short-time energy of recording first:Then sound clip is found out according to threshold value;4)
Feature extraction:Sound clip is added into overlapping window first, each window turns into a frame, and it is special to extract time domain for the value in each window
Seek peace frequency domain character, most of frequency domain character is based on Short Time Fourier Transform (STFT), then temporal signatures and frequency domain
Feature synthesizes a vector, the characteristic vector as this frame;5) dimensionality reduction and noise reduction:The means of dimensionality reduction are used as using PCA;6) adopt
It is each birdvocalization founding mathematical models with hidden markov chain, mould is carried out using segmental k means first
Type initializes, and then HMM is trained using Forward-backward algorithm (forward-backward algorithm), HMM moulds
After type has established, for new need recording to be processed, source separation is carried out, is pre-processed, segmentation, feature extraction, PCA, so
Obtained characteristic sequence is compared with the HMM that each is trained afterwards.Use viterbi algorithm (Viterbi
Algorithm) decoded, obtain confidence level.That maximum model of confidence level is chosen as recognition result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710583313.8A CN107369451B (en) | 2017-07-18 | 2017-07-18 | Bird voice recognition method for assisting phenological study of bird breeding period |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710583313.8A CN107369451B (en) | 2017-07-18 | 2017-07-18 | Bird voice recognition method for assisting phenological study of bird breeding period |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107369451A true CN107369451A (en) | 2017-11-21 |
CN107369451B CN107369451B (en) | 2020-12-22 |
Family
ID=60308665
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710583313.8A Active CN107369451B (en) | 2017-07-18 | 2017-07-18 | Bird voice recognition method for assisting phenological study of bird breeding period |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107369451B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108898164A (en) * | 2018-06-11 | 2018-11-27 | 南京理工大学 | A kind of chirping of birds automatic identifying method based on Fusion Features |
CN110120224A (en) * | 2019-05-10 | 2019-08-13 | 平安科技(深圳)有限公司 | Construction method, device, computer equipment and the storage medium of bird sound identification model |
CN110335613A (en) * | 2019-05-28 | 2019-10-15 | 广东工业大学 | A kind of birds recognition methods using sound pick-up real-time detection |
CN113707158A (en) * | 2021-08-02 | 2021-11-26 | 南昌大学 | Power grid harmful bird seed singing recognition method based on VGGish migration learning network |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102708860A (en) * | 2012-06-27 | 2012-10-03 | 昆明信诺莱伯科技有限公司 | Method for establishing judgment standard for identifying bird type based on sound signal |
CN102930870A (en) * | 2012-09-27 | 2013-02-13 | 福州大学 | Bird voice recognition method using anti-noise power normalization cepstrum coefficients (APNCC) |
CN103117061A (en) * | 2013-02-05 | 2013-05-22 | 广东欧珀移动通信有限公司 | Method and device for identifying animals based on voice |
US20130185061A1 (en) * | 2012-10-04 | 2013-07-18 | Medical Privacy Solutions, Llc | Method and apparatus for masking speech in a private environment |
CN103474072A (en) * | 2013-10-11 | 2013-12-25 | 福州大学 | Rapid anti-noise twitter identification method by utilizing textural features and random forest (RF) |
CN103489446A (en) * | 2013-10-10 | 2014-01-01 | 福州大学 | Twitter identification method based on self-adaption energy detection under complex environment |
CN103985385A (en) * | 2014-05-30 | 2014-08-13 | 安庆师范学院 | Method for identifying Batrachia individual information based on spectral features |
CN104102923A (en) * | 2014-07-16 | 2014-10-15 | 西安建筑科技大学 | Nipponia nippon individual recognition method based on MFCC algorithm |
CN104658538A (en) * | 2013-11-18 | 2015-05-27 | 中国计量学院 | Mobile bird recognition method based on birdsong |
US9058384B2 (en) * | 2012-04-05 | 2015-06-16 | Wisconsin Alumni Research Foundation | System and method for identification of highly-variable vocalizations |
CN104882144A (en) * | 2015-05-06 | 2015-09-02 | 福州大学 | Animal voice identification method based on double sound spectrogram characteristics |
US9177559B2 (en) * | 2012-04-24 | 2015-11-03 | Tom Stephenson | Method and apparatus for analyzing animal vocalizations, extracting identification characteristics, and using databases of these characteristics for identifying the species of vocalizing animals |
CN106504762A (en) * | 2016-11-04 | 2017-03-15 | 中南民族大学 | Bird community quantity survey system and method |
-
2017
- 2017-07-18 CN CN201710583313.8A patent/CN107369451B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9058384B2 (en) * | 2012-04-05 | 2015-06-16 | Wisconsin Alumni Research Foundation | System and method for identification of highly-variable vocalizations |
US9177559B2 (en) * | 2012-04-24 | 2015-11-03 | Tom Stephenson | Method and apparatus for analyzing animal vocalizations, extracting identification characteristics, and using databases of these characteristics for identifying the species of vocalizing animals |
CN102708860A (en) * | 2012-06-27 | 2012-10-03 | 昆明信诺莱伯科技有限公司 | Method for establishing judgment standard for identifying bird type based on sound signal |
CN102930870A (en) * | 2012-09-27 | 2013-02-13 | 福州大学 | Bird voice recognition method using anti-noise power normalization cepstrum coefficients (APNCC) |
US20130185061A1 (en) * | 2012-10-04 | 2013-07-18 | Medical Privacy Solutions, Llc | Method and apparatus for masking speech in a private environment |
CN103117061A (en) * | 2013-02-05 | 2013-05-22 | 广东欧珀移动通信有限公司 | Method and device for identifying animals based on voice |
CN103489446A (en) * | 2013-10-10 | 2014-01-01 | 福州大学 | Twitter identification method based on self-adaption energy detection under complex environment |
CN103474072A (en) * | 2013-10-11 | 2013-12-25 | 福州大学 | Rapid anti-noise twitter identification method by utilizing textural features and random forest (RF) |
CN104658538A (en) * | 2013-11-18 | 2015-05-27 | 中国计量学院 | Mobile bird recognition method based on birdsong |
CN103985385A (en) * | 2014-05-30 | 2014-08-13 | 安庆师范学院 | Method for identifying Batrachia individual information based on spectral features |
CN104102923A (en) * | 2014-07-16 | 2014-10-15 | 西安建筑科技大学 | Nipponia nippon individual recognition method based on MFCC algorithm |
CN104882144A (en) * | 2015-05-06 | 2015-09-02 | 福州大学 | Animal voice identification method based on double sound spectrogram characteristics |
CN106504762A (en) * | 2016-11-04 | 2017-03-15 | 中南民族大学 | Bird community quantity survey system and method |
Non-Patent Citations (3)
Title |
---|
刘兴永: "基于隐马尔科夫模型的钢琴音符识别算法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
王恩泽: "基于鸣声的鸟类智能识别方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 * |
韩联宪等: "灰胸薮鹛繁殖行为初报", 《四川动物》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108898164A (en) * | 2018-06-11 | 2018-11-27 | 南京理工大学 | A kind of chirping of birds automatic identifying method based on Fusion Features |
CN110120224A (en) * | 2019-05-10 | 2019-08-13 | 平安科技(深圳)有限公司 | Construction method, device, computer equipment and the storage medium of bird sound identification model |
CN110120224B (en) * | 2019-05-10 | 2023-01-20 | 平安科技(深圳)有限公司 | Method and device for constructing bird sound recognition model, computer equipment and storage medium |
CN110335613A (en) * | 2019-05-28 | 2019-10-15 | 广东工业大学 | A kind of birds recognition methods using sound pick-up real-time detection |
CN110335613B (en) * | 2019-05-28 | 2021-07-09 | 广东工业大学 | Bird identification method adopting pickup for real-time detection |
CN113707158A (en) * | 2021-08-02 | 2021-11-26 | 南昌大学 | Power grid harmful bird seed singing recognition method based on VGGish migration learning network |
Also Published As
Publication number | Publication date |
---|---|
CN107369451B (en) | 2020-12-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11869261B2 (en) | Robust audio identification with interference cancellation | |
Tom et al. | End-To-End Audio Replay Attack Detection Using Deep Convolutional Networks with Attention. | |
CN107369451A (en) | A kind of birds sound identification method of the phenology research of auxiliary avian reproduction phase | |
Cai et al. | Sensor network for the monitoring of ecosystem: Bird species recognition | |
Huang et al. | Intelligent feature extraction and classification of anuran vocalizations | |
US20050027514A1 (en) | Method and apparatus for automatically recognizing audio data | |
Wang et al. | Exploring audio semantic concepts for event-based video retrieval | |
CN108520752A (en) | A kind of method for recognizing sound-groove and device | |
KR20160102815A (en) | Robust audio signal processing apparatus and method for noise | |
CN105280181A (en) | Training method for language recognition model and language recognition method | |
Ramli et al. | Peak finding algorithm to improve syllable segmentation for noisy bioacoustic sound signal | |
Chou et al. | On the studies of syllable segmentation and improving MFCCs for automatic birdsong recognition | |
CN114882914A (en) | Aliasing tone processing method, device and storage medium | |
Xie et al. | Learning A Self-Supervised Domain-Invariant Feature Representation for Generalized Audio Deepfake Detection | |
KR20190141350A (en) | Apparatus and method for recognizing speech in robot | |
CN108564967A (en) | Mel energy vocal print feature extracting methods towards crying detecting system | |
Dumpala et al. | A Cycle-GAN approach to model natural perturbations in speech for ASR applications | |
Chen et al. | An intelligent nocturnal animal vocalization recognition system | |
Zeinali et al. | A fast speaker identification method using nearest neighbor distance | |
Chakroun et al. | A hybrid system based on GMM-SVM for speaker identification | |
Liaqat et al. | Domain tuning methods for bird audio detection. | |
JP6594278B2 (en) | Acoustic model learning device, speech recognition device, method and program thereof | |
Makropoulos et al. | Convolutional recurrent neural networks for the classification of cetacean bioacoustic patterns | |
KR102044520B1 (en) | Apparatus and method for discriminating voice presence section | |
Martín-Doñas et al. | The Vicomtech Partial Deepfake Detection and Location System for the 2023 ADD Challenge. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |