CN107369451A - A kind of birds sound identification method of the phenology research of auxiliary avian reproduction phase - Google Patents

A kind of birds sound identification method of the phenology research of auxiliary avian reproduction phase Download PDF

Info

Publication number
CN107369451A
CN107369451A CN201710583313.8A CN201710583313A CN107369451A CN 107369451 A CN107369451 A CN 107369451A CN 201710583313 A CN201710583313 A CN 201710583313A CN 107369451 A CN107369451 A CN 107369451A
Authority
CN
China
Prior art keywords
birds
sound
recording
fragment
breeding period
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710583313.8A
Other languages
Chinese (zh)
Other versions
CN107369451B (en
Inventor
刘丰
李晟
申小莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEJING COMPUTING CENTER
Original Assignee
BEJING COMPUTING CENTER
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEJING COMPUTING CENTER filed Critical BEJING COMPUTING CENTER
Priority to CN201710583313.8A priority Critical patent/CN107369451B/en
Publication of CN107369451A publication Critical patent/CN107369451A/en
Application granted granted Critical
Publication of CN107369451B publication Critical patent/CN107369451B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/26Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/16Hidden Markov models [HMM]

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A kind of birds sound identification method of the phenology research of auxiliary avian reproduction phase, it is characterized in that, the recording fragment at the scene of reading first, the fragment of some birdvocalizations is included in sound, then recognizer can identify the species that the birds belonging to fragment are called in recording, provide the confidence level of an identification, and record the actual TRDA of this section of sound, the recognition result of last combination algorithm calculates the quantity of this birds for sending song during this area all records, namely enter the quantity of the birds of breeding period, after some time after more than one threshold value set in advance of this quantity, i.e. it is believed that this birds in this area from this when initially enter breeding period, it is on the contrary, after quantity reduces by more than a threshold value, i.e. it is believed that this kind of bird finishes breeding period.

Description

A kind of birds sound identification method of the phenology research of auxiliary avian reproduction phase
Technical field
The present invention relates to birds voice recognition technology field, the bird of the phenology research of specifically a kind of auxiliary avian reproduction phase Class sound identification method.
Background technology
Biologically, the cry of birds is divided into cry (bird call), song (bird song).Wherein, song (bird Song the cry that birds send in breeding period) is referred to.The song pattern of birds of a feather is very fixed.And the ring of different birds Sound area is not often very big.Therefore a kind of means of the song of birds as identification birds species can be used.
So-called phenology is to study animal and the subject of environment mechanical periodicity relation research.One of branch is research birds Breeding period and environment mechanical periodicity relation.And the breeding period of birds can be obtained by identifying the sound of birds.Therefore, may be used To pass through the research of the phenology of birds voice recognition auxiliary avian reproduction phase.
The content of the invention
The technical problems to be solved by the invention are to provide a kind of birds sound for the phenology research for aiding in the avian reproduction phase Voice recognition method.
In order to solve the above technical problems, adopt the following technical scheme that:A kind of bird of the phenology research of auxiliary avian reproduction phase Class sound identification method, it is characterised in that read live recording fragment first, the piece of some birdvocalizations is included in sound Section, then recognizer can identify the species that the birds belonging to fragment are called in recording, provide the confidence level of an identification, and Record this section and call the time of occurrence in fragment of recording, finally can calculate this bird for sending song with reference to the time of recording The quantity of class, that is, the quantity of the birds into breeding period, this quantity is preset more than one after some time Threshold value after, you can think this birds in this area from this when initially enter breeding period, conversely, when quantity reduce it is super Cross after a threshold value, you can think that this kind of bird finishes breeding period.
Recognizer concretely comprises the following steps:1) it is used for doing source separation using semi-supervised Non-negative Matrix Factorization, 2) by signal By a low pass filter, frequency compensation is then carried out;3) sound is split:Blank is found to crying using short-time energy The transfer point of sound, the short-time energy of recording is calculated first:Then sound is found out according to threshold value Fragment;4) feature extraction:Sound clip is added into overlapping window first, each window turns into a frame, carried for the value in each window Temporal signatures and frequency domain character are taken, most of frequency domain character is based on Short Time Fourier Transform (STFT), then time domain spy Frequency domain character of seeking peace synthesizes a vector, the characteristic vector as this frame;5) dimensionality reduction and noise reduction:Dimensionality reduction is used as using PCA Means;6) hidden markov chain is used as each birdvocalization founding mathematical models, first using segmental k Means carries out model initialization, and then HMM is entered using Forward-backward algorithm (forward-backward algorithm) Row training, after HMM model has established, for new need recording to be processed, source separation is carried out, is pre-processed, segmentation, feature Extraction, PCA, then obtained characteristic sequence is compared with the HMM that each is trained.Use viterbi algorithm (Viterbi Algorithm) is decoded, and obtains confidence level.That maximum model of confidence level is chosen as recognition result.
Brief description of the drawings
Fig. 1 is a technical route schematic diagram of the invention
Embodiment
The present invention is described in further detail below in conjunction with the accompanying drawings.
Live recording fragment is read first, and the fragment of some birdvocalizations, then recognizer meeting are included in sound The species of the birds belonging to fragment is called in identification recording, provides the confidence level of an identification, and records this section cry and is recording Time of occurrence in tablet section, the quantity of this birds for sending song can be finally calculated with reference to the time recorded, that is, Into the quantity of the birds of breeding period, after some time after more than one threshold value set in advance of this quantity, you can Think this birds in this area from this when initially enter breeding period, conversely, after quantity reduces by more than a threshold value, I.e. it is believed that this kind of bird finishes breeding period.
The specific step of recognizer is:
1)semi-supervised NMF
Semi-supervised NMF:For doing source separation (source separation).So-called source separation refers to record The sound of sound machine record is the mixing of muli-sounds, is had at some overlapping.Source separation is for alternative sounds are separated Technology.
NMF full name is non-negative matrix factorization, i.e. Non-negative Matrix Factorization.It is to do now Source separating effect the best way.It can resolve into sound the form of different base (base) weightings.One group of base and corresponding Weight a result that can be isolated as source.
Semi-supervised NMF refer to be trained with the data of some known particular categories first, obtain and this class Not corresponding base, another set initial vector then is added to thinking that data to be processed use NMF algorithms with this group of base.In advance The base of the known class trained and weighting are used for the result separated, and these results are carried out with follow-up processing.
Good separating effect can be obtained using Semi-supervisedNMF, can effectively suppress noise in addition. This method is better than other noise-reduction method effects in some environment.Because traditional noise reduction means need the property to noise Know about.But the Production conditions of noise are very uncertain.Therefore the property of noise can not be described accurately in advance.So pass The means effect of system noise reduction is not just fine.But the method based on semi-supervisedNMF can make an uproar without knowing in advance The property of sound.Therefore, the noise reduction of the method based on semi-supervised NMF is more preferable.
2) pre-process
Pretreatment mainly does two parts work.Signal is passed through into a low pass filter first.Then frequency compensation is carried out.
3) split
Recording is very long and includes blank and cry.Therefore need first to remove the part of blank, left behind cry Part.Therefore need to split sound (segmentation).Looked for using short-time energy (short-term energy) Transfer point (end point) to blank to cry.
The short-time energy of recording is calculated first, and sound clip is then found out according to threshold value.
4) feature extraction
Called for each section obtained, it is necessary to extract their feature.Sound clip is added into overlapping window first, often One window is referred to as a frame, and temporal signatures and frequency domain character are extracted for the value in each window.Most of frequency domain character is based in short-term Fourier transformation (STFT).Then temporal signatures and frequency domain character are synthesized a vector, the characteristic vector as this frame.
Temporal signatures have:Zero crossing rate, Short timeenergy, entropy of energy
Frequency domain character has:MFCC, spectral centroid, Spectral spread, Spectral entropy, Spectral flux, Spectral rolloff
5)PCA
Because obtained characteristic vector dimension is higher, operand is very big if direct computing, and has some noises. Therefore need to carry out dimensionality reduction to data, used here as means of the PCA as dimensionality reduction.
PCA full name are principal component analysis, principal component analysis.PCA is a kind of effective data Dimensionality reduction means, data dimension can be reduced, reduce operand.And many noises can be reduced.So as to lifting system performance.
6)HMM
HMM full name is hidden markov chain (Hidden Markov Model).It is a kind of very famous when being used for The mathematical modeling of sequence modeling.Compared to other method, HMM recognition efficiency is higher, and robustness is more preferable.
A HMM is established for the song of every kind of birds.It is initial that model is carried out using segmental k means first Change, then HMM is trained using Forward-backward algorithm (forward-backward algorithm).
After training terminates, for the new characteristic vector for needing to identify after PCA is handled, using viterbi algorithm (viterbi algorithm) decodes to each characteristic vector.Viterbi algorithm can obtain a probability, can basis Need species corresponding to the maximum some HMM of select probability as a result.
HMM exports the species and confidence level of birds.
The above-described embodiments merely illustrate the principles and effects of the present invention, and the embodiment that part uses, for For one of ordinary skill in the art, without departing from the concept of the premise of the invention, can also make it is some deformation and Improve, these belong to protection scope of the present invention.

Claims (2)

1. the birds sound identification method of the phenology research of a kind of auxiliary avian reproduction phase, it is characterised in that read scene first Recording fragment, the fragment of some birdvocalizations is included in sound, then recognizer, which can identify, calls fragment institute in recording The species of the birds of category, provides the confidence level of an identification, and records the actual TRDA of this section of sound, finally combines The recognition result of algorithm calculate this area all in recording this birds for sending song quantity, that is, into breeding period Birds quantity, after some time after more than one threshold value set in advance of this quantity, you can think on the ground This birds in area from this when initially enter breeding period, conversely, after quantity reduces by more than a threshold value, you can think this Kind bird finishes breeding period.
2. the birds sound identification method of the phenology research of auxiliary avian reproduction phase according to claim 1, its feature exist In recognizer concretely comprises the following steps:1) it is used for doing source separation using semi-supervised Non-negative Matrix Factorization, 2) signal is passed through one Individual low pass filter, then carry out frequency compensation;3) sound is split:Blank is found using short-time energy to cry to turn Change a little, calculate the short-time energy of recording first:Then sound clip is found out according to threshold value;4) Feature extraction:Sound clip is added into overlapping window first, each window turns into a frame, and it is special to extract time domain for the value in each window Seek peace frequency domain character, most of frequency domain character is based on Short Time Fourier Transform (STFT), then temporal signatures and frequency domain Feature synthesizes a vector, the characteristic vector as this frame;5) dimensionality reduction and noise reduction:The means of dimensionality reduction are used as using PCA;6) adopt It is each birdvocalization founding mathematical models with hidden markov chain, mould is carried out using segmental k means first Type initializes, and then HMM is trained using Forward-backward algorithm (forward-backward algorithm), HMM moulds After type has established, for new need recording to be processed, source separation is carried out, is pre-processed, segmentation, feature extraction, PCA, so Obtained characteristic sequence is compared with the HMM that each is trained afterwards.Use viterbi algorithm (Viterbi Algorithm) decoded, obtain confidence level.That maximum model of confidence level is chosen as recognition result.
CN201710583313.8A 2017-07-18 2017-07-18 Bird voice recognition method for assisting phenological study of bird breeding period Active CN107369451B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710583313.8A CN107369451B (en) 2017-07-18 2017-07-18 Bird voice recognition method for assisting phenological study of bird breeding period

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710583313.8A CN107369451B (en) 2017-07-18 2017-07-18 Bird voice recognition method for assisting phenological study of bird breeding period

Publications (2)

Publication Number Publication Date
CN107369451A true CN107369451A (en) 2017-11-21
CN107369451B CN107369451B (en) 2020-12-22

Family

ID=60308665

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710583313.8A Active CN107369451B (en) 2017-07-18 2017-07-18 Bird voice recognition method for assisting phenological study of bird breeding period

Country Status (1)

Country Link
CN (1) CN107369451B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108898164A (en) * 2018-06-11 2018-11-27 南京理工大学 A kind of chirping of birds automatic identifying method based on Fusion Features
CN110120224A (en) * 2019-05-10 2019-08-13 平安科技(深圳)有限公司 Construction method, device, computer equipment and the storage medium of bird sound identification model
CN110335613A (en) * 2019-05-28 2019-10-15 广东工业大学 A kind of birds recognition methods using sound pick-up real-time detection
CN113707158A (en) * 2021-08-02 2021-11-26 南昌大学 Power grid harmful bird seed singing recognition method based on VGGish migration learning network

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708860A (en) * 2012-06-27 2012-10-03 昆明信诺莱伯科技有限公司 Method for establishing judgment standard for identifying bird type based on sound signal
CN102930870A (en) * 2012-09-27 2013-02-13 福州大学 Bird voice recognition method using anti-noise power normalization cepstrum coefficients (APNCC)
CN103117061A (en) * 2013-02-05 2013-05-22 广东欧珀移动通信有限公司 Method and device for identifying animals based on voice
US20130185061A1 (en) * 2012-10-04 2013-07-18 Medical Privacy Solutions, Llc Method and apparatus for masking speech in a private environment
CN103474072A (en) * 2013-10-11 2013-12-25 福州大学 Rapid anti-noise twitter identification method by utilizing textural features and random forest (RF)
CN103489446A (en) * 2013-10-10 2014-01-01 福州大学 Twitter identification method based on self-adaption energy detection under complex environment
CN103985385A (en) * 2014-05-30 2014-08-13 安庆师范学院 Method for identifying Batrachia individual information based on spectral features
CN104102923A (en) * 2014-07-16 2014-10-15 西安建筑科技大学 Nipponia nippon individual recognition method based on MFCC algorithm
CN104658538A (en) * 2013-11-18 2015-05-27 中国计量学院 Mobile bird recognition method based on birdsong
US9058384B2 (en) * 2012-04-05 2015-06-16 Wisconsin Alumni Research Foundation System and method for identification of highly-variable vocalizations
CN104882144A (en) * 2015-05-06 2015-09-02 福州大学 Animal voice identification method based on double sound spectrogram characteristics
US9177559B2 (en) * 2012-04-24 2015-11-03 Tom Stephenson Method and apparatus for analyzing animal vocalizations, extracting identification characteristics, and using databases of these characteristics for identifying the species of vocalizing animals
CN106504762A (en) * 2016-11-04 2017-03-15 中南民族大学 Bird community quantity survey system and method

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9058384B2 (en) * 2012-04-05 2015-06-16 Wisconsin Alumni Research Foundation System and method for identification of highly-variable vocalizations
US9177559B2 (en) * 2012-04-24 2015-11-03 Tom Stephenson Method and apparatus for analyzing animal vocalizations, extracting identification characteristics, and using databases of these characteristics for identifying the species of vocalizing animals
CN102708860A (en) * 2012-06-27 2012-10-03 昆明信诺莱伯科技有限公司 Method for establishing judgment standard for identifying bird type based on sound signal
CN102930870A (en) * 2012-09-27 2013-02-13 福州大学 Bird voice recognition method using anti-noise power normalization cepstrum coefficients (APNCC)
US20130185061A1 (en) * 2012-10-04 2013-07-18 Medical Privacy Solutions, Llc Method and apparatus for masking speech in a private environment
CN103117061A (en) * 2013-02-05 2013-05-22 广东欧珀移动通信有限公司 Method and device for identifying animals based on voice
CN103489446A (en) * 2013-10-10 2014-01-01 福州大学 Twitter identification method based on self-adaption energy detection under complex environment
CN103474072A (en) * 2013-10-11 2013-12-25 福州大学 Rapid anti-noise twitter identification method by utilizing textural features and random forest (RF)
CN104658538A (en) * 2013-11-18 2015-05-27 中国计量学院 Mobile bird recognition method based on birdsong
CN103985385A (en) * 2014-05-30 2014-08-13 安庆师范学院 Method for identifying Batrachia individual information based on spectral features
CN104102923A (en) * 2014-07-16 2014-10-15 西安建筑科技大学 Nipponia nippon individual recognition method based on MFCC algorithm
CN104882144A (en) * 2015-05-06 2015-09-02 福州大学 Animal voice identification method based on double sound spectrogram characteristics
CN106504762A (en) * 2016-11-04 2017-03-15 中南民族大学 Bird community quantity survey system and method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
刘兴永: "基于隐马尔科夫模型的钢琴音符识别算法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
王恩泽: "基于鸣声的鸟类智能识别方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
韩联宪等: "灰胸薮鹛繁殖行为初报", 《四川动物》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108898164A (en) * 2018-06-11 2018-11-27 南京理工大学 A kind of chirping of birds automatic identifying method based on Fusion Features
CN110120224A (en) * 2019-05-10 2019-08-13 平安科技(深圳)有限公司 Construction method, device, computer equipment and the storage medium of bird sound identification model
CN110120224B (en) * 2019-05-10 2023-01-20 平安科技(深圳)有限公司 Method and device for constructing bird sound recognition model, computer equipment and storage medium
CN110335613A (en) * 2019-05-28 2019-10-15 广东工业大学 A kind of birds recognition methods using sound pick-up real-time detection
CN110335613B (en) * 2019-05-28 2021-07-09 广东工业大学 Bird identification method adopting pickup for real-time detection
CN113707158A (en) * 2021-08-02 2021-11-26 南昌大学 Power grid harmful bird seed singing recognition method based on VGGish migration learning network

Also Published As

Publication number Publication date
CN107369451B (en) 2020-12-22

Similar Documents

Publication Publication Date Title
US11869261B2 (en) Robust audio identification with interference cancellation
Tom et al. End-To-End Audio Replay Attack Detection Using Deep Convolutional Networks with Attention.
CN107369451A (en) A kind of birds sound identification method of the phenology research of auxiliary avian reproduction phase
Cai et al. Sensor network for the monitoring of ecosystem: Bird species recognition
Huang et al. Intelligent feature extraction and classification of anuran vocalizations
US20050027514A1 (en) Method and apparatus for automatically recognizing audio data
Wang et al. Exploring audio semantic concepts for event-based video retrieval
CN108520752A (en) A kind of method for recognizing sound-groove and device
KR20160102815A (en) Robust audio signal processing apparatus and method for noise
CN105280181A (en) Training method for language recognition model and language recognition method
Ramli et al. Peak finding algorithm to improve syllable segmentation for noisy bioacoustic sound signal
Chou et al. On the studies of syllable segmentation and improving MFCCs for automatic birdsong recognition
CN114882914A (en) Aliasing tone processing method, device and storage medium
Xie et al. Learning A Self-Supervised Domain-Invariant Feature Representation for Generalized Audio Deepfake Detection
KR20190141350A (en) Apparatus and method for recognizing speech in robot
CN108564967A (en) Mel energy vocal print feature extracting methods towards crying detecting system
Dumpala et al. A Cycle-GAN approach to model natural perturbations in speech for ASR applications
Chen et al. An intelligent nocturnal animal vocalization recognition system
Zeinali et al. A fast speaker identification method using nearest neighbor distance
Chakroun et al. A hybrid system based on GMM-SVM for speaker identification
Liaqat et al. Domain tuning methods for bird audio detection.
JP6594278B2 (en) Acoustic model learning device, speech recognition device, method and program thereof
Makropoulos et al. Convolutional recurrent neural networks for the classification of cetacean bioacoustic patterns
KR102044520B1 (en) Apparatus and method for discriminating voice presence section
Martín-Doñas et al. The Vicomtech Partial Deepfake Detection and Location System for the 2023 ADD Challenge.

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant