CN107369451A

CN107369451A - A kind of birds sound identification method of the phenology research of auxiliary avian reproduction phase

Info

Publication number: CN107369451A
Application number: CN201710583313.8A
Authority: CN
Inventors: 刘丰; 李晟; 申小莉
Original assignee: BEJING COMPUTING CENTER
Current assignee: BEJING COMPUTING CENTER
Priority date: 2017-07-18
Filing date: 2017-07-18
Publication date: 2017-11-21
Anticipated expiration: 2037-07-18
Also published as: CN107369451B

Abstract

A kind of birds sound identification method of the phenology research of auxiliary avian reproduction phase, it is characterized in that, the recording fragment at the scene of reading first, the fragment of some birdvocalizations is included in sound, then recognizer can identify the species that the birds belonging to fragment are called in recording, provide the confidence level of an identification, and record the actual TRDA of this section of sound, the recognition result of last combination algorithm calculates the quantity of this birds for sending song during this area all records, namely enter the quantity of the birds of breeding period, after some time after more than one threshold value set in advance of this quantity, i.e. it is believed that this birds in this area from this when initially enter breeding period, it is on the contrary, after quantity reduces by more than a threshold value, i.e. it is believed that this kind of bird finishes breeding period.

Description

A kind of birds sound identification method of the phenology research of auxiliary avian reproduction phase

Technical field

The present invention relates to birds voice recognition technology field, the bird of the phenology research of specifically a kind of auxiliary avian reproduction phase Class sound identification method.

Background technology

Biologically, the cry of birds is divided into cry (bird call), song (bird song).Wherein, song (bird Song the cry that birds send in breeding period) is referred to.The song pattern of birds of a feather is very fixed.And the ring of different birds Sound area is not often very big.Therefore a kind of means of the song of birds as identification birds species can be used.

So-called phenology is to study animal and the subject of environment mechanical periodicity relation research.One of branch is research birds Breeding period and environment mechanical periodicity relation.And the breeding period of birds can be obtained by identifying the sound of birds.Therefore, may be used To pass through the research of the phenology of birds voice recognition auxiliary avian reproduction phase.

The content of the invention

The technical problems to be solved by the invention are to provide a kind of birds sound for the phenology research for aiding in the avian reproduction phase Voice recognition method.

In order to solve the above technical problems, adopt the following technical scheme that：A kind of bird of the phenology research of auxiliary avian reproduction phase Class sound identification method, it is characterised in that read live recording fragment first, the piece of some birdvocalizations is included in sound Section, then recognizer can identify the species that the birds belonging to fragment are called in recording, provide the confidence level of an identification, and Record this section and call the time of occurrence in fragment of recording, finally can calculate this bird for sending song with reference to the time of recording The quantity of class, that is, the quantity of the birds into breeding period, this quantity is preset more than one after some time Threshold value after, you can think this birds in this area from this when initially enter breeding period, conversely, when quantity reduce it is super Cross after a threshold value, you can think that this kind of bird finishes breeding period.

Recognizer concretely comprises the following steps：1) it is used for doing source separation using semi-supervised Non-negative Matrix Factorization, 2) by signal By a low pass filter, frequency compensation is then carried out；3) sound is split：Blank is found to crying using short-time energy The transfer point of sound, the short-time energy of recording is calculated first：Then sound is found out according to threshold value Fragment；4) feature extraction：Sound clip is added into overlapping window first, each window turns into a frame, carried for the value in each window Temporal signatures and frequency domain character are taken, most of frequency domain character is based on Short Time Fourier Transform (STFT), then time domain spy Frequency domain character of seeking peace synthesizes a vector, the characteristic vector as this frame；5) dimensionality reduction and noise reduction：Dimensionality reduction is used as using PCA Means；6) hidden markov chain is used as each birdvocalization founding mathematical models, first using segmental k Means carries out model initialization, and then HMM is entered using Forward-backward algorithm (forward-backward algorithm) Row training, after HMM model has established, for new need recording to be processed, source separation is carried out, is pre-processed, segmentation, feature Extraction, PCA, then obtained characteristic sequence is compared with the HMM that each is trained.Use viterbi algorithm (Viterbi Algorithm) is decoded, and obtains confidence level.That maximum model of confidence level is chosen as recognition result.

Brief description of the drawings

Fig. 1 is a technical route schematic diagram of the invention

Embodiment

The present invention is described in further detail below in conjunction with the accompanying drawings.

Live recording fragment is read first, and the fragment of some birdvocalizations, then recognizer meeting are included in sound The species of the birds belonging to fragment is called in identification recording, provides the confidence level of an identification, and records this section cry and is recording Time of occurrence in tablet section, the quantity of this birds for sending song can be finally calculated with reference to the time recorded, that is, Into the quantity of the birds of breeding period, after some time after more than one threshold value set in advance of this quantity, you can Think this birds in this area from this when initially enter breeding period, conversely, after quantity reduces by more than a threshold value, I.e. it is believed that this kind of bird finishes breeding period.

The specific step of recognizer is：

1)semi-supervised NMF

Semi-supervised NMF：For doing source separation (source separation).So-called source separation refers to record The sound of sound machine record is the mixing of muli-sounds, is had at some overlapping.Source separation is for alternative sounds are separated Technology.

NMF full name is non-negative matrix factorization, i.e. Non-negative Matrix Factorization.It is to do now Source separating effect the best way.It can resolve into sound the form of different base (base) weightings.One group of base and corresponding Weight a result that can be isolated as source.

Semi-supervised NMF refer to be trained with the data of some known particular categories first, obtain and this class Not corresponding base, another set initial vector then is added to thinking that data to be processed use NMF algorithms with this group of base.In advance The base of the known class trained and weighting are used for the result separated, and these results are carried out with follow-up processing.

Good separating effect can be obtained using Semi-supervisedNMF, can effectively suppress noise in addition. This method is better than other noise-reduction method effects in some environment.Because traditional noise reduction means need the property to noise Know about.But the Production conditions of noise are very uncertain.Therefore the property of noise can not be described accurately in advance.So pass The means effect of system noise reduction is not just fine.But the method based on semi-supervisedNMF can make an uproar without knowing in advance The property of sound.Therefore, the noise reduction of the method based on semi-supervised NMF is more preferable.

2) pre-process

Pretreatment mainly does two parts work.Signal is passed through into a low pass filter first.Then frequency compensation is carried out.

3) split

Recording is very long and includes blank and cry.Therefore need first to remove the part of blank, left behind cry Part.Therefore need to split sound (segmentation).Looked for using short-time energy (short-term energy) Transfer point (end point) to blank to cry.

The short-time energy of recording is calculated first, and sound clip is then found out according to threshold value.

4) feature extraction

Called for each section obtained, it is necessary to extract their feature.Sound clip is added into overlapping window first, often One window is referred to as a frame, and temporal signatures and frequency domain character are extracted for the value in each window.Most of frequency domain character is based in short-term Fourier transformation (STFT).Then temporal signatures and frequency domain character are synthesized a vector, the characteristic vector as this frame.

Temporal signatures have：Zero crossing rate, Short timeenergy, entropy of energy

Frequency domain character has：MFCC, spectral centroid, Spectral spread, Spectral entropy, Spectral flux, Spectral rolloff

5)PCA

Because obtained characteristic vector dimension is higher, operand is very big if direct computing, and has some noises. Therefore need to carry out dimensionality reduction to data, used here as means of the PCA as dimensionality reduction.

PCA full name are principal component analysis, principal component analysis.PCA is a kind of effective data Dimensionality reduction means, data dimension can be reduced, reduce operand.And many noises can be reduced.So as to lifting system performance.

6)HMM

HMM full name is hidden markov chain (Hidden Markov Model).It is a kind of very famous when being used for The mathematical modeling of sequence modeling.Compared to other method, HMM recognition efficiency is higher, and robustness is more preferable.

A HMM is established for the song of every kind of birds.It is initial that model is carried out using segmental k means first Change, then HMM is trained using Forward-backward algorithm (forward-backward algorithm).

After training terminates, for the new characteristic vector for needing to identify after PCA is handled, using viterbi algorithm (viterbi algorithm) decodes to each characteristic vector.Viterbi algorithm can obtain a probability, can basis Need species corresponding to the maximum some HMM of select probability as a result.

HMM exports the species and confidence level of birds.

The above-described embodiments merely illustrate the principles and effects of the present invention, and the embodiment that part uses, for For one of ordinary skill in the art, without departing from the concept of the premise of the invention, can also make it is some deformation and Improve, these belong to protection scope of the present invention.

Claims

1. the birds sound identification method of the phenology research of a kind of auxiliary avian reproduction phase, it is characterised in that read scene first Recording fragment, the fragment of some birdvocalizations is included in sound, then recognizer, which can identify, calls fragment institute in recording The species of the birds of category, provides the confidence level of an identification, and records the actual TRDA of this section of sound, finally combines The recognition result of algorithm calculate this area all in recording this birds for sending song quantity, that is, into breeding period Birds quantity, after some time after more than one threshold value set in advance of this quantity, you can think on the ground This birds in area from this when initially enter breeding period, conversely, after quantity reduces by more than a threshold value, you can think this Kind bird finishes breeding period.

2. the birds sound identification method of the phenology research of auxiliary avian reproduction phase according to claim 1, its feature exist In recognizer concretely comprises the following steps：1) it is used for doing source separation using semi-supervised Non-negative Matrix Factorization, 2) signal is passed through one Individual low pass filter, then carry out frequency compensation；3) sound is split：Blank is found using short-time energy to cry to turn Change a little, calculate the short-time energy of recording first：Then sound clip is found out according to threshold value；4) Feature extraction：Sound clip is added into overlapping window first, each window turns into a frame, and it is special to extract time domain for the value in each window Seek peace frequency domain character, most of frequency domain character is based on Short Time Fourier Transform (STFT), then temporal signatures and frequency domain Feature synthesizes a vector, the characteristic vector as this frame；5) dimensionality reduction and noise reduction：The means of dimensionality reduction are used as using PCA；6) adopt It is each birdvocalization founding mathematical models with hidden markov chain, mould is carried out using segmental k means first Type initializes, and then HMM is trained using Forward-backward algorithm (forward-backward algorithm), HMM moulds After type has established, for new need recording to be processed, source separation is carried out, is pre-processed, segmentation, feature extraction, PCA, so Obtained characteristic sequence is compared with the HMM that each is trained afterwards.Use viterbi algorithm (Viterbi Algorithm) decoded, obtain confidence level.That maximum model of confidence level is chosen as recognition result.