CN115910077A - Insect identification method based on deep learning of sound - Google Patents
Insect identification method based on deep learning of sound Download PDFInfo
- Publication number
- CN115910077A CN115910077A CN202211425420.5A CN202211425420A CN115910077A CN 115910077 A CN115910077 A CN 115910077A CN 202211425420 A CN202211425420 A CN 202211425420A CN 115910077 A CN115910077 A CN 115910077A
- Authority
- CN
- China
- Prior art keywords
- sound
- insect
- signal
- sample
- insects
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 241000238631 Hexapoda Species 0.000 title claims abstract description 160
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000013135 deep learning Methods 0.000 title claims abstract description 20
- 239000000203 mixture Substances 0.000 claims abstract description 19
- 238000007781 pre-processing Methods 0.000 claims abstract description 12
- 238000003860 storage Methods 0.000 claims abstract description 5
- 230000006698 induction Effects 0.000 claims abstract description 4
- 238000001228 spectrum Methods 0.000 claims description 36
- 238000012360 testing method Methods 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 13
- 230000006870 function Effects 0.000 claims description 12
- 238000010183 spectrum analysis Methods 0.000 claims description 12
- 238000001514 detection method Methods 0.000 claims description 11
- 238000012549 training Methods 0.000 claims description 10
- 239000013598 vector Substances 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 8
- 230000003068 static effect Effects 0.000 claims description 7
- 238000004458 analytical method Methods 0.000 claims description 6
- 238000004422 calculation algorithm Methods 0.000 claims description 6
- 230000008859 change Effects 0.000 claims description 6
- 150000001875 compounds Chemical class 0.000 claims description 6
- 238000001914 filtration Methods 0.000 claims description 6
- 238000009432 framing Methods 0.000 claims description 6
- 239000000411 inducer Substances 0.000 claims description 6
- 238000005520 cutting process Methods 0.000 claims description 5
- 230000003321 amplification Effects 0.000 claims description 4
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 4
- 230000001939 inductive effect Effects 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 239000011159 matrix material Substances 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000012546 transfer Methods 0.000 claims description 3
- 230000001131 transforming effect Effects 0.000 claims description 3
- 230000001755 vocal effect Effects 0.000 claims description 3
- 238000012804 iterative process Methods 0.000 claims description 2
- 238000000605 extraction Methods 0.000 claims 1
- 238000005516 engineering process Methods 0.000 abstract description 9
- 230000009286 beneficial effect Effects 0.000 abstract description 5
- 238000003909 pattern recognition Methods 0.000 abstract description 4
- 241000607479 Yersinia pestis Species 0.000 description 40
- 230000002265 prevention Effects 0.000 description 5
- 238000011161 development Methods 0.000 description 4
- 230000018109 developmental process Effects 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 230000005236 sound signal Effects 0.000 description 4
- 230000007547 defect Effects 0.000 description 3
- 235000013305 food Nutrition 0.000 description 3
- 241001347511 Carposina sasakii Species 0.000 description 2
- 241000426497 Chilo suppressalis Species 0.000 description 2
- 239000000877 Sex Attractant Substances 0.000 description 2
- 230000002950 deficient Effects 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 238000009413 insulation Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 241000218475 Agrotis segetum Species 0.000 description 1
- 241000008892 Cnaphalocrocis patnalis Species 0.000 description 1
- 229920000742 Cotton Polymers 0.000 description 1
- 241000255967 Helicoverpa zea Species 0.000 description 1
- 241000659518 Lozotaenia capensana Species 0.000 description 1
- 241001477931 Mythimna unipuncta Species 0.000 description 1
- 241000209094 Oryza Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 241000346285 Ostrinia furnacalis Species 0.000 description 1
- 241000721451 Pectinophora gossypiella Species 0.000 description 1
- 241000500437 Plutella xylostella Species 0.000 description 1
- 239000004698 Polyethylene Substances 0.000 description 1
- 235000014443 Pyrus communis Nutrition 0.000 description 1
- 241000254152 Sitophilus oryzae Species 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 239000005667 attractant Substances 0.000 description 1
- 230000031902 chemoattractant activity Effects 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000009193 crawling Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 238000003912 environmental pollution Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 239000000575 pesticide Substances 0.000 description 1
- 239000000447 pesticide residue Substances 0.000 description 1
- -1 polyethylene Polymers 0.000 description 1
- 229920000573 polyethylene Polymers 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A50/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
- Y02A50/30—Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change
Landscapes
- Investigating Or Analyzing Materials By The Use Of Ultrasonic Waves (AREA)
Abstract
The invention discloses an insect identification method based on deep learning of sound, which comprises the following steps: s1, carrying out induction capture, storage and sound collection on insects to obtain an initial insect sound sample; s2, preprocessing the collected initial insect sound sample, and obtaining a required insect sound sample; s3, extracting features in the insect sound sample through MFCC to obtain a feature sample; and S4, performing sound classification on the characteristic samples by using a Gaussian mixture model. The invention has the beneficial effects that: the signal parameterization method and the advanced pattern recognition technology realize the identification of the insect voice, the MFCC is provided as the voice feature, the GMM is provided as the classifier, the average identification rate obtained when the voice of a plurality of types of insects is identified is 98.95%, the time required for identifying a voice sample of about 1s is about 300ms, and the good performance is shown in the aspects of identification accuracy and identification time.
Description
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an insect identification method based on deep learning of sound.
Background
The crop diseases and insect pests have the characteristics of multiple types, large influence and frequent outbreak and disaster. During the growth process of crops, the crops are often damaged by different kinds of insects in the field, the quality and the yield of the crops are seriously influenced, and huge economic losses are caused. Therefore, timely and accurate identification of crop insects is important, people often use chemical pesticides to control pests according to self experience in actual production, so that the problems of pesticide residue, environmental pollution and the like are serious, the crop insects are effectively and accurately identified, effective control measures are taken timely, economic loss caused by pests can be reduced, and stable and high yield of crops is guaranteed.
At present, agricultural insect identification and diagnosis work in China mainly depends on manual identification or insect expert identification, the method requires very rich experience or solid insect classification professional knowledge, and the problems of time and labor consumption, low efficiency, high misjudgment rate and the like exist, so that the requirement of modern agricultural development is difficult to meet. In recent years, the convolutional neural network in deep learning has made great progress in some fields such as image classification, voice recognition and pedestrian detection, however, the application of the related technology in the field of agricultural insect recognition and monitoring is deficient. The crop insect automatic identification technology based on deep learning has the advantages of accurate identification, high intelligent degree and the like, on one hand, insect forecasting personnel get rid of mechanical and tedious insect identification and statistics work, the defect of manual detection is reduced, human resources and time cost are saved, and the insect identification rate is improved; on the other hand, the method is also beneficial to protecting the ecological environment, ensures the food safety, enhances the sustainable development of agriculture, provides timely and effective information for the prevention and treatment of insects, and has great practical value.
However, in a practical complicated farmland environment during a specific use operation, the pest image is interfered by the background, the identification performance of the pest image is limited, and the sound has an important significance in insect species identification as one of the expression characteristics of the insect. The sound detection comprises the steps of converting collected sounding signals into electric signals, amplifying the electric signals, purifying the electric signals by an electronic filter, extracting the sound emitted by insects, and determining the types and the quantity of the insects according to the sound frequency and the characteristic value of signal pulses;
however, in the prior art, although there are many researches in the field of human language recognition, automatic acoustic species identification is still considered as a marginal field of pattern recognition, and the research literature is relatively few, and a singing deep learning-based insect recognition method is urgently needed to solve the technical scheme.
Disclosure of Invention
Aiming at the problems in the related art, the invention provides an insect identification method based on deep learning of sound, so as to overcome the technical problems in the prior related art.
Therefore, the invention adopts the following specific technical scheme:
an insect identification method based on deep learning of sound, comprising the following steps:
s1, carrying out induction capture, storage and sound collection on insects to obtain an initial insect sound sample;
s2, preprocessing the collected initial insect sound sample, and obtaining a required insect sound sample;
s3, extracting the characteristics in the insect singing sample through MFCC to obtain a characteristic sample;
and S4, performing sound classification on the characteristic samples by using a Gaussian mixture model.
Further, the step of preprocessing the collected insect song sample and obtaining the required insect song sample comprises the following steps:
s11, presetting a catching cage, and inducing insects to enter the catching cage by using an inducer in the catching cage;
s12, trapping insects into the catching cage, and collecting the sounding signals of the insects by using a sensor;
and S13, storing the sounding signal to a database.
Further, transmitting the sounding signal to a recording device and storing it to a database includes the steps of:
s131, deleting the incomplete and damaged sounding signals, and storing the screened sounding signals into a database;
and S132, dividing the data in the database into a training set and a test set, wherein the length of a sample in the test set is far longer than that of the test set.
Further, the pretreatment of the collected insect chirps comprises the following steps:
s21, denoising the collected insect singing samples to obtain samples to be processed;
s21, cutting a sample to be processed into a plurality of sound sections;
s23, detecting a sound section and a soundless section in the sound section, and preprocessing and detecting each sound section;
and S24, taking the preset ringing segment after the pretreatment detection as a required insect ringing sample.
Further, the preprocessing detection comprises calculation of Mel frequency cepstral coefficients;
normalization: dividing each sample to be processed by the amplification peak of the ringing segment;
pre-emphasis: the system comprises a pre-emphasis filter, a frequency spectrum analysis module and a frequency spectrum analysis module, wherein the pre-emphasis filter is used for improving high-frequency components by utilizing a pre-emphasis factor and the pre-emphasis filter, keeping low-frequency components at the original level and flattening the frequency spectrum of a signal so as to perform frequency spectrum analysis and vocal tract parameter analysis;
framing and windowing: overlapping and framing the signal and the Hamming window to change the signal into a short-time stable signal;
fourier transform: performing fast Fourier transform on the frame signal, and converting a time domain signal into a frequency domain signal;
mel filter bank: filtering the frequency spectrum coefficient obtained by Fourier transform by using a triangular filter to obtain a group of coefficients, wherein the span of the triangular filter is evenly distributed on the Mel axis;
log power spectrum: taking logarithm of the output of each filter to obtain a corresponding logarithm power spectrum;
discrete cosine transform: transforming the logarithmic power spectrum to a time domain by utilizing discrete cosine transform, wherein the amplitude of the obtained spectrum is the original Mel frequency cepstrum coefficient to obtain a static signal;
first order difference: obtaining a dynamic signal;
merging: combining the static signal and the dynamic signal to be used as an effective sounding sample signal;
segmenting: the data set of the test set is segmented and longer sounding samples are evenly segmented into short sounding samples.
Further, the pre-emphasis factor is calculated as follows:
α=exp(-2πFΔt)
in the formula, delta t is the sampling period of the sounding signal, F is the frequency, and exp is an exponential curve;
the calculation formula of the pre-emphasis filter is as follows:
H(z)=1-αz -1
where z is the transfer function of the pre-emphasis filter and α is the pre-emphasis factor.
Further, the extracting the characteristics in the required insect sound sample through the MFCC comprises the following steps:
s31, performing spectrum analysis and sound channel parameter analysis on the needed insect singing samples;
s32, dividing the sounding signal into short periods, calling each short period as a frame, and cutting a sounding signal waveform containing N samples of the frame length from the sounding signal;
s33, multiplying the time window function by the initial sounding signal;
s34, taking the frame length N =256, performing FFT (fast Fourier transform) on each frame, and performing modular squaring on a frequency spectrum to obtain a discrete power spectrum;
s35, mapping the discrete power spectrum to a Mel frequency scale, and filtering by using M Mel band-pass filters to obtain a group of coefficients;
s36, taking the logarithm of the output of each Mel band-pass filter to obtain a corresponding Mel logarithmic power spectrum;
and S37, performing discrete cosine change on the Mel logarithm power spectrum, and obtaining the amplitude of the spectrum in the sounding sample.
Further, the gaussian mixture model is constructed as follows:
in the formula (I), the compound is shown in the specification,is a D-dimensional random vector, is greater than or equal to>Is the density of each component, and each component has a Gaussian function with variable D degree, i is 1,2 i Is a mixing weight, and lambda is a parameter;
the density calculation formula of each component is as follows:
in the formula (I), the compound is shown in the specification,is a mean vector, | Σ i | is a covariance matrix, and the mixing weight satisfies the relationship |>D is a variable Gaussian function, exp is an exponential curve;
and the density of the Gaussian mixture model is parameterized by mean vectors, covariance matrixes and mixture weights of all components.
Further, the singing classification of the features in the required insect singing sample through the Gaussian mixture model further comprises the following steps:
s41, starting from the preset initial mode parameter lambda, estimating another new mode parameterAnd ensure the mixed weight
S42, estimating new mode parameters serving as an initial mode in the next iteration process, and repeating the step S41 until a convergence threshold value is reached;
s43, performing squealer training on the N types of insects by using N GMM classifiers, and finding out a type model with similarity for a given observation object;
s44, setting the similarity of all the types to be consistent, simplifying the classification rule, and carrying out the insect sound identification by using the independence and the calculation logarithm among the observations.
Further, a re-estimation algorithm is adopted in the iteration process to ensure that the mode likelihood value is monotonically decreased.
The beneficial effects of the invention are as follows:
1. the signal parameterization method and the advanced pattern recognition technology realize the automatic identification of the insect voice, the automatic identification method uses MFCC as the voice feature and GMM as the classifier, the average identification rate obtained by the method when the voice of a plurality of types of insects is identified is 98.95%, the time required for identifying a voice sample of about 1s is about 300ms, and the method shows good performance in terms of identification accuracy and identification time.
2. According to the insect pest identification and early warning method provided by the invention, firstly, insects are induced and captured, voice characteristics are identified through MFCC, the characteristic sound classification in the required insect sound sample is carried out through the Gaussian mixture model, the insect species is identified, and the identification is more accurate, comprehensive and reliable, so that the insect pest grade is accurately identified, the corresponding insect pest early warning is carried out, the labor intensity of workers is reduced, the identification accuracy rate of the insect pests is increased, the insect pest prevention effect is obviously improved, and the insect pest early warning is more intelligent and reliable.
3. The crop insect automatic identification technology based on deep learning has the advantages of accurate identification, high intelligent degree and the like, on one hand, insect testers get rid of mechanical and tedious insect identification and statistics work, the defect of manual detection is reduced, human resources and time cost are saved, and the insect identification rate is improved; on the other hand, the method is also beneficial to protecting the ecological environment, ensures the food safety, enhances the sustainable development of agriculture, provides timely and effective information for the prevention and treatment of insects, and has great practical value.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a flowchart of a method for insect identification based on whistling deep learning according to an embodiment of the present invention.
Detailed Description
For further explanation of the various embodiments, the drawings which form a part of the disclosure and which are incorporated in and constitute a part of this specification, illustrate embodiments and, together with the description, serve to explain the principles of operation of the embodiments, and to enable others of ordinary skill in the art to understand the various embodiments and advantages of the invention, and, by reference to these figures, reference is made to the accompanying drawings, which are not to scale and wherein like reference numerals generally refer to like elements.
According to the first embodiment of the invention, an insect identification method based on deep learning of sounding is provided.
The invention will be further described with reference to the accompanying drawings and specific embodiments, as shown in fig. 1, according to an embodiment of the invention, a method for identifying insects based on deep learning of whistling, the method for identifying insects comprises the following steps:
s1, carrying out induction capture, storage and sound collection on insects to obtain an initial insect sound sample;
in one embodiment, the step of pre-treating the collected insect song sample and obtaining the desired insect song sample comprises the steps of:
s11, presetting a catching cage, and inducing insects to enter the catching cage by using an inducer in the catching cage;
specifically, the catching cage can be arranged on the ground head, two sides of a road, the front and the back of a house and the like, an inducer in the catching cage contains an inducer, the inducer comprises 20 insect sex attractants such as cotton bollworm, corn borer, striped rice borer, armyworm, pink rice borer, black cutworm, peach fruit borer and diamond back moth, as well as cotton pink bollworm, pear fruit borer, apple leaf roller, peach fruit borer, rice leaf roller, chilo suppressalis and the like, the lure core is prepared by fully mixing an artificial synthetic attractant and a proper carrier (polyethylene, a rubber tube, a rubber head and the like), and each lure core generally contains 20-500 micrograms of the sex attractant.
S12, trapping insects into a catching cage, and collecting the sounding signals of the insects by using a sensor;
and S13, transmitting the sounding signal to a recording device and storing the sounding signal in a database.
In a specific application, before the insect species is identified by the insect voice, the sound signal of the insect is collected, so the following sound signal collecting step is further included before the step S1: the method comprises the following steps that insects with sounding signals to be collected are trapped in a sound insulation box capable of reducing noise interference, the insects are enabled to make sounds under stress pressure, but physical damage is not caused to the insects, a set distance is reserved between a generator of the insects and a sensor arranged in the sound insulation box, and the set distance is generally about 0.5 cm; during the recording period, the sensor collects the insect's beeping signal and transmits the beeping signal to the recording device for storage. The recording equipment can be an Edirolr-4 digital recorder (DAT) which is used in cooperation with a prestige SM4001 sensor, the sampling frequency during recording is 96KHz, the resolution is 16 bits, a single sound track is formed, and the volume is set to the maximum.
In one embodiment, transmitting the sounding signal to a recording device and storing to a database comprises the steps of:
s131, deleting the incomplete and damaged sounding signals, and storing the screened sounding signals into a database;
in the specific application, in the acoustic signal deletion, by collecting different acoustic signals generated by crawling and biting of two stored pests such as the khaki and the rice weevil and combining a chaos algorithm and a support vector machine, defective and damaged acoustic signals of the pests are deleted, so that a better effect can be achieved; and separating the sounding signals of different insects by using a busy source separation technology, and deleting the incomplete and damaged declaration signals according to the spectrogram characteristics of the sounding signals of different insects.
And S132, dividing the data in the database into a training set and a test set, wherein the length of a sample in the test set is far longer than that of the test set.
In a specific application, in order to make the GMM classifier have sufficient data for training, the sample with a longer insect voice segment is used for training, and the sample with a short segment is used for testing to improve the recognition speed, for the recognition process, a voice segment with a length of 1.2s containing an active signal is enough to extract useful parameters for recognition, so the data in the library is divided into two data sets: a training set and a test set, the samples in the training set being much longer in length than the test set.
S2, preprocessing the collected initial insect sound sample, and obtaining a required insect sound sample;
in one embodiment, the pre-treating the collected insect chirps comprises the steps of:
s21, denoising the collected insect singing samples to obtain samples to be processed;
in a specific application, the sounds caused by various natural phenomena such as wind, rain, thunder and lightning and the like can also be the sounds generated by the similar activities of insects, the background noises are firstly removed before the algorithm is used in the natural environment, in addition, the sounds generated by the similar insects under different states are also distinguished, the different states comprise sexes, ages, spouses, competitions, alarms and the like of the insects, and then the automatic insect identification system can also process the different states. The sound file used in the invention is a sound segment without noise which is cut from the recording signal, and in operation, before identification, the insect sound is tried to be separated and detected from the sounding signal mixed with the noise background, so that the stability in the identification process is ensured.
S21, cutting a sample to be processed into a plurality of sound sections;
s23, detecting a sound section (a non-zero area of the sound signal, with a single pulse) and a soundless section (an area of the sound signal being zero, without a single pulse) in the sound section, and preprocessing and detecting each sound section;
and S24, taking the preset chirping segment after pretreatment detection as a required insect chirping sample.
In one embodiment, the pre-processing detection includes calculation of Mel-frequency cepstral coefficients;
normalization: dividing each sample to be processed by the amplification peak of the ringing segment;
the formula for calculating the peak value of the amplification is as follows:
wherein x (i) is the original signal,is the normalized signal, n is the signal length, i is the signal;
pre-emphasis: the system comprises a pre-emphasis filter, a frequency spectrum analysis module and a frequency spectrum analysis module, wherein the pre-emphasis filter is used for improving high-frequency components by utilizing a pre-emphasis factor and the pre-emphasis filter, keeping low-frequency components at the original level and flattening the frequency spectrum of a signal so as to perform frequency spectrum analysis and vocal tract parameter analysis;
framing and windowing: overlapping and framing the signal and the Hamming window to change the signal into a short-time stable signal;
fourier transform: performing fast Fourier transform on the frame signal, and converting a time domain signal into a frequency domain signal;
mel filter bank: filtering the frequency spectrum coefficient obtained by Fourier transform by using a triangular filter to obtain a group of coefficients, wherein the span of the triangular filter is evenly distributed on the Mel axis;
log power spectrum: taking logarithm of the output of each filter to obtain a corresponding logarithm power spectrum;
discrete cosine transform: transforming the logarithmic power spectrum to a time domain by utilizing discrete cosine transform, wherein the amplitude of the obtained spectrum is the original Mel frequency cepstrum coefficient to obtain a static signal;
first order difference: obtaining a dynamic signal;
merging: combining the static signal and the dynamic signal to be used as an effective sounding sample signal;
segmenting: and carrying out segmentation operation on the data set of the test set, and uniformly dividing the long sounding sample into short sounding samples.
In one embodiment, the pre-emphasis factor is calculated as follows:
α=exp(-2πFΔt)
in the formula, delta t is the sampling period of the sounding signal, F is the frequency, and exp is an exponential curve;
the calculation formula of the pre-emphasis filter is as follows:
H(z)=1-αz-1
where z is the transfer function of the pre-emphasis filter and α is the pre-emphasis factor.
S3, extracting features in the insect sound sample through MFCC to obtain a feature sample;
in one embodiment, the extracting features in the desired insect chirp sample by MFCC (mel-frequency cepstral coefficient) comprises the steps of:
s31, performing spectrum analysis and sound channel parameter analysis on the needed insect singing samples;
s32, dividing the sounding signal into short periods, calling each short period as a frame, and cutting a sounding signal waveform containing N samples of the frame length from the sounding signal;
s33, multiplying the time window function by the initial sounding signal;
s34, taking the frame length N =256, performing FFT (fast Fourier transform) on each frame, and performing modular squaring on a frequency spectrum to obtain a discrete power spectrum;
s35, mapping the discrete power spectrum to a Mel frequency scale, and filtering by using M Mel band-pass filters to obtain a group of coefficients;
s36, taking the logarithm of the output of each Mel band-pass filter to obtain a corresponding Mel logarithmic power spectrum;
s37, discrete cosine change is conducted on the Mel logarithm power spectrum, and the amplitude of the spectrum in the sound sample is obtained.
In specific application, the standard MFCC only reflects the static characteristics of voice parameters, the level difference MFCC (delta MFCC) is a dynamic parameter, reflects the dynamic characteristics of the voice parameters, and has better robustness, on the basis of the first-order difference MFCC, the second-order difference MFCC can be further calculated, the automatic identification of insect sounds is realized by a signal parameterization method and an advanced mode identification technology, the MFCC is used as a sound characteristic, and the GMM is used as a classifier, so that the average identification rate obtained by the method when the sounds of various insects are identified is 98.95%, the time required for identifying a sound sample of about 1s is about 300ms, and good performance is shown from the aspects of identification accuracy and identification time.
And S4, performing sound classification on the characteristic samples by using a Gaussian mixture model.
In one embodiment, the gaussian mixture model is constructed as follows:
in the formula (I), the compound is shown in the specification,is a random vector of dimension D, is combined with a predetermined number of vectors>The density of each component is a Gaussian function with variable D degree, the value of i is 1, 2.
The density calculation formula of each component is as follows:
in the formula (I), the compound is shown in the specification,is a mean vector, | Σ i I is a covariance matrix, and the mix weight satisfies the relationship |>D is a variable Gaussian function, exp is an exponential curve;
and the density of the Gaussian mixture model is parameterized by mean vectors, covariance matrixes and mixture weights of all components.
In one embodiment, said classifying the chirp of the features in the desired insect chirp sample by the Gaussian mixture model further comprises the steps of:
s41, starting from the preset initial mode parameter lambda, estimating another new mode parameterAnd ensure the mixed weight
S42, estimating new mode parameters as an initial mode in the next iteration process, and repeating the step S41 until a convergence threshold value is reached;
s43, performing acoustic training on N types of insects by using N GMM classifiers, and finding a type model with similarity for a given observed object;
s44, setting the similarity of all types to be consistent, simplifying the classification rule, and carrying out the insect sound identification by using the independence and the calculation logarithm among the observations.
In one embodiment, a re-estimation algorithm is employed in the iterative process to ensure a monotonic decrease in the mode likelihood values.
The re-estimation algorithm steps are as follows:
calculating the average mixed weight value, wherein the calculation formula is as follows:
the posterior probability for the sound class is calculated as follows:
in which λ is a parameter, p i In order to mix the weight values, the user can select the weight value,is the density of the i-th component, p k Is new weight value>The density of the kth component, i, k, is the value in the crenulated sample.
In addition, when the method is applied specifically, in order to determine the insect pest grade and carry out corresponding insect pest early warning, a Gradient Boosting Regression Tree (GBRT) model is also established, the insect pest index and multidimensional environmental characteristics are combined into a characteristic group by a disaster early warning center, the characteristic group is input into the Gradient Boosting Regression Tree model, the disaster early warning center calculates the characteristic group formed by the insect pest index and multiple environmental characteristics through the Gradient Boosting Regression Tree model to obtain a Regression value, the insect pest grade is determined according to the size of the Regression value, the insect pest grade is higher if the Regression value is larger, a user can set a corresponding warning value according to actual needs, the calculated Regression value is compared with the warning value by the disaster early warning center, and if the Regression value is larger than the insect pest grade, the disaster early warning center sends out early warning and displays the insect pest grade; and if the regression value is smaller than the warning value, the disaster early warning center only displays the insect pest grade without early warning.
Wherein, the disaster early warning center can also send out different early warnings according to the difference of insect pest grades. Assuming that the pest grades are divided into three grades, a user can set a first threshold value and a second threshold value according to actual requirements for classifying the pest grades; wherein the first threshold is less than the second threshold. When the regression value is smaller than the first threshold value, the pest grade is low, and the probability of pest occurrence is low at the moment; when the regression value is between the first threshold value and the second threshold value, the pest grade is medium; when the regression value is larger than the second threshold value, the pest grade is high, and the probability of pest occurrence is high. Suppose that the disaster early warning center carries out the insect pest early warning through sound, then the different insect pest early warnings are carried out to different sound frequencies of accessible, and the insect pest grade is higher, and the frequency of sound early warning is higher, and the user can directly confirm the insect pest grade through the difference of sound, knows the farmland condition rapidly.
In summary, with the above technical solutions of the present invention, the signal parameterization method and the advanced pattern recognition technology of the present invention achieve automatic identification of insect voices, and the proposed automatic identification method uses MFCC as a voice feature and GMM as a classifier, and the method obtains an average identification rate of 98.95% when identifying a plurality of types of insect voices, and the time required for identifying a voice sample of about 1s is about 300ms, thus showing good performance in terms of both identification accuracy and identification time; according to the insect pest identification and early warning method provided by the invention, firstly, insects are induced and captured, voice characteristics are identified through MFCC, the characteristic sound classification in a required insect sound sample is carried out through a Gaussian mixture model, the insect species is identified, and the identification is more accurate, comprehensive and reliable, so that the insect pest grade is accurately identified, the corresponding insect pest early warning is carried out, the labor intensity of workers is reduced, the identification accuracy rate of insect pests is increased, the insect pest prevention effect is obviously improved, and the insect pest early warning is more intelligent and reliable; the crop insect automatic identification technology based on deep learning has the advantages of accurate identification, high intelligent degree and the like, on one hand, insect testers get rid of mechanical and tedious insect identification and statistics work, the defect of manual detection is reduced, human resources and time cost are saved, and the insect identification rate is improved; on the other hand, the method is also beneficial to protecting the ecological environment, ensures the food safety, enhances the sustainable development of agriculture, provides timely and effective information for the prevention and treatment of insects, and has great practical value.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (9)
1. An insect recognition method based on deep learning of sound, which is characterized by comprising the following steps:
s1, carrying out induction capture, storage and sound collection on insects to obtain an initial insect sound sample;
s2, preprocessing the collected initial insect sound sample, and obtaining a required insect sound sample;
s3, extracting the characteristics in the insect singing sample through MFCC to obtain a characteristic sample;
s4, performing sound classification on the feature samples by using a Gaussian mixture model;
the pretreatment of the collected whistling of the insects comprises the following steps:
s21, denoising the collected insect whistling samples to obtain samples to be processed;
s21, intercepting a sample to be processed into a plurality of sounding segments;
s23, detecting a sound section and a soundless section in the sound section, and preprocessing and detecting each sound section;
and S24, taking the preset sound segment after the pretreatment detection as a sound sample of the required insect.
2. The insect recognition method based on whistling deep learning of claim 1, wherein the preprocessing of the collected whistling samples of the insects and obtaining the desired whistling samples of the insects comprises the following steps:
s11, presetting a catching cage, and inducing insects to enter the catching cage by using an inducer in the catching cage;
s12, trapping insects into the catching cage, and collecting the sounding signals of the insects by using a sensor;
and S13, storing the sounding signal to a database.
3. The method of claim 2, wherein transmitting the sounding signal to a recording device and storing the sounding signal in a database comprises:
s131, deleting the incomplete and damaged sounding signals, and storing the screened sounding signals into a database;
and S132, dividing the data in the database into a training set and a test set, wherein the length of a sample in the test set is far longer than that of the test set.
4. The insect recognition method based on deep learning of singing sound according to claim 1, wherein the preprocessing detection comprises calculation of Mel-frequency cepstrum coefficients;
normalization: dividing each sample to be processed by the amplification peak of the ringing segment;
pre-emphasis: the system comprises a pre-emphasis filter, a frequency spectrum analysis module and a frequency spectrum analysis module, wherein the pre-emphasis filter is used for improving high-frequency components by utilizing a pre-emphasis factor and the pre-emphasis filter, keeping low-frequency components at the original level and flattening the frequency spectrum of a signal so as to perform frequency spectrum analysis and vocal tract parameter analysis;
framing and windowing: overlapping and framing the signal and the Hamming window to change the signal into a short-time stable signal;
fourier transform: performing fast Fourier transform on the frame signal, and converting a time domain signal into a frequency domain signal;
mel filter bank: filtering the frequency spectrum coefficient obtained by Fourier transform by using a triangular filter to obtain a group of coefficients, wherein the span of the triangular filter is evenly distributed on the Mel axis;
log power spectrum: taking logarithm of the output of each filter to obtain a corresponding logarithm power spectrum;
discrete cosine transform: transforming the logarithmic power spectrum to a time domain by utilizing discrete cosine transform, wherein the amplitude of the obtained spectrum is the original Mel frequency cepstrum coefficient to obtain a static signal;
first order difference: obtaining a dynamic signal;
merging: combining the static signal and the dynamic signal to be used as an effective sounding sample signal;
segmenting: the data set of the test set is segmented and longer sounding samples are evenly segmented into short sounding samples.
5. The method of claim 4, wherein the pre-emphasis factor is calculated as follows:
α=exp(-2πFΔt)
in the formula, Δ t is the sampling period of the sounding signal, F is the frequency, and exp is an exponential curve;
the calculation formula of the pre-emphasis filter is as follows:
H(z)=1-αz -1
where z is the transfer function of the pre-emphasis filter and α is the pre-emphasis factor.
6. The insect recognition method based on deep learning of singing sound according to claim 1, wherein the extraction of the features in the singing sound samples of the desired insects by MFCC comprises the following steps:
s31, performing spectrum analysis and sound channel parameter analysis on the sounding samples of the required insects;
s32, dividing the sounding signal into short time segments, calling each short time segment as a frame, and cutting a sounding signal waveform containing N samples with the frame length from the sounding signal;
s33, multiplying the time window function by the initial sounding signal;
s34, taking the frame length N =256, performing FFT (fast Fourier transform) on each frame, and performing modular squaring on a frequency spectrum to obtain a discrete power spectrum;
s35, mapping the discrete power spectrum to a Mel frequency scale, and filtering by using M Mel band-pass filters to obtain a group of coefficients;
s36, taking the logarithm of the output of each Mel band-pass filter to obtain a corresponding Mel logarithm power spectrum;
s37, discrete cosine change is conducted on the Mel logarithm power spectrum, and the amplitude of the spectrum in the sound sample is obtained.
7. The insect recognition method based on deep learning of singing sound according to claim 1, wherein the Gaussian mixture model is constructed according to the following formula:
in the formula (I), the compound is shown in the specification,is a D-dimensional random vector, is greater than or equal to>Is the density of each component, and each component has a Gaussian function with variable D degree, i is 1,2 i Is a mixing weight value, and lambda is a parameter;
the density calculation formula of each component is as follows:
in the formula (I), the compound is shown in the specification,is a mean vector, | Σ i I is a covariance matrix, and the mix weight satisfies the relationship |>D is a variable Gaussian function, exp is an exponential curve;
and the density of the Gaussian mixture model is parameterized by mean vectors, covariance matrixes and mixed weights of all components.
8. The insect recognition method based on deep learning of singing sound according to claim 7, wherein the classification of the singing sound of the features in the singing sound samples of the desired insects by the Gaussian mixture model further comprises the following steps:
s41, starting from the preset initial mode parameter lambda, estimating another new mode parameterAnd ensure the mixed weight
S42, estimating new mode parameters as an initial mode in the next iteration process, and repeating the step S41 until a convergence threshold value is reached;
s43, performing squealer training on the N types of insects by using N GMM classifiers, and finding out a type model with similarity for a given observation object;
s44, setting the similarity of all the types to be consistent, simplifying the classification rule, and carrying out the insect sound identification by using the independence and the calculation logarithm among the observations.
9. The method of claim 8, wherein a re-estimation algorithm is used in the iterative process to ensure a monotonic decrease in the pattern likelihood values.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211425420.5A CN115910077A (en) | 2022-11-15 | 2022-11-15 | Insect identification method based on deep learning of sound |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211425420.5A CN115910077A (en) | 2022-11-15 | 2022-11-15 | Insect identification method based on deep learning of sound |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115910077A true CN115910077A (en) | 2023-04-04 |
Family
ID=86485084
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211425420.5A Pending CN115910077A (en) | 2022-11-15 | 2022-11-15 | Insect identification method based on deep learning of sound |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115910077A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101976564A (en) * | 2010-10-15 | 2011-02-16 | 中国林业科学研究院森林生态环境与保护研究所 | Method for identifying insect voice |
CN107369444A (en) * | 2016-05-11 | 2017-11-21 | 中国科学院声学研究所 | A kind of underwater manoeuvre Small object recognition methods based on MFCC and artificial neural network |
CN109726700A (en) * | 2019-01-07 | 2019-05-07 | 武汉南博网络科技有限公司 | A kind of identifying pest method for early warning and device based on multiple features |
KR20190087363A (en) * | 2019-07-15 | 2019-07-24 | 인하대학교 산학협력단 | System and method for hidden markov model based uav sound recognition using mfcc technique in practical noisy environments |
-
2022
- 2022-11-15 CN CN202211425420.5A patent/CN115910077A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101976564A (en) * | 2010-10-15 | 2011-02-16 | 中国林业科学研究院森林生态环境与保护研究所 | Method for identifying insect voice |
CN107369444A (en) * | 2016-05-11 | 2017-11-21 | 中国科学院声学研究所 | A kind of underwater manoeuvre Small object recognition methods based on MFCC and artificial neural network |
CN109726700A (en) * | 2019-01-07 | 2019-05-07 | 武汉南博网络科技有限公司 | A kind of identifying pest method for early warning and device based on multiple features |
KR20190087363A (en) * | 2019-07-15 | 2019-07-24 | 인하대학교 산학협력단 | System and method for hidden markov model based uav sound recognition using mfcc technique in practical noisy environments |
Non-Patent Citations (2)
Title |
---|
吴梅梅: "机器学习算法及其应用", 31 May 2020, 北京:机械工业出版社, pages: 74 - 78 * |
竺乐庆 等: "基于MFCC 和GMM 的昆虫声音自动识别", 昆虫学报, vol. 55, no. 4, 20 April 2012 (2012-04-20), pages 466 - 471 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Chesmore et al. | Automated identification of field-recorded songs of four British grasshoppers using bioacoustic signal recognition | |
US5956463A (en) | Audio monitoring system for assessing wildlife biodiversity | |
Potamitis et al. | On automatic bioacoustic detection of pests: the cases of Rhynchophorus ferrugineus and Sitophilus oryzae | |
Le-Qing | Insect sound recognition based on mfcc and pnn | |
Evans et al. | Monitoring grassland birds in nocturnal migration | |
CN106847293A (en) | Facility cultivation sheep stress behavior acoustical signal monitoring method | |
CN109243470A (en) | Broiler chicken cough monitoring method based on Audiotechnica | |
Kvsn et al. | Bioacoustics data analysis–A taxonomy, survey and open challenges | |
CN101976564A (en) | Method for identifying insect voice | |
de Souza et al. | Classification of data streams applied to insect recognition: Initial results | |
Himawan et al. | Deep Learning Techniques for Koala Activity Detection. | |
CN112331231B (en) | Broiler feed intake detection system based on audio technology | |
CN110265041A (en) | A kind of method and system for the song behavior collected, analyze pig | |
CN115048984A (en) | Sow oestrus recognition method based on deep learning | |
Boulmaiz et al. | The use of WSN (wireless sensor network) in the surveillance of endangered bird species | |
CN117727308B (en) | Mixed bird song recognition method based on deep migration learning | |
CN113936667A (en) | Bird song recognition model training method, recognition method and storage medium | |
CN111540368A (en) | Stable bird sound extraction method and device and computer readable storage medium | |
Zhang et al. | A novel insect sound recognition algorithm based on MFCC and CNN | |
Xie et al. | Detecting frog calling activity based on acoustic event detection and multi-label learning | |
Sharma et al. | Bioacoustics Monitoring of Wildlife using Artificial Intelligence: A Methodological Literature Review | |
Yazgaç et al. | Detection of sunn pests using sound signal processing methods | |
CN115910077A (en) | Insect identification method based on deep learning of sound | |
Cosentino et al. | Porpoise click classifier (PorCC): A high-accuracy classifier to study harbour porpoises (Phocoena phocoena) in the wild | |
CH719673A2 (en) | AI-BASED REAL-TIME ACOUSTIC WILDLIFE MONITORING SYSTEM |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |