CN115910077A

CN115910077A - Insect identification method based on deep learning of sound

Info

Publication number: CN115910077A
Application number: CN202211425420.5A
Authority: CN
Inventors: 马方舟; 王晨彬; 张彦静; 陈菁; 崔鹏; 徐海根
Original assignee: Nanjing Institute of Environmental Sciences MEE
Current assignee: Nanjing Institute of Environmental Sciences MEE
Priority date: 2022-11-15
Filing date: 2022-11-15
Publication date: 2023-04-04

Abstract

The invention discloses an insect identification method based on deep learning of sound, which comprises the following steps: s1, carrying out induction capture, storage and sound collection on insects to obtain an initial insect sound sample; s2, preprocessing the collected initial insect sound sample, and obtaining a required insect sound sample; s3, extracting features in the insect sound sample through MFCC to obtain a feature sample; and S4, performing sound classification on the characteristic samples by using a Gaussian mixture model. The invention has the beneficial effects that: the signal parameterization method and the advanced pattern recognition technology realize the identification of the insect voice, the MFCC is provided as the voice feature, the GMM is provided as the classifier, the average identification rate obtained when the voice of a plurality of types of insects is identified is 98.95%, the time required for identifying a voice sample of about 1s is about 300ms, and the good performance is shown in the aspects of identification accuracy and identification time.

Description

Insect identification method based on deep learning of sound

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to an insect identification method based on deep learning of sound.

Background

The crop diseases and insect pests have the characteristics of multiple types, large influence and frequent outbreak and disaster. During the growth process of crops, the crops are often damaged by different kinds of insects in the field, the quality and the yield of the crops are seriously influenced, and huge economic losses are caused. Therefore, timely and accurate identification of crop insects is important, people often use chemical pesticides to control pests according to self experience in actual production, so that the problems of pesticide residue, environmental pollution and the like are serious, the crop insects are effectively and accurately identified, effective control measures are taken timely, economic loss caused by pests can be reduced, and stable and high yield of crops is guaranteed.

At present, agricultural insect identification and diagnosis work in China mainly depends on manual identification or insect expert identification, the method requires very rich experience or solid insect classification professional knowledge, and the problems of time and labor consumption, low efficiency, high misjudgment rate and the like exist, so that the requirement of modern agricultural development is difficult to meet. In recent years, the convolutional neural network in deep learning has made great progress in some fields such as image classification, voice recognition and pedestrian detection, however, the application of the related technology in the field of agricultural insect recognition and monitoring is deficient. The crop insect automatic identification technology based on deep learning has the advantages of accurate identification, high intelligent degree and the like, on one hand, insect forecasting personnel get rid of mechanical and tedious insect identification and statistics work, the defect of manual detection is reduced, human resources and time cost are saved, and the insect identification rate is improved; on the other hand, the method is also beneficial to protecting the ecological environment, ensures the food safety, enhances the sustainable development of agriculture, provides timely and effective information for the prevention and treatment of insects, and has great practical value.

However, in a practical complicated farmland environment during a specific use operation, the pest image is interfered by the background, the identification performance of the pest image is limited, and the sound has an important significance in insect species identification as one of the expression characteristics of the insect. The sound detection comprises the steps of converting collected sounding signals into electric signals, amplifying the electric signals, purifying the electric signals by an electronic filter, extracting the sound emitted by insects, and determining the types and the quantity of the insects according to the sound frequency and the characteristic value of signal pulses;

however, in the prior art, although there are many researches in the field of human language recognition, automatic acoustic species identification is still considered as a marginal field of pattern recognition, and the research literature is relatively few, and a singing deep learning-based insect recognition method is urgently needed to solve the technical scheme.

Disclosure of Invention

Aiming at the problems in the related art, the invention provides an insect identification method based on deep learning of sound, so as to overcome the technical problems in the prior related art.

Therefore, the invention adopts the following specific technical scheme:

an insect identification method based on deep learning of sound, comprising the following steps:

s1, carrying out induction capture, storage and sound collection on insects to obtain an initial insect sound sample;

s2, preprocessing the collected initial insect sound sample, and obtaining a required insect sound sample;

s3, extracting the characteristics in the insect singing sample through MFCC to obtain a characteristic sample;

and S4, performing sound classification on the characteristic samples by using a Gaussian mixture model.

Further, the step of preprocessing the collected insect song sample and obtaining the required insect song sample comprises the following steps:

s11, presetting a catching cage, and inducing insects to enter the catching cage by using an inducer in the catching cage;

s12, trapping insects into the catching cage, and collecting the sounding signals of the insects by using a sensor;

and S13, storing the sounding signal to a database.

Further, transmitting the sounding signal to a recording device and storing it to a database includes the steps of:

s131, deleting the incomplete and damaged sounding signals, and storing the screened sounding signals into a database;

and S132, dividing the data in the database into a training set and a test set, wherein the length of a sample in the test set is far longer than that of the test set.

Further, the pretreatment of the collected insect chirps comprises the following steps:

s21, denoising the collected insect singing samples to obtain samples to be processed;

s21, cutting a sample to be processed into a plurality of sound sections;

s23, detecting a sound section and a soundless section in the sound section, and preprocessing and detecting each sound section;

and S24, taking the preset ringing segment after the pretreatment detection as a required insect ringing sample.

Further, the preprocessing detection comprises calculation of Mel frequency cepstral coefficients;

normalization: dividing each sample to be processed by the amplification peak of the ringing segment;

pre-emphasis: the system comprises a pre-emphasis filter, a frequency spectrum analysis module and a frequency spectrum analysis module, wherein the pre-emphasis filter is used for improving high-frequency components by utilizing a pre-emphasis factor and the pre-emphasis filter, keeping low-frequency components at the original level and flattening the frequency spectrum of a signal so as to perform frequency spectrum analysis and vocal tract parameter analysis;

framing and windowing: overlapping and framing the signal and the Hamming window to change the signal into a short-time stable signal;

fourier transform: performing fast Fourier transform on the frame signal, and converting a time domain signal into a frequency domain signal;

mel filter bank: filtering the frequency spectrum coefficient obtained by Fourier transform by using a triangular filter to obtain a group of coefficients, wherein the span of the triangular filter is evenly distributed on the Mel axis;

log power spectrum: taking logarithm of the output of each filter to obtain a corresponding logarithm power spectrum;

discrete cosine transform: transforming the logarithmic power spectrum to a time domain by utilizing discrete cosine transform, wherein the amplitude of the obtained spectrum is the original Mel frequency cepstrum coefficient to obtain a static signal;

first order difference: obtaining a dynamic signal;

merging: combining the static signal and the dynamic signal to be used as an effective sounding sample signal;

segmenting: the data set of the test set is segmented and longer sounding samples are evenly segmented into short sounding samples.

Further, the pre-emphasis factor is calculated as follows:

α＝exp(-2πFΔt)

in the formula, delta t is the sampling period of the sounding signal, F is the frequency, and exp is an exponential curve;

the calculation formula of the pre-emphasis filter is as follows:

H(z)＝1-αz ^-1

where z is the transfer function of the pre-emphasis filter and α is the pre-emphasis factor.

Further, the extracting the characteristics in the required insect sound sample through the MFCC comprises the following steps:

s31, performing spectrum analysis and sound channel parameter analysis on the needed insect singing samples;

s32, dividing the sounding signal into short periods, calling each short period as a frame, and cutting a sounding signal waveform containing N samples of the frame length from the sounding signal;

s33, multiplying the time window function by the initial sounding signal;

s34, taking the frame length N =256, performing FFT (fast Fourier transform) on each frame, and performing modular squaring on a frequency spectrum to obtain a discrete power spectrum;

s35, mapping the discrete power spectrum to a Mel frequency scale, and filtering by using M Mel band-pass filters to obtain a group of coefficients;

s36, taking the logarithm of the output of each Mel band-pass filter to obtain a corresponding Mel logarithmic power spectrum;

and S37, performing discrete cosine change on the Mel logarithm power spectrum, and obtaining the amplitude of the spectrum in the sounding sample.

Further, the gaussian mixture model is constructed as follows:

in the formula (I), the compound is shown in the specification,

is a D-dimensional random vector, is greater than or equal to>

Is the density of each component, and each component has a Gaussian function with variable D degree, i is 1,2 _i Is a mixing weight, and lambda is a parameter;

the density calculation formula of each component is as follows:

in the formula (I), the compound is shown in the specification,

is a mean vector, | Σ _i | is a covariance matrix, and the mixing weight satisfies the relationship |>

D is a variable Gaussian function, exp is an exponential curve;

and the density of the Gaussian mixture model is parameterized by mean vectors, covariance matrixes and mixture weights of all components.

Further, the singing classification of the features in the required insect singing sample through the Gaussian mixture model further comprises the following steps:

s41, starting from the preset initial mode parameter lambda, estimating another new mode parameter

And ensure the mixed weight

S42, estimating new mode parameters serving as an initial mode in the next iteration process, and repeating the step S41 until a convergence threshold value is reached;

s43, performing squealer training on the N types of insects by using N GMM classifiers, and finding out a type model with similarity for a given observation object;

s44, setting the similarity of all the types to be consistent, simplifying the classification rule, and carrying out the insect sound identification by using the independence and the calculation logarithm among the observations.

Further, a re-estimation algorithm is adopted in the iteration process to ensure that the mode likelihood value is monotonically decreased.

The beneficial effects of the invention are as follows:

1. the signal parameterization method and the advanced pattern recognition technology realize the automatic identification of the insect voice, the automatic identification method uses MFCC as the voice feature and GMM as the classifier, the average identification rate obtained by the method when the voice of a plurality of types of insects is identified is 98.95%, the time required for identifying a voice sample of about 1s is about 300ms, and the method shows good performance in terms of identification accuracy and identification time.

2. According to the insect pest identification and early warning method provided by the invention, firstly, insects are induced and captured, voice characteristics are identified through MFCC, the characteristic sound classification in the required insect sound sample is carried out through the Gaussian mixture model, the insect species is identified, and the identification is more accurate, comprehensive and reliable, so that the insect pest grade is accurately identified, the corresponding insect pest early warning is carried out, the labor intensity of workers is reduced, the identification accuracy rate of the insect pests is increased, the insect pest prevention effect is obviously improved, and the insect pest early warning is more intelligent and reliable.

3. The crop insect automatic identification technology based on deep learning has the advantages of accurate identification, high intelligent degree and the like, on one hand, insect testers get rid of mechanical and tedious insect identification and statistics work, the defect of manual detection is reduced, human resources and time cost are saved, and the insect identification rate is improved; on the other hand, the method is also beneficial to protecting the ecological environment, ensures the food safety, enhances the sustainable development of agriculture, provides timely and effective information for the prevention and treatment of insects, and has great practical value.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

Fig. 1 is a flowchart of a method for insect identification based on whistling deep learning according to an embodiment of the present invention.

Detailed Description

For further explanation of the various embodiments, the drawings which form a part of the disclosure and which are incorporated in and constitute a part of this specification, illustrate embodiments and, together with the description, serve to explain the principles of operation of the embodiments, and to enable others of ordinary skill in the art to understand the various embodiments and advantages of the invention, and, by reference to these figures, reference is made to the accompanying drawings, which are not to scale and wherein like reference numerals generally refer to like elements.

According to the first embodiment of the invention, an insect identification method based on deep learning of sounding is provided.

The invention will be further described with reference to the accompanying drawings and specific embodiments, as shown in fig. 1, according to an embodiment of the invention, a method for identifying insects based on deep learning of whistling, the method for identifying insects comprises the following steps:

in one embodiment, the step of pre-treating the collected insect song sample and obtaining the desired insect song sample comprises the steps of:

specifically, the catching cage can be arranged on the ground head, two sides of a road, the front and the back of a house and the like, an inducer in the catching cage contains an inducer, the inducer comprises 20 insect sex attractants such as cotton bollworm, corn borer, striped rice borer, armyworm, pink rice borer, black cutworm, peach fruit borer and diamond back moth, as well as cotton pink bollworm, pear fruit borer, apple leaf roller, peach fruit borer, rice leaf roller, chilo suppressalis and the like, the lure core is prepared by fully mixing an artificial synthetic attractant and a proper carrier (polyethylene, a rubber tube, a rubber head and the like), and each lure core generally contains 20-500 micrograms of the sex attractant.

S12, trapping insects into a catching cage, and collecting the sounding signals of the insects by using a sensor;

and S13, transmitting the sounding signal to a recording device and storing the sounding signal in a database.

In a specific application, before the insect species is identified by the insect voice, the sound signal of the insect is collected, so the following sound signal collecting step is further included before the step S1: the method comprises the following steps that insects with sounding signals to be collected are trapped in a sound insulation box capable of reducing noise interference, the insects are enabled to make sounds under stress pressure, but physical damage is not caused to the insects, a set distance is reserved between a generator of the insects and a sensor arranged in the sound insulation box, and the set distance is generally about 0.5 cm; during the recording period, the sensor collects the insect's beeping signal and transmits the beeping signal to the recording device for storage. The recording equipment can be an Edirolr-4 digital recorder (DAT) which is used in cooperation with a prestige SM4001 sensor, the sampling frequency during recording is 96KHz, the resolution is 16 bits, a single sound track is formed, and the volume is set to the maximum.

In one embodiment, transmitting the sounding signal to a recording device and storing to a database comprises the steps of:

in the specific application, in the acoustic signal deletion, by collecting different acoustic signals generated by crawling and biting of two stored pests such as the khaki and the rice weevil and combining a chaos algorithm and a support vector machine, defective and damaged acoustic signals of the pests are deleted, so that a better effect can be achieved; and separating the sounding signals of different insects by using a busy source separation technology, and deleting the incomplete and damaged declaration signals according to the spectrogram characteristics of the sounding signals of different insects.

In a specific application, in order to make the GMM classifier have sufficient data for training, the sample with a longer insect voice segment is used for training, and the sample with a short segment is used for testing to improve the recognition speed, for the recognition process, a voice segment with a length of 1.2s containing an active signal is enough to extract useful parameters for recognition, so the data in the library is divided into two data sets: a training set and a test set, the samples in the training set being much longer in length than the test set.

in one embodiment, the pre-treating the collected insect chirps comprises the steps of:

in a specific application, the sounds caused by various natural phenomena such as wind, rain, thunder and lightning and the like can also be the sounds generated by the similar activities of insects, the background noises are firstly removed before the algorithm is used in the natural environment, in addition, the sounds generated by the similar insects under different states are also distinguished, the different states comprise sexes, ages, spouses, competitions, alarms and the like of the insects, and then the automatic insect identification system can also process the different states. The sound file used in the invention is a sound segment without noise which is cut from the recording signal, and in operation, before identification, the insect sound is tried to be separated and detected from the sounding signal mixed with the noise background, so that the stability in the identification process is ensured.

S21, cutting a sample to be processed into a plurality of sound sections;

s23, detecting a sound section (a non-zero area of the sound signal, with a single pulse) and a soundless section (an area of the sound signal being zero, without a single pulse) in the sound section, and preprocessing and detecting each sound section;

and S24, taking the preset chirping segment after pretreatment detection as a required insect chirping sample.

In one embodiment, the pre-processing detection includes calculation of Mel-frequency cepstral coefficients;

the formula for calculating the peak value of the amplification is as follows:

wherein x (i) is the original signal,

is the normalized signal, n is the signal length, i is the signal;

first order difference: obtaining a dynamic signal;

segmenting: and carrying out segmentation operation on the data set of the test set, and uniformly dividing the long sounding sample into short sounding samples.

In one embodiment, the pre-emphasis factor is calculated as follows:

α＝exp(-2πFΔt)

the calculation formula of the pre-emphasis filter is as follows:

H(z)＝1-αz-1

S3, extracting features in the insect sound sample through MFCC to obtain a feature sample;

in one embodiment, the extracting features in the desired insect chirp sample by MFCC (mel-frequency cepstral coefficient) comprises the steps of:

s33, multiplying the time window function by the initial sounding signal;

s37, discrete cosine change is conducted on the Mel logarithm power spectrum, and the amplitude of the spectrum in the sound sample is obtained.

In specific application, the standard MFCC only reflects the static characteristics of voice parameters, the level difference MFCC (delta MFCC) is a dynamic parameter, reflects the dynamic characteristics of the voice parameters, and has better robustness, on the basis of the first-order difference MFCC, the second-order difference MFCC can be further calculated, the automatic identification of insect sounds is realized by a signal parameterization method and an advanced mode identification technology, the MFCC is used as a sound characteristic, and the GMM is used as a classifier, so that the average identification rate obtained by the method when the sounds of various insects are identified is 98.95%, the time required for identifying a sound sample of about 1s is about 300ms, and good performance is shown from the aspects of identification accuracy and identification time.

In one embodiment, the gaussian mixture model is constructed as follows:

in the formula (I), the compound is shown in the specification,

is a random vector of dimension D, is combined with a predetermined number of vectors>

The density of each component is a Gaussian function with variable D degree, the value of i is 1, 2.

The density calculation formula of each component is as follows:

in the formula (I), the compound is shown in the specification,

is a mean vector, | Σ _i I is a covariance matrix, and the mix weight satisfies the relationship |>

D is a variable Gaussian function, exp is an exponential curve;

In one embodiment, said classifying the chirp of the features in the desired insect chirp sample by the Gaussian mixture model further comprises the steps of:

And ensure the mixed weight

S42, estimating new mode parameters as an initial mode in the next iteration process, and repeating the step S41 until a convergence threshold value is reached;

s43, performing acoustic training on N types of insects by using N GMM classifiers, and finding a type model with similarity for a given observed object;

s44, setting the similarity of all types to be consistent, simplifying the classification rule, and carrying out the insect sound identification by using the independence and the calculation logarithm among the observations.

In one embodiment, a re-estimation algorithm is employed in the iterative process to ensure a monotonic decrease in the mode likelihood values.

The re-estimation algorithm steps are as follows:

calculating the average mixed weight value, wherein the calculation formula is as follows:

the posterior probability for the sound class is calculated as follows:

in which λ is a parameter, p _i In order to mix the weight values, the user can select the weight value,

is the density of the i-th component, p _k Is new weight value>

The density of the kth component, i, k, is the value in the crenulated sample.

In addition, when the method is applied specifically, in order to determine the insect pest grade and carry out corresponding insect pest early warning, a Gradient Boosting Regression Tree (GBRT) model is also established, the insect pest index and multidimensional environmental characteristics are combined into a characteristic group by a disaster early warning center, the characteristic group is input into the Gradient Boosting Regression Tree model, the disaster early warning center calculates the characteristic group formed by the insect pest index and multiple environmental characteristics through the Gradient Boosting Regression Tree model to obtain a Regression value, the insect pest grade is determined according to the size of the Regression value, the insect pest grade is higher if the Regression value is larger, a user can set a corresponding warning value according to actual needs, the calculated Regression value is compared with the warning value by the disaster early warning center, and if the Regression value is larger than the insect pest grade, the disaster early warning center sends out early warning and displays the insect pest grade; and if the regression value is smaller than the warning value, the disaster early warning center only displays the insect pest grade without early warning.

Wherein, the disaster early warning center can also send out different early warnings according to the difference of insect pest grades. Assuming that the pest grades are divided into three grades, a user can set a first threshold value and a second threshold value according to actual requirements for classifying the pest grades; wherein the first threshold is less than the second threshold. When the regression value is smaller than the first threshold value, the pest grade is low, and the probability of pest occurrence is low at the moment; when the regression value is between the first threshold value and the second threshold value, the pest grade is medium; when the regression value is larger than the second threshold value, the pest grade is high, and the probability of pest occurrence is high. Suppose that the disaster early warning center carries out the insect pest early warning through sound, then the different insect pest early warnings are carried out to different sound frequencies of accessible, and the insect pest grade is higher, and the frequency of sound early warning is higher, and the user can directly confirm the insect pest grade through the difference of sound, knows the farmland condition rapidly.

In summary, with the above technical solutions of the present invention, the signal parameterization method and the advanced pattern recognition technology of the present invention achieve automatic identification of insect voices, and the proposed automatic identification method uses MFCC as a voice feature and GMM as a classifier, and the method obtains an average identification rate of 98.95% when identifying a plurality of types of insect voices, and the time required for identifying a voice sample of about 1s is about 300ms, thus showing good performance in terms of both identification accuracy and identification time; according to the insect pest identification and early warning method provided by the invention, firstly, insects are induced and captured, voice characteristics are identified through MFCC, the characteristic sound classification in a required insect sound sample is carried out through a Gaussian mixture model, the insect species is identified, and the identification is more accurate, comprehensive and reliable, so that the insect pest grade is accurately identified, the corresponding insect pest early warning is carried out, the labor intensity of workers is reduced, the identification accuracy rate of insect pests is increased, the insect pest prevention effect is obviously improved, and the insect pest early warning is more intelligent and reliable; the crop insect automatic identification technology based on deep learning has the advantages of accurate identification, high intelligent degree and the like, on one hand, insect testers get rid of mechanical and tedious insect identification and statistics work, the defect of manual detection is reduced, human resources and time cost are saved, and the insect identification rate is improved; on the other hand, the method is also beneficial to protecting the ecological environment, ensures the food safety, enhances the sustainable development of agriculture, provides timely and effective information for the prevention and treatment of insects, and has great practical value.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. An insect recognition method based on deep learning of sound, which is characterized by comprising the following steps:

s4, performing sound classification on the feature samples by using a Gaussian mixture model;

the pretreatment of the collected whistling of the insects comprises the following steps:

s21, denoising the collected insect whistling samples to obtain samples to be processed;

s21, intercepting a sample to be processed into a plurality of sounding segments;

and S24, taking the preset sound segment after the pretreatment detection as a sound sample of the required insect.

2. The insect recognition method based on whistling deep learning of claim 1, wherein the preprocessing of the collected whistling samples of the insects and obtaining the desired whistling samples of the insects comprises the following steps:

and S13, storing the sounding signal to a database.

3. The method of claim 2, wherein transmitting the sounding signal to a recording device and storing the sounding signal in a database comprises:

4. The insect recognition method based on deep learning of singing sound according to claim 1, wherein the preprocessing detection comprises calculation of Mel-frequency cepstrum coefficients;

first order difference: obtaining a dynamic signal;

5. The method of claim 4, wherein the pre-emphasis factor is calculated as follows:

α＝exp(-2πFΔt)

in the formula, Δ t is the sampling period of the sounding signal, F is the frequency, and exp is an exponential curve;

the calculation formula of the pre-emphasis filter is as follows:

H(z)＝1-αz ^-1

6. The insect recognition method based on deep learning of singing sound according to claim 1, wherein the extraction of the features in the singing sound samples of the desired insects by MFCC comprises the following steps:

s31, performing spectrum analysis and sound channel parameter analysis on the sounding samples of the required insects;

s32, dividing the sounding signal into short time segments, calling each short time segment as a frame, and cutting a sounding signal waveform containing N samples with the frame length from the sounding signal;

s33, multiplying the time window function by the initial sounding signal;

s36, taking the logarithm of the output of each Mel band-pass filter to obtain a corresponding Mel logarithm power spectrum;

7. The insect recognition method based on deep learning of singing sound according to claim 1, wherein the Gaussian mixture model is constructed according to the following formula:

in the formula (I), the compound is shown in the specification,

is a D-dimensional random vector, is greater than or equal to>

Is the density of each component, and each component has a Gaussian function with variable D degree, i is 1,2 _i Is a mixing weight value, and lambda is a parameter;

the density calculation formula of each component is as follows:

in the formula (I), the compound is shown in the specification,

D is a variable Gaussian function, exp is an exponential curve;

and the density of the Gaussian mixture model is parameterized by mean vectors, covariance matrixes and mixed weights of all components.

8. The insect recognition method based on deep learning of singing sound according to claim 7, wherein the classification of the singing sound of the features in the singing sound samples of the desired insects by the Gaussian mixture model further comprises the following steps:

And ensure the mixed weight

9. The method of claim 8, wherein a re-estimation algorithm is used in the iterative process to ensure a monotonic decrease in the pattern likelihood values.