CN115050014A

CN115050014A - Small sample tomato disease identification system and method based on image text learning

Info

Publication number: CN115050014A
Application number: CN202210673683.1A
Authority: CN
Inventors: 王春山; 周冀; 吴育瑶; 冯徐广; 张文浩; 孙威; 孙社栋; 姚惠; 王彤; 王前; 杜鹏飞; 李久熙
Original assignee: Hebei Agricultural University
Current assignee: Hebei Agricultural University
Priority date: 2022-06-15
Filing date: 2022-06-15
Publication date: 2022-09-13

Abstract

The invention discloses a small sample tomato disease identification system and method based on image text learning, which comprises the following steps: the system comprises an image classification module, a text classification module and a combined classification module; the image classification module is used for obtaining a first prediction probability of tomato disease types based on a tomato image; the text classification module is used for obtaining a second prediction probability of the tomato disease type based on the tomato text information; and the joint classification module is used for carrying out joint output on the first prediction probability and the second prediction probability to obtain the disease category. Through the technical scheme, the tomato diseases can be accurately identified without a large number of disease images.

Description

Small sample tomato disease identification system and method based on image text learning

Technical Field

The invention belongs to the technical field of image model recognition, and particularly relates to a small sample tomato disease recognition system and method based on image text learning.

Background

During the production process of tomatoes, tomato diseases frequently occur due to the existence of pathogens such as bacteria, fungi, viruses and the like. These diseases seriously affect the quality and yield of vegetables, resulting in very serious economic losses each year. The traditional tomato disease diagnosis is mainly identified and evaluated by an agricultural expert or a technician according to experience, has the problems of time consumption, labor waste and low efficiency, and is difficult to adapt to the real-time and accuracy requirements of rapid disease prevention and control. The development of deep convolutional networks has prompted the rapid development of universal visual recognition on large-scale reference datasets, such as ImageNet, over the past few years. These CNN-based models and algorithms have proven useful in solving disease identification problems. Therefore, more and more researches are focused on vegetable disease detection and classification, and have been successful to some extent.

The tomato leaf disease identification model based on deep learning generally needs large-scale disease leaf images as training data sets, the data sets in different disease stages are time-consuming and labor-consuming to construct, and in a complex environment, the tomato disease images often contain various backgrounds such as other plants, soil, mulching films, water pipes and the like. Therefore, in the existing tomato disease data set, on one hand, the number of disease images is not enough to support the recognition model with large training parameter quantity, and on the other hand, the background of the disease images is single, and the data set of the type is directly used for training, so that a large error exists in the test under a real environment.

Disclosure of Invention

The invention aims to provide a small sample tomato disease identification system based on image text learning, and aims to solve the problems in the prior art.

In order to achieve the above object, the present invention provides a small sample tomato disease recognition system based on image text learning, comprising: the system comprises an image classification module, a text classification module and a combined classification module;

the image classification module is used for obtaining a first prediction probability of tomato disease types based on a tomato image;

the text classification module is used for obtaining a second prediction probability of the tomato disease types based on the tomato text information;

and the joint classification module is used for performing joint output on the first prediction probability and the second prediction probability to obtain the disease category.

Preferably, the image classification module comprises: a first feature extraction network and a first probability calculation unit;

the first feature extraction network is used for extracting features of the tomato image to obtain a first extraction result; wherein the first extraction result comprises: tomato characteristic images and tomato characteristic image labels;

the first probability calculation unit calculates a first prediction probability based on the first extraction result.

Preferably, the text classification module comprises: a second feature extraction network and a second probability calculation unit;

the second feature extraction network is used for extracting features of the tomato text information to obtain a text information extraction result; wherein the tomato text information comprises: the position information of the growth of the disease spots, the front and back information of the leaves and the self information of the disease characteristics, wherein the text information extraction result comprises the following steps: tomato characteristic text and tomato characteristic text labels;

the second probability calculation unit is configured to calculate a second prediction probability based on the text information extraction result.

Preferably, the second feature extraction network comprises a context network and a current text network;

the context network performs feature extraction on context information through a bidirectional cyclic neural network to obtain a second extraction result; wherein the context information comprises position information of the lesion growth and front and back information of the leaf;

the current text network performs feature extraction on the current text information through a convolutional neural network to obtain a third extraction result; wherein the current text information is the self information of the disease characteristics.

Preferably, the joint classification module comprises: a combined output unit and a disease identification unit;

the combined output unit is used for performing combined output on the tomato image and the tomato text to obtain a third prediction probability;

and the disease identification unit identifies the tomato diseases based on the third prediction probability to obtain disease types.

On the other hand, in order to achieve the technical purpose, the invention provides a small sample tomato disease identification method based on image text learning, which comprises the following steps:

obtaining a first prediction probability of tomato disease types based on a tomato image;

obtaining a second prediction probability of the tomato disease type based on the tomato text information;

and jointly outputting the first prediction probability and the second prediction probability to obtain the disease category.

Preferably, the process of obtaining a first predicted probability of a tomato disease species comprises:

performing feature extraction on the tomato image to obtain a first extraction result; wherein the first extraction result comprises: tomato characteristic images and tomato characteristic image labels; a first prediction probability is calculated based on the first extraction result.

Preferably, the process of obtaining the second predicted probability of the tomato disease species comprises:

carrying out feature extraction on the tomato text information to obtain a text information extraction result; calculating a second prediction probability based on the text information extraction result; wherein the tomato text information comprises: the position information of the growth of the lesion, the front and back information of the leaf and the information of the disease characteristics, wherein the text information extraction result comprises the following steps: tomato feature text and tomato feature text labels.

Preferably, the process of extracting the features of the tomato text information comprises the following steps:

extracting the characteristics of the context information through a bidirectional cyclic neural network to obtain a second extraction result; wherein the context information comprises position information of the lesion growth and front and back information of the leaf;

performing feature extraction on the current text information through a convolutional neural network to obtain a third extraction result; wherein the current text information is the self information of the disease characteristics;

wherein the text information extraction result comprises: the position information of the growth of the lesion, the front and back information of the blade and the self information of the disease characteristics.

Preferably, the process of obtaining disease categories comprises:

jointly outputting the tomato image and the tomato text to obtain a third prediction probability; and identifying the tomato diseases based on the third prediction probability to obtain the disease types.

The invention has the technical effects that: the method comprises the steps of obtaining a first prediction probability of tomato disease types based on a tomato image, obtaining a second prediction probability of the tomato disease types based on a tomato text, and classifying the disease types in a parallel mode of an image channel and a text channel; the disease description information is added in a text form on the basis of the image data set, so that a disease image-text pair is formed as a data set of a disease identification model. According to the invention, by describing disease symptoms in the text channel, the characteristic expression of the disease is further enhanced, and the influence of the disease image background on the identification process is weakened. According to the invention, through the combined classification module, the image-text pair can be input into the model, and the model can output the tomato disease category finally.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application. In the drawings:

FIG. 1 is a schematic diagram of a small sample tomato disease identification network structure in an embodiment of the invention;

fig. 2 is a flow chart of a small sample tomato disease identification method in the embodiment of the invention.

Detailed Description

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.

It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.

Example one

As shown in fig. 1, the present embodiment provides a small sample tomato disease recognition system based on image text learning, including: the system comprises an image classification module, a text classification module and a combined classification module;

the image classification module is used for obtaining a first prediction probability of the tomato disease type based on the tomato image;

the text classification module is used for obtaining a second prediction probability of the tomato disease type based on the tomato text information;

and the joint classification module is used for carrying out joint output on the first prediction probability and the second prediction probability to obtain the disease category.

In some embodiments, the image classification module comprises: a first feature extraction network and a first probability calculation unit; the first feature extraction network is used for extracting features of the tomato image to obtain a first extraction result; wherein the first extraction result comprises: tomato characteristic images and tomato characteristic image labels; a first probability calculation unit that calculates a first prediction probability based on the first extraction result.

In this embodiment, the image classification module specifically includes: the image feature extraction networks with different structures show different advantages of different recognition tasks, and the networks are mostly composed of convolution layers, pooling layers and full-connection layers. In the network structure, the network performance and the model size are comprehensively considered, and ResNet18 is used as a feature extraction network Img-Net of the image branch.

Giving an image I and an image label L as the input of Img-Net, and obtaining the output [ P ] of an image branch after feature extraction _I1 ,P _I2 ,P _I3 ,P _I4 ,P _I5 ] _img As shown in formula (1).

[P _I1 ,P _I2 ,P _I3 ,P _I4 ,P _I5 ] _img ＝S(W(I),L) (1)

Wherein W (-) represents the extraction result of the feature extraction network Img-Net, S (-) is a Softmax function, and P represents the prediction probability of each disease type.

In some embodiments, the text classification module comprises: a second feature extraction network and a second probability calculation unit; the second feature extraction network is used for extracting features of the tomato text information to obtain a text information extraction result; wherein the tomato text information comprises: the position information of the growth of the disease spots, the front and back information of the leaves and the information of the disease characteristics, and the text information extraction result comprises the following steps: tomato characteristic text and tomato characteristic text labels; and a second probability calculation unit for calculating a second prediction probability based on the text information extraction result.

In some embodiments, the second feature extraction network comprises a contextual network and a current text network; the context network is used for extracting the characteristics of the context information through the bidirectional cyclic neural network to obtain a second extraction result; the context information comprises position information of lesion growth and front and back information of the leaves; the current text network performs feature extraction on the current text information through a convolutional neural network to obtain a third extraction result; wherein the current text information is the self information of the disease characteristics.

In this embodiment, the text classification module specifically includes: different from the image feature extraction network, in order to better extract context information between texts, the Text feature extraction network is mostly composed of a recurrent neural network layer, but in a vegetable disease description Text, the feature extraction network not only needs to extract context information between texts (such as a lesion growth position, front and back information of leaves, and the like), but also is very important for information of disease features, and compared with the recurrent neural network layer, a convolutional layer has the advantage of extracting specific features, so that in the network structure, the feature extraction network Text-Net using TextRCNN as a Text branch is used.

Given preprocessed text C (T) _i ) And the Text label L is used as the input of Text-Net, and the Text C (T) is firstly input _i ) Obtaining context characteristic C of text after passing through bidirectional recurrent neural network (LSTM) _l (T _i ) And C _r (T _i ) The calculation formulas are shown in formulas (2) and (3).

C _l (T _i )＝f(W ^(l) C _l (T _i-1 )+W ^(sl) e(T _i-1 )) (2)

C _r (T _i )＝f(W ^(r) C _r (T _i+1 )+W ^(sr) e(T _i+1 )) (3)

Where f (-) is the Tanh activation function, T _i For the current vectorized text, T _i-1 And T _i+1 Upper and lower, W, respectively, of vectorized text ^(l) And W ^(r) Respectively, the left and right cyclic neural network hidden layers and the conversion matrix of the next hidden layer, W ^(sl) And W ^(sr) A combination matrix of the left and right semantics of the current and next vectorized text, C _l And C _r Left and right quantized text, e (T), respectively, for the current vectorized text _i-1 ) And e (T) _i+1 ) A left word embedding vector and a right word embedding vector, respectively, representing the current vectorized text.

Splicing the obtained context features and the current text features to be used as the input of a specific feature extractor, wherein the specific feature extractor extracts the text features with the most obvious features in the vectorized text by using a maximum pooling layer (MaxPool), and finally obtains the output [ P ] of text branches _T1 ,P _T2 ,P _T3 ,P _T4 ,P _T5 ] _text As shown in formula (4).

[P _T1 ,P _T2 ,P _T3 ,P _T4 ,P _T5 ] _text ＝S(fc(M(C _l (T _i )+C(T _i )+C _r (T _i ))),L) (4)

Wherein S (-) is a softmax function, fc (-) is a full link layer, M (-) is a maximum pooling layer (maxpool), and P represents the prediction probability of each disease type.

In some embodiments, the joint classification module comprises: a joint output unit and a disease identification unit; the combined output unit is used for performing combined output on the tomato image and the tomato text to obtain a third prediction probability; and the disease identification unit is used for identifying the tomato diseases based on the third prediction probability to obtain the disease types.

In this embodiment, the joint classification module specifically includes: the Img-Net and the Text-Net respectively extract the characteristics of different modes in the image-Text pair from different angles, and the characteristics of the two modes can be combined after the characteristics are fused, so that the difference between the large probability classification value and the small probability classification value is further increased, and the classification confidence of the combined classifier is increased. The joint output formula is shown in formula (5).

[P _J1 ,P _J2 ,P _J3 ,P _J4 ,P _J5 ] _joint ＝[P _I1+T1 ,P _I2+T2 ,P _I3+T3 ,P _I4+T4 ,P _I5+T5 ] (5)

The loss function of the characteristic combination is shown as the formula (6).

Where T is the number of sample classes,

and

is a label value, and the two values are the same,

and

the predicted result probability values of Text-Net and Img-Net are respectively.

In this embodiment, the learning rates of the image classification module and the text classification module are both 0.0001, Adam is adopted as the optimizer, and the batch processing size is 16. The model training process comprises the following steps:

step one, data preprocessing. The original data set needs to be preprocessed in order to increase the diversity of the original data set, and because the original data set is shot in a real field and the situation in real application is considered, data enhancement processing is not carried out on disease images, the original images are only uniformly resized into 224 x 224 pixels, and the disease description text data are vectorized by adopting a word bag model, wherein the vectorization length is 20.

And step two, performing 50 rounds of training on the whole network.

And step three, network testing, namely testing the network on the test set.

The beneficial effects of this embodiment:

in the embodiment, disease types are classified in a mode that an image channel and a text channel are parallel; in the embodiment, disease description text information is added on the basis of disease image data to assist the diagnosis process, so that the disease identification model can obtain a good effect on the basis of a small number of image-text pairs. The appearance of different diseases on the blade is different, and according to the condition that the disease characteristics exist on the front and the back of the blade, the back image of the diseased blade is added to assist the disease diagnosis process.

Compared with the existing disease recognition model, the vegetable disease recognition model (ITC-Net) for image and text collaborative representation learning is constructed, the correlation and complementarity between the disease image characteristics and the text description are utilized, the effects of the image mode independent training and the text mode independent training are obtained in the complex environment vegetable disease small sample data set, and the accuracy, the precision, the sensitivity and the specificity of the test set are respectively 99.48%, 98.90%, 98.78% and 99.66%. The work of the method provides a feasible scheme for small sample disease identification based on image text collaborative representation learning in an actual agricultural scene.

Example two

As shown in fig. 2, the embodiment provides a small sample tomato disease identification method based on image text learning, which includes the following steps:

and carrying out joint output on the first prediction probability and the second prediction probability to obtain the disease category.

In some embodiments, the process of obtaining a first predicted probability of a tomato disease species comprises: performing feature extraction on the tomato image to obtain a first extraction result; wherein the first extraction result comprises: tomato characteristic images and tomato characteristic image labels; a first prediction probability is calculated based on the first extraction result.

In some embodiments, the process of obtaining a second predicted probability of a tomato disease species comprises: carrying out feature extraction on the tomato text information to obtain a text information extraction result; calculating a second prediction probability based on the text information extraction result; wherein the tomato text information comprises: the position information of the growth of the lesion, the front and back information of the leaf and the information of the disease characteristics, wherein the text information extraction result comprises the following steps: tomato feature text and tomato feature text labels.

In some embodiments, the process of feature extraction of the tomato text information comprises: extracting the characteristics of the context information through a bidirectional cyclic neural network to obtain a second extraction result; wherein the context information comprises position information of the lesion growth and front and back information of the leaf; performing feature extraction on the current text information through a convolutional neural network to obtain a third extraction result; wherein the current text information is the self information of the disease characteristics; wherein the text information extraction result comprises: position information of lesion growth, front and back information of leaves and self information of disease characteristics.

In some embodiments, the process of obtaining disease categories comprises: the tomato image and the tomato text are jointly output to obtain a third prediction probability; and identifying the tomato diseases based on the third prediction probability to obtain the disease types.

The above description is only for the preferred embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A small sample tomato disease identification system based on image text learning is characterized by comprising: the system comprises an image classification module, a text classification module and a combined classification module;

2. The small-sample tomato disease recognition system based on image text learning of claim 1,

the image classification module comprises: a first feature extraction network and a first probability calculation unit;

3. The small-sample tomato disease recognition system based on image text learning of claim 1,

the text classification module comprises: a second feature extraction network and a second probability calculation unit;

the second feature extraction network is used for extracting features of the tomato text information to obtain a text information extraction result; wherein the tomato text information comprises: the position information of the growth of the lesion, the front and back information of the leaf and the information of the disease characteristics, wherein the text information extraction result comprises the following steps: tomato characteristic text and tomato characteristic text labels;

the second probability calculation unit is used for calculating a second prediction probability based on the text information extraction result.

4. The small-sample tomato disease recognition system based on image text learning of claim 3,

the second feature extraction network comprises a context network and a current text network;

5. The small-sample tomato disease recognition system based on image text learning of claim 1,

the joint classification module comprises: a combined output unit and a disease identification unit;

and the disease identification unit identifies the tomato diseases based on the third prediction probability to obtain disease categories.

6. A small sample tomato disease identification method based on image text learning is characterized by comprising the following steps:

7. The small-sample tomato disease recognition method based on image text learning according to claim 6,

the process of obtaining a first predicted probability of a tomato disease species comprises:

8. The small-sample tomato disease recognition method based on image text learning according to claim 6,

the process of obtaining the second prediction probability of the tomato disease category comprises the following steps:

9. The small-sample tomato disease recognition method based on image text learning according to claim 8,

the process of extracting the characteristics of the tomato text information comprises the following steps:

10. The small-sample tomato disease recognition method based on image text learning according to claim 6,

the process of obtaining disease categories comprises the following steps:

the tomato image and the tomato text are jointly output to obtain a third prediction probability; and identifying the tomato diseases based on the third prediction probability to obtain the disease types.