CN114913179A

CN114913179A - Apple skin defect detection system based on transfer learning

Info

Publication number: CN114913179A
Application number: CN202210844392.4A
Authority: CN
Inventors: 张素红; 赖欣欣
Original assignee: Nantong Haiyang Food Co ltd
Current assignee: Nantong Haiyang Food Co ltd
Priority date: 2022-07-19
Filing date: 2022-07-19
Publication date: 2022-08-16
Anticipated expiration: 2042-07-19
Also published as: CN114913179B

Abstract

The invention relates to the technical field of artificial intelligence, in particular to an apple skin defect detection system based on transfer learning. The system comprises: the detection network construction module is used for obtaining a depth map, an edge density map and a brown enhancement map of the apple and constructing a first detection network; the transfer learning module is used for respectively selecting parameters of the convolutional layers except the preset convolutional layer in the corresponding channel according to the first difference, the second difference and the third difference to initialize, so that a second detection network is obtained; and the detection module is used for obtaining detection networks corresponding to various apples, inputting depth maps, edge density maps and brown enhancement maps corresponding to various apples, and outputting defect types of apple skins. In the embodiment, the skin characteristics of the apples are respectively extracted by using multiple channels, so that the detection result is more accurate; meanwhile, the convergence rate of the detection network corresponding to each kind of apples is increased by using a transfer learning method, so that the skin defects of each kind of apples can be accurately detected.

Description

Apple skin defect detection system based on transfer learning

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to an apple skin defect detection system based on transfer learning.

Background

In recent years, with the improvement of living standard of people, the demand of people for fruits is getting larger and larger, and with the continuous improvement of production and planting technology, the yield of apples is continuously improved, so the quality of apples needs to be evaluated according to the defects of apple skins, which is a common and important means for evaluating the quality of apples, because the smoothness, the color degree and the integrity of the skins of the apples are also reflected most intuitively on the hands of consumers, and the rotting and pathological changes inside the fruits can be reflected to a certain extent through the defects of the apple skins.

At present, most of the detection of the apple skin defects mainly depends on manual detection, so that the detection efficiency is low, the subjectivity is strong, and uncontrollable factors are many. With the development of artificial intelligence technology, it has become a trend to detect the defects of the apple peel intelligently by using artificial intelligence and image processing technology, for example, the peel defect is detected based on improved particle swarm optimization, and the rot of the fruit is identified by using hyperspectral fluorescence imaging technology. However, there are many varieties of apples, and when the skin defects of the apples are detected by using the neural network based on image processing, one trained neural network can only identify one variety of apples, if a large number of image sets of multiple varieties of apples are obtained to train the network, the time is long and the cost is too high, and the training result of the network is poor due to the fact that a small number of image sets of some varieties of apples are provided, and the detection effect is poor.

Disclosure of Invention

In order to solve the above technical problems, an object of the present invention is to provide an apple skin defect detection system based on transfer learning, which adopts the following technical scheme:

the embodiment of the invention provides an apple skin defect detection system based on transfer learning. The system comprises: the detection network construction module is used for processing the surface image of the apple to obtain a depth map, an edge density map and a brown enhancement map which only contain one apple; constructing a first detection network corresponding to a first apple type, wherein the network comprises three channels, namely a depth channel, an edge channel and a color channel; the three channels are used for extracting the features of a depth map, an edge density map and a brown enhancement map of the apple; processing the extracted features, and outputting a detection result through a Softmax classifier;

the transfer learning module is used for analyzing the depth maps, the edge density maps and the brown enhancement maps of a preset number of the first kinds of apples by utilizing a principal component analysis algorithm to obtain a depth average map, a texture average map and a color average map; migrating the depth channel, the edge channel and the color channel corresponding to the first detection network to a detection network corresponding to a new apple variety, and selecting a preset convolution layer in the three channels for parameter sharing; respectively obtaining a first difference, a second difference and a third difference by utilizing the depth, texture and color average images of the new apple variety and the first apple variety; respectively selecting parameters of the convolutional layers except the preset convolutional layer in the corresponding channel according to the obtained first, second and third differences to initialize to obtain a second detection network;

and the detection module is used for obtaining detection networks corresponding to various apples and detecting the defect types of the skins of various apples through the detection networks corresponding to various apples.

Preferably, the obtaining a depth map, an edge density map and a brown enhancement map containing only one apple comprises: segmenting the surface image of the apple to obtain an image only containing the apple, and obtaining a depth map of the image; graying an image only containing the apple and obtaining the edge of the apple; calculating the edge density of edge pixels by using a window with a preset size, converting the gray value of the edge pixels into the edge density of the edge pixels to obtain an edge density graph, wherein the edge density is the ratio of the number of the edge pixels in the window to the total number of the pixels in the window; and converting the image only containing the apple into an HSV space to obtain an HSV image, labeling the pixel points which are near brown in the HSV image with a first preset value, labeling the pixel points of other colors with a second preset value, and obtaining a brown enhanced image according to H, S, V values of the pixel points and the labeled first and second preset values.

Preferably, the depth channel, the edge channel and the color channel include: the network structure of the three channels is a pre-trained VGG16 network architecture; the pre-trained VGG16 network architecture of the deep channel is a VGG16 network architecture pre-trained by using a deep image in a KITTI data set; the pre-trained VGG16 network architecture of the edge channel is a VGG16 network architecture pre-trained by utilizing a Brodatz texture image library; the pre-trained VGG16 network architecture for color channels is a VGG16 network architecture pre-trained with the streams-360 fruit image dataset.

Preferably, the obtaining the depth average map, the texture average map and the color average map includes: and obtaining an average long-phase map of the depth map, the edge density map and the brown enhancement map by using a principal component analysis algorithm, wherein the average long-phase map is a depth average map, a texture average map and a color average map corresponding to the depth map, the edge density map and the brown enhancement map respectively.

Preferably, the first difference comprises; obtaining a central point of a depth average map corresponding to the first kind of apples, and simultaneously obtaining an average value of second-order gradients of pixel points on a horizontal straight line passing through the central point and an average value of second-order gradients of pixel points on a vertical straight line passing through the central point, and respectively recording the average values as a horizontal second-order gradient value and a vertical second-order gradient value; obtaining a horizontal second-order gradient value and a vertical second-order gradient value corresponding to the depth average map of the new apple variety; and obtaining a first difference according to the ratio of the horizontal second-order gradient values corresponding to the depth average maps of the first kind of apples and the new kind of apples and the ratio of the vertical second-order gradient values.

Preferably, the second difference comprises: acquiring gray level co-occurrence matrixes of texture average images of the first kind of apples and the new kind of apples, and respectively acquiring energy, contrast and entropy corresponding to the texture average images of the first kind of apples and the new kind of apples according to the gray level co-occurrence matrixes; and weighting and summing the first preset value and the ratio of energy, contrast and entropy corresponding to the texture average image of the first kind of apples and the texture average image of the new kind of apples to obtain a second difference.

Preferably, the third difference is:

wherein the content of the first and second substances,

represents a third difference; c represents the c-th colorN represents the number of types of colors in the color average map;

the times of the appearance of the pixel points of the c color in the color average graph of the first apple type are shown,

and the times of the appearance of the pixel points of the c color in the color average graph of the new apple variety are shown.

Preferably, the selecting parameters of the convolutional layers in the corresponding channel, except the preset convolutional layer, according to the obtained first, second and third differences for initialization includes: sharing the parameters of the preset convolutional layer in the VGG16 network architecture of the depth, edge and color channels in the first detection network with the convolutional layer of the same layer in the second detection network; obtaining the ratio of the first difference, the second difference and the third difference after expansion and rounding-down to a third preset value, and taking the result of rounding-down of the ratio as the number of convolution layers for parameter initialization; sequentially carrying out parameter initialization from the convolution layer at the highest layer according to the rounding result until the number of the initialized convolution layers is the rounding result; if the first, second and third differences are enlarged and are even numbers after rounding down, after parameter initialization is completed from the highest layer convolution layer to the next layer in sequence, parameters of corresponding convolution layers of corresponding channels in the first detection network are shared by the remaining convolution layers except the preset convolution layer; if the first, second and third differences are enlarged and are odd numbers after rounding down, after parameter initialization is completed from the highest layer of convolutional layer to the next layer of convolutional layer in sequence, initializing half of neurons in the next layer of convolutional layer of the lowest layer of convolutional layer after parameter initialization, and sharing parameters of corresponding convolutional layer of corresponding channel in the first detection network with the remaining convolutional layers except the preset convolutional layer.

Preferably, the detection module further comprises: and training by using the detection networks after initialization of the depth map, the edge density map and the brown enhancement map convolution layer parameters corresponding to various apples to obtain the detection networks corresponding to various apples.

The embodiment of the invention at least has the following beneficial effects: according to the method, the detection network is constructed, the features of the depth map, the edge density map and the brown enhancement map of the apple epidermis are respectively extracted through the convolutional neural networks of the three channels, and then the defect types of the apple epidermis are identified according to the fused features, so that the detection accuracy can be improved; meanwhile, the pre-trained VGG16 network architecture is migrated into the first detection network by using a migration learning method, and the VGG network architecture of the first detection network is migrated into detection networks corresponding to other kinds of apples, so that the convergence of the detection networks corresponding to other kinds of apples can be accelerated, the skin defects of various apples can be detected, the detection network corresponding to each kind of apples is more suitable for the corresponding apples, and the detection accuracy can be improved; meanwhile, the apple with the defective epidermis is detected, so that the quality of the apple can be guaranteed, the apple with the same batch of mould infection caused by bad apples is prevented, and guidance can be provided for planting the apple for a grower planting the apple, such as prevention of plant diseases and insect pests.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions and advantages of the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a block diagram of an apple skin defect detection system based on transfer learning.

Fig. 2 is a diagram of a detection network structure.

Detailed Description

To further illustrate the technical means and effects of the present invention for achieving the predetermined objects, the following detailed description of the embodiments, structures, features and effects of the present invention will be made with reference to the accompanying drawings and preferred embodiments. In the following description, different "one embodiment" or "another embodiment" refers to not necessarily the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

The following describes a specific scheme of the apple skin defect detection system based on transfer learning in detail with reference to the accompanying drawings.

The main application scenarios of the invention are as follows: the skin defects of the apples are detected through the image of the apple skins in a corresponding apple processing factory or in the apple planting industry, the types of the defects are determined at the same time, and then the quality of the apples is evaluated.

Examples

The present embodiment provides a system embodiment. Referring to fig. 1, a block diagram of an apple skin defect detection system based on transfer learning according to an embodiment of the present invention is shown, and the system includes the following modules:

the detection network construction module is used for processing the surface image of the apple to obtain a depth map, an edge density map and a brown enhancement map which only contain one apple; constructing a first detection network corresponding to a first kind of apples, wherein the network comprises three channels, namely a depth channel, an edge channel and a color channel; the network structure of the three channels is a pre-trained VGG16 network architecture and is used for extracting the features of a depth map, an edge density map and a brown enhancement map of an apple; and processing the extracted features, and outputting a detection result through a Softmax classifier.

Firstly, the apples are laterally laid on a transportation rail in a proper posture by using mechanical devices such as a stopper, an RGBD camera is placed right above the rail to shoot side images of the apples below, so that the images contain a few fruit stalks and fruit navel parts, and the images with the fruit stalks on the upper part and the fruit navel on the lower part are convenient for subsequent image processing. Irradiating with D65 standard light source to ensure that there is no intense reflection on the surface of the apple, and rotating or rolling the apple in a proper manner if it is desired to identify the entire skin of the apple. The software environment is a Python-based computer vision OpenCV software library and a deep learning Tensorflow framework. The apple referred to in the embodiment is a common apple with smooth, complete and mature surface, and can be red Fuji apple, snake fruit, green apple and white apple. They do not differ greatly in size and do not differ greatly in biological composition, and are therefore characterized in combination with these features.

The RGB image obtained by the camera is an apple surface image, a plurality of apples are arranged in the apple surface image, and the image of each apple is segmented. Because the apples are above the transportation track, a plurality of depth maps containing only one apple can be easily obtained by carrying out threshold segmentation on the depth maps; graying an image which is divided and only contains one apple, detecting an ROI (region of interest) of the apple in the grayed image by using a Sobel operator to obtain the edge of the ROI, setting a positive direction sliding window with the preset size of 1mm x 1mm, taking pixel points on the edge as the central point of the window, and obtaining the edge density of each pixel point on the edge

：

Wherein N is the number of the pixel points in the window, and N represents the number of the pixel points in the window. And converting the gray value of the edge pixel point into the edge density of the edge pixel point to obtain an edge density graph. This is to better reflect the texture information because there are more edge points of the texture at the defect portion, more abrupt color changes occur, and there is a large difference from the normal skin texture.

The image containing a single apple is converted into an HSV space to obtain an HSV image, each pixel point on the HSV image has 3 values (H, S and V), the value is expanded to 4 values, and an additional labeled value is added because the rotten and damaged part of the apple is oxidizedA partial brown color and a hue value of a near brown color in

Within the range of lightness

Within the range, the pixel points of near brown are labeled with a first preset value, the pixel points of other colors are labeled with a second preset value, preferably, the first preset value is 1, the second preset value is 0 in the embodiment, so as to obtain a brown enhancement map of each apple, and in the image, the values of the pixel points in the ROI area of the apple are (H, S, V, labeled values). Because the apple epidermis is rotten or the part that receives the damage is close to the brown, so the pixel of the nearly brown of mark that is also close brown is favorable to supplementary feature extraction to the defect.

Further, a detection network of apple skin defects is constructed, the network structure is three channels, the network structure of the three channels is a VGG16 network architecture, the three channels are a depth channel, an edge channel and a color channel respectively and are used for extracting the characteristics of a depth map, an edge density map and a brown enhancement map, the extracted characteristics are fused through a Concatenate layer, the fused characteristics are sent into a basic full-link layer for processing, finally, the probability of each defect type is calculated through a Softmax classifier, the defect type of the apple skin is output, and the network structure is shown in the detection network structure diagram of FIG. 2; wherein, the three channels are actually network architectures of three VGGs 16, and are all pre-trained to be migrated; and the parameters of the basic fully-connected layer in the first detection network need to be trained to learn knowledge.

When the apple skin defect is identified, the three convolutional neural networks are integrated, and the final identification result is determined by performing certain combination on the outputs of the three convolutional neural networks, so that the performance better than that of a single convolutional neural network is obtained. The generalization error of the multi-channel convolution neural network is certainly smaller than the average generalization error of a plurality of single-channel neural networks, and the performance of the multi-channel convolution neural network is better than that of a single channel; the better the self-learning performance of each channel is, the larger the difference between the channels is, the smaller the error after fusion is, and the better the recognition effect is; most importantly, the multi-channel migration learning method is divided into multiple channels, and suggestions can be provided for migration learning parameters of each channel according to differences of identified fruits.

The pre-trained VGG16 network architecture of the depth channel is a VGG16 network architecture pre-trained by using a depth image in a KITTI data set; the pre-trained VGG16 network architecture of the edge channel is a VGG16 network architecture pre-trained by utilizing a Brodatz texture image library; the pre-trained VGG16 network architecture for color channels is a VGG16 network architecture pre-trained with the Fruits-360 fruit image dataset.

Finally, because parameters of the basic full-link layer and other structures are initialized, preferably, the parameters of the embodiment refer to weights mainly referring to neurons, so that a large amount of image data of apples are required to be used for training the detection network, and apple types with rich image data are required to be selected and are marked as first-type apples, preferably, in the embodiment, because the number of red fuji apples is large, a large amount of image data which can be obtained meet requirements, the red fuji apples are taken as the first-type apples, a large amount of depth maps, edge density maps and brown enhancement maps of the first-type apples are obtained, the image data of the first-type apples are used for training the detection network, the loss function is a cross-entropy loss function, and after the training is completed, the detection network corresponding to the first-type apples is marked as the first detection network; inputting image data of red Fuji apples for training, finely adjusting each nerve layer of the first detection network, and specifically setting a smaller learning rate to reduce a search space, prevent oscillation near the best fitting point and prevent fitting to an error point; meanwhile, the detected defect classification result output by the network is as follows: fruit stem (1), fruit navel (2), mechanical scar (3), plant diseases and insect pests (4), fold (5), rot (6), cracked fruit (7) and others (8).

The transfer learning module is used for analyzing the depth maps, the edge density maps and the brown enhancement maps of a preset number of the first kinds of apples by utilizing a principal component analysis algorithm to obtain a depth average map, a texture average map and a color average map; migrating the depth channel, the edge channel and the color channel corresponding to the first detection network to a detection network corresponding to a new apple variety, and selecting a preset convolution layer in the three channels for parameter sharing; respectively obtaining a first difference, a second difference and a third difference by utilizing the depth, texture and color average images of the new apple and the first apple; and respectively selecting parameters of the convolutional layers except the preset convolutional layer in the corresponding channel according to the obtained first, second and third differences to initialize, so as to obtain a second detection network.

Firstly, the first detection network trained by using the image data of the first kind of apple, namely, the red fuji apple, can be used as a migration learning object of the detection networks of different kinds of apples, other different kinds of apples are called new kinds of apples, the structure of the detection network corresponding to the new kind of apples is the same as that of the first detection network, the biggest difference is that the relation of the rolling layers in the three VGG16 network architectures is not just as simple as sharing the parameters of the rolling layers in the three VGG16 network architectures in the first detection network, meanwhile, the parameters of the basic full connection layer of the detection network need to be initialized, the degree of migration learning needs to be determined according to the difference of the new kind of apples and the first kind of apples on the image, wherein, the first kind of apples is the source field, the new kind of apples is the target field, and the tasks of the source field and the target field are the same, the source domain is somewhat different from the target domain.

Further, a depth map, an edge density map and a brown enhancement map of a preset number of the first kind of apples are obtained, preferably, the preset number is 100 in this embodiment, and the depth map, the edge density map and the brown enhancement map of 100 pieces of the first kind of apples are respectively analyzed by using a Principal Component Analysis (PCA) algorithm to obtain an average long-phase map of three feature maps, which are respectively a depth average map, a texture average map and a color average map, and the specific process is the same as the human average face construction method, which is a known technology. The average growth phase diagram of the first kind of apple is used as a representative of the image data of the first kind of apple, and preparation is made for measuring the characteristic difference between the new kind of apple and the first kind of apple.

And similarly, three average long-phase images corresponding to the depth image, the edge density image and the brown enhancement image of the new apple variety are obtained, and the obtaining method is the same as that of the three average long-phase images corresponding to the first apple variety. The difference between them and the three mean growth phase maps of the first apple species was then used as a measure of the difference between the source and target domains.

Further, obtaining a first difference according to the difference of the gradient of the depth average map of the first kind of apples and the gradient of the depth average map of the new kind of apples

：

Wherein the horizontal second order gradient of the depth average map of the first kind of apples

And vertical second order gradient

The method comprises the steps of obtaining a central point of a depth average image corresponding to a first kind of apples, simultaneously obtaining an average value of second-order gradients of pixel points on a horizontal straight line passing through the central point and an average value of second-order gradients of pixel points on a vertical straight line passing through the central point, and respectively recording the average values as horizontal second-order gradient values

And vertical second order gradient value

Similarly, obtaining the horizontal second-order gradient value of the depth average image of the new apple variety

And vertical second order gradient value

. To be provided with

For example, the similarity degree of the skin radian of two depth average graphs is reflected, the more similar the values are, the closer the values are to 1, and the double vertical lines represent absolute values, so that negative values are avoided.

The first-order gradient reflects whether the surface of the apple is flat or not, and in fact, the surface of the common apple is cambered; the second-order gradient reflects whether the radian of the surface of the apple is smooth or not, for example, the side surface of the Fuji apple is approximately round, the second-order gradient value has small variation range around 0, the concave-convex fluctuation area of the side surface of the snake fruit is larger, and the variation range of the second-order gradient value is larger.

Calculating a second difference according to the texture mean map of the first apple species and the texture mean map of the new apple species

Firstly, converting the texture average images of the first kind of apples and the new kind of apples into gray level images, respectively obtaining gray level co-occurrence matrixes of the two gray level images, and expressing the difference by using the difference of the gray level co-occurrence matrixes.

Respectively obtaining gray level co-occurrence matrixes corresponding to the two texture average graphs, and respectively obtaining energy of the corresponding co-occurrence matrixes

And

contrast ratio of

And

entropy of

And

(ii) a Wherein, the energy reflects the thickness degree of the texture of the apple epidermis; the contrast reflects the depth of the texture grooves of the apple peel; the entropy reflects the non-uniform degree of the texture of the apple peel; obtaining a second difference

：

Wherein the content of the first and second substances,

、

and

are weights, respectively, preferably in the present embodiment

The identification effect of the edge points is greatly influenced by the thickness degree of the skin texture of different types of apples;

、

and

respectively representing the energy, contrast and entropy corresponding to the first kind of apples;

、

and

respectively representing the energy, contrast and entropy corresponding to the new apple species; to be provided with

For example, the similarity of the texture thickness degree is reflected, the more similar the value is, the closer the value is to 1, and the double vertical lines indicate that the absolute value is taken, so that negative values are avoided.

Regarding the difference of the color average maps of the first kind of apples and the new kind of apples, the difference of the color histograms of the color average maps is used for representing that the colors of the skins of the different kinds of apples may be different, and when the skins are defective, the colors of the defective parts are near brown, but the significance of the near brown of the defective parts in the skins of the different kinds of apples may be different. The two color histograms are processed, the bit depth is reduced firstly, in an HSV color space, the hue H is divided into 8 intervals according to empirical values, the saturation S is divided into 4 intervals, and the lightness V is divided into 4 intervals, so that a sequence with n =128 color value combinations is obtained, and then the occurrence frequency H of pixel points of each color is counted.

The color discrete distribution function of the pixel points of each color in the color texture map corresponding to the first kind of apples is obtained as

C represents a pixel of the c-th color,

representing the occurrence times of the pixel points of the c-th color; the color discrete distribution function of the pixel points of each color in the color average graph corresponding to the new apple variety is obtained as

c represents a pixel point of the c-th color,

image showing the c-th colorThe number of occurrences of a prime point; obtain the third difference

：

The second term in the formula represents the similarity degree of the two color average graphs, the more similar the component is, the more the value of the component is towards 1, otherwise, the value of the component is towards 0.

Then, the first, second and third differences are expanded, preferably, in this embodiment, the operation of expanding is to multiply by 20, then round down, and if the result after multiplication is greater than 20, round up to 20, and obtain the reconstruction degree of the convolution layer in the VGG16 network architecture of the three channels of the second detection network corresponding to the new kind of apple migrating from the first detection network

。

Finally, according to the reconstruction degrees respectively corresponding to the depth channel, the edge channel and the color channel

Degree of reconstruction

And degree of reconstruction

Fine-tuning convolutional layer parameters of the network architecture of the VGG16 migrated to the second detection network, which specifically comprises the following operations:

setting a preset convolutional layer, preferably, the preset convolutional layer in the embodiment is a first convolutional layer to a third convolutional layer, and parameters of the convolutional layers of the three layers directly share parameters of the convolutional layers corresponding to the same layer in the first detection network;

in addition to the preset convolutional layer, fine tuning of parameters of other convolutional layers of the VGG16 network architecture in the second detection network is determined based on the corresponding reconstruction degree, and a ratio of the reconstruction degree to a third preset value is obtained, where the second preset value is 2:

wherein, the first and the second end of the pipe are connected with each other,

the result of the ratio of the reconstruction degree to the third preset value and the downward rounding is shown, namely the result of the ratio of the first difference, the second difference and the third difference after the enlarged rounding to the third preset value and the downward rounding; the value of i is 1, 2 and 3,

representing the number of layers of the VGG16 network architecture of the depth channel in the second detection network, which are initialized by the downward parameters from the convolutional layer at the highest layer;

the VGG16 network architecture representing the edge channel in the second detection network sequentially comprises the number of layers for initializing the downward parameters from the convolutional layer at the highest layer;

the VGG16 network architecture representing the color channels in the second detection network sequentially comprises the number of layers for initializing the lower parameters from the convolutional layer at the highest layer; such as

Parameters of neurons of layer 13, 12, 11, 10, 9 convolutional layers of the VGG16 network architecture for the deep channel in the second detection network are initialized.

If the first, second and third differences are enlarged and rounded down to an even number, i.e. the reconstruction degree is an even number, the convolution layer is based on

From top to bottom parameter initializationAfter completion of the quantization, the remaining convolutional layers except the preset convolutional layer share the parameters of the corresponding convolutional layer of the corresponding channel in the first detection network, such as

When the initialization of the parameters of the neurons of the 13 th, 12 th, 11 th, 10 th and 9 th convolutional layers of the VGG16 network architecture of the deep channel in the second detection network is completed, the 8 th, 7 th, 6 th, 5 th and 4 th convolutional layers of the deep channel in the first detection network directly share the parameters of the 8 th, 7 th, 6 th, 5 th and 4 th convolutional layers of the deep channel in the first detection network; if the first, second and third differences are enlarged and rounded down to be odd numbers, that is, the reconstruction degree is an odd number, then after the initialization of the parameters of the convolutional layers from top to bottom is completed, half of the neurons in the convolutional layers of the lowest convolutional layer after the initialization of the parameters are initialized, and the remaining convolutional layers except the preset convolutional layer share the parameters of the corresponding convolutional layer of the corresponding channel in the first detection network, such as the reconstruction degree

Is 11, then

After the initialization of the parameters of the neurons of the 13 th, 12 th, 11 th, 10 th and 9 th convolutional layers of the VGG16 network architecture of the deep channel in the second detection network is completed, the parameters of the general number of neurons in the 8 th convolutional layer are initialized, and then the parameters of the 7 th, 6 th, 5 th and 4 th convolutional layers of the deep channel in the first detection network are directly shared by the 7 th, 6 th, 5 th and 4 th convolutional layers.

When the VGG16 network architecture of the depth channel, the edge channel and the color channel of the second detection network is subjected to migration learning, if the skin difference between a new kind of apple and a first kind of apple is too large, parameters of a high-level convolutional layer in the VGG16 network architecture of the depth channel, the edge channel and the color channel of the first detection network should not be shared in the migration learning process, because the convolutional layer at the bottom layer acquires the characteristics of wide range and strong universality, and the convolutional layer at the high level acquires the characteristics of fine range and weak universality, when the skin difference between the new kind of apple and the first kind of apple is too large, the convolutional layer at the highest level should be initialized, so that the extracted characteristics are more accurate.

And the detection module is used for obtaining detection networks corresponding to various apples, inputting the depth map, the edge density map and the brown enhancement map corresponding to various apples, and outputting the defect types of the apple skins.

Firstly, after the second detection network performs migration learning based on the first detection network, after initializing the parameters of the convolution layers of the three channels of the second detection network and the parameters of the basic full-connection layer in the migration learning module, image data of a new kind of apples, namely a depth map, an edge density map and a brown enhancement map of the new kind of apples, needs to be used for training, so that the initialized convolution layers and the basic full-connection layer learn new knowledge, and the defect types of the flaw skins of various apples can be accurately detected.

So far, performing transfer learning based on the first detection network to obtain detection networks corresponding to various apples, then training the detection networks corresponding to various apples by using image data of various apples, and finishing network training when the cross entropy loss function is converged; and inputting a degree map, an edge density map and a brown enhancement map corresponding to various apples in a detection network corresponding to various apples, and outputting defect types of the skins of various apples.

It should be noted that: the precedence order of the above embodiments of the present invention is only for description, and does not represent the merits of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. An apple skin defect detection system based on transfer learning is characterized by comprising: the detection network construction module is used for processing the surface image of the apple to obtain a depth map, an edge density map and a brown enhancement map which only contain one apple; constructing a first detection network corresponding to a first kind of apples, wherein the network comprises three channels, namely a depth channel, an edge channel and a color channel; the three channels are used for extracting the features of a depth map, an edge density map and a brown enhancement map of the apple; processing the extracted features, and outputting a detection result through a Softmax classifier;

2. The system of claim 1, wherein the obtaining of the depth map, the edge density map and the brown enhancement map including only one apple comprises: segmenting the surface image of the apple to obtain an image only containing the apple, and obtaining a depth map of the image; graying an image only containing the apple and obtaining the edge of the apple; calculating the edge density of edge pixels by using a window with a preset size, converting the gray value of the edge pixels into the edge density of the edge pixels to obtain an edge density graph, wherein the edge density is the ratio of the number of the edge pixels in the window to the total number of the pixels in the window; and converting the image only containing the apple into an HSV space to obtain an HSV image, labeling the pixel points which are near brown in the HSV image with a first preset value, labeling the pixel points of other colors with a second preset value, and obtaining a brown enhanced image according to H, S, V values of the pixel points and the labeled first and second preset values.

3. The apple skin defect detection system based on transfer learning of claim 1, wherein the depth channel, edge channel and color channel comprise: the network structure of the three channels is a pre-trained VGG16 network architecture; the pre-trained VGG16 network architecture of the deep channel is a VGG16 network architecture pre-trained by using a deep image in a KITTI data set; the pre-trained VGG16 network architecture of the edge channel is a VGG16 network architecture pre-trained by utilizing a Brodatz texture image library; the pre-trained VGG16 network architecture for color channels is a VGG16 network architecture pre-trained with the streams-360 fruit image dataset.

4. The system of claim 1, wherein the obtaining of the depth average map, the texture average map, and the color average map comprises: and obtaining an average long-phase diagram of the depth map, the edge density map and the brown enhancement map by using a principal component analysis algorithm, wherein the average long-phase diagram is respectively a depth average map, a texture average map and a color average map corresponding to the depth map, the edge density map and the brown enhancement map.

5. The apple skin defect detection system based on transfer learning of claim 1, wherein the first difference comprises; obtaining a central point of a depth average map corresponding to the first kind of apples, and simultaneously obtaining an average value of second-order gradients of pixel points on a horizontal straight line passing through the central point and an average value of second-order gradients of pixel points on a vertical straight line passing through the central point, and respectively recording the average values as a horizontal second-order gradient value and a vertical second-order gradient value; obtaining a horizontal second-order gradient value and a vertical second-order gradient value corresponding to the depth average map of the new apple variety; and obtaining a first difference according to the ratio of the horizontal second-order gradient values corresponding to the depth average maps of the first kind of apples and the new kind of apples and the ratio of the vertical second-order gradient values.

6. The apple skin defect detection system based on transfer learning of claim 1, wherein the second difference comprises: acquiring gray level co-occurrence matrixes of texture average images of the first kind of apples and the new kind of apples, and respectively acquiring energy, contrast and entropy corresponding to the texture average images of the first kind of apples and the new kind of apples according to the gray level co-occurrence matrixes; and carrying out weighted summation on the first preset value and the ratio of energy, contrast and entropy corresponding to the texture average map of the first kind of apples and the new kind of apples to obtain a second difference.

7. The system for detecting apple peel defects based on transfer learning of claim 1, wherein the third difference is:

wherein the content of the first and second substances,

represents a third difference; c represents the c-th color, and n represents the number of types of colors in the color average map;

the times of the appearance of the pixel points of the c color in the color average graph of the first apple type are represented,

8. The system of claim 1, wherein the selecting parameters of the convolutional layers in the corresponding channels for initialization according to the obtained first, second and third differences respectively, except for the preset convolutional layers, comprises: sharing the parameters of the preset convolutional layer in the VGG16 network architecture of the depth, edge and color channels in the first detection network with the convolutional layer of the same layer in the second detection network; obtaining the ratio of the first difference, the second difference and the third difference after expansion and rounding-down to a third preset value, and taking the result of rounding-down of the ratio as the number of convolution layers for parameter initialization; sequentially carrying out parameter initialization from the convolution layer at the highest layer according to the rounding result until the number of the initialized convolution layers is the rounding result; if the first, second and third differences are expanded and are even after being rounded downwards, after the initialization of the parameters from the highest convolutional layer to the lower layers is completed, the parameters of the corresponding convolutional layers of the corresponding channels in the first detection network are shared by the remaining convolutional layers except the preset convolutional layer; if the first, second and third differences are expanded and are odd numbers after rounding down, after the initialization of the parameters is completed from the highest convolutional layer to the next convolutional layer in sequence, initializing half of the neurons in the next convolutional layer of the lowest convolutional layer after the initialization of the parameters, and sharing the parameters of the corresponding convolutional layer of the corresponding channel in the first detection network with the remaining convolutional layers except the preset convolutional layer.

9. The system of claim 1, wherein the detection module further comprises: and training by using the detection networks after initialization of the depth map, the edge density map and the brown enhancement map convolution layer parameters corresponding to various apples to obtain the detection networks corresponding to various apples.