CN112381787A

CN112381787A - Steel plate surface defect classification method based on transfer learning

Info

Publication number: CN112381787A
Application number: CN202011264401.XA
Authority: CN
Inventors: 郑宗华; 方鑫
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2020-11-12
Filing date: 2020-11-12
Publication date: 2021-02-19

Abstract

The invention relates to a classification method of steel plate surface defects based on transfer learning, which comprises the steps of firstly obtaining a typical image sample of the steel plate surface defects from an NEU database, and carrying out data enhancement pretreatment on the sample; then, a neural network Mobilenet model pre-trained on an ImageNet data set (1400 million images) is used for building a classification network on the NEU steel plate surface defect samples to realize transfer learning, and finally, results obtained by the classification model are evaluated. The method disclosed by the invention has the advantages of high accuracy in identifying the surface defects of the steel plate and high classification speed, and solves the problems of poor generalization performance, time-consuming process and insufficient model data samples in the traditional classification. Can be effectively applied to classification detection of the surface defects of the steel plate.

Description

Steel plate surface defect classification method based on transfer learning

Technical Field

The invention relates to the field of computer deep learning, in particular to a steel plate surface defect classification method based on transfer learning.

Background

Surface defect detection is a very important research content in the field of machine vision, and in recent years, with the rise of deep learning models represented by Convolutional Neural Networks (CNNs), many defect detection methods based on deep learning are widely applied to various industrial scenes. The defect detection can be divided into three different levels, namely defect classification, defect positioning and defect segmentation according to different requirements. The defect classification needs to give class information of an image, the defect positioning needs to give a specific position of a defect, and the defect segmentation needs to segment the defect and give a series of information of the length, the area, the position and the like of the defect.

The surface defect detection method based on deep learning is divided into a fully supervised learning model, an unsupervised learning model and other methods (a semi-supervised learning model and a weakly supervised learning model) according to different data labels. In the fully supervised model, methods based on characterization learning and metric learning are classified according to the difference between the input image mode and the loss function. In the characterization learning, the classification network, the detection network and the segmentation network can be further subdivided according to different network structures.

In the classification of networks, due to the powerful feature extraction capability of CNN, the use of CNN-based classification networks has become the most common model in surface defect classification. Generally, the network for classifying the existing surface defects usually adopts network structures such as AlexNet, VGG, GoogleNet, ResNet, DenseNet, SENET, shufftlent, Mobilene and the like.

In the actual production process of hot-rolled strip steel, various defects such as pressed scale, scratch, pitted surface, inclusion, plaque, crack and the like easily appear on the surface of a steel plate due to various physical and chemical factors and the complexity of a hot rolling process, and huge economic and commercial reputation losses are brought to product manufacturers. The hot-rolled strip steel has the problems of high surface temperature, high radiation intensity, influence of water, iron scale, uneven illumination and the like, is one of the main difficulties in detecting defects of the hot-rolled strip steel by machine vision, has various defect forms, large intra-class difference and high similarity among classes, and further increases the development difficulty of a hot-rolled strip steel surface defect identification algorithm. In the current defect classification task of the hot-rolled strip steel, two indexes that the classification accuracy and the classification speed are always mutually restricted meet the requirement of identification precision and are difficult to meet the requirement of real-time property. Therefore, real-time and accurate classification research on the surface defects of the hot-rolled strip steel is very important for production and quality control of the strip steel.

Disclosure of Invention

In view of the above, the present invention provides a method for classifying steel plate surface defects based on transfer learning, so as to solve the problems of poor generalization performance, time-consuming process and insufficient model data samples in the conventional classification.

The invention is realized by adopting the following scheme: a steel plate surface defect classification method based on transfer learning comprises the following steps:

step S1: acquiring a typical image sample of the surface defect of the steel plate from the NEU surface defect database data set, and preprocessing the sample;

step S2: building a classification network on a NEU steel plate surface defect sample by using a neural network Mobilenet model pre-trained on an ImageNet data set so as to realize transfer learning;

step S3: evaluating the performance of the classification model built in the step S2, calculating an AP value, and drawing a recall curve, an AUC curve and a confusion matrix to realize the visualization of model evaluation; and (5) sending the defect picture to the neural network model obtained in the step (S2) of transfer learning to obtain a classification result.

Further, the step S1 of preprocessing the sample specifically includes the following steps:

step SA: dividing 1800 pictures in a data set into 6 classes according to categories, wherein the 6 classes comprise reticulate patterns, inclusions, patches, surface pits, iron scale pressing and scratches, each class of sample comprises 300 grayscale images with 200x200 resolution, the given images in the data set are in a bmp format, 72% of each class of defect detection data set is randomly taken as a training set, 18% of each class of defect detection data set is taken as a verification set, and the rest 10% of each class of defect detection data set is taken as a test set;

step SB: during training of the data set, 16 pictures are randomly extracted from the training set each time and input into the neural network, the pictures are enhanced by data, and an image generator part of the training set and a test set is defined, and the specific contents are as follows:

and (3) setting the horizontal and vertical projection transformation to be 0.2, setting the scaling to be 0.2, starting horizontal and vertical overturning, and generating random transformation of a credible image so as to enable the neural network mobilene model to identify more different characteristics and increase the generalization capability of the model.

Further, the building of the classification network in step S2 specifically includes the following steps:

step Sa: building a model, namely freezing all layers of the pre-training model of the Mobilene _1.0_224 based on the pre-training model of the Mobilene _1.0_224, and selecting the weight of the basic model not to be updated in the training process;

and Sb: in order to make the model suitable for 6 defect classifications, the last full-link layer, i.e. the output layer, is removed from the mobilent _1.0_224 pre-trained model, a global average pooling layer (globalagregapopoiling 2D layer) is used, a dense layer with a ReLU activation function is added behind the pooling layer, and then the dense layer is sent to a softmax layer with 6 probability functions to judge which type of 6 types of defects the input belongs to; calculating the probability that the probability belongs to the class is close to 1, and the probability belonging to other classes is close to 0; the calculation method is as follows:

in the formula:

represents the input of the jth neuron of the L-th layer; calculating the ratio of each value, ensuring the sum to be 1, and the output vector is the probability of each type;

step Sc: setting a learning rate updating rule by using the classified cross entropy as a loss function and adopting a random gradient descent (SGD) device;

the calculation formula is as follows:

in the formula, decade _ lr is the learning rate after attenuation, lr is the learning rate set for starting, decade _ rate is the attenuation coefficient, global _ step is the current iteration step number, decade _ step is the step number of one-time passing of the change of the learning rate;

step Sd: model training: setting initial training 5epoch of the model;

step Se: fine adjustment of a model: and setting the front 2/3 layer of the model after freezing training during fine tuning, and applying an early-stopping method in the fine tuning process of the model.

Further, the specific content of the step Se is as follows:

setting a front 2/3 layer of the frozen trained model during fine tuning, starting fine tuning from a 58 th layer because the models have 87 layers in total, applying an early-stopping method in the fine tuning process of the models, namely obtaining a test result on a verification set, and stopping training if the precision of the verification set is not improved within three generations of the verification set along with the increase of epoch; and using the parameters in the last iteration result as final parameters of the model.

Further, the specific content of the classification performance evaluation in step S3 is:

in order to test the effectiveness of the network model finally obtained in the step S2 in the test set, TP, FP and FN values are calculated through the predicted defect class of each input image output by the network model; drawing a Recall curve by utilizing the accuracy P of a certain model, Precision TP/(TP + FP) and Recall R (Recall TP/(TP + FN)), wherein the abscissa is the Recall R, and the ordinate is the accuracy P of the certain model, calculating the area ratio of the shaded part in the accurate Recall graph, namely the average accuracy AP value, and finally calculating the average value of the AP of each type to obtain the average accuracy mAP of 1; wherein TP represents a true case, which is the number that is actually positive and predicted to be positive; FP represents a false positive case, being the number that is actually negative but predicted to be positive; FN represents a false negative, being the number that is actually positive but predicted to be negative; TN represents the number that the true negative is actually negative and predicted to be negative.

Compared with the prior art, the invention has the following beneficial effects:

the method has the advantages of high accuracy in identifying the surface defects of the steel plate, high classification speed, low time cost and good robustness, and solves the problems of poor generalization performance, time-consuming process and insufficient model data samples of the traditional classification. Can be effectively applied to classification detection of the surface defects of the steel plate.

The invention has high accuracy, the mAP value is 1, and all defect samples are tested to be completely classified correctly. The classification accuracy of the traditional other neural networks is exceeded.

The method has the advantages of high classification speed and low time cost, is embodied in that the model is a migrated Mobilene model, is more efficient than other traditional convolutional neural networks, has small number of model convolutional layers and small parameter quantity, can obtain a classification result more quickly, and can obtain a very good classification result after the model runs for 8 generations in total due to migration learning, so that the time cost is low.

The robustness of the invention is embodied in that the transfer learning method can be applied to data sets of other defect surfaces.

Drawings

FIG. 1 is a flow chart of a method according to an embodiment of the present invention.

Fig. 2 is a sample diagram of 6 typical defect images in the NEU database according to an embodiment of the invention.

Fig. 3 is a diagram illustrating a transformation of a picture during data enhancement processing according to an embodiment of the present invention.

FIG. 4 is a comparison of a conventional convolution and depth separable convolution process according to an embodiment of the present invention.

FIG. 5 is an image of the accuracy of the model training and tuning process according to an embodiment of the present invention.

FIG. 6 is a graph of a model PR curve according to an embodiment of the invention.

FIG. 7 is a graph of the model AUC for an embodiment of the present invention.

FIG. 8 is a diagram of a model confusion matrix according to an embodiment of the invention.

FIG. 9 is a model normalized confusion matrix according to an embodiment of the present invention.

Detailed Description

The invention is further explained below with reference to the drawings and the embodiments.

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

As shown in fig. 1, the present embodiment provides a method for classifying surface defects of a steel plate based on transfer learning, which includes the following steps:

step S2: constructing a classification network on a NEU steel plate surface defect sample by using a neural network Mobilenet model pre-trained on an ImageNet data set (1400 million images) so as to realize transfer learning;

In this embodiment, the step S1 of preprocessing the sample specifically includes the following steps:

step SA: in order to ensure the objectivity of a test result, 1800 pictures in a data set are classified into 6 classes according to the classes, wherein the 6 classes comprise reticulate patterns, inclusions, patches, surface pits, and iron oxide sheet pressing and scratching, each class of sample comprises 300 grayscale images with the resolution of 200x200, the image given in the data set is in a bmp format, 72% of each class of defect detection data set is randomly taken as a training set, 18% of each class of defect detection data set is taken as a verification set, and the rest 10% of each class of defect detection data set is taken as a test set;

In this embodiment, the building of the classification network in step S2 specifically includes the following steps:

step Sa: model building, namely freezing all layers of the pre-training model of the Mobilene _1.0_224 based on the pre-training model of the Mobilene _1.0_224, and selecting the weight not to update the basic model in the training process

And Sb: in order to make the model suitable for 6 defect classifications, the last full-link layer, namely the output layer, is removed from the model pre-trained by Mobilene _1.0_224, a global average pooling layer GlobavalagePooling 2D is used, a dense layer with a ReLU activation function is added behind the pooling layer, then the dense layer is sent to a softmax layer with 6 probability functions, and the type of the defect belonging to 6 types is judged; calculating the probability that the probability belongs to the class is close to 1, and the probability belonging to other classes is close to 0; the calculation method is as follows:

in the formula:

an input representing the jth neuron at level L (usually the last level); calculating the ratio of each value, ensuring the sum to be 1, and the output vector is the probability of each type;

the calculation formula is as follows:

step Sd: model training: setting initial training 5epoch of the model;

In the embodiment, the model training is to set trained algebraic input picture data to a neural network model for classification training; the introduction is that a full connection layer is connected behind a pre-training model for feature extraction, and then the network is classified by softmax; the classification accuracy of the network improves after several training.

The meaning of fine tuning a model is to adjust the model. The model can be selectively fine-tuned on the input-output in the target steel dataset to adapt it to the target task. The method is usually adopted to freeze a part of parameters of the model to be unchanged during training, and only adjust a part of parameters of the updated model, wherein the first 2/3 layers of the frozen model are selected.

In this embodiment, the specific content of step Se is:

In this embodiment, the specific content of the classification performance evaluation in step S3 is:

Among these, the reason for one type of model is to obtain the respective P, R and AP of each type of model on the reticulation, inclusions, patches, surface pitting, scale indentation and scratches.

Preferably, in this embodiment, the image of the surface defect of the steel plate is added to the trained neural network model, the model outputs the prediction result of the image, the model determines which of the six defects is, and the output value is the defect result deemed by the model. Therefore, all original pictures without labels are input, the model can give the classification result of the model, and the original pictures can be classified.

Preferably, the embodiment adopts the migration learning based on the Mobilenet convolutional neural network, and performs preprocessing such as small-scale image data enhancement and the like, and then performs the migration learning on the large-scale data set pre-training model and modifies the dense connection classification layer. And fine-tuning convolution base parameters on the small-scale data set on the basis to obtain a recognition and classification result.

The method specifically comprises the following steps: the convolutional neural network is a classification network, the output predicted value is a classification result, for example, a reticulate pattern picture is input, the model can judge which of six defects the convolutional neural network is, and the output value is a defect result considered by the model. Therefore, all original pictures without labels are input, the model can give the classification result of the model, and the original pictures can be classified.

Preferably, the implementation process of this embodiment is as follows:

the method comprises the following steps: and acquiring a typical image sample of the surface defect of the steel plate from the NEU surface defect database data set, and preprocessing the sample.

The NEU surface defect database dataset is a defect detection dataset opened by Sonke minister and the like of northeast university. The data set contains six types of hot Rolled steel strip Surface defects, namely, cross hatch (Cr), Inclusion (In), Patches (Patches, Pa), Surface Pitting (PS), mill-In Scale (RS), and Scratches (Sc), with 300 samples of each type having a 200x200 resolution grayscale image, a total number of samples of 1800, and the data set giving the image In the form of a.bmp format. Fig. 2 shows an example sample of typical surface defect images of six types of steel strips, and it can be clearly observed that the defects in the same type have large differences in appearance, such as scratch defects having various existing forms, such as vertical, oblique and horizontal. Furthermore, different classes of defects have similarities, such as texture (Cr), surface Pitting (PS), and scale indentation (RS). In addition, under the influence of illumination and production environment change, the gray values of the similar defect images have certain difference. Certain difficulty is brought to identification.

The method comprises the following two steps:

s1.1, in order to ensure the objectivity of a test result, 1800 pictures in a data set are classified into 6 classes according to classes, 72% of each class of defect detection data set is randomly taken as a training set, 18% of each class of defect detection data set is taken as a verification set, the remaining 10% of each class of defect detection data set is taken as a test set, and 180 test samples are used for detecting whether an algorithm model can correctly classify the six classes of defects in the test set. The defect sample distribution in the NEU surface defect database dataset is shown in Table one:

watch 1

Type of sample	Training sample	Validating a sample	Test specimen	Label (R)	Total of
						Cr	216	54	30	0	300
In	216	54	30	1	300
						PS	216	54	30	2	300
Pa	216	54	30	3	300
						RS	216	54	30	4	300
Sc	216	54	30	5	300
						Total of	1296	324	180	--	1800

S1.2, randomly extracting small batches of data to input in the training process of the data set, using data enhancement to pictures, defining an image generator part of the training set and a test set, setting the horizontal and vertical projection transformation to be 0.2, scaling to be 0.2, starting horizontal and vertical overturning, generating random transformation of a credible image, enabling the model to recognize more different characteristics, increasing the generalization capability of the model, and using the data enhancement part as shown in figure 3.

Step two: a neural network Mobilenet model pre-trained on an ImageNet data set (1400 million images) is used for building a classification network on a NEU steel plate surface defect sample, so that transfer learning is realized. The following explains the design idea, parameter setting and optimization of the algorithm model:

the design idea of the algorithm model adopted in this embodiment is to use a "MobileNet" model trained in advance on an ImageNet dataset (1400 million images) to implement transfer learning.

The step of transfer learning comprises model building, model training and model fine tuning.

S2.1 model construction

The MobileNet is a deep learning network proposed by Google, has the characteristics of high efficiency and low consumption, and can maintain high accuracy in tasks such as image classification and image recognition. The MobileNet algorithm optimizes the traditional full convolution mode, and decomposes the full convolution operation into two parts, namely Depthwise convolution and Pointwise convolution (deep separable convolution), so that the quantity of parameters needing to be learned is greatly reduced, and meanwhile, the sparse expression mode also reduces a lot of redundant information. On the basis, the model is provided with two hyper-parameters, namely a width factor and a resolution factor respectively, so as to control the size of the model and the resolution of an input image. FIG. 4 shows the left side of a conventional convolution scheme in a neural network, which is convolved by 3 × 3, and then passed through a BN (batch normalization) layer and a Relu (rectified Linear Unit) activation function; the right side is the depth separable convolution mode proposed by the MobileNet algorithm: the 3 × 3 conventional convolution approach is replaced with a Depthwise convolution and a 1 × 1 Pointwise convolution, then passed through the BN and ReLU activation functions, respectively, as with the conventional convolution.

The model on which this pre-training is based is mobilent _1.0_224, knowledge initially learned from the ImageNet dataset in order to utilize the network. We first freeze all layers of the base model and choose not to update the weights of the "base model" during the training process.

To adapt the model for 6 defect classifications, the last fully-connected layer (output layer) is removed from the original model, and a global pooling layer "globalaveragePooling 2D" layer is used, the global average pooling corresponding to the average

And (4) value pooling, wherein the average pooling is to select a large area corresponding to the filter on the extracted feature image and average the number which is not zero in the area. The characteristic information obtained by the method is sensitive to background information, and the global situation refers to taking the mean value of the whole feature map, so that the global information is ensured, and the parameters of the model are reduced.

The dense layer with the ReLU activating function is added behind the pooling layer to add nonlinear factors to the network, so that the defect of insufficient linear expression capacity is overcome, Relu calculation is simple, the operation efficiency of the machine is improved, and the formula is as follows:

F(x)＝max(0，x)

it is then sent to the softmax layer with 6 probability functions to determine which of the 6 classes of defects the input belongs to. The probability of being within this class is calculated to be close to 1, and the probability of being within other classes is calculated to be close to 0. The algorithm is mainly applied to multi-classification, and the calculation mode is as follows:

in the formula:

3756742, total parameters of the building model; 3734854 as training parameters; untrained parameters 21888, Table II gives information on the MobileNet model framework changes.

Watch two

Network layer	Output dimension	Amount of ginseng
			Global average pooling layer	1024	0
Activation function dense layer	512	524800
			softmax classification layer	6	3078

The batch size for which the data input network is set to train is 16. In addition, a tag map of the model is retrieved from the generator, building a verification generator. And (3) setting an initial learning rate by using the classified cross entropy as a loss function and adopting a random gradient descent (SGD) device, and gradually degrading the learning rate along with the increase of iteration times in order to quickly reach the vicinity of a minimum function value so as to avoid result oscillation caused by larger learning rate. The decay rate of the learning rate is determined by the iteration algebra and the decay coefficient. The calculation formula is as follows:

s2.1 model training

During model training, the initial training epoch of the model is set to 5 generations, and the verification precision is improved from 0.6389 to 0.9969.

S2.2 model Fine tuning

The following enters the fine-tuning process of the model, and a typical transfer learning process is such that: firstly, a new data set is trained through transfer learning, after a certain epoch is trained, the training is continued by using a fine tuning method, and meanwhile, the learning rate is reduced. This is done because if the fine tuning method is used from the beginning, the network is not yet adapted to the new data, and then a large gradient may cause the better parameters of the original training to be contaminated and the effect to be reduced instead when the parameters are updated. Because the data set is small, the first 2/3 layers of the frozen model are set during fine adjustment, because the model has 87 layers, fine adjustment is started from the 58 th layer, an early stopping method is used during fine adjustment of the model, the early stopping method is a widely used method, and when the performance of the model on the verification set begins to decline, training is stopped, so that the problem of overfitting caused by continuous training can be avoided. The method comprises the steps of calculating the error of a model on a verification set every other period, stopping training when the model is on the verification set (the update of the weight is lower than a certain threshold value; the predicted error rate is lower than a certain threshold value; and a certain iteration number is reached), and using the parameters in the last iteration result as the final parameters of the model. The experiment is set to stop early if the precision is not improved for 3 generations. And the third table shows the classification loss and the precision value in the fine adjustment process, the model has the test precision reaching 1 in the 6 th generation and the algorithm ending in the 9 th generation.

TABLE 3

	Training accuracy	Loss of training	Verification accuracy	Verifying loss
					Sixth generation	0.9961	0.0214	1.0000	0.0014
The seventh generation	0.9915	0.0286	0.9907	0.0015
					Eighth generation	0.9954	0.0172	0.9938	0.0037
Ninth generation	0.9923	0.0236	0.9938	2.3573e-05

FIG. 5 presents an image of training accuracy (and loss) versus validation accuracy (and loss) during model training and tuning. It can be seen that the algorithm has the best results after a total of 5 generations of adjustment, i.e. 6 th generation.

Step three: and evaluating the performance of the classification model built based on the step two.

S3.1, an experiment platform:

the experimental environment comprises CPU I7-6700; a computing platform, GPU NVIDIA 960 m; an environment manager: jupyternotebook under Anaconda 4.8.3; programming language: python 3.7.3; and (3) deep learning architecture: tensorflow1.14.0.

S3.2 Performance evaluation index

In order to test the effectiveness of the model in a test set, one sample is sent each time, the AP value is calculated by obtaining the prediction index of each input test image, and an accurate recall curve, an ROC curve and a confusion matrix are drawn to realize the visualization of model evaluation.

In the target detection, an Average accuracy AP (Average precision) is calculated by using a certain model accuracy p (precision) and a recall rate r (recall), and finally, the Average accuracy mapp (mean Average precision) is used as a standard for performance evaluation of the target detection model.

Precision ratio (precision ratio): precision ═ TP/(TP + FP);

recall ratio: recall is TP/(TP + FN);

accuracy (accuracy): accuracy ═ (TP + TN)/(TP + FP + TN + FN)

F value (F1-scores): precision and Recall weighted harmonic means, and assuming that both are equally important: f1-score (2 reduce + Precision)/(reduce + Precision).

Wherein, TP: true case, actually positive predicts positive; FP: false positive case, actually negative but predicted positive; FN: false negative examples, actually positive but predicted negative; TN: on the contrary, an actual negative prediction is negative.

For ease of observation, the AP values and exact recall curves for the 6 categories are plotted in a graph. As shown in fig. 6, the average accuracy AP is the area ratio of the shaded portion in the figure. Wherein the abscissa is recall R (recall) and the ordinate is model accuracy P (precision). Since the model classification is good, it can be seen that the AP values of 6 classes are all 1, the mapp is 1, and the accurate recall PR curve is a straight line. The model is shown to classify all the test samples correctly, and for further exploration, an ROC curve and a confusion matrix can be output to observe whether the evidence is correct or not.

The ROC curve is shown in FIG. 7

The confusion matrix is used to calculate classification accuracy and helps visualize the predictive labels of our test images. It can be seen that the model makes the correct predictions for all test images. Fig. 8 shows a confusion matrix, and the test image is composed of six defects, each group of defects has thirty pictures, and the real results of the 6 defects are Cr 30, In 30, Pa 30, PS 30, Rs 30 and Sc 30. FIG. 9 shows the normalized confusion matrix, with the accuracies of the 6 steel defects Cr 1, In 1, Pa 1, PS 1, Rs 1, Sc 1, respectively.

The above description is only a preferred embodiment of the present invention, and all equivalent changes and modifications made in accordance with the claims of the present invention should be covered by the present invention.

Claims

1. A steel plate surface defect classification method based on transfer learning is characterized in that: the method comprises the following steps:

2. The method for classifying surface defects of steel plates based on transfer learning according to claim 1, wherein the method comprises the following steps: the step S1 of preprocessing the sample specifically includes the following steps:

3. The method for classifying surface defects of steel plates based on transfer learning according to claim 1, wherein the method comprises the following steps: the building of the classification network in the step S2 specifically includes the following steps:

and Sb: in order to make the model suitable for classifying 6 defects, a final full-connection layer, namely an output layer, is removed from a pre-trained model of Mobilene _1.0_224, a global average pooling layer is used, a dense layer with a ReLU activation function is added behind the pooling layer, then the dense layer is sent to a softmax layer with 6 probability functions, and the type of the defect belonging to 6 types is judged; calculating the probability that the probability belongs to the class is close to 1, and the probability belonging to other classes is close to 0; the calculation method is as follows:

in the formula:

step Sc: setting a learning rate updating rule by using the classified cross entropy as a loss function and adopting a random gradient descender;

the calculation formula is as follows:

step Sd: model training: setting initial training 5epoch of the model;

4. The method for classifying the surface defects of the steel plate based on the transfer learning according to claim 3, wherein the method comprises the following steps: the specific content of the step Se is as follows:

5. The method for classifying surface defects of steel plates based on transfer learning according to claim 1, wherein the method comprises the following steps: the specific content of the classification performance evaluation in step S3 is: