CN117671354A

CN117671354A - Industrial product surface defect detection method based on multi-scale woven fusion network

Info

Publication number: CN117671354A
Application number: CN202311645352.8A
Authority: CN
Inventors: 李俊峰; 刘双宁; 刘存淩
Original assignee: Zhejiang Sci Tech University ZSTU
Current assignee: Zhejiang Sci Tech University ZSTU
Priority date: 2023-12-04
Filing date: 2023-12-04
Publication date: 2024-03-08

Abstract

The invention provides an industrial product surface defect detection method based on a multi-scale braiding fusion network, which comprises the following steps: classifying the surface defects of the industrial products; constructing a backbone network unit of a multi-scale braiding fusion network model; a multi-branch expansion convolution module is arranged behind a main network unit; the method comprises the steps that a context parallel fusion woven network is adopted to construct a neck network unit of a multi-scale woven fusion network model, the context parallel fusion woven network is provided with three parallel branches, each parallel branch comprises a feature fusion module, and the feature fusion module is used for fusing feature graphs which are extracted from different feature layers and have different scale feature information and different semantic information; model training is carried out on the multi-scale weaving fusion network model; testing the multi-scale weaving fusion network model after training; and (5) deploying a model, and detecting surface defects of the industrial product. The invention improves the performance of the surface defect detection model in the defect detection task.

Description

Industrial product surface defect detection method based on multi-scale woven fusion network

Technical Field

The invention relates to an industrial product surface defect detection method, in particular to an industrial product surface defect detection method based on a multi-scale weaving fusion network, and belongs to the technical field of surface defect detection.

Background

In modern industrial production, product defect detection is a crucial link, and has important significance for ensuring product quality, improving production efficiency and reducing cost. In the defect detection process of products such as light guide plates, bearings, magnetic shoes and the like in the industrial field, a plurality of challenges are usually faced, such as small detection target size, low contrast, obvious intra-class difference, insignificant inter-class difference and the like.

Many defects in industrial production often have tiny and weak characteristics, so that a network model for detecting the defects has stronger perceptibility and robustness and can cope with complex situations such as low contrast, small defects and the like. Conventional defect detection methods employing target detection models often fail to meet these requirements because conventional target detection models typically assume higher contrast and larger target sizes at the time of construction, ignoring sensitivity to fine features. Furthermore, because the morphology and characteristics of defects are not the same for different types of industrial products, simply replacing a dataset may not cover all cases and lack pertinence.

In summary, in order to obtain a better defect detection effect in the field of industrial product surface defect detection, an efficient and accurate industrial product surface defect detection method is needed to improve the performance and effect of defect detection.

Disclosure of Invention

Based on the background, the invention aims to provide the industrial product surface defect detection method based on the multi-scale woven fusion network, so that the limitation of industrial product surface defect detection in the prior art is overcome, and the performance of a surface defect detection model in a defect detection task is improved.

In order to achieve the above object, the present invention provides the following technical solutions:

a method for detecting surface defects of industrial products based on a multi-scale woven fusion network comprises the following steps:

classifying the surface defects of the industrial product, collecting a data set according to the defect classification, and dividing the data set into a training set, a verification set and a test set;

constructing a main network unit of a multi-scale braiding fusion network model based on a main network of YOLOv 7;

arranging a multi-branch expansion convolution module behind a main network unit, and connecting the multi-branch expansion convolution module with a neck network unit;

a neck network unit of a multi-scale woven fusion network model is constructed by adopting a context parallel fusion woven network, wherein the context parallel fusion woven network is provided with three parallel branches which are extracted from different feature layers and contain different scale feature information, each parallel branch comprises a feature fusion module, and the feature fusion module is used for fusing feature graphs which are extracted from different feature layers and have different scale feature information and different semantic information;

model training is carried out on the multi-scale weaving fusion network model through a training set and a verification set, a loss function adopted by model training comprises a classification loss function, a positioning loss function and a confidence loss function, a loss ranking module is introduced into the confidence loss function, and the loss ranking module is used for mining a loss value through loss level so as to filter out a prediction frame with low confidence of model prediction;

testing the trained multi-scale woven fusion network model through a test set;

and deploying a multi-scale weaving fusion network model, and detecting surface defects of industrial products.

Preferably, the backbone network unit includes a CBS module, an ELAN module, an MP module, and an SPPCSPC module, where the three CBS modules are sequentially connected to the first ELAN module and the first MP module, and then sequentially connected to the second ELAN module and the second MP module, and the third ELAN module and the third MP module, and then connected to the fourth ELAN module and the SPPCSPC module.

Preferably, the multi-branch expansion convolution module is connected with the SPPCSPC module, the multi-branch expansion convolution module forms a multi-branch structure through convolution kernels with different sizes, a hole convolution layer is introduced, and the parameters of the hole convolution layer are represented by values with different hole ratios.

Preferably, the output characteristics of the multi-branch expansion convolution module are obtained by:

the input features form feature graphs with different receptive fields through cavity convolution with three different cavity rates, wherein the cavity rates of the three cavity convolutions are respectively 1, 2 and 3, and the convolution kernel sizes are 3 multiplied by 3;

carrying out connection fusion on the characteristic diagrams with different receptive fields;

recovering the number of channels by a common convolution with a convolution kernel size of 1 x 1;

the output characteristics are obtained by a ReLU activation function.

Preferably, the context parallel fusion fabric has a prospective mechanism for fusing current layer output features of a current branch with future layer output features of adjacent branches, thereby capturing useful feature corrections that may occur at the future layer and enhancing feature expression of the current layer.

Preferably, the output characteristics of the characteristic fusion module are obtained by the following method:

receiving a plurality of input features from different layers, and forming a comprehensive feature through scale alignment and connection fusion;

the comprehensive features are connected and fused after being processed by two parallel branches, convolution, batch normalization and activation function processing are carried out on a first parallel branch, and convolution, batch normalization, activation function and residual error processing are carried out on a second parallel branch;

and carrying out convolution, batch normalization and activation function processing on the characteristics after connection and fusion to obtain output characteristics.

Preferably, the method for obtaining the confidence loss value by the confidence loss function comprises the following steps:

in the training process of each small batch, a balanced Focal Loss function is applied to each cell, so that a Loss value of a detection result is obtained;

flattening the three-dimensional cell structure, and respectively connecting the loss value of each image sample to different weight vectors;

sorting the loss values of each image;

selecting a number proportion of the top B cells of each image sample from the sorted loss vectors, wherein the number of B is determined by an LRM_ignore parameter of the loss ranking module;

calculating the mean value of each selected loss;

and adding the average values to obtain a confidence loss value.

Preferably, the industrial product is a light guide plate, and classifying the surface defects of the industrial product includes:

the surface defects of the light guide plate are classified into bright line defects, dark line defects, surface defects and white spot defects.

Preferably, the training parameters adopted in the model training are as follows: 16; dynamic parameters:0.937; initial learning rate:0.01; final learning rate:0.001; lrm_ignore:0.65; input image size:448 x 448; epochs:300.

compared with the prior art, the invention has the following advantages:

according to the industrial product surface defect detection method based on the multi-scale woven fusion network, the EC-PFN multi-scale woven fusion network model with strong context sensing capability and parallel feature fusion characteristics is adopted, so that the feature fusion performance and effect are remarkably improved, more accurate defect detection is realized, the target loss in a target detection task is optimized, the influence of a negative sample is reduced by neglecting a prediction frame with low confidence, the performance of the model is improved, and the target detection precision and speed are improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of an industrial product surface defect detection method based on a multi-scale woven fusion network;

FIG. 2 is a schematic structural diagram of a multi-scale woven fusion network model EC-PFN of the present invention;

FIG. 3 is a detailed block diagram of a portion of the modules of the EC-PFN of the present invention;

FIG. 4 is a schematic view showing the effect of the RFB module of the present invention;

FIG. 5 is a schematic view of the structure of the RFB module of the present invention;

fig. 6 is a schematic diagram of a structure of FWN in the present invention;

FIG. 7 is a schematic diagram of a Unifusion layer of the feature fusion module of the present invention;

FIG. 8 is a schematic diagram of a fusion module according to the present invention;

FIG. 9 is a diagram showing a light guide plate image capturing device according to an embodiment of the present invention;

FIG. 10 is a training loss function diagram of an ablation experiment in accordance with an embodiment of the present invention;

FIG. 11 is a training mAP diagram of an ablation experiment in accordance with an embodiment of the invention;

FIG. 12 is a graph showing test results of a model according to an embodiment of the present invention;

FIG. 13 is a graph showing test results of the second embodiment of the present invention.

Detailed Description

The technical scheme of the invention is further specifically described below through specific embodiments and with reference to the accompanying drawings. It should be understood that the practice of the invention is not limited to the following examples, but is intended to be within the scope of the invention in any form and/or modification thereof.

In the present invention, unless otherwise specified, all parts and percentages are by weight, and the equipment, materials, etc. used are commercially available or are conventional in the art. The methods in the following examples are conventional in the art unless otherwise specified. The components and devices in the following examples are, unless otherwise indicated, all those components and devices known to those skilled in the art, and their structures and principles are known to those skilled in the art from technical manuals or by routine experimentation.

In the following detailed description of embodiments of the invention, reference is made to the accompanying drawings, in which, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the invention. However, one or more embodiments may be practiced by one of ordinary skill in the art without these specific details.

As shown in fig. 1, an embodiment of the invention discloses a method for detecting surface defects of an industrial product based on a multi-scale woven fusion network, which comprises the following steps:

testing the trained multi-scale woven fusion network model through a test set;

In order to solve the problems of small target size, low contrast, obvious intra-class difference and insignificant inter-class difference in the surface defect detection of industrial products, cope with the limitation of the surface defect detection of the industrial products in the prior art and improve the performance of a surface defect detection model in a defect detection task, the invention discloses a novel multi-scale woven fusion network model (EC-PFN). The EC-PFN adopts a new context parallel fusion woven network architecture (Feature Weave Network, FWN hereinafter) to enhance the context awareness and parallel fusion capability of the model, and adopts a feature fusion module (Unifusion layer) to promote the effective learning of the model on multi-scale and multi-semantic information. The multi-branch expansion convolution module RFB is also introduced, the receptive field is enlarged, the feature extraction capability is enhanced, and the performances in the aspects of low contrast, imperceptible tiny defects and the like are optimized. In addition, loss Ranking Module is added into the loss function of the EC-PFN, and a parameter LRM_ignore is defined for optimizing the targeting loss in the target detection task, and the influence of a negative sample is reduced by omitting a prediction frame with low confidence, so that the performance of the model is improved.

A large number of experiments are carried out on a light guide plate defect data set acquired in an industrial field, and the experimental result shows that the detection accuracy (mAP) of the EC-PFN is 98.9%, the detection speed can reach 92FPS, and the calculated amount (GFcaps) is only 14.5, which is superior to the current mainstream surface defect detection model. The construction of an EC-PFN is described in detail below.

The EC-PFN is an enhanced, context-aware parallel feature fusion network, aimed at improving the performance and effect of feature fusion, with network frames and partial modules as shown in fig. 2 and 3, respectively. In order to construct the EC-PFN, the invention uses the classical target detection network architecture backbond of YOLOv7 as a reference, and the backbond structure can provide strong feature extraction capability for the EC-PFN, so that the invention can better cope with challenges in industrial defect detection. In the construction of the Backbone, the invention uses the convolution kernel with the convolution kernel size of 3 multiplied by 3 and the step length of 2, and performs downsampling on the image by combining multiple pooling operations, thereby reducing the dimension of the feature and acquiring the position information of the feature, reducing the parameter quantity and further simplifying the complexity of network calculation. The neck network adopts a new feature fusion architecture FWN, which introduces three parallel branches to perform feature fusion, and effectively extracts and fuses target features.

The backbone network unit comprises a CBS module, an ELAN module, an MP module and an SPPCSPC module, wherein the three CBS modules are sequentially connected with the first ELAN module and the first MP module, then sequentially connected with the second ELAN module and the second MP module, and the third ELAN module and the third MP module, and then connected with the fourth ELAN module and the SPPCSPC module. Specifically, the detailed structure of the backbone network unit is shown in table 1, and the CBS convolution module is composed of a Conv layer, a BN layer and an activation function Silu, where the Conv layer uses a convolution kernel with a size of 3×3 and a step length of 2. The CBS module and the MP module work cooperatively to downsample the image and generate six layers of characteristic images with different sizes, wherein the resolution of the six layers is 224 multiplied by 224, 112 multiplied by 112, 56 multiplied by 56, 28 multiplied by 28, 14 multiplied by 14, 7 multiplied by 7 and the like; the ELAN module is an efficient network architecture that enables the network to learn more features and is more robust by controlling the shortest and longest gradient paths. The ELAN module is provided with two branches, the first branch carries out channel transformation through a convolution of 1 multiplied by 1, the second branch carries out channel transformation through a convolution module of 1 multiplied by 1, then the feature extraction is carried out through two convolution modules of 3 multiplied by 3, and finally the four features are overlapped together and channel transformation is carried out again, so that a final feature extraction result is obtained. The first branch of the SPPCSP module includes four maximum pooling layers (maxpools), which represent the ability to handle objects of different sizes, and better distinguish between small and large objects. The second branch performs the conventional convolution operation, and finally the two parts are combined together to reduce the calculation amount, so that the speed is faster and the precision is higher.

TABLE 1 detailed structure of backbone network element

The multi-branch expansion convolution module RFB is connected with the SPPCSPC module, the multi-branch expansion convolution module RFB forms a multi-branch structure through convolution kernels with different sizes, a cavity convolution layer is introduced, and the parameters of the cavity convolution layer are represented by values with different cavity rates. Specifically, the RFB forms a multi-branch structure through convolution kernels with different sizes, and simultaneously introduces a dialated convolution layer, wherein parameters of the dialated convolution layer are represented by different rate values. The effect of RFB is schematically shown in fig. 4, and the structure is shown in fig. 5. As can be seen, RFB has two characteristics:

the convolution layer of the first convolution kernel and the convolution kernel of different sizes has a multi-branch structure, and the convolution effects of different convolution kernel sizes are different. Small-sized convolution kernels are typically used to capture details and local features, and by using small-sized convolution kernels, the RFB module can better capture fine-grained features of the target, such as texture, edges, etc. The large-size convolution kernel is used for acquiring wider context information, and the RFB module can better understand global context information around the target by using the large-size convolution kernel, so that the accuracy of target detection is improved.

Secondly, an expansion convolution layer, also called a cavity convolution layer, is introduced, and cavity convolution is a convolution operation of introducing a cavity rate (displacement rate) to enlarge a receptive field. The hole convolution can expand the receptive field without increasing the convolution parameters, thereby extracting more extensive context information.

Since the industrial defect data set has large-area defects such as dirt, breakage and the like, and also has small target defects with the width of 1 pixel, in order to capture defect characteristics of different size ratios, the output characteristics of the multi-branch expansion convolution module RFB are obtained by the following method:

the input features form feature graphs with different receptive fields through cavity convolution with three different cavity rates, wherein the cavity rates of the three cavity convolutions are respectively 1, 2 and 3, and the convolution kernel sizes are 3 multiplied by 3; for example, for small target defects, the dilation convolutions of different dilation rates help to expand their receptive fields, reduce the problem of information loss of small target defects during downsampling, and the features of different receptive fields help to improve multi-scale feature expression, enrich the context information, and enhance the feature information. The method comprises the steps of carrying out a first treatment on the surface of the

the output characteristics are obtained by a ReLU activation function.

In addition, a direct connection structure is added in the RFB module, and input and output are directly connected, so that semantic information of a defect shallow layer is reserved, the problem of gradient disappearance is effectively prevented, and the network gradient can be effectively propagated to a deeper layer.

FWN is to solve the problems that the conventional feature fusion network in the prior art cannot fully utilize multi-scale information, has limited perceptibility to context information, and has high dependence on feature fusion strategies. The introduction of FWN in the model aims at achieving more excellent performance and effect at the feature fusion stage, and the feature fusion structure of FWN is shown in fig. 6.

The core idea of FWN is to fuse features with both global and local contexts. Global context refers to context information of the entire input image or feature map, and local context refers to context information of a local region. FWN has three parallel branches for feature fusion. They are extracted from different feature layers respectively, and contain feature information of different scales. The size of the feature map is adjusted by introducing a sampling layer so as to achieve the purpose of fusion. FWN adopts a feature fusion module Unifusion layer, so that feature graphs with different scales can be fused better, and the learning ability of the model on various information is enhanced. In addition, FWN has a prospective mechanism for fusing current layer output features of the current branch with future layer output features of neighboring branches to capture useful feature corrections that may occur at the future layer and enhance the feature expression of the current layer. By utilizing information for future layers, FWN is able to better capture useful features that may appear in subsequent layers, thereby modifying and enhancing the feature expression of the current layer. This "bi-directional fusion" design helps to increase the diversity and discrimination of features, thereby improving the performance and generalization ability of the model.

The feature fusion module Unifusion layer is a neural network layer used for combining a plurality of features of different sources or types together, and can help a model to better capture the association and interaction of a plurality of information sources, thereby improving the performance and performance. The structure of the feature fusion module unification layer is shown in fig. 7 and 8, and the unification layer firstly samples feature graphs with different scales to ensure that the feature graphs have the same scale, then fuses the feature graphs with a Concat function, and sends the fused features into the fusion module.

The core goal of the feature fusion module Unifusion layer is to effectively fuse a plurality of feature graphs so as to enhance the representation capability and performance of the model and promote the effective learning of the model on multi-scale and multi-semantic information. The output characteristics of the characteristic fusion module are obtained by the following method:

receiving a plurality of input features from different layers, forming a comprehensive feature through scale alignment and connection fusion, and performing up-sampling or down-sampling operation to ensure that the input features have consistent scales;

the comprehensive features are connected and fused after being processed by two parallel branches, convolution, batch normalization and activation function processing are carried out on a first parallel branch, convolution, batch normalization, activation function and residual error processing are carried out on a second parallel branch, and the residual error processing can keep the original information of the input features and enhance gradient propagation;

The feature fusion module unifusionlayyer uses a re-parameterized structure that allows the module to use convolutionally more complex branches to better capture features during training, and can switch to fewer branches during verification or reasoning to save computational resources. This helps the module achieve efficient feature processing at different stages.

The penalty functions of the EC-PFN consist of three parts, classification penalty (cls_loss), localization penalty (box_loss), confidence penalty (obj_loss), which are used for different aspects of the target detection task. The classification loss is used for calculating whether the anchor frame and the corresponding calibration classification are correct or not; the positioning loss represents an error between the prediction frame and the calibration frame; the confidence loss represents the confidence of the computing network. The loss function is shown in the formula:

loss＝ω _cls cls_loss+ω _box box_loss+ω _obj obj_loss

where confidence and classification losses use binary cross entropy loss functions (BCEWithLogitsLoss), and positioning losses employ CIoU loss functions. The cross entropy loss function BCEWithLogitsLoss is defined as follows:

where n represents the number of input samples, y _n Representing the target value x _n Representing the predicted value of the network. The probability of belonging to a certain target and category can be obtained through the formula, and the larger the probability value is, the higher the reliability is.

In consideration of the problems of sample unbalance phenomenon, insufficient processing of difficult samples and the like in industrial data, the invention improves the loss function, digs and adjusts the confidence coefficient loss function by referring to the hard samples, reduces negative sample interference and improves model performance and training effect. The invention introduces Loss Ranking Module in the confidence loss function and defines the parameter LRM_ignore for optimizing the targeting loss in the target detection task, and reduces the influence of negative samples by omitting the prediction frame with low confidence. The improvement is realized by modifying a hard sample mining loss function and processing a loss value through loss level mining so as to achieve the purpose of filtering out a prediction frame with lower confidence. In the method for obtaining the confidence loss value by the adjusted confidence loss function, the first N (N is a parameter called "rank factor") detection results are selected for each feature map. The feature map herein refers to feature maps of different scales for detecting objects of different sizes. The following steps are performed for each feature map:

sorting the loss values of each image;

calculating the mean value of each selected loss;

and adding the average values to obtain a confidence loss value.

The loss is calculated in a balanced manner by selecting the first B detection results for each feature map. This helps to better address sample imbalance issues in target detection and to focus more on the classification of difficult samples on each feature map.

The application of the industrial product surface defect detection method based on the multi-scale woven fusion network in specific industrial product examples is further described below through two application examples.

Application example 1

A self-built light guide plate defect dataset is used, and the data come from a real industrial site. The light guide plate picture acquisition device is shown in fig. 9 and comprises a camera, a light source, a manipulator, a conveying device and the like. In view of the fact that high resolution images may occupy more computing resources and memory, this application example takes comprehensive consideration, and cuts out these images, resulting in 4111 images with resolution 448×448. These images are classified and labeled by the skilled artisan, and are classified into four types, mainly based on the type of defect, including four types of defects, namely, bright line defect (b_line), dark line defect (d_line), area defect (area), and white point defect (white_point). The specific classification is shown in table 2.

Table 2 classification of light guide plate defect dataset

	B_line	D_line	area	white_point
					Number	1014	1181	875	1024

The light guide plate data set has the difficulties of small defect target, low contrast and the like, and the light guide points are unevenly distributed due to the complex optical structural characteristics of the light guide plate, and the light guide points show different changes of density, so that the image background is full of complex textures. Meanwhile, the imaging characteristics of dust are very similar to white point defects, and erroneous judgment is easy to occur. In addition, the area of some white point defects is very close to the area of the light guide points, so that effective distinction is difficult; and dark line defects are difficult for the naked eye to distinguish clearly due to low contrast with the background. The size difference of white spots, bright lines, dark lines and surface defects is obvious, and the detection model is required to have a good detection effect on a multi-scale target.

Reasonable partitioning of the data sets is required before inputting the data sets into the network for training. According to the application example, each defect sample is divided according to the number of data set samples and the training rationality, and the dividing ratio is 6:2:2. it is emphasized that the images in these sets do not overlap each other to ensure the effectiveness of evaluating the model performance.

The training parameters adopted in the model training of the EC-PFN are as follows: 16; dynamic parameters:0.937; initial learning rate:0.01; final learning rate:0.001; lrm_ignore:0.65; input image size:448 x 448; epochs:300.

since the EC-PFN improved the YOLOv 7-back+FPN structure everywhere, in order to verify the detection results after each improvement and fusion improvement, analysis was performed by ablation experiments, the training loss function curve and training mAP curve are shown in FIGS. 10 and 11, and the ablation experiment results are shown in Table 3.

Table 3 ablation experimental results

As can be seen from table 3, the addition of the RFB module improves the detection accuracy of the model for two types of defects, namely b_line and d_line, but is not friendly for the other two types, which indicates that expanding the receptive field is helpful for detecting the linear defects of the light guide plate; the addition of the FWN architecture improves the whole, but also obviously improves the calculated amount (GFcaps); the model with the improved loss function introduced into the LRM also improves the detection precision as a whole, but the calculated amount is not increased; the mAP is up to 98.6% after the RFB, the FWN and the LRM are combined, so that the combination of the RFB, the FWN and the LRM is effective, the combination of the RFB, the FWN and the LRM not only improves the receptive field during feature extraction, but also fuses more-scale semantic information into a model in a feature fusion stage, improves the detection capability of a network on defects of different scales, and also has the problem that the calculated amount (GFcaps) is obviously increased. After the detection level is improved, not only is the calculated amount (GFlos) reduced, but also the detection precision is improved to a small extent, so that the detection precision is the highest.

To verify the validity of the EC-PFN, the present application compares it with one-stage and two-stage target detection methods, one-stage being mainly SSD, YOLOv5, YOLOv6, YOLOv7 and YOLOv7-tiny, two-stage being mainly Faster-RCNN. The comparison results of the respective target detection methods are shown in table 4, and the test results of the respective models are shown in fig. 12.

TABLE 4 detection results of light guide plate defect dataset on different models

As can be seen from table 4, the performance of the EC-PFN is superior to other target detection network models. In terms of detection accuracy, mAP values of EC-PFN were highest (98.9%), which were increased by 14.9%, 15.3%, 0.8%, 1.5%, 1.1%, 2.3% compared to SSD, faster-RCNN, yolov5, yolov6, yolov7-tiny, respectively; in the aspect of detecting the speed FPS, the detecting speed of the YOLOv7-tiny reaches 142 at the highest speed; in terms of calculated GFlips, the calculated amount of YOLOv7-tiny was the least, 13.2, and the EC-PFN was 14.5. Compared with SSD, faster-RCNN, YOLOv5, YOLOv6, YOLOv7 and YOLOv7-tiny networks, the EC-PFN improves the detection precision mAP on the premise of keeping the calculated amount and the parameter amount small, and meanwhile the detection speed also meets the real-time requirement.

As can be seen from fig. 12, a total of 9 pictures are randomly selected from the four types of defects and tested on a part of models, for each defect, bounding boxes, types and type scores are marked, different models have different detection effects on the defect dataset of the light guide plate, and obviously, compared with other detection models, the EC-PFN has better detection performance and higher confidence of detection results.

Application example two

A comparative experiment was performed using the NEU surface defect database data set (NEU-DET) published by the university of North east Song Kechen professor team, in the university of North East (NEU) surface defect database, six typical surface defects of hot rolled steel strip were collected, namely, inclusion of scale (RS), plaque (Pa), silver streak (Cr), pit Surface (PS), inclusions (In) and scratches (Sc). The database includes 1800 gray scale images: 300 samples, each with six different types of typical surface defects. The NEU-DET surface defect dataset has large differences in appearance of intra-class defects, e.g., scratches may be horizontal scratches, vertical scratches, oblique scratches, etc.; at the same time, inter-class defects have similar aspects, such as entanglement of scales, cracks and pit surfaces. These are all problems in the industry and are difficult to distinguish between detection.

The application example carries out comparison experiments on EC-PFN and each target detection method, and mainly comprises YOLOv5, YOLOv6, YOLOv7-tiny, A-ODSS and Multi-Scale YOLO-v5. Wherein the a-ODSS method is a layered method for detecting various atypical defects on a steel surface. The Multi-Scale YOLO-v5 method improves YOLO v5 and improves the detection performance. The comparison results of the respective target detection methods are shown in table 5, and the test results of the model are shown in fig. 13.

TABLE 5 detection results of NEU-DET on different models

As can be seen from Table 5, mAP values of the EC-PFN defect detection network were highest (79.1%), and increased by 1.5%, 7.4%, 3.5%, 6.9%, 1.98%, 7.1% compared to YOLOv5, YOLOv6, YOLOv7-tiny, A-ODSS, MS-YOLO-v5, respectively. Although the detection precision of the EC-PFN defect detection network in the crazing, inclusion, patches, pitted _surface class is not optimal, the overall precision is improved. EC-PFN performance is superior to other target detection networks.

As can be seen from FIG. 13, 11 pictures in total are randomly selected for testing on several models in 6 types of defect, different models have different detection effects on NEU-DET data sets, YOLOv5 has a problem of rechecking on inclusion defects, YOLOv7 and YOLOv7-tiny cannot detect rolled-in_scale and scratch defects, and obviously, EC-PFN has better detection performance compared with other detection models.

The industrial product surface defect detection method based on the multi-scale woven fusion network has the best performance on the light guide plate data set, the detection precision can reach 98.9%, the calculated amount is kept low, and the light-weight deployment is convenient. Even if the method is applied to a steel surface defect data set, the detection precision can reach 79.1%, and compared with the methods of YOLOv5, YOLOv6, YOLOv7-tiny, A-ODSS and MS-YOLO-v5, the detection precision is respectively improved by 1.5%, 7.4%, 3.5%, 6.9%, 1.98% and 7.1%. The superiority and expandability of the industrial product surface defect detection method based on the multi-scale weaving fusion network in the application field of industrial product surface defect detection are illustrated.

The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to facilitate an understanding of the method of the present invention and its core ideas. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the invention can be made without departing from the principles of the invention and these modifications and adaptations are intended to be within the scope of the invention as defined in the following claims.

Claims

1. A method for detecting surface defects of industrial products based on a multi-scale braiding fusion network is characterized by comprising the following steps of: the method comprises the following steps:

testing the trained multi-scale woven fusion network model through a test set;

2. The method for detecting the surface defects of the industrial product based on the multi-scale woven fusion network according to claim 1, wherein the method comprises the following steps of: the backbone network unit comprises a CBS module, an ELAN module, an MP module and an SPPCSPC module, wherein the three CBS modules are sequentially connected with a first ELAN module and a first MP module, then sequentially connected with a second ELAN module and a second MP module, a third ELAN module and a third MP module, and then connected with a fourth ELAN module and an SPPCSPC module.

3. The method for detecting the surface defects of the industrial product based on the multi-scale woven fusion network according to claim 2, which is characterized by comprising the following steps of: the multi-branch expansion convolution module is connected with the SPPCSPC module, forms a multi-branch structure through convolution kernels with different sizes, introduces a cavity convolution layer, and represents parameters of the cavity convolution layer by using values with different cavity rates.

4. A method for detecting surface defects of an industrial product based on a multi-scale woven fusion network according to claim 3, wherein the method comprises the following steps: the output characteristics of the multi-branch expansion convolution module are obtained by the following method:

the output characteristics are obtained by a ReLU activation function.

5. The method for detecting the surface defects of the industrial product based on the multi-scale woven fusion network according to claim 1, wherein the method comprises the following steps of: the context parallel fusion fabric network has a prospective mechanism for fusing current layer output features of a current branch with future layer output features of adjacent branches, thereby capturing useful feature corrections that may occur at the future layer and enhancing feature expression of the current layer.

6. The method for detecting the surface defects of the industrial product based on the multi-scale woven fusion network according to claim 1, wherein the method comprises the following steps of: the output characteristics of the characteristic fusion module are obtained by the following method:

7. The method for detecting the surface defects of the industrial product based on the multi-scale woven fusion network according to claim 1, wherein the method comprises the following steps of: the method for obtaining the confidence loss value by the confidence loss function comprises the following steps:

sorting the loss values of each image;

calculating the mean value of each selected loss;

and adding the average values to obtain a confidence loss value.

8. The method for detecting the surface defects of the industrial product based on the multi-scale woven fusion network according to claim 1, wherein the method comprises the following steps of: the industrial product is a light guide plate, and the classifying the surface defects of the industrial product comprises the following steps:

9. The method for detecting the surface defects of the industrial product based on the multi-scale woven fusion network according to claim 1, wherein the method comprises the following steps of: the training parameters adopted in the model training are as follows: 16; dynamic parameters:0.937; initial learning rate:0.01; final learning rate:0.001; lrm_ignore:0.65; input image size:448 x 448; epochs:300.