CN116205832A - Metal surface defect detection method based on STM R-CNN - Google Patents

Metal surface defect detection method based on STM R-CNN Download PDF

Info

Publication number
CN116205832A
CN116205832A CN202111430299.0A CN202111430299A CN116205832A CN 116205832 A CN116205832 A CN 116205832A CN 202111430299 A CN202111430299 A CN 202111430299A CN 116205832 A CN116205832 A CN 116205832A
Authority
CN
China
Prior art keywords
cnn
metal surface
feature
stage
defect detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111430299.0A
Other languages
Chinese (zh)
Inventor
王卫
张新凯
于波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Institute of Computing Technology of CAS
Original Assignee
Shenyang Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Institute of Computing Technology of CAS filed Critical Shenyang Institute of Computing Technology of CAS
Priority to CN202111430299.0A priority Critical patent/CN116205832A/en
Publication of CN116205832A publication Critical patent/CN116205832A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20104Interactive definition of region of interest [ROI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection
    • G06T2207/30136Metal
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a metal surface defect detection method based on STM R-CNN, which uses a Swin Transformer as a backbone feature extraction network, uses a Mix-FPN mixed feature pyramid as a feature extraction layer, designs a metal surface image detection algorithm of a cascade region convolutional neural network frame, and applies a Transformer structure to the field of metal surface defect detection. Firstly, a Swin Transformer is used as a backbone feature extraction network to replace a conventional residual network structure, so that the extraction capability of the feature network to deep semantic information hidden in an image is enhanced. Then designing a Mix-FPN mixed feature pyramid network, fusing different feature layer information through a feature pyramid, and then designing a Multi-stage R-CNN cascade structure: each stage is focused on detecting Region Proposal within a specific range by a different IoU threshold. Finally, soft non-maximum suppression (Soft-NMS) and FP16 hybrid accuracy training is used to optimize and improve model performance.

Description

Metal surface defect detection method based on STM R-CNN
Technical Field
The invention relates to application of a deep learning algorithm to traditional industrial metal defect detection, in particular to a metal surface defect detection method based on STM R-CNN.
Background
Industrial metal profiles are widely applied to various industrial building engineering constructions, are also main raw materials of large-scale steel structures, play a role in various application scenes, and are used as industrial basic materials, the production quality of the metal profiles affects the safety, quality and service life of engineering construction at any time, and particularly in some high-precision industrial application scenes, the quality and appearance of the materials are extremely harsh. However, due to the limitations of steel production process and production environment, various problems of the steel are unavoidable, and in order to meet the requirements of various industrial application scenes on different standards of the steel, corresponding quality detection of the steel is generated, and the problem of detecting the surface defects of the hot-rolled strip steel is mainly solved. The industrial hot rolled strip steel is coiled strip steel with the thickness of 0.1-2 cm and the width of 60-200 cm, is widely applied to the production and manufacturing scenes of industrial equipment such as automobiles, shipbuilding, electrical equipment, engineering construction and the like as raw materials, and the service life and the value of subsequent products are directly influenced by the integrity and no flaws of the strip steel surface, so the strip steel surface defect detection is an important link in the production quality detection of the hot rolled strip steel.
The traditional steel surface detection method comprises manual detection, magnetic powder detection, penetration detection, eddy current detection, X-ray detection, ultrasonic detection technology and machine vision detection. The manual detection efficiency is low, the ultrasonic waves are not suitable for materials with complex surfaces, the traditional machine vision detection method has the problems of difficult realization, high equipment precision requirement, higher design threshold, unfriendly experiment designers, high maintenance cost, large environmental impact and the like.
With the rapid development of deep learning in recent years, the characteristic extraction characteristic of a convolution operator enables the technology of image processing to be greatly developed, the device limitation for the digital image processing technology is relieved along with the rapid development of moore's law, the rapid capability improvement is brought, the more powerful GPU can be applied to large-scale parallel computing, and the problem of computing force for limiting the deep learning in the past is solved. The method brings revolutionary promotion to the traditional digital image processing, and the target detection method based on the image processing is rapidly applied and developed.
The deep learning-based object detection algorithms can be currently structurally divided into two-stage algorithms and one-stage algorithms, and are represented by the Faster-RCNN, the Yolo series, and SSD, respectively. The method has good performance in the aspect of metal surface flaw detection, the detection effect can basically meet the industrial application requirement, but with the updating iteration of the technology, an advanced method is more needed to further improve the quality and accuracy of flaw detection.
Disclosure of Invention
Aiming at the problems existing in the prior art: the defects on the surface of the metal section are various and similar in characteristics, the defects are large in shape difference and area difference, the algorithm is difficult to quickly converge under the condition of small data sets, and accurate detection is difficult to realize. The invention provides a metal surface defect detection algorithm based on STM R-CNN to solve the problems.
The technical scheme adopted by the invention for achieving the purpose is as follows:
a metal surface defect detection method based on STM R-CNN comprises the following steps:
s1, acquiring metal surface image data, enhancing the data, classifying tags, and establishing a pairing data set with classified tags;
s2, establishing a metal surface defect detection network of STM R-CNN, wherein the metal surface defect detection network comprises the following 4 network modules: the backbone feature extraction network module is used for extracting features with different dimensions from input data by adopting a transducer operation unit; the Mix-FPN mixed feature extraction network module is used for carrying out further feature mixing on feature graphs with different dimensions to obtain enhanced features; the RPN network module is used for carrying out iterative training on the enhanced features and outputting an interested region and a defect boundary prediction frame; the Multi-stage R-CNN Multi-cascade detection network module is used for carrying out iterative training by combining with flexible non-maximum suppression on the region of interest output by the RPN network module, and outputting a defect boundary prediction frame and a prediction classification label step by step;
s3, acquiring metal surface image data in real time, inputting a metal surface defect detection network for establishing STMR-CNN, automatically positioning a defect boundary prediction frame and outputting a prediction classification label.
The label classification is manual classification of defects.
The backbone feature extraction network module extracts features of different dimensions from input data, including:
1) Dividing an original image of H multiplied by W multiplied by 3 into 4 multiplied by 4 image block patches, flattening the image block patches into linear dimensions, and adding pixel positions of each image block patch in the image;
2) And inputting the sequentially connected transformers to obtain 4 characteristic graphs stage 1-stage 4 with different dimensions respectively.
And the Mix-FPN mixed feature extraction network module outputs 5 enhancement feature graphs of p 1-p 5 by adopting a cross-layer cross data fusion method to the 4 feature graphs of stage 1-stage 4.
The cross-layer cross data fusion method comprises the following steps:
a. adopting T4+T2, T3+T1, T4+T2+T3 and T3+T1+T4 feature fusion, and then carrying out convolution operation to output p 1-p 4 feature information;
b. performing convolution operation of 3×3 and stride=2 on stage4 to obtain feature information p5;
the T1-T4 are respectively obtained by transforming the characteristic graphs stage 1-stage 4 with 4 different dimensions output by the backbone characteristic extraction network module through a 1X 1 convolution channel.
The t4+t2, t3+t1, t4+t2+t3, t3+t1+t4 feature fusion comprises:
1) 4 times of nearest neighbor up-sampling is carried out on T4, then add operation is carried out on the T2, and a new product is obtained
Figure BDA0003379937420000031
2) Will be new
Figure BDA0003379937420000032
4 times nearest neighbor up-sampling is carried out, then add operation is carried out with T1, and new +.>
Figure BDA0003379937420000033
3) Will be new
Figure BDA0003379937420000034
Performing 8 times random downsampling, and then performing add operation with T4 to obtain new +.>
Figure BDA0003379937420000035
4) Will be new
Figure BDA0003379937420000036
Performing 2-fold random downsampling, and then performing add operation with T3 to obtain new +.>
Figure BDA0003379937420000037
5) For new fused
Figure BDA0003379937420000038
The respective 3 x 3 convolutions are performed to obtain the final outputs p1 to p4 of 4 scales.
The loss functions of the RPN network module and the Multi-stage R-CNN Multi-cascade detection network module are classified cross entropy loss and bounding box regression loss.
The invention has the following beneficial effects and advantages:
1. the most advanced feature extraction backbone architecture based on a transducer at present is adopted as a basic feature extraction network, so that the feature extraction capability of the enhanced long semantic information is realized;
2. designing a Mix-FPN mixed feature pyramid network (Mixed dense feature pyramid networks, mix-FPN) framework, and enhancing the characteristic of adaptation of an algorithm to large-scale change of a detection target by mixing high-low layer feature semantic information;
3. designing a Multi-stage R-CNN Multi-cascade structure, and realizing a Multi-threshold gradual rise strategy through cascade R-CNN detection stages to improve detection accuracy;
4. and the rapid convergence is realized by adopting soft non-maximum suppression and FP16 mixed precision training, so that the detection accuracy is improved, and the training time is shortened.
Drawings
FIG. 1 is a flow chart of a metal surface defect detection algorithm of STM R-CNN of the present invention;
FIG. 2 is a schematic diagram of the Swin transducer data processing process of the present invention;
FIG. 3 is a diagram of a Swin transducer backbone algorithm model employed in the present invention;
FIG. 4 is a diagram of a Mix-FPN hybrid feature pyramid extraction algorithm model of the present invention;
FIG. 5 is a block diagram of a regional suggestion network in accordance with the present invention;
FIG. 6 is a diagram of a Multi-stage R-CNN Multi-cascade detection network R-CNN algorithm according to the present invention.
Detailed Description
In order that the above objects, features and advantages of the invention will be readily understood, a more particular description of the invention will be rendered by reference to the appended drawings. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. The invention may be embodied in many other forms than described herein and similarly modified by those skilled in the art without departing from the spirit or scope of the invention, which is therefore not limited to the specific embodiments disclosed below.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
As shown in fig. 1, the metal surface defect detection algorithm based on STM R-CNN proposed herein is mainly divided into four parts, and its specific functions are as follows:
(1) Swin transducer backbone network: the transformation former is developed by a self-intent operation unit, is different from a classical convolution operation unit structure in convolution nerves, has prominent long semantic information acquisition capability in natural language processing technology, and has the trend of replacing the latter and unifying two fields of computer vision and natural language processing;
(2) Mix-FPN hybrid feature extraction network: feature map fusion is carried out on the feature maps output by the backbone network according to a mixed mode, five fused feature maps are obtained, the feature fusion is fused with the bottom layer according to high-level semantic information, inter-layer fusion rules are crossed, cross mixing of the feature information is achieved to the greatest extent, and the characterization capability of the feature images is enhanced;
(3) Multi-stage R-CNN layer: the single-stage detection threshold adopted in the traditional mode is limited by a single condition, the unification of the detection precision and the lifting threshold cannot be realized, the further lifting of the detection precision is severely limited, the multi-stage mode is adopted, the linear constraint relation between the threshold and the detection precision can be effectively avoided through the setting of the multi-threshold, the stronger relation fitting is realized, and the detection precision is further improved;
(4) Soft-NMS: there is a case where the flexible non-maximum suppression (NMS) algorithm forces the score of the adjacent detection box to zero, and there will be a case where the overlapping real objects are forced to zero, resulting in detection failure and a drop in average detection accuracy (average preciusion, AP).
Traditional NMS reset function:
Figure BDA0003379937420000051
s i to detect the frame score, iou (M, b i ) As the cross-ratio function of the real frame and the detection frame, N t An overlap threshold value set; the overlapping detection frames are preserved by reducing the score of the overlapping detection frames rather than the forced zeroing operation using a Soft non-maximum suppression (Soft-NMS) algorithm.
Soft-NMS reset function:
Figure BDA0003379937420000052
when the detection frames exceed the overlapping threshold, the score of the detection frames is reset to linearly attenuate, the attenuation degree of the detection frames with the shorter distance from M is increased, and the influence degree of the detection frames with the longer distance is smaller.
As shown in fig. 2 and 3, the Swin Transformer backbone network data processing algorithm directly divides an image into batches of fixed size, obtains batch Embedding by Linear transformation, and performs operations such as feature extraction classification after serializing the image into transformers, similar to word Embedding (Linear Embedding) in natural language processing. The Swin transform backbone network firstly cuts the H×W×3pix picture into 4×4 picture blocks (patches), flattens the patches into linear dimensions, converts the linear dimensions into token emudding, and adds the position emudding on the basis of embedding tokens into the token emudding. It is input to a custom number of Transformer Encoder modules. (Each (H/4) × (W/4) ×3pix patch represents a token.)
Backbone is 4 total stages, each stage outputs a feature map, and the size of the output feature map is (c=96):
①:
Figure BDA0003379937420000061
②:
Figure BDA0003379937420000062
③:
Figure BDA0003379937420000063
④:
Figure BDA0003379937420000064
as shown in fig. 4, the backup outputs 4 feature maps stage1, stage2, stage3, and stage4, which are converted into T1 to T4 through 1×1 convolution channels, and the channels are unified to 256.
Starting from T4, performing 4 times nearest neighbor up-sampling, and then performing add operation with T2 to obtain a new sample
Figure BDA0003379937420000065
Will be new
Figure BDA0003379937420000066
4 times nearest neighbor up-sampling is carried out, then add operation is carried out with T1, and new +.>
Figure BDA0003379937420000067
Will be new
Figure BDA0003379937420000068
Performing 8 times random downsampling, and then performing add operation with T4 to obtain new +.>
Figure BDA0003379937420000069
Will be new
Figure BDA00033799374200000610
Performing 2-fold random downsampling, and then performing add operation with T3 to obtain new +.>
Figure BDA00033799374200000611
For new fused
Figure BDA00033799374200000612
All carry on the respective 3X 3 convolution, get final output p 4-p 1 of 4 scales;
to provide a very large receptive field feature map, large scale features were detected and stage4 was convolved 3 x 3 with stride=2 to yield p5. The Mix-FPN module realizes stage1 to stage4,4 feature map inputs, p1 to p5 and 5 feature map output sizes:
p1:
Figure BDA0003379937420000071
p2:
Figure BDA0003379937420000072
p3:
Figure BDA0003379937420000073
p4:
Figure BDA0003379937420000074
p5:
Figure BDA0003379937420000075
as shown in fig. 5 and 6, the feature map from the feature extraction network enters the RPN network, and is first convolved by 3×3 and then convolved by 1×1, respectively, to generate classification and bounding box prediction, wherein the classification prediction is mainly used for two classification prediction foreground and background, and the bounding box is used as the input of the Multi-stage R-CNN cascade network. The result of the simultaneous prediction is combined with the anchor frame generator Anchors Generator to generate the region pre-selection frame and label for loss calculation.
The loss functions of the RPN and the Multi-stage R-CNN network are as follows:
Figure BDA0003379937420000076
in the formula (3), i represents a prediction frame index anchor index, p i Representing the prediction probability that the i-th prediction box anchor is a true label,
Figure BDA0003379937420000077
the representative corresponding positive sample is 1, and the negative sample is 0, so that no boundary box regression loss is ensured when the anchor is the negative sample. t is t i Boundary box regression value representing the i-th anchor of the prediction, +.>
Figure BDA0003379937420000078
Representing the corresponding ith real frame value, and calculating the offset of the anchor and the real frame. N (N) cls Minimum batch size, N reg Is the number of prediction boxes Anchor Location. L (L) cls For cross entropy loss, L reg Is SmoothL1Loss.
Equation (3), the loss function of the RPN network is a classification cross entropy loss and a bounding box regression loss.
(1) Classification cross entropy loss formula (Cross EntropyLoss): the classifier in the RPN network divides the candidate box into foreground and background, which is a two-classification problem. The prediction result is only two p and 1-p, the formula:
Figure BDA0003379937420000079
wherein p is i Representing the probability that the i-th anchor predicts as a true label,
Figure BDA0003379937420000081
1 when positive and 0 when negative;
(2) multiple classification functions of the cross-class entropy loss will be used in the R-CNN module:
Figure BDA0003379937420000082
where M is the number of categories, y jc Is a sign function (0 or 1), if the true class of sample j is equal to c, taking 1, otherwise taking 0; p is p jc Is the predicted probability that observation sample j belongs to category c.
(3) Bounding box regression loss:
Figure BDA0003379937420000083
Figure BDA0003379937420000084
Figure BDA0003379937420000085
Figure BDA0003379937420000086
Figure BDA0003379937420000087
wherein in the formulas (6-4, 6-5), x, y, w, h respectively represent the coordinate positions of the real (tag) frames, x a ,y a ,w a ,h a Respectively representing coordinates of the anchor prediction frame. Using formula (7)
Figure BDA0003379937420000088
And (5) returning the loss function, and calculating the loss functions of the anchor prediction frame and the real frame.
As shown in fig. 6, the data set of the present invention adopts a hot rolled steel strip public data set, and has been subjected to label classification, the classification includes: six types of scale (RS), plaque (Pa), crack (Cr), pitted Surface (PS), inclusion (In) and scratch (Sc), 300 sheets each, totaling 1800 sheets.
The traditional R-CNN network has high resistance to detection results no matter how the threshold value is set. If the threshold is set high, the predicted binding box (x) and the real binding box (y) contain many contexts, making it difficult for the network to obtain positive sample data. If the threshold is lower, the network may obtain more positive samples, but may contain more non-real samples. It is therefore difficult to implement the setting of the threshold value by a single network model. To improve the detection capability of the model, the threshold of the detector module is continuously increased by constructing a cascade architecture, respectively (0.55,0.65,0.75). By resampling using the regression output of the previous stage, some extremes are removed by increasing the IoU threshold, optimizing the deep detector, improving overall performance. Intersection ratio (IoU), ratio calculation of intersection and union of prediction block (bounding box predict) and real block (real bounding box):
Figure BDA0003379937420000091
the predicted value of the binding box is obtained through IoU of the RPN network, the predicted value is sent to a first stage of the R-CNN network, if IoU exceeds a set threshold value, the predicted value is sent to a second stage, the IoU threshold value of the second stage is higher, and the predicted value is sent to a third stage again through further screening. Therefore, in the RPN network and the Multi-stage R-CNN network, soft-NMS flexible non-maximum suppression algorithm is adopted to detect defects on the metal surface, and the improvement of detection precision is realized by realizing the nearest neighbor attenuation strategy, wherein the formula of Soft-NMS is as follows:
Figure BDA0003379937420000092
wherein g y The detection frame score IoU (x, y) is the intersection ratio of the detection frame and the real frame, and u is the set threshold value of non-maximum suppression.
The feature map of Banckbone is input to the RPN and Multi-stage R-CNN network part at the same time, receives the box_pred_0 from the RPN network, and realizes the adjustment of the box_pred and cls_logist by cascading a plurality of R-CNN modules, and each module sets different thresholds. And finally, classifying the final test result into an average value of n R-CNN modules, and predicting bbox into the output of the last R-CNN module.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the present invention.

Claims (7)

1. The metal surface defect detection method based on STM R-CNN is characterized by comprising the following steps of:
s1, acquiring metal surface image data, enhancing the data, classifying tags, and establishing a pairing data set with classified tags;
s2, establishing a metal surface defect detection network of STM R-CNN, wherein the metal surface defect detection network comprises the following 4 network modules: the backbone feature extraction network module is used for extracting features with different dimensions from input data by adopting a transducer operation unit; the Mix-FPN mixed feature extraction network module is used for carrying out further feature mixing on feature graphs with different dimensions to obtain enhanced features; the RPN network module is used for carrying out iterative training on the enhanced features and outputting an interested region and a defect boundary prediction frame; the Multi-stage R-CNN Multi-cascade detection network module is used for carrying out iterative training by combining with flexible non-maximum suppression on the region of interest output by the RPN network module, and outputting a defect boundary prediction frame and a prediction classification label step by step;
s3, acquiring metal surface image data in real time, inputting a metal surface defect detection network for establishing STMR-CNN, automatically positioning a defect boundary prediction frame and outputting a prediction classification label.
2. The STM R-CNN based metal surface defect detection method of claim 1, wherein the label classification is a manual classification of defects.
3. The STM R-CNN-based metal surface defect detection method of claim 1, wherein the backbone feature extraction network module extracts features of different dimensions from the input data, comprising:
1) Dividing an original image of H multiplied by W multiplied by 3 into 4 multiplied by 4 image block patches, flattening the image block patches into linear dimensions, and adding pixel positions of each image block patch in the image;
2) And inputting the sequentially connected transformers to obtain 4 characteristic graphs stage 1-stage 4 with different dimensions respectively.
4. The STM R-CNN-based metal surface defect detection method according to claim 1, wherein the Mix-FPN hybrid feature extraction network module outputs 5 enhancement feature graphs of p 1-p 5 by using a cross-layer cross data fusion method for 4 feature graph inputs of stage 1-stage 4.
5. The STM R-CNN-based metal surface defect detection method of claim 4, wherein the cross-layer cross data fusion method comprises:
a. adopting T4+T2, T3+T1, T4+T2+T3 and T3+T1+T4 feature fusion, and then carrying out convolution operation to output p 1-p 4 feature information;
b. performing convolution operation of 3×3 and stride=2 on stage4 to obtain feature information p5;
the T1-T4 are respectively obtained by transforming the characteristic graphs stage 1-stage 4 with 4 different dimensions output by the backbone characteristic extraction network module through a 1X 1 convolution channel.
6. The STM R-CNN-based metal surface defect detection method according to claim 1, wherein the t4+t2, t3+t1, t4+t2+t3, t3+t1+t4 feature fusion includes:
1) 4 times of nearest neighbor up-sampling is carried out on T4, then add operation is carried out on the T2, and a new product is obtained
Figure FDA0003379937410000021
2) Will be new
Figure FDA0003379937410000022
4 times nearest neighbor up-sampling is carried out, then add operation is carried out with T1, and new +.>
Figure FDA0003379937410000023
3) Will be new
Figure FDA0003379937410000024
Performing 8 times random downsampling, and then performing add operation with T4 to obtain newIs->
Figure FDA0003379937410000025
4) Will be new
Figure FDA0003379937410000026
Performing 2-fold random downsampling, and then performing add operation with T3 to obtain new +.>
Figure FDA0003379937410000027
5) For new fused
Figure FDA0003379937410000028
The respective 3 x 3 convolutions are performed to obtain the final outputs p1 to p4 of 4 scales.
7. The STM R-CNN based metal surface defect detection method of claim 1, wherein the loss function of the RPN network module, multi-stage R-CNN Multi-cascade detection network module is a classification cross entropy loss and a bounding box regression loss.
CN202111430299.0A 2021-11-29 2021-11-29 Metal surface defect detection method based on STM R-CNN Pending CN116205832A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111430299.0A CN116205832A (en) 2021-11-29 2021-11-29 Metal surface defect detection method based on STM R-CNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111430299.0A CN116205832A (en) 2021-11-29 2021-11-29 Metal surface defect detection method based on STM R-CNN

Publications (1)

Publication Number Publication Date
CN116205832A true CN116205832A (en) 2023-06-02

Family

ID=86515997

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111430299.0A Pending CN116205832A (en) 2021-11-29 2021-11-29 Metal surface defect detection method based on STM R-CNN

Country Status (1)

Country Link
CN (1) CN116205832A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117541554A (en) * 2023-11-15 2024-02-09 江西理工大学 Surface defect detection method based on deep learning
CN117994257A (en) * 2024-04-07 2024-05-07 中国机械总院集团江苏分院有限公司 Fabric flaw analysis and detection system and analysis and detection method based on deep learning

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117541554A (en) * 2023-11-15 2024-02-09 江西理工大学 Surface defect detection method based on deep learning
CN117994257A (en) * 2024-04-07 2024-05-07 中国机械总院集团江苏分院有限公司 Fabric flaw analysis and detection system and analysis and detection method based on deep learning

Similar Documents

Publication Publication Date Title
Wan et al. Ceramic tile surface defect detection based on deep learning
CN109859163A (en) A kind of LCD defect inspection method based on feature pyramid convolutional neural networks
CN116205832A (en) Metal surface defect detection method based on STM R-CNN
CN110135486B (en) Chopstick image classification method based on adaptive convolutional neural network
CN113947590A (en) Surface defect detection method based on multi-scale attention guidance and knowledge distillation
CN113160139B (en) Attention-based steel plate surface defect detection method of Faster R-CNN network
CN113628178B (en) Steel product surface defect detection method with balanced speed and precision
CN114066820A (en) Fabric defect detection method based on Swin-transducer and NAS-FPN
CN110598698B (en) Natural scene text detection method and system based on adaptive regional suggestion network
CN115953408B (en) YOLOv 7-based lightning arrester surface defect detection method
CN116343053B (en) Automatic solid waste extraction method based on fusion of optical remote sensing image and SAR remote sensing image
CN105550712A (en) Optimized convolution automatic encoding network-based auroral image sorting method
CN116883393B (en) Metal surface defect detection method based on anchor frame-free target detection algorithm
CN115861281A (en) Anchor-frame-free surface defect detection method based on multi-scale features
CN114463759A (en) Lightweight character detection method and device based on anchor-frame-free algorithm
CN116012310A (en) Cross-sea bridge pier surface crack detection method based on linear residual error attention
CN116523875A (en) Insulator defect detection method based on FPGA pretreatment and improved YOLOv5
CN114037684B (en) Defect detection method based on yolov and attention mechanism model
CN115631186A (en) Industrial element surface defect detection method based on double-branch neural network
Fan et al. Application of YOLOv5 neural network based on improved attention mechanism in recognition of Thangka image defects
CN115147347A (en) Method for detecting surface defects of malleable cast iron pipe fitting facing edge calculation
CN112837281B (en) Pin defect identification method, device and equipment based on cascade convolution neural network
CN114092467A (en) Scratch detection method and system based on lightweight convolutional neural network
CN110136098B (en) Cable sequence detection method based on deep learning
CN116342496A (en) Abnormal object detection method and system for intelligent inspection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination