CN114663407A - Coal gangue target detection method based on improved YOLOv5s model - Google Patents

Coal gangue target detection method based on improved YOLOv5s model Download PDF

Info

Publication number
CN114663407A
CN114663407A CN202210320475.3A CN202210320475A CN114663407A CN 114663407 A CN114663407 A CN 114663407A CN 202210320475 A CN202210320475 A CN 202210320475A CN 114663407 A CN114663407 A CN 114663407A
Authority
CN
China
Prior art keywords
yolov5s
model
gangue
improved
coal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202210320475.3A
Other languages
Chinese (zh)
Inventor
季亮
沈科
张袁浩
陈晓晶
周李兵
霍振龙
潘祥生
任书文
***
郝大彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tiandi Changzhou Automation Co Ltd
Changzhou Research Institute of China Coal Technology and Engineering Group Corp
Original Assignee
Tiandi Changzhou Automation Co Ltd
Changzhou Research Institute of China Coal Technology and Engineering Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tiandi Changzhou Automation Co Ltd, Changzhou Research Institute of China Coal Technology and Engineering Group Corp filed Critical Tiandi Changzhou Automation Co Ltd
Priority to CN202210320475.3A priority Critical patent/CN114663407A/en
Publication of CN114663407A publication Critical patent/CN114663407A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0004Industrial image inspection
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B07SEPARATING SOLIDS FROM SOLIDS; SORTING
    • B07CPOSTAL SORTING; SORTING INDIVIDUAL ARTICLES, OR BULK MATERIAL FIT TO BE SORTED PIECE-MEAL, e.g. BY PICKING
    • B07C5/00Sorting according to a characteristic or feature of the articles or material being sorted, e.g. by control effected by devices which detect or measure such characteristic or feature; Sorting by manually actuated devices, e.g. switches
    • B07C5/02Measures preceding sorting, e.g. arranging articles in a stream orientating
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B07SEPARATING SOLIDS FROM SOLIDS; SORTING
    • B07CPOSTAL SORTING; SORTING INDIVIDUAL ARTICLES, OR BULK MATERIAL FIT TO BE SORTED PIECE-MEAL, e.g. BY PICKING
    • B07C5/00Sorting according to a characteristic or feature of the articles or material being sorted, e.g. by control effected by devices which detect or measure such characteristic or feature; Sorting by manually actuated devices, e.g. switches
    • B07C5/34Sorting according to other particular properties
    • B07C5/342Sorting according to other particular properties according to optical properties, e.g. colour
    • B07C5/3422Sorting according to other particular properties according to optical properties, e.g. colour using video scanning devices, e.g. TV-cameras
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B07SEPARATING SOLIDS FROM SOLIDS; SORTING
    • B07CPOSTAL SORTING; SORTING INDIVIDUAL ARTICLES, OR BULK MATERIAL FIT TO BE SORTED PIECE-MEAL, e.g. BY PICKING
    • B07C5/00Sorting according to a characteristic or feature of the articles or material being sorted, e.g. by control effected by devices which detect or measure such characteristic or feature; Sorting by manually actuated devices, e.g. switches
    • B07C5/36Sorting apparatus characterised by the means used for distribution
    • B07C5/361Processing or control devices therefor, e.g. escort memory
    • B07C5/362Separating or distributor mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30108Industrial image inspection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Quality & Reliability (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a coal gangue target detection method based on an improved YOLOv5s model, which comprises the following steps: s1, acquiring real-time images of coal and gangue; s2, carrying out visual identification processing on the acquired real-time image based on the improved YOLOv5S model, thereby identifying coal and gangue in the real-time image and determining coordinate information of the gangue; and S3, sorting the gangue from the coal by the mechanical arm according to the coordinates of the gangue. On the basis of a YOLOv5s model, a self-correcting convolution network SCConv is embedded into a backsbone area of the YOLOv5s model, 19 multiplied by 19 characteristic diagram branches of a Neck area and a Prediction area in the YOLOv5s model are deleted, linear scaling is carried out on an anchor frame obtained by K-means algorithm clustering, an improved YOLOv5s model is provided and applied to coal and gangue target detection, and the detection speed and the detection precision are effectively improved.

Description

Coal gangue target detection method based on improved YOLOv5s model
Technical Field
The invention belongs to the technical field of coal and gangue identification, and particularly relates to a coal and gangue target detection method based on an improved YOLOv5s model.
Background
A large amount of gangue is often accompanied in coal mine production, and the quality of coal can be influenced if the gangue is not timely treated, so that the gangue separated from the mined coal mine can effectively improve the quality of the coal. With the development of machine vision technology, the machine vision technology is widely applied to the field of coal and gangue identification, and is mainly divided into 2 types of image processing algorithms and deep learning algorithms according to the technical principle. The image processing algorithm extracts the characteristics of the coal and gangue, such as color, gray level, edge, contour and the like, by designing a specific convolution filter, and then detects the coal and gangue target through an image segmentation algorithm, but the parameters need to be adjusted artificially according to different scenes in practical application, and the algorithm has poor robustness and poor practicability. The deep learning algorithm has the advantages of high recognition rate, strong robustness and the like, and is rapidly popularized in the aspect of coal and gangue recognition. In the application of coal and gangue target detection, Wangzhong lifting and the like propose a coal and gangue image classification method based on a deep learning network, the recognition rate is high, and accurate detection of the position and the size of a coal and gangue target is not carried out. Wangpo and the like adopt a convolutional neural network to detect the target of the coal gangue, and because the scene of the data set in the literature has only one target and the detection precision can be met under the condition of better illumination environment, the detection precision cannot be guaranteed when the scene of the data set in the literature has at least more than 6 targets, and the detection speed is slow because the detection time of each frame is 50 ms. Laiwehao and the like collect 3 wave bands by using a multi-spectral system to form a pseudo RGB image data set, and then use an improved YOLOv4 model to detect a coal and gangue target, but the single-frame detection time is 4.18s, so that the real-time detection of the coal and gangue cannot be realized.
Disclosure of Invention
The present invention is directed to solving at least one of the problems of the prior art.
Therefore, the invention provides a coal gangue target detection method based on an improved YOLOv5s model, and the coal gangue target detection method based on the improved YOLOv5s model has the advantages of high detection speed and high detection precision.
The gangue target detection method based on the improved YOLOv5s model comprises the following steps: s1, acquiring real-time images of coal and gangue; s2, carrying out visual identification processing on the acquired real-time image based on the improved YOLOv5S model, thereby identifying coal and gangue in the real-time image and determining coordinate information of the gangue; and S3, sorting the gangue from the coal by the mechanical arm according to the coordinates of the gangue.
According to an embodiment of the present invention, S2 includes: s21, collecting a plurality of image samples of coal and gangue, dividing the image samples into a training set, a verification set and a test set, and labeling to complete data sets of the coal and the gangue simultaneously; s22, improving the YOLOv5S model to obtain an improved YOLOv5S model; s23, training the improved YOLOv5S model by utilizing a training set; s24, testing the improved YOLOv5S model by using a verification set, so as to test whether the training of the improved YOLOv5S model is accurate; and S25, detecting the image samples in the test set by using the trained improved YOLOv5S model, and evaluating the detection precision and the detection efficiency of the detection result of the test set.
In S22, two SCConv structures are embedded in the Backbone region of the YOLOv5S model, one SCConv structure located between the first CBL module and the CSP1_1 module and the other SCConv structure located between the second CBL module and the first CSP1_3 module, according to an embodiment of the present invention.
According to an embodiment of the present invention, in S22, the Neck region and the Prediction region of the YOLOv5S model are reduced, and the 19 × 19 feature map branches in the Neck region and the Prediction region are deleted.
According to an embodiment of the invention, in the training process of the improved YOLOv5s model, the sizes of the 6 groups of anchor boxes obtained after clustering by the K-means algorithm are (41, 63), (47, 94), (54, 69), (54, 51), (64, 84) and (64, 120), respectively.
According to one embodiment of the invention, the sizes of 6 groups of anchor frames obtained after the clustering by the K-means algorithm are scaled, and the scaling formula is as follows:
x′1=Ax1 (1)
x′6=Bx6 (2)
Figure BDA0003570337070000021
Figure BDA0003570337070000022
wherein: x is the number ofiThe width of the ith anchor frame (in the order of the width size of the anchor frame from small to large), i is 1, 2. x'iThe width of the zoomed anchor frame; a is the reduction multiple of the anchor frame; b is the magnification of the anchor frame; y isiThe height of the ith anchor frame; y'iIs the zoomed anchor frame height.
According to one embodiment of the present invention, the scaled anchor frame sizes are (20, 31), (39, 79), (62, 80), (62, 59), (96, 126), and (96, 180), respectively.
According to one embodiment of the invention, the training platform adopted in S23 is NVIDIA GeForce GTX 2080Ti, the inference platform is a mining intrinsically safe edge computing device, and the mining intrinsically safe edge computing device has 14TOPS computing power; the input image size is 608 × 608, the channel is 3; during training, the momentum coefficient is set to be 0.937, the weight attenuation coefficient is set to be 0.0005, the learning rate is set to be 0.01, a arm-up method is adopted for updating the learning rate, the batch size is 16, and the training iteration number is 300.
In S21, 526 image samples with a native resolution of 1280 × 960 are collected, each image sample includes more than 4 coals and gangues and includes stacked and blocked coals and gangues, the training set includes 373 image samples, the verification set includes 77 image samples, and the test set includes 76 image samples.
According to an embodiment of the invention, in step S21, the coal and gangue data set is preliminarily labeled by using an auxiliary labeling tool, and then visualized by using an open source tool LabelImg, so as to complete the data set production of coal and gangue.
The improved YOLOv5s model has the beneficial effects that on the basis of the YOLOv5s model, a Self-correcting convolutional network (SCConv) is embedded into a backhaul region of the YOLOv5s model, 19 multiplied by 19 characteristic diagram branches of the Neck and Prediction regions are deleted, and an anchor frame obtained by K-means algorithm clustering is subjected to linear scaling, so that the improved YOLOv5s model is provided and applied to coal and gangue target detection, and the detection speed and the detection accuracy are effectively improved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a flow chart of a coal gangue target detection method based on an improved YOLOv5s model in the invention;
FIG. 2 is a diagram of the structure of the YOLOv5s model;
FIG. 3 is a view of the SCConv structure of the present invention;
FIG. 4 is a structural diagram of a Backbone of the improved YOLOv5s model of the present application;
FIG. 5 is a block diagram of the Neck and Prediction of the improved YOLOv5s model of the present application;
FIG. 6 is a graph of the results of the test using the YOLOv5s model;
fig. 7 is a graph of the results of the test using the improved YOLOv5s model of the present application.
FIG. 8 is a P-R plot of models on a gangue identification test set.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
In the description of the present invention, it is to be understood that the terms "central," "longitudinal," "lateral," "length," "width," "thickness," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," "clockwise," "counterclockwise," "axial," "radial," "circumferential," and the like are used in the orientations and positional relationships indicated in the drawings for convenience in describing the invention and to simplify the description, and are not intended to indicate or imply that the referenced device or element must have a particular orientation, be constructed and operated in a particular orientation, and are not to be considered limiting of the invention. Furthermore, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless otherwise specified.
In the description of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
The gangue target detection method based on the improved YOLOv5s model according to the embodiment of the invention is described in detail below with reference to the accompanying drawings.
As shown in fig. 1 to 8, the method for detecting the coal gangue target based on the improved YOLOv5s model according to the embodiment of the invention includes: s0, configuring and starting camera parameters, S1, collecting real-time images of coal and gangue through a camera; s2, carrying out visual identification processing on the acquired real-time image based on the improved YOLOv5S model, thereby identifying coal and gangue in the real-time image and determining coordinate information of the gangue; s3, sorting the gangue from the coal by the mechanical arm according to the coordinates of the gangue; and repeating the steps S1 to S3 to complete the circular detection and the coal gangue sorting.
According to an embodiment of the present invention, S2 includes: s21, collecting a plurality of image samples of coal and gangue, dividing the image samples into a training set, a verification set and a test set, and labeling to complete data sets of the coal and the gangue simultaneously; s22, improving the YOLOv5S model to obtain an improved YOLOv5S model; s23, training the improved YOLOv5S model by utilizing a training set; s24, testing the improved YOLOv5S model by using a verification set, so as to test whether the training of the improved YOLOv5S model is accurate; and S25, detecting the image samples in the test set by using the trained improved YOLOv5S model, and evaluating the detection precision and the detection efficiency of the detection result of the test set.
On this basis, in S22, two SCConv structures are embedded in the Backbone region of the YOLOv5S model, one SCConv structure being located between the first CBL module and the CSP1_1 module, and the other SCConv structure being located between the second CBL module and the first CSP1_3 module. That is, as shown in fig. 2, the YOLOv5s model mainly realizes flexible configuration of model size and performance on the basis of the YOLOv4 model, and introduces the latest network modules and training skills, such as mosaic data enhancement, DropBlock mechanism, hardwinh activation function, GIoU bounding box regression loss, and the like. The Yolov5s model mainly comprises input, Backbone, Neck, Prediction and other areas, and each area is composed of CBL (Conv + BN + Leaky _ Relu), CSP (CBL + Res unint + Concat + BN + Leaky _ Relu), Focus, SPP and other modules. Since the Backbone area of YOLOv5s is mainly formed by stacking multiple sets of residual modules. However, the residual module cannot sufficiently fuse multi-scale feature information, so that an SCConv structure is introduced, where SCConv is a network component that achieves the effect of amplifying a receptive field by enhancing the intrinsic communication of a feature map without changing the model architecture, as shown in fig. 3, cxhxw is a dimension of an input feature map X, and X is X1、X2For the feature map after splitting, K1—K4For a convolution kernel, F1—F4The characteristic maps are respectively processed characteristic maps, r is the average pooling down-sampling multiple, Y1、Y2The feature maps are respectively output by the first branch and the second branch, Y is an output feature map, and the dimension is C multiplied by H multiplied by W. SCConv structure is divided according to channel dimensionFor 2 branches, the first branch utilizes down-sampling to increase the receptive field of the feature map, and the second branch is used for conventional convolution operation, and 2 branch channel information are combined, so that the feature extraction and expression capacity of the model is increased. Then, as can be seen from fig. 2 in combination with fig. 4, the Backbone structure of the improved YOLOv5s model has two additional SCConv structures compared to the Backbone structure of the YOLOv5s model, and the SCConv structures are embedded in the Backbone region of the YOLOv5s model in the invention, so that the feature extraction capability of the Backbone region is improved without significantly increasing the complexity of the YOLOv5s model.
From fig. 5 in conjunction with fig. 2, in S22, the Neck region and the Prediction region of the YOLOv5S model are reduced, and the 19 × 19 feature map branches in the Neck region and the Prediction region are deleted. The Neck area in the YOLOv5s model adopts multipath structure aggregation characteristics to enhance the network characteristic fusion capability. Because the sizes of the coal blocks and the gangue are small relative to the whole image, the large target detection of the 3 rd branch in the middle Neck area becomes redundant. In order to improve the detection speed of the model, a NeolOv 5s model Neck region and a Prediction region are appropriately simplified, and 19 x 19 characteristic diagram branches which have the largest receptive field and are suitable for detecting objects with larger sizes are deleted, so that the complexity of the model is reduced, and the detection real-time performance is improved.
According to an embodiment of the invention, in the training process of the improved YOLOv5s model, the sizes of the 6 groups of anchor boxes obtained after clustering by the K-means algorithm are (41, 63), (47, 94), (54, 69), (54, 51), (64, 84) and (64, 120), respectively. Further, the sizes of the 6 groups of anchor frames obtained after the clustering by the K-means algorithm are scaled, and the scaling formula is as follows:
x′1=Ax1 (1)
x′6=Bx6 (2)
Figure BDA0003570337070000061
Figure BDA0003570337070000062
wherein: x is the number ofiThe width of the ith anchor frame (in order of the width dimension of the anchor frame from small to large), i is 1, 2. x'iThe width of the zoomed anchor frame; a is the reduction multiple of the anchor frame; b is the anchor box magnification (the scaling factor A, B needs to be determined from the selected data set to ensure that the scaled anchor box covers all marker box sizes in the data set); y isiThe height of the ith anchor frame; y'iIs the zoomed anchor frame height. Further, the scaled anchor frame sizes are (20, 31), (39, 79), (62, 80), (62, 59), (96, 126), and (96, 180), respectively.
That is, in the training process, the anchor box set is generated by performing K-means algorithm clustering on the target bounding boxes in the data set. Since the 19 x 19 feature map branches of the predicted large target are deleted in the hack region, the number of anchor boxes of the cluster is reduced from 9 groups to 6 groups. The 6 sets of anchor box sizes obtained after K-means algorithm clustering were (41, 63), (47, 94), (54, 69), (54, 51), (64, 84), (64, 120), respectively. Meanwhile, the sizes of the anchor frames generated by clustering through the K-means algorithm are relatively concentrated, the sizes of a considerable part of real mark frames of objects are greatly different from the sizes of the anchor frames obtained by clustering through the K-means algorithm, and the sizes of the anchor frames obtained by clustering cannot well cover the real sizes of most mark frames in the data set, so that the model convergence is slow and the optimal state is difficult to achieve. Therefore, the sizes of the 6 groups of anchor frames generated by clustering by the K-means algorithm are subjected to linear scaling, and in the scaling formula in the embodiment, a is 0.5; b is 1.5.
According to one embodiment of the invention, the training platform adopted in S23 is NVIDIA GeForce GTX 2080Ti, the inference platform is a mining intrinsically safe edge computing device, and the mining intrinsically safe edge computing device has 14TOPS computing power; the input image size is 608 × 608, the channel is 3; during training, the momentum coefficient is set to be 0.937, the weight attenuation coefficient is set to be 0.0005, the learning rate is set to be 0.01, a arm-up method is adopted for updating the learning rate, the batch size is 16, and the training iteration number is 300.
According to an embodiment of the present invention, in S21, 526 image samples with a native resolution of 1280 × 960 are collected, each image sample includes more than 4 pieces of coal and gangue, and includes stacking and blocking conditions of coal and gangue, the training set includes 373 image samples, the verification set includes 77 image samples, and the test set includes 76 image samples.
According to an embodiment of the invention, in step S21, in order to reduce the labor labeling cost, an auxiliary labeling tool is used to perform preliminary labeling on the coal and gangue data set, and then visualization is performed by an open source tool LabelImg, so as to complete the data set production of coal and gangue.
To verify the detection effect of the YOLOv5s model, a comparison experiment was performed based on the YOLOv5s model, and the results are shown in table 1(FPS is the frame rate per second, and mAP is the average precision). It can be seen that the model size of YOLOv5s is 6.74MB, the maps on the test set is 87.5%, and the FPS is 30.5 frames/s; the YOLOv5s-SCC model embeds an SCConv structure in a Backbone area as a main feature extraction network, and under the premise that the size of the model is increased by 0.26MB and the FPS is reduced by 0.9 frame/s, the mAP is improved by 0.7% compared with the YOLOv5s model, which indicates that the SCConv structure can improve the detection accuracy of the model; the YOLOv5s-TA model deletes 19 x 19 characteristic diagram branches in the Neck and Prediction regions, and the mAP is only reduced by 0.7% compared with the YOLOv5s model on the premise that the size of the model is reduced by 1.69MB and the FPS is increased by 3.2 frames/s, which shows that the YOLOv5s-TA model can improve the detection speed of the model; the YOLOv5s-DS model performs linear scale scaling on an anchor frame generated by K-means algorithm clustering, and on the premise that the size of the model is reduced by 1.69MB and the FPS is increased by 3.1 frames/s, the mAP is reduced by 0.1% compared with the YOLOv5s model, which shows that the YOLOv5s-DS model can improve the detection speed of the model on the premise that the detection precision tends to be stable; compared with the YOLOv5s model, the size of the improved YOLOv5s model is reduced by 1.57MB, FPS is increased by 2.1 frames/s, and mAP is increased by 1.7%, which shows that the improved YOLOv5s model of the application is improved in detection rate and detection precision to a certain extent.
TABLE 1 comparison of different improved YOLOv5s model test results
Model (model) Model size/MB FPS/frame s-1 mAP/%
YOLOv5s 6.74 30.5 87.5
YOLOv5s-SCC 7.0 29.6 88.2
YOLOv5s-TA 5.05 33.7 86.8
YOLOv5s-DS 5.05 33.6 87.4
Improved YOLOv5s model 5.17 32.6 89.2
With reference to fig. 8, the accuracy P is plotted on the horizontal axis and the recall R is plotted on the vertical axis to obtain the P-R curves of YOLOv5s and 4 improved models, and the areas of the P-R curves and the horizontal and vertical coordinate enclosed cities are the average detection accuracy. As can be seen from fig. 8, the improved YOLOv5s model has the highest detection accuracy and the best performance.
As can be seen from fig. 6 and 7, the YOLOv5s model in fig. 6 does not identify the coal on the left side of the image, and the improved YOLOv5s model in fig. 7 has higher identification accuracy.
In conclusion, the improved YOLOv5s model has the beneficial effects that on the basis of the YOLOv5s model, a Self-correcting convolutional network (SCConv) is embedded into a backhaul region of the YOLOv5s model, 19 × 19 feature map branches of the Neck and Prediction regions are deleted, and an anchor frame obtained by clustering the K-means algorithm is linearly scaled, so that the improved YOLOv5s model is provided and applied to coal and gangue target detection, and the detection speed and the detection precision are effectively improved. SCConv is embedded in a Backbone area of a YOLOv5s model to serve as a feature extraction network, so that the problem of insufficient multi-scale feature extraction of the model is solved; the 19 multiplied by 19 characteristic diagram branches of the Neck and Prediction areas of the YOLOv5s model are deleted, so that the size of the model is effectively reduced; and linear scaling operation is carried out on the anchor frame obtained by clustering through the K-means algorithm, so that the model detection precision is improved. Compared with the YOLOv5s model, the improved YOLOv5s model reduces the size by 1.57MB, reduces the model parameters, increases the FPS by 2.1 frames/s, improves the mAP by 1.7%, and shows that the improved YOLOv5s model is improved in detection speed and detection precision.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an illustrative embodiment," "an example," "a specific example," or "some examples" or the like mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims (10)

1. A gangue target detection method based on an improved YOLOv5s model is characterized by comprising the following steps:
s1, acquiring real-time images of coal and gangue;
s2, carrying out visual identification processing on the acquired real-time image based on the improved YOLOv5S model, thereby identifying coal and gangue in the real-time image and determining coordinate information of the gangue;
and S3, sorting the gangue from the coal by the mechanical arm according to the coordinates of the gangue.
2. The method for detecting the gangue targets based on the improved YOLOv5S model according to claim 1, wherein S2 comprises:
s21, collecting a plurality of image samples of coal and gangue, dividing the image samples into a training set, a verification set and a test set, and labeling to complete data sets of the coal and the gangue simultaneously;
s22, improving the YOLOv5S model to obtain an improved YOLOv5S model;
s23, training the improved YOLOv5S model by utilizing a training set;
s24, testing the improved YOLOv5S model by using a verification set, so as to test whether the training of the improved YOLOv5S model is accurate;
and S25, detecting the image samples in the test set by using the trained improved YOLOv5S model, and evaluating the detection precision and the detection efficiency of the detection result of the test set.
3. The method for detecting the gangue targets based on the improved YOLOv5S model as claimed in claim 2, wherein in S22, two SCConv structures are embedded in the backsbone region of the YOLOv5S model, one SCConv structure is located between the first CBL module and the CSP1_1 module, and the other SCConv structure is located between the second CBL module and the first CSP1_3 module.
4. The method for detecting the gangue targets based on the improved YOLOv5S model as claimed in claim 3, wherein in S22, the Neck region and the Prediction region of the YOLOv5S model are reduced, and the 19 × 19 feature map branches in the Neck region and the Prediction region are deleted.
5. The method for detecting the gangue targets based on the improved YOLOv5s model according to claim 4, wherein in the training process of the improved YOLOv5s model, the sizes of the 6 groups of anchor boxes obtained after clustering by the K-means algorithm are (41, 63), (47, 94), (54, 69), (54, 51), (64, 84) and (64, 120), respectively.
6. The method for detecting the gangue targets based on the improved YOLOv5s model according to claim 5, wherein 6 groups of anchor frame sizes obtained after clustering by a K-means algorithm are scaled, and the scaling formula is as follows:
x′1=Ax1 (1)
x′6=Bx6 (2)
Figure FDA0003570337060000021
Figure FDA0003570337060000022
wherein: x is the number ofiThe width of the ith anchor frame (in the order of the width size of the anchor frame from small to large), i is 1, 2. x'iThe width of the zoomed anchor frame; a is the reduction times of the anchor frame; b is the magnification of the anchor frame; y isiThe height of the ith anchor frame; y'iIs the zoomed anchor frame height.
7. The method for detecting the gangue target based on the improved YOLOv5s model as claimed in claim 6, wherein the scaled anchor frame sizes are (20, 31), (39, 79), (62, 80), (62, 59), (96, 126) and (96, 180), respectively.
8. The gangue target detection method based on the improved YOLOv5S model as claimed in claim 2, wherein the training platform adopted in S23 is NVIDIA GeForce GTX 2080Ti, the inference platform is mining intrinsically safe type edge computing equipment, and the mining intrinsically safe type edge computing equipment has 14TOPS computing power; the input image size is 608 × 608, the channel is 3; during training, the momentum coefficient is set to be 0.937, the weight attenuation coefficient is set to be 0.0005, the learning rate is set to be 0.01, a arm-up method is adopted for updating the learning rate, the batch size is 16, and the training iteration number is 300.
9. The method for detecting the coal gangue target based on the improved YOLOv5S model as claimed in claim 8, wherein in S21, 526 image samples with a primary resolution of 1280 x 960 are collected, each image sample comprises more than 4 coals and gangue, and contains the condition of coal and gangue stacking and blocking, the training set comprises 373 image samples, the verification set comprises 77 image samples, and the test set comprises 76 image samples.
10. The method for detecting the target of the coal gangue based on the improved YOLOv5S model as claimed in claim 9, wherein in S21, the coal gangue data set is preliminarily labeled by using an auxiliary labeling tool, and then visualized by an open source tool LabelImg, so as to complete the production of the coal and gangue data set.
CN202210320475.3A 2022-03-29 2022-03-29 Coal gangue target detection method based on improved YOLOv5s model Withdrawn CN114663407A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210320475.3A CN114663407A (en) 2022-03-29 2022-03-29 Coal gangue target detection method based on improved YOLOv5s model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210320475.3A CN114663407A (en) 2022-03-29 2022-03-29 Coal gangue target detection method based on improved YOLOv5s model

Publications (1)

Publication Number Publication Date
CN114663407A true CN114663407A (en) 2022-06-24

Family

ID=82033282

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210320475.3A Withdrawn CN114663407A (en) 2022-03-29 2022-03-29 Coal gangue target detection method based on improved YOLOv5s model

Country Status (1)

Country Link
CN (1) CN114663407A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113435542A (en) * 2021-07-22 2021-09-24 安徽理工大学 Coal and gangue real-time detection method based on deep learning
CN113553979A (en) * 2021-07-30 2021-10-26 国电汉川发电有限公司 Safety clothing detection method and system based on improved YOLO V5
CN114120093A (en) * 2021-12-01 2022-03-01 安徽理工大学 Coal gangue target detection method based on improved YOLOv5 algorithm

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113435542A (en) * 2021-07-22 2021-09-24 安徽理工大学 Coal and gangue real-time detection method based on deep learning
CN113553979A (en) * 2021-07-30 2021-10-26 国电汉川发电有限公司 Safety clothing detection method and system based on improved YOLO V5
CN114120093A (en) * 2021-12-01 2022-03-01 安徽理工大学 Coal gangue target detection method based on improved YOLOv5 algorithm

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
沈科等: "基于改进YOLOv5s模型的煤矸目标检测", 《工矿自动化》, vol. 47, no. 11, 22 November 2021 (2021-11-22), pages 0 - 4 *

Similar Documents

Publication Publication Date Title
CN109118479B (en) Capsule network-based insulator defect identification and positioning device and method
CN113052210B (en) Rapid low-light target detection method based on convolutional neural network
Mayr et al. Weakly supervised segmentation of cracks on solar cells using normalized L p norm
CN113569667B (en) Inland ship target identification method and system based on lightweight neural network model
CN110580699A (en) Pathological image cell nucleus detection method based on improved fast RCNN algorithm
CN114445706A (en) Power transmission line target detection and identification method based on feature fusion
CN110322453A (en) 3D point cloud semantic segmentation method based on position attention and auxiliary network
CN103679187B (en) Image-recognizing method and system
CN112819748B (en) Training method and device for strip steel surface defect recognition model
CN111738114B (en) Vehicle target detection method based on anchor-free accurate sampling remote sensing image
CN115953408B (en) YOLOv 7-based lightning arrester surface defect detection method
CN114972312A (en) Improved insulator defect detection method based on YOLOv4-Tiny
CN113012153A (en) Aluminum profile flaw detection method
CN115457277A (en) Intelligent pavement disease identification and detection method and system
CN116342894A (en) GIS infrared feature recognition system and method based on improved YOLOv5
CN105354547A (en) Pedestrian detection method in combination of texture and color features
CN116597411A (en) Method and system for identifying traffic sign by unmanned vehicle in extreme weather
CN111598854A (en) Complex texture small defect segmentation method based on rich robust convolution characteristic model
CN111199255A (en) Small target detection network model and detection method based on dark net53 network
CN113591973A (en) Intelligent comparison method for appearance state changes of track slabs
CN111597939A (en) High-speed rail line nest defect detection method based on deep learning
CN110889418A (en) Gas contour identification method
CN114663407A (en) Coal gangue target detection method based on improved YOLOv5s model
CN110110665A (en) The detection method of hand region under a kind of driving environment
CN111860332B (en) Dual-channel electrokinetic diagram part detection method based on multi-threshold cascade detector

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20220624