CN118095314A - Magnetizing tag detection method based on deep learning - Google Patents

Magnetizing tag detection method based on deep learning Download PDF

Info

Publication number
CN118095314A
CN118095314A CN202410227357.7A CN202410227357A CN118095314A CN 118095314 A CN118095314 A CN 118095314A CN 202410227357 A CN202410227357 A CN 202410227357A CN 118095314 A CN118095314 A CN 118095314A
Authority
CN
China
Prior art keywords
deep learning
tag
label
training
magnetizing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410227357.7A
Other languages
Chinese (zh)
Inventor
李俊峰
陈谦
郝京波
杜阳
刘玉杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aetna Polytron Technologies Inc Beijing Airport New Material Branch
Advanced Technology and Materials Co Ltd
Original Assignee
Aetna Polytron Technologies Inc Beijing Airport New Material Branch
Advanced Technology and Materials Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aetna Polytron Technologies Inc Beijing Airport New Material Branch, Advanced Technology and Materials Co Ltd filed Critical Aetna Polytron Technologies Inc Beijing Airport New Material Branch
Priority to CN202410227357.7A priority Critical patent/CN118095314A/en
Publication of CN118095314A publication Critical patent/CN118095314A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a detection method of a magnetizing label based on deep learning, which automatically learns and extracts the surface characteristics of the label by the deep learning method, effectively processes dust interference factors, can improve the identification accuracy and stability of the magnetizing label, and meets the application requirements of the magnetizing label identification. In the deep learning training process, L1 regularization is added before a batch normalization layer, so that non-important parameter layers are determined, the weight of the non-important parameter layers is close to zero, the complexity of a model can be reduced, and the influence of noise and interference on a detection result is reduced. By adopting a migration learning method, a plurality of deep learning models are obtained by training a plurality of pre-training models, and layers with smaller influence factors in the models are deleted by sparse training, so that the models are more focused on learning important features, and the prediction accuracy and generalization capability of the models can be improved. By introducing a coordinate attention mechanism between the neck network and the prediction network, the position information of the tag can be extracted, and the tag position can be accurately positioned.

Description

Magnetizing tag detection method based on deep learning
Technical Field
The invention relates to the technical field of label detection and deep learning, in particular to a magnetizing label detection method based on deep learning.
Background
Tag detection is an automated technique for identifying, reading and verifying tag information, is an indispensable link in modern industry and commerce, and is mainly used for identifying, verifying and managing various tags. Along with development of technology, the types and the applications of the tags are more and more extensive, such as RFID tags, bar code tags, magnetizing tags and the like, and the tags are widely applied to the fields of logistics, production and manufacturing, retail, medical and the like, and are used for tracking articles, managing inventory, improving production efficiency and the like.
Currently, tag detection mainly faces the following challenges: firstly, with the increase of the number of labels, the efficiency and accuracy of manual detection cannot meet the requirement of mass production; second, some labels may be difficult to accurately identify for various reasons (e.g., contamination, breakage, blurring, etc.); in addition, for some special types of tags, for example, the reading distance and direction of the RFID tag may affect the accuracy of detection, and the magnetized tag is small and is easy to attach dust, resulting in reduced accuracy and stability of identification.
Aiming at the problems, the existing label inspection technology is mainly divided into two main categories: manual detection and automatic detection. Manual detection is mainly performed by manually visually inspecting the tag, and the method is simple and easy to implement, but has low efficiency, is easy to make mistakes, has high cost and is difficult to expand for a large-scale production environment. The automatic detection technology is mainly dependent on various sensors and machine vision technologies, and mainly comprises the following technologies: machine vision detection, radio Frequency Identification (RFID) technology, and bar code and two-dimensional code identification. The machine vision technology utilizes an image processing and recognition algorithm to automatically read and verify the label information, and the method can greatly improve the accuracy and efficiency of detection, but requires high configuration hardware and complex software algorithm support; for the RFID tag, the RFID reader can be used for directly reading tag information, no visible view is needed, and the method has a good effect in a certain range, but is limited by the reading distance and direction and has higher cost; for bar codes and two-dimensional code labels, special scanning equipment or machine vision technology can be utilized for identification, and the method is widely applied in many scenes, but the definition and the integrity of the labels are required to be ensured.
In summary, the existing tag detection method is not suitable for magnetizing tags, and with the rapid development of industries such as the internet and intelligent manufacturing, the number of tags is continuously increased, and requirements on the accuracy and efficiency of tag detection are also higher and higher, so how to rapidly and accurately detect large-scale magnetizing tags is a problem that needs to be solved by those skilled in the art.
Disclosure of Invention
In view of the above, the present invention provides a detection method of a magnetizing tag based on deep learning, which is used for rapidly and accurately detecting a large-scale magnetizing tag.
Therefore, the invention provides a magnetizing tag detection method based on deep learning, which comprises the following steps:
S11: collecting tag data to be detected, and preprocessing the tag data to be detected;
s12: inputting the processed label data to be detected into a pre-trained deep learning model to realize label detection;
The training process of the deep learning model is as follows:
S21: collecting tag image data, and labeling the tag image data;
s22: preprocessing the label image data;
S23: dividing the processed label image data into a training set and a testing set, inputting the training set into a deep learning network for training to obtain a deep learning model, calculating and optimizing the deep learning model by using the testing set, and evaluating and optimizing the deep learning model by using a DIOU loss function; before training, initializing N pre-training models by adopting a transfer learning method, and obtaining N deep learning models corresponding to the pre-training models through N times of training, wherein N is a positive integer; the deep learning network is built based on introducing a coordinate attention mechanism between the neck network and the prediction network, and adding L1 regularization before the batch normalization layer.
In a possible implementation manner, in the method for detecting a magnetizing tag based on deep learning provided by the present invention, step S11 collects tag data to be detected, and pre-processes the tag data to be detected, and specifically includes the following steps:
S111: selecting OpenCV to acquire a video stream of a label image to be detected, opening the video stream through a cv2.video capture () function, and taking an incoming parameter as a video stream source; continuously acquiring each frame of image in the video stream by using a while loop until acquisition fails or is manually interrupted;
S112: and adjusting the size and resolution of the label image to be detected, and performing noise reduction processing, image distortion removal processing, contrast enhancement processing and normalization processing on the label image to be detected.
In a possible implementation manner, in the method for detecting a magnetizing tag based on deep learning provided by the present invention, step S12 inputs processed tag data to be detected into a deep learning model trained in advance, so as to implement tag detection, and specifically includes the following steps:
s121: selecting a deep learning model according to the acquired ambient light, the number of arrows and the size of the arrows of an input image, performing image recognition based on a pt file generated by the selected deep learning model, and marking the type and the coordinates of the label in the output image;
S122: after the detection of the label is completed, the detected label information is stored, including the coordinates, the size, the category and the credibility of the label.
In a possible implementation manner, in the method for detecting a magnetizing tag based on deep learning provided by the present invention, step S21 collects tag image data, marks the tag image data, and specifically includes the following steps:
S211: collecting a plurality of tag images under different sizes, different angles and different illumination conditions from a magnetizing scene, wherein the tag images comprise arrows in two directions;
S212: and labeling the collected label image data, including the type, the position and the size of the label.
In one possible implementation manner, in the method for detecting a magnetizing tag based on deep learning provided by the present invention, step S22, preprocessing tag image data specifically includes: and adjusting the size and resolution of the tag image, and performing noise reduction processing, image distortion removal processing, contrast enhancement processing and normalization processing on the tag image.
In one possible implementation manner, in the above deep learning-based magnetizing tag detection method provided by the present invention, in step 23, a coordinate attention mechanism is introduced between the neck network and the prediction network, which specifically includes:
The features of the channel c of the input image along the height h direction are aggregated into a single value g h by using a pair of 1x1 convolution operations F h and F h, the features of the channel c of the input image along the width w direction are aggregated into a single value g w by using a pair of 1x1 convolution operations F w and F w, and scaled by a sigmoid activation function σ:
gh=σ(Fh(fh)) (1)
gw=σ(Fw(fw)) (2)
the output of the coordinate attention mechanism is:
Wherein L represents the length of the input sequence, x j represents the feature vector corresponding to position j in the input sequence, and Attention (i, j) represents the Attention weight between position i and position j in the input sequence.
In a possible implementation manner, in the foregoing deep learning-based detection method for a magnetizing tag provided by the present invention, in step 23, L1 regularization is added before a batch normalization layer, which specifically includes:
the L1 regular expression is:
Where k represents regularization strength, controlling the contribution of the regularization term to the total loss; n represents the number of weight parameters; w s represents the absolute value of the weight parameter at position s;
the parameters of the batch normalization layer were calculated as follows:
Where x≡represents normalized input, z in represents non-normalized input, μ B represents the average of batch, Representing the variance of batch, co being a constant, z out representing the output of the batch normalization layer, α representing the learned scaling parameter, β representing the learned offset parameter,/>A normalization input representing a batch normalization layer;
after adding L1 regularization before the batch normalization layer, the total loss function expression is:
Wherein, Is a network loss function,/>Is regularization of the scale factor; x represents training input, y represents training result, and W represents training weight; f (x, W) represents the output of the deep learning model, which is a predicted value obtained by calculating training weight W for training input x by the deep learning model; gamma denotes the scale factor, g (gamma) denotes the sparsity-induced penalty of the scale factor gamma, and lambda denotes the penalty sparsity parameter that determines the magnitude of the penalty term.
In a possible implementation manner, in the above-mentioned deep learning-based magnetizing tag detection method provided by the present invention, in step 23, the deep learning network is constructed based on introducing a coordinate attention mechanism between the neck network and the prediction network, and adding L1 regularization before the batch normalization layer, and further includes:
the downsampling uses depth separable convolution and the upsampling uses transposed convolution.
In a possible implementation manner, in the method for detecting a magnetizing tag based on deep learning provided by the present invention, step 23, before training, further includes the following steps:
s231: judging whether a missing value exists in the label image data; if yes, execute step S232 and then execute step S233; if not, executing step S233;
S232: interpolation at the missing;
S233: judging whether repeated data exist in the label image data; if yes, execute step S234 and then execute step S235; if not, executing the step S235;
S234: deleting the duplicate data;
S235: judging whether abnormal values exist in the label image data or not; if yes, executing step S236 and then entering a training process; if not, entering a training process;
S236: replacing outliers.
According to the detection method of the magnetizing tag based on deep learning, provided by the invention, the characteristics of the tag surface can be automatically learned and extracted through the deep learning method, and interference factors such as dust and the like are effectively processed, so that the identification accuracy and stability of the magnetizing tag are obviously improved, the identification effect of a smaller target is better, and the actual application requirements of the magnetizing tag identification can be better met. The training process of deep learning is utilized, the characteristics of the labels can be automatically learned, extracted and identified, and accurate label classification and positioning are realized, so that manual intervention can be reduced, the possibility of missing detection or false detection caused by human factors is reduced, the efficiency of label detection can be improved, the labor cost is reduced, and the accuracy and the efficiency of label detection are improved. In the deep learning training process, by adding L1 regularization before the batch normalization layer, resetting the weight below the threshold value in the CBS layer to 0 or removing the channel or convolution kernel of the CBS layer with the weight below the threshold value, the non-important parameter layers can be determined, the complexity of the model can be reduced, the generalization capability of the model is improved, and the influence of noise and interference on the detection result is reduced. By adopting a migration learning method, a plurality of pre-training models are utilized for initialization, a plurality of deep learning models are obtained through training, and layers with smaller influence factors in the models are deleted through sparse training, so that the models are more focused on learning truly important features, and the prediction accuracy and generalization capability of the models can be improved. By introducing a coordinate attention mechanism between the neck network and the prediction network, the position information of the tag can be extracted, and the position of the tag can be accurately positioned. In conclusion, the tag detection method based on deep learning can accurately and rapidly identify the tag, the accuracy and the speed of tag detection can be remarkably improved by optimizing the architecture and parameters of a deep learning model, the production cost and the management cost are reduced, the deep learning model has good expansibility and reliability, the tag detection capacity and scale can be expanded by adding training data or adjusting the model structure, and the continuously-changing practical application requirements are met.
Drawings
FIG. 1 is a schematic flow chart of a training process of a deep learning model in a deep learning-based magnetizing label detection method provided by the invention;
fig. 2 is a schematic structural diagram of a YOLO target detection network in embodiment 1 of the present invention;
Fig. 3 is a flow chart of a method for detecting a label in embodiment 1 of the present invention.
Detailed Description
The following describes in detail a specific embodiment of a detection method of a magnetizing tag based on deep learning with reference to the accompanying drawings.
Before the detection method of the magnetizing tag based on the deep learning provided by the invention is used for detecting the tag data to be detected, a deep learning model needs to be trained in advance. The training process of the deep learning model used in the present invention will be described in detail. As shown in fig. 1, the training process specifically includes the following steps:
the first step: and collecting the label image data and labeling the label image data.
Specifically, the process of collecting the training set and the test set can be realized by the following ways:
(1) Collecting a plurality of tag images under different sizes, different angles and different illumination conditions from a magnetizing scene, wherein the tag images comprise arrows in two directions; wherein, the arrows in the upward direction and the downward direction are included; in this way, as many variations as possible can be covered;
(2) And labeling the collected label image data, including the type, the position and the size of the label.
And a second step of: the label image data is preprocessed.
Specifically, the preprocessing process mainly includes adjusting the size and resolution of the tag image, and performing noise reduction processing, image distortion removal processing, contrast enhancement processing and normalization processing on the tag image.
And a third step of: dividing the processed label image data into a training set and a testing set, inputting the training set into a deep learning network for training to obtain a deep learning model, calculating and optimizing the deep learning model by using the testing set, and evaluating and optimizing the deep learning model by using a DIOU loss function; before training, initializing N pre-training models by adopting a transfer learning method, and obtaining N deep learning models corresponding to the pre-training models through N times of training, wherein N is a positive integer; the deep learning network is built based on introducing a coordinate attention mechanism between the neck network and the prediction network, and adding L1 regularization before the batch normalization layer. As shown in fig. 1, after the deep learning model is obtained through the above training process, tag detection may be performed using the obtained deep learning model.
Specifically, the deep learning model may select a convolutional neural network model (such as a YOLO target detection network), or may also select a cyclic neural network model, or may also select other neural network models, which are not limited herein.
Specifically, in the training process of the deep learning model used in the invention, the deep learning network is constructed based on a coordinate attention mechanism introduced between the neck network and the prediction network, and the training process can be realized by the following steps: the features of the channel c of the input image along the height h direction are aggregated into a single value g h by using a pair of 1x1 convolution operations F h and F h, the features of the channel c of the input image along the width w direction are aggregated into a single value g w by using a pair of 1x1 convolution operations F w and F w, and scaled by a sigmoid activation function σ:
gh=σ(Fh(fh)) (1)
gw=σ(Fw(fw)) (2)
the output of the coordinate attention mechanism is:
Wherein L represents the length of the input sequence, x j represents the feature vector corresponding to position j in the input sequence, and Attention (i, j) represents the Attention weight between position i and position j in the input sequence.
The invention can help the model to perform better feature extraction on the target object by introducing a coordinate attention mechanism between the neck network and the prediction network. The coordinate attention mechanism may help the model focus better on the region of interest rather than the entire image. It adjusts the attention weight of each channel by using the coordinate information of a specific location. Specifically, the coordinate information of each position is first extracted from the input feature map, and then mapped to the same tensor as the number of channels by a plurality of 1x1 convolution layers. This tensor is used to calculate the attention weight of each channel in order to better focus on the region of interest. By introducing a coordinate attention mechanism, the model can better adapt to targets with different sizes and positions, and better performance is obtained in a target detection task. The existing attention mechanisms of the lightweight network mostly adopt modularized attention mechanisms, only consider the information among channels, and ignore the position information, which is critical in many computer vision tasks, such as object detection or image recognition. Although some attempts have been made to extract position attention information by convolution after reducing the number of channels, convolution can only extract local relationships, lacking the ability to extract long-distance relationships. The invention can solve the defect of the lightweight network in processing the position information by introducing a coordinate attention mechanism between the neck network and the prediction network.
Specifically, in the training process of the deep learning model used in the invention, the deep learning network is constructed based on adding L1 regularization before the batch normalization layer, and the method can be realized by the following steps: the L1 regular expression is:
Where k represents regularization strength, controlling the contribution of the regularization term to the total loss; n represents the number of weight parameters; w s represents the absolute value of the weight parameter at position s;
the parameters of the batch normalization layer were calculated as follows:
Where x≡represents normalized input, z in represents non-normalized input, μ B represents the average of batch, Representing the variance of batch, co being a constant, z out representing the output of the batch normalization layer, α representing the learned scaling parameter, β representing the learned offset parameter,/>A normalization input representing a batch normalization layer;
after adding L1 regularization before the batch normalization layer, the total loss function expression is:
Wherein, Is a network loss function,/>Is regularization of the scale factor; x represents training input, y represents training result, and W represents training weight; f (x, W) represents the output of the deep learning model, which is a predicted value obtained by calculating training weight W for training input x by the deep learning model; gamma denotes the scale factor, g (gamma) denotes the sparsity-induced penalty of the scale factor gamma, and lambda denotes the penalty sparsity parameter that determines the magnitude of the penalty term.
L1 regularization, also known as Lasso regression, is a method of adding regularization terms to the loss function. Regularization term is the sum of absolute values of model parameters, which may bias the model towards selecting smaller parameter values, thereby enabling simplification of the model. According to the invention, L1 regularization is added before the batch normalization layer (Batch Normalization, BN), which means that a penalty term is added in addition to a normal loss function in the training process of the model, and the penalty term is in direct proportion to the absolute value of model parameters, so that non-important parameter layers can be determined, the weights of the non-important parameter layers are close to zero, and the non-important parameter layers with the weights close to zero are deleted through sparse training, so that the complexity of the model can be reduced, the generalization capability of the model is improved, and the influence of noise and interference on a detection result is reduced.
Specifically, before training the deep learning model, the migration learning method is adopted, and the N pre-training models are utilized for initialization, so that N deep learning models corresponding to the pre-training models can be obtained through N times of training, and the training is sparse training. Sparse training is a method of simplifying the model by deleting or suppressing certain neurons or layers, the goal of which is to preserve only the neurons or layers that are most important for a particular task. During the training process, the importance of each layer is determined by analyzing its activation or weight distribution. Through sparse training, layers with little or no contribution to the model can be deleted or inhibited, so that the model is more focused on learning truly important features, the model can be simplified, and the prediction accuracy and generalization capability of the model can be improved.
According to the invention, L1 regularization is added before a batch normalization layer, and model pruning is realized through sparse training, so that the problem of performance bottleneck of a deep learning model can be solved. Specifically, the pruning of the model can reduce the memory occupation of the model, accelerate the running speed and reduce the power consumption, thereby improving the prediction accuracy of the model. The most direct method of model pruning is to directly reduce the operand participating in calculation, thereby fundamentally solving the pressure of calculation and memory.
It should be noted that, through sparse training, a plurality of deep learning models can be obtained, and these deep learning models can be selected for different detection situations (such as collecting ambient light, arrow number and arrow size). Because the difference of collection ambient light, the difference of arrow quantity in the label that magnetizes, the difference of arrow size in the label that magnetizes all can influence the accuracy and the efficiency of label testing result, consequently, obtain a plurality of deep learning models through sparse training, can select suitable deep learning model according to the collection ambient light, arrow quantity and the arrow size of input image when carrying out the label detection subsequently to can show the accuracy and the efficiency that improve the label and detect.
In particular, in constructing a deep learning network, downsampling is a process for reducing feature map size and increasing receptive field, and upsampling is used to increase feature map size in order to make accurate positioning. Preferably, in the training process of the deep learning model used in the invention, the downsampling can adopt depth separable convolution (DEPTHWISE SEPARABLE CONVOLUTION) to replace standard convolution, so that the parameter quantity and the calculation complexity can be reduced; the upsampling may employ a transpose convolution (Transposed Convolution) that may learn the upsampling process, as compared to conventional upsampling (e.g., bilinear interpolation), and may thus provide more accurate feature recovery.
Specifically, before training the deep learning model used in the present invention, the method may further include the following steps: judging whether a missing value exists in the label image data; if the missing value exists, continuing to judge whether repeated data exist in the label image data after interpolation at the missing position, and if the missing value does not exist, directly judging whether the repeated data exist in the label image data; if the repeated data exist, continuously judging whether the abnormal value exists in the label image data after deleting the repeated data, and if the repeated data do not exist, directly judging whether the abnormal value exists in the label image data; if an outlier exists, the outlier is replaced, and if no outlier exists, the training process is entered.
After the deep learning model is obtained through the training process, the obtained deep learning model can be utilized for label detection. The invention provides a magnetizing label detection method based on deep learning, which takes PyQt 5-based application program framework as an example, integrates data importing and result displaying into an operation interface, can detect pictures or videos in real time, meets the actual engineering requirements, and specifically comprises the following steps:
the first step: and collecting the label data to be detected, and preprocessing the label data to be detected.
Specifically, the acquisition and preprocessing process may be implemented as follows:
(1) Selecting OpenCV to acquire a video stream of a label image to be detected, opening the video stream through a cv2.video capture () function, and taking an incoming parameter as a video stream source; continuously acquiring each frame of image in the video stream by using a while loop until acquisition fails or is manually interrupted;
(2) And adjusting the size and resolution of the label image to be detected, and performing noise reduction processing, image distortion removal processing, contrast enhancement processing and normalization processing on the label image to be detected.
And a second step of: and inputting the processed label data to be detected into a pre-trained deep learning model to realize label detection.
Specifically, the detection process of the tag data to be detected can be implemented by the following manner:
(1) Selecting a deep learning model according to the acquired ambient light, the number of arrows and the size of the arrows of an input image, performing image recognition based on a pt file generated by the selected deep learning model, and marking the type and the coordinates of the label in the output image;
It should be noted that, through sparse training, a plurality of deep learning models can be obtained, and these deep learning models can be selected for different detection situations (such as collecting ambient light, arrow number and arrow size). Because the difference of the collected ambient light and the difference of the number of the arrows in the magnetizing label, the difference of the sizes of the arrows in the magnetizing label can influence the accuracy and the efficiency of the label detection result, the proper deep learning model is selected according to the collected ambient light, the number of the arrows and the size of the arrows of the input image, so that the accuracy and the efficiency of the label detection can be remarkably improved;
(2) After the detection of the label is completed, the detected label information is stored, including the coordinates, the size, the category and the credibility of the label. These results may be used for further analysis and application of object tracking, path planning, environmental modeling, etc.
The following describes in detail the implementation of the method for detecting a magnetization tag based on deep learning according to the present invention by using a specific embodiment.
Example 1: taking the deep learning network selection YOLO target detection network (YOLOv 7_row network) as an example.
The specific structure of YOLOv7_row network used in embodiment 1 of the present invention is shown in fig. 2:
Backbone (Backbone) module: this portion of the YOLOv7_row network is responsible for feature extraction. The feature map is split into two parts by using a series of convolution layers (CONV) and CBS modules, and then the learning capacity can be enhanced by combining the cross-stage hierarchical structure. The SPP (spatial pyramid pooling) layer aggregates global contexts by applying multiple filters. The CBS module may perform complex feature extraction and conversion tasks in the neural network. By adjusting the parameters and structure of these components, the performance of the network can be further optimized. The CBS module consists of convolutional layers, BN (Batch Normalization) layers, and Silu (Sigmoid Linear Unit) layers. The ELAN module consists of a series of convolutions, the advantage of this configuration is that in each branching operation the input channel is consistent with the output channel, only the first two 1x1 convolutions have channel variations. The Max Pooling (MP) layer is largely divided into max pooling and CBS, where MP1 and MP2 are largely the ratio variation of channel numbers. MP1 is adopted to realize space downsampling, and a 1x1 convolution compression channel is followed; the right side firstly uses 1x1 convolution to compress channels, then uses 3x3 convolution with step length of 2 to complete downsampling, finally uses UP sampling operation (UP) to combine the results of two branches to obtain a characteristic diagram with the number of channels equal to the number of input channels, but the spatial resolution is reduced by 2 times. REP is a model re-parameterization technology, which can combine multiple calculation modules into one in the reasoning stage, so as to improve the efficiency and performance of the model. ELAN-H is a permutation sequence that changes convolution on the ELAN basis. A Connection (CAT) is an operation that concatenates two or more tensors in a certain dimension. The coordinate attention mechanism (CA) can extract the position information of the tag, so that the position of the tag can be accurately positioned.
Neck (Neck) module: the Neck module labeled "PANet" in the image of YOLOv _row is part of the neural network architecture that bridges the backhaul module and the Head module. Its structure is to optimize the feature map of object detection. The PANet module uses a series of convolutional layer (CONV) and cross-stage section (CBS) modules, as well as up-sampling and concatenation (Concat) operations. The convolution layer processes the feature map from the backbone. The CBS module splits and merges feature maps to enhance the feature representation. Upsampling may increase the resolution of the feature map, thereby obtaining finer detail. The concatenation combines the upsampled feature map with the higher resolution feature map in the earlier layers to preserve rich detail at different scales. This structure can enhance the feature hierarchy necessary to detect objects of various sizes, making it suitable for multi-scale detection.
Head (Head) module: the YOLOv7_row header is the last part of the network that uses the 2D convolution layer (CONV 2D) to predict class probabilities, objectivity scores, and bounding box coordinates for object detection.
The magnetizing tag detection method based on the YOLO target detection network, as shown in fig. 3, may include the following steps:
(1) Image importation
Specifically, image data is read from a camera or other sensors, openCV is selected to obtain a current image video stream, the video stream is opened through a cv2.videocapture () function, and an incoming parameter is a video stream source. Each frame of image in the video stream is continuously acquired using a while loop until acquisition fails or is manually interrupted.
(2) Image processing
After the image input, a series of preprocessing steps are required to better perform object detection. These preprocessing steps may include resizing the image, adjusting resolution, reducing noise, removing image distortion, contrast enhancement, normalization processing, and the like, which may be implemented by image processing algorithms and software tools.
(3) Image detection
Specifically, the pt file generated by YOLOv7_row model is used for image recognition detection. And selecting a proper weight file according to the input data, and marking the category and the coordinates of the target in the output picture.
(4) Image preservation
After the target detection is completed, the marked picture is required to be displayed through a QT interface, and the detected target information is stored. Such information may include coordinates, size, class, credibility, etc. of the target. These results may be used for further analysis and application of object tracking, path planning, environmental modeling, etc.
The above is the main steps of the magnetizing tag detection method based on the YOLO target detection network in embodiment 1 of the present invention. The advantage of the magnetizing tag detection method based on the YOLO target detection network of embodiment 1 of the present invention is analyzed by comparing the YOLOv7_row model, the existing YOLOv7 model, and the existing YOLOv7x model of embodiment 1 of the present invention as follows.
Table 1 average accuracy and recall of three models
It can be seen from table 1 that the YOLOv7 xjrow model in example 1 of the present invention achieves the best overall tradeoff between average accuracy and recall. Specifically, in the embodiment 1 of the invention, the Precision value of the YOLOv7 x_ROWmodel reaches 0.906, which exceeds the Precision values of the existing YOLOv model and YOLOv x model, and is improved by 5% compared with the YOLOv model and 3% compared with the YOLOv7x model; in addition, in the embodiment 1 of the present invention, the Recall value of the YOLOv7 ×row model is 0.793, which is also improved compared with the Recall values of the conventional YOLOv7 model and YOLOv7 ×7 model. From this, it can be seen that the overall accuracy of the magnetizing tag detection method based on the YOLO target detection network in embodiment 1 of the present invention is significantly improved.
TABLE 2 [email protected] values and [email protected]:0.95 values for the two models
The YOLOv x_row model of example 1 of the present invention was compared with the [email protected] value and [email protected]:0.95 value of the existing YOLOv7 model, and the results are shown in table 2. [email protected] denotes the average accuracy at IoU value of 0.5, [email protected]:0.95 denotes the average accuracy at IoU value of 0.5 to 0.95. As can be seen from Table 2, compared with the existing YOLOv model, the [email protected] value of the YOLOv X_ROW model in the embodiment 1 of the invention is improved by 5.8%, and the [email protected]:0.95 value is improved by 5.4%, so that the effectiveness and the superior performance of the magnetizing label detection method based on the YOLO target detection network in the embodiment 1 of the invention in the aspects of accurate detection and label detection can be proved.
In summary, the average accuracy (Precision) of label detection in embodiment 1 of the present invention is 90.6%, recall (Recall) is 79.3%, [email protected] is 86.9%, [email protected]:0.95 is 54.1%, and it is apparent that the magnetizing label detection method based on the YOLO target detection network in embodiment 1 of the present invention can rapidly and accurately detect magnetizing labels, and improve production efficiency and product quality.
And adding a coordinate attention mechanism and a DIOU loss function on the basis of the existing YOLOv model to obtain a YOLOv & lt7+ & gt CA model and a YOLOv & lt7+ & gt DIOU model. The results are shown in Table 3 comparing YOLOv model, YOLOv7+CA model, YOLOv7+DIOU model, and YOLOv x_ROW model in example 1 of the present invention.
TABLE 3 Table 3
Model Average accuracy rate Recall rate of recall [email protected] [email protected]:0.95
YOLOv7 0.859 0.791 81.1% 48.7%
YOLOv7+CA 0.884 0.811 83.1% 51.1%
YOLOv7+DIOU 0.872 0.776 86.5% 53%
YOLOv7x_ROW 0.906 0.793 86.9% 54.1%
As can be seen from Table 3, compared with the existing YOLOv model, by adding the coordinate attention mechanism, the average accuracy, recall rate, [email protected] value and [email protected]:0.95 value of the YOLOv7+CA model are all improved, so that the model can pay more attention to the position information of the target, the position and size of the target can be predicted more accurately, and the accuracy and recall rate are improved. As can be seen from Table 3, compared with the existing YOLOv model, the average accuracy, recall, [email protected] value and [email protected]:0.95 value of the YOLOv7+DIOU model are all improved by adopting the DIOU loss function, so that the model can be focused on target detection, and the detection effect and accuracy are improved. In combination with the two improvement points, the accuracy and recall rate of the YOLOv7 xROW model in the embodiment 1 of the invention are improved to a greater extent.
TABLE 4 Table 4
Model Picture pixel Model size (MB)
YOLOv7 1024×1024 138.3
YOLOv7x_ROW 1024×1024 135.6
Table 4 compares the sizes of the existing YOLOv model and the YOLOv x_ROW model of example 1 of the present invention. As can be seen from table 4, the sizes of the trained models differ for pictures of the same pixel. By using L1 regularization and sparse training, the YOLOv x_ROW model in the embodiment 1 of the invention has obvious compression and pruning effects, and can rapidly and accurately detect the magnetizing label.
In conclusion, it can be proved that the magnetizing label detection method based on the YOLO target detection network in embodiment 1 of the invention can rapidly and accurately detect the magnetizing label, and improves the production efficiency and the product quality.
According to the detection method for the magnetizing tag based on deep learning, provided by the embodiment of the invention, through the deep learning method, the characteristics of the tag surface can be automatically learned and extracted, and interference factors such as dust and the like are effectively processed, so that the identification accuracy and stability of the magnetizing tag are obviously improved, the identification effect of a smaller target is better, and the actual application requirements of the magnetizing tag identification can be better met. The training process of deep learning is utilized, the characteristics of the labels can be automatically learned, extracted and identified, and accurate label classification and positioning are realized, so that manual intervention can be reduced, the possibility of missing detection or false detection caused by human factors is reduced, the efficiency of label detection can be improved, the labor cost is reduced, and the accuracy and the efficiency of label detection are improved. In the deep learning training process, by adding L1 regularization before the batch normalization layer, resetting the weight below the threshold value in the CBS layer to 0 or removing the channel or convolution kernel of the CBS layer with the weight below the threshold value, the non-important parameter layers can be determined, the complexity of the model can be reduced, the generalization capability of the model is improved, and the influence of noise and interference on the detection result is reduced. By adopting a migration learning method, a plurality of pre-training models are utilized for initialization, a plurality of deep learning models are obtained through training, and layers with smaller influence factors in the models are deleted through sparse training, so that the models are more focused on learning truly important features, and the prediction accuracy and generalization capability of the models can be improved. By introducing a coordinate attention mechanism between the neck network and the prediction network, the position information of the tag can be extracted, and the position of the tag can be accurately positioned. In conclusion, the tag detection method based on deep learning can accurately and rapidly identify the tag, the accuracy and the speed of tag detection can be remarkably improved by optimizing the architecture and parameters of a deep learning model, the production cost and the management cost are reduced, the deep learning model has good expansibility and reliability, the tag detection capacity and scale can be expanded by adding training data or adjusting the model structure, and the continuously-changing practical application requirements are met.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (9)

1. The detection method of the magnetizing label based on deep learning is characterized by comprising the following steps:
S11: collecting tag data to be detected, and preprocessing the tag data to be detected;
s12: inputting the processed label data to be detected into a pre-trained deep learning model to realize label detection;
The training process of the deep learning model is as follows:
S21: collecting tag image data, and labeling the tag image data;
s22: preprocessing the label image data;
S23: dividing the processed label image data into a training set and a testing set, inputting the training set into a deep learning network for training to obtain a deep learning model, calculating and optimizing the deep learning model by using the testing set, and evaluating and optimizing the deep learning model by using a DIOU loss function; before training, initializing N pre-training models by adopting a transfer learning method, and obtaining N deep learning models corresponding to the pre-training models through N times of training, wherein N is a positive integer; the deep learning network is built based on introducing a coordinate attention mechanism between the neck network and the prediction network, and adding L1 regularization before the batch normalization layer.
2. The deep learning-based magnetizing tag detection method as claimed in claim 1, wherein the step S11 of collecting tag data to be detected, and preprocessing the tag data to be detected, specifically comprises the following steps:
S111: selecting OpenCV to acquire a video stream of a label image to be detected, opening the video stream through a cv2.video capture () function, and taking an incoming parameter as a video stream source; continuously acquiring each frame of image in the video stream by using a while loop until acquisition fails or is manually interrupted;
S112: and adjusting the size and resolution of the label image to be detected, and performing noise reduction processing, image distortion removal processing, contrast enhancement processing and normalization processing on the label image to be detected.
3. The method for detecting a magnetizing tag based on deep learning as claimed in claim 1, wherein the step S12 is to input the processed tag data to be detected into a pre-trained deep learning model to realize tag detection, and specifically comprises the following steps:
s121: selecting a deep learning model according to the acquired ambient light, the number of arrows and the size of the arrows of an input image, performing image recognition based on a pt file generated by the selected deep learning model, and marking the type and the coordinates of the label in the output image;
S122: after the detection of the label is completed, the detected label information is stored, including the coordinates, the size, the category and the credibility of the label.
4. The method for detecting a magnetizing tag based on deep learning as claimed in claim 1, wherein the step S21 of collecting tag image data and labeling the tag image data comprises the following steps:
S211: collecting a plurality of tag images under different sizes, different angles and different illumination conditions from a magnetizing scene, wherein the tag images comprise arrows in two directions;
S212: and labeling the collected label image data, including the type, the position and the size of the label.
5. The deep learning-based magnetizing tag detection method of claim 1, wherein the step S22 of preprocessing tag image data specifically includes: and adjusting the size and resolution of the tag image, and performing noise reduction processing, image distortion removal processing, contrast enhancement processing and normalization processing on the tag image.
6. The deep learning-based magnetizing tag detection method of claim 1, wherein in step 23, a coordinate attention mechanism is introduced between the neck network and the prediction network, specifically comprising:
The features of the channel c of the input image along the height h direction are aggregated into a single value g h by using a pair of 1x1 convolution operations F h and F h, the features of the channel c of the input image along the width w direction are aggregated into a single value g w by using a pair of 1x1 convolution operations F w and F w, and scaled by a sigmoid activation function σ:
gh=σ(Fh(fh)) (1)
gw=σ(Fw(fw)) (2)
the output of the coordinate attention mechanism is:
Wherein L represents the length of the input sequence, x j represents the feature vector corresponding to position j in the input sequence, and Attention (i, j) represents the Attention weight between position i and position j in the input sequence.
7. The deep learning-based magnetizing tag detection method of claim 1, wherein adding L1 regularization before the batch normalization layer in step 23 specifically comprises:
the L1 regular expression is:
Where k represents regularization strength, controlling the contribution of the regularization term to the total loss; n represents the number of weight parameters; w s represents the absolute value of the weight parameter at position s;
the parameters of the batch normalization layer were calculated as follows:
Where x≡represents normalized input, z in represents non-normalized input, μ B represents the average of batch, Representing the variance of batch, co being a constant, z out representing the output of the batch normalization layer, α representing the learned scaling parameter, β representing the learned offset parameter,/>A normalization input representing a batch normalization layer;
after adding L1 regularization before the batch normalization layer, the total loss function expression is:
Wherein, Is a network loss function,/>Is regularization of the scale factor; x represents training input, y represents training result, and W represents training weight; f (x, W) represents the output of the deep learning model, which is a predicted value obtained by calculating training weight W for training input x by the deep learning model; gamma denotes the scale factor, g (gamma) denotes the sparsity-induced penalty of the scale factor gamma, and lambda denotes the penalty sparsity parameter that determines the magnitude of the penalty term.
8. The deep learning-based magnetizing tag detection method of claim 1, wherein in step 23, the deep learning network is constructed based on introducing a coordinate attention mechanism between the neck network and the prediction network, and adding L1 regularization before the batch normalization layer, further comprising:
the downsampling uses depth separable convolution and the upsampling uses transposed convolution.
9. The deep learning-based magnetizing tag detection method of claim 1, further comprising the step of, prior to training, in step 23:
s231: judging whether a missing value exists in the label image data; if yes, execute step S232 and then execute step S233; if not, executing step S233;
S232: interpolation at the missing;
S233: judging whether repeated data exist in the label image data; if yes, execute step S234 and then execute step S235; if not, executing the step S235;
S234: deleting the duplicate data;
S235: judging whether abnormal values exist in the label image data or not; if yes, executing step S236 and then entering a training process; if not, entering a training process;
S236: replacing outliers.
CN202410227357.7A 2024-02-29 2024-02-29 Magnetizing tag detection method based on deep learning Pending CN118095314A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410227357.7A CN118095314A (en) 2024-02-29 2024-02-29 Magnetizing tag detection method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410227357.7A CN118095314A (en) 2024-02-29 2024-02-29 Magnetizing tag detection method based on deep learning

Publications (1)

Publication Number Publication Date
CN118095314A true CN118095314A (en) 2024-05-28

Family

ID=91141631

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410227357.7A Pending CN118095314A (en) 2024-02-29 2024-02-29 Magnetizing tag detection method based on deep learning

Country Status (1)

Country Link
CN (1) CN118095314A (en)

Similar Documents

Publication Publication Date Title
US11429818B2 (en) Method, system and device for multi-label object detection based on an object detection network
CN113705478B (en) Mangrove single wood target detection method based on improved YOLOv5
CN109086811B (en) Multi-label image classification method and device and electronic equipment
CN111652317B (en) Super-parameter image segmentation method based on Bayes deep learning
CN113159120A (en) Contraband detection method based on multi-scale cross-image weak supervision learning
CN110610210B (en) Multi-target detection method
CN111339975A (en) Target detection, identification and tracking method based on central scale prediction and twin neural network
CN108734200B (en) Human target visual detection method and device based on BING (building information network) features
CN112085024A (en) Tank surface character recognition method
CN111724355A (en) Image measuring method for abalone body type parameters
CN114332473A (en) Object detection method, object detection device, computer equipment, storage medium and program product
CN116229052B (en) Method for detecting state change of substation equipment based on twin network
CN115147418B (en) Compression training method and device for defect detection model
CN116012291A (en) Industrial part image defect detection method and system, electronic equipment and storage medium
CN114155474A (en) Damage identification technology based on video semantic segmentation algorithm
CN115116074A (en) Handwritten character recognition and model training method and device
CN110991374B (en) Fingerprint singular point detection method based on RCNN
CN114445620A (en) Target segmentation method for improving Mask R-CNN
CN113313678A (en) Automatic sperm morphology analysis method based on multi-scale feature fusion
CN108960005B (en) Method and system for establishing and displaying object visual label in intelligent visual Internet of things
CN117593244A (en) Film product defect detection method based on improved attention mechanism
CN111968154A (en) HOG-LBP and KCF fused pedestrian tracking method
CN116030050A (en) On-line detection and segmentation method for surface defects of fan based on unmanned aerial vehicle and deep learning
CN116912670A (en) Deep sea fish identification method based on improved YOLO model
CN115937095A (en) Printing defect detection method and system integrating image processing algorithm and deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination