CN117689731A - Lightweight new energy heavy-duty truck battery pack identification method based on improved YOLOv5 model - Google Patents

Lightweight new energy heavy-duty truck battery pack identification method based on improved YOLOv5 model Download PDF

Info

Publication number
CN117689731A
CN117689731A CN202410149737.3A CN202410149737A CN117689731A CN 117689731 A CN117689731 A CN 117689731A CN 202410149737 A CN202410149737 A CN 202410149737A CN 117689731 A CN117689731 A CN 117689731A
Authority
CN
China
Prior art keywords
output
input
battery pack
module
cfnet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410149737.3A
Other languages
Chinese (zh)
Other versions
CN117689731B (en
Inventor
郭佳豪
晋军
刘一霏
魏雨辰
谷霄月
邓雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaanxi Dechuang Digital Industrial Intelligent Technology Co ltd
Original Assignee
Shaanxi Dechuang Digital Industrial Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shaanxi Dechuang Digital Industrial Intelligent Technology Co ltd filed Critical Shaanxi Dechuang Digital Industrial Intelligent Technology Co ltd
Priority to CN202410149737.3A priority Critical patent/CN117689731B/en
Publication of CN117689731A publication Critical patent/CN117689731A/en
Application granted granted Critical
Publication of CN117689731B publication Critical patent/CN117689731B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention relates to a light new energy heavy-duty truck battery pack identification method based on an improved YOLOv5 model, which comprises the following steps: collecting an image dataset of the battery pack; preprocessing an image data set to construct a battery pack data set, and dividing the battery pack data set into a training set, a verification set and a test set according to proportions; constructing an LFNT model; repeatedly adjusting parameters and training the LFNT model by using the training set to obtain an optimal new energy heavy-duty battery pack detection model; predicting a test set through an optimal new energy heavy-duty battery pack detection model, and identifying the specific position of a battery pack; the abstract features and the advanced features of the battery pack are extracted through a deep learning method, and a deep learning model for battery pack identification is obtained through repeated training of a deep convolutional neural network, so that the detection precision and the working efficiency of the battery pack can be greatly improved in the actual power-changing process.

Description

Lightweight new energy heavy-duty truck battery pack identification method based on improved YOLOv5 model
Technical Field
The invention relates to the field of battery pack identification, in particular to a light new energy heavy-duty truck battery pack identification method based on an improved YOLOv5 model.
Background
The trend of new energy heavy truck gradually changes to new energy and intelligent direction in the traffic field is continuously enhanced, and the new energy heavy truck utilizes an electric energy driving mode, so that the new energy heavy truck has the advantages of high energy utilization rate and low operation cost compared with the traditional heavy truck, and gradually becomes an important choice in industries such as ports, mining areas, logistics and the like.
The new energy heavy truck is usually powered by a battery pack loaded on the back of the truck body, and when the electric energy of the vehicle-mounted battery pack is consumed to a certain extent, electric quantity is required to be supplemented so as to maintain continuous running of the new energy heavy truck. At present, battery pack replacement is one of the main modes of new energy heavy truck energy replenishment, the battery pack to be exhausted is detected mainly through a battery replacement station camera, the battery pack to be exhausted is replaced by a full battery pack of a battery replacement robot to complete replenishment of new energy heavy truck energy, and the battery pack of the new energy heavy truck is large in capacity and large in mass, so that the battery pack to be exhausted needs to be detected before battery replacement, and then the battery replacement robot is responded to perform battery replacement. The existing detection method is to detect the battery pack through non-contact measurement technologies such as sensors, on one hand, the sensors are easily interfered by multiple aspects to reduce the detection precision, and potential safety hazards exist; on the other hand, the complexity and the cost of the detection equipment are high, a great deal of time is spent on operation and maintenance by professionals, and the efficiency is low.
Disclosure of Invention
Aiming at the problems, the invention provides a lightweight new energy heavy-duty battery pack identification method based on an improved YOLOv5 model, which is characterized in that abstract features and advanced features of a battery pack are extracted through a deep learning method, and a deep learning model for battery pack identification is obtained through repeated training of a deep convolutional neural network, so that the detection precision and the working efficiency of the battery pack can be greatly improved in the actual power change process.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
a light new energy heavy-duty battery pack identification method based on an improved YOLOv5 model comprises the following steps:
s1, acquiring an image data set of the battery pack;
s2, preprocessing an image data set, constructing a battery pack data set, and dividing the battery pack data set into a training set, a verification set and a test set according to the proportion;
s3, constructing an LFNT (lightweight rapid detection network) model;
s4, performing repeated parameter adjustment and training on the LFNT model by using the training set to obtain an optimal new energy heavy-duty battery pack detection model;
s5, predicting the test set through an optimal new energy heavy-duty battery pack detection model, and identifying the specific position of the battery pack.
Preferably, S2 specifically includes:
s21, converting the image data set in the S1 into a YOLOv5 format to obtain txt files of information of the left upper corner coordinates and the right lower corner coordinates of the battery pack in each image;
s22, generating three same-level folders under the total folder, and generating two subfolders under each folder in the three same-level folders, wherein the two subfolders respectively store images and corresponding txt files to form a battery pack data set;
s23, the images in the battery pack data set and txt files are all processed according to 8:1:1, generating a training set, a verification set and a test set, automatically creating a class. Txt file in a label folder under the folder where each training set is located, and writing a package English word in the class. Txt file.
Preferably, in S3, the LFNet model includes a trunk feature extraction network and a neg feature processing network, and the input of the LFNet model is the input of the trunk feature extraction network, and the output of the trunk feature extraction network is the input of the neg feature processing network.
Preferably, the trunk feature extraction network includes four-layer feature extraction blocks and an SPPF (rapid spatial pyramid pooling) module, in the four-layer feature extraction blocks, an output of a previous-layer feature extraction block is used as an input of a next-layer feature extraction block, an input of the trunk feature extraction network is an input of a first-layer feature extraction block, an output of a last-layer feature extraction block is an input of the SPPF module, and an output of the SPPF module is an input of a neg feature processing network.
Preferably, each layer of the feature extraction block is composed of a convolution combination and a CFNet (cross-stage lightweight network) -1 module, the input of the feature extraction block is combined through the convolution, the output of the convolution combination is used as the input of the CFNet-1 module, and the output of the CFNet-1 module is the output of the feature extraction block.
Preferably, the CFNet-1 module includes a convolution combination, a multi-layer CFBlock (lightweight block) and an SCABlock (space-channel dual-attention block), wherein the output of the former layer CFBlock is used as the input of the latter layer CFBlock; after the input of the CFNet-1 module is combined through a convolution, the input of the CFblock is used as the input of a first layer of CFblock, the output of the last layer of CFblock in the multi-layer CFblock and the input of the first layer of CFblock are subjected to channel splicing (jointing), and then the output of the CFblock is used as the input of the SCA block, and the output of the SCA block is the output of the CFNet-1 module;
the CFBlock extracts the characteristics of the partial channel number input by PConv (partial convolution), the extracted characteristics sequentially pass through two 1X 1 convolutions with different channel numbers, and the sum of the output of the second 1X 1 convolution and the input of the CFBlock is used as the output of the CFBlock.
Preferably, the SCABlock includes a CABlock (channel attention block) and a SABlock (spatial attention block), the inputs of the SCABlock are respectively taken as the inputs of the CABlock and SABlock, and the sum of the output of the CABlock and the output of the SABlock is taken as the output of the SCABlock;
the input of the CABLock is subjected to convolution combination and then is respectively used as the input of global average pooling and global maximum pooling; adding the global average pooled output and the global maximum pooled output, inputting a sigmoid function, and weighting the output of the sigmoid function into the input of a CABLock to obtain the output of the CABLock module;
the input of the SABlock module is respectively combined through two parallel convolutions, the product of the result of one convolution combination output after reshape (size change) and the result of the other convolution combination output after reshape (transposition) is input into a softmax function, and the output of the softmax function is weighted into the input of the SABlock to obtain the output of the SABlock.
Preferably, the input of the SPPF module sequentially carries out a convolution combination and three layers of maximum pooling layers, and after the output of the convolution combination and the output of each layer of maximum pooling layer are subjected to channel splicing, the output of the SPPF module is obtained through the convolution combination.
Preferably, the neg feature processing network includes an FPN (feature pyramid network) structure and a PAN (path aggregation network) structure, where the FPN structure and the PAN structure each include two CFNet-2 modules, compared with the CFNet-1 modules, each layer of CFBlock in the CFNet-2 modules has no residual edge, the output of the SPPF module is used as the input of the FPN structure, the output of the SPPF module is channel-spliced with the output of the third layer of feature extraction block in the trunk feature extraction network to obtain a first feature image, the output of the first CFNet-2 module in the FPN structure is channel-spliced with the output of the second layer of feature extraction block in the trunk feature extraction network to obtain a second feature image, the output of the second CFNet-2 module in the FPN structure is used as the input of the PAN structure, the output of the second CFNet-2 module in the FPN structure is channel-spliced with the output of the first CFNet-2 module in the second feature extraction network, and the output of the first CFNet-2 module in the second feature structure is channel-spliced with the output of the second CFNet-2 module in the first cfn structure, and the output of the second feature image is channel-spliced with the first CFNet-2 module in the second feature image is the first CFNet-2 module.
Preferably, the convolution combination includes a 1×1 convolution, BN normalization, and a lu activation function, and the input of the convolution combination sequentially passes through the 1×1 convolution, BN normalization, and the lu activation function, and the output of the lu activation function is the output of the convolution combination.
Compared with the prior art, the invention has the beneficial effects that:
(1) According to the invention, the CFNet module is introduced into the trunk feature extraction network and the neg feature processing network of the LFNT model, so that the overall parameter quantity of the model can be greatly reduced, the memory access and calculation resources of the model are reduced, meanwhile, the SCABlock is also introduced into the trunk feature extraction network of the LFNT model, the space perception capability and the channel correlation of the model are enhanced, the battery pack recognition capability is greatly enhanced, and the introduction of the CFNet module and the SCABlock ensures the model detection precision and simultaneously accelerates the reasoning speed of the LFNT model;
(2) The invention can automatically detect the new energy heavy-duty battery pack after entering the station in real time, and has the advantages of high detection precision, high detection speed, high efficiency, high safety and the like;
(3) The invention effectively applies the deep learning model to the battery pack detection technology of the new energy heavy-duty intelligent power conversion system, so that the application scene of the invention is wide, and the invention can be used for ports, mining areas, logistics and the like, and is an effective way for the intelligent power conversion technology.
Drawings
FIG. 1 is a flow chart of a light new energy heavy-duty truck battery pack identification method based on an improved YOLOv5 model;
FIG. 2 is a diagram illustrating placement of data set files in accordance with the present invention;
FIG. 3 is a block diagram of the LFNT model of the present invention;
FIG. 4 is a block diagram of the CFNet-1 module of the present invention;
FIG. 5 is a block diagram of the CFblock of the present invention;
FIG. 6 is a block diagram of the SCABlock module of the present invention;
FIG. 7 is a block diagram of an SPPF module of the present invention;
FIG. 8 is a graph of the final loss value for LFNT training at the same epoch as the prior lightweight model;
FIG. 9 is a graph comparing the LFNT model and the YOLOv5 model for light weight index;
fig. 10 is a visual diagram of LFNet model and YOLOv5 model predictions.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the invention provides a light new energy heavy-duty truck battery pack identification method based on an improved YOLOv5 model, which comprises the following steps:
s1, acquiring an image data set of the battery pack; the data set contains rich information and rules, is a basic stone for model learning, and is a material for model learning and optimization, so that the acquisition of the data set is important. The new energy heavy-duty battery pack is black in color and cubic in shape, and the battery pack is singly collected and used for training a deep learning model, so that the model detection effect is poor, and the model generalization capability is also poor. Therefore, in order to improve the detection precision of the model and enhance the generalization capability of the model, the battery pack image data are collected from different visual angles, different brightness, different distance sizes and the like, the collected battery pack image data are effectively expanded by using a data enhancement mode, mainly the operations such as fuzzy processing, sharpening processing, brightness change and the like are carried out on an image data set, and the collection and processing mode can enrich the battery pack data on one hand, so that the model obtains abundant information and insight; on the other hand, the model can be simulated and adapted to different scene changes in a training stage so as to be used in complex and changeable practical application scenes.
S2, referring to FIG. 2, preprocessing an image data set to construct a battery pack data set, and dividing the battery pack data set into a training set, a verification set and a test set according to proportions;
the method for preprocessing the image data set in the S1 comprises the following steps: converting the image data set into a YOLOv5 format to obtain txt files of information of the left upper corner coordinates and the right lower corner coordinates of the battery pack in each image;
the method for constructing the battery pack data set comprises the following steps: generating three same-level folders under the total folder, generating two subfolders under each folder in the three same-level folders, wherein the two subfolders respectively store images and corresponding txt files to form a battery pack data set;
and (3) the images and txt files in the battery pack data set are all processed according to 8:1:1, generating a training set, a verification set and a test set, automatically creating a class. Txt file in a label folder under the folder where each training set is located, and writing a package English word in the class. Txt file.
S3, referring to FIG. 3, constructing an LFNT model:
the LFNT model comprises a trunk feature extraction network and a neg feature processing network, wherein the input of the LFNT model is the input of the trunk feature extraction network, and the output of the trunk feature extraction network is the input of the neg feature processing network;
the trunk feature extraction network comprises four layers of feature extraction blocks and an SPPF module, wherein in the four layers of feature extraction blocks, the output of the former layer of feature extraction block is used as the input of the latter layer of feature extraction block, the input of the trunk feature extraction network is the input of the first layer of feature extraction block, the output of the last layer of feature extraction block is the input of the SPPF module, and the output of the SPPF module is the input of the neg feature processing network;
each layer of the feature extraction block consists of a convolution combination and a CFNet-1 module, wherein the input of the feature extraction block is subjected to the convolution combination, the output of the convolution combination is used as the input of the CFNet-1 module, and the output of the CFNet-1 module is the output of the feature extraction block;
referring to fig. 4, the CFNet-1 module includes a convolution combination, multiple layers of CFBlock and a SCABlock, CFBlock layer number, wherein the layers can be sequentially set to 3, 6 and 9 layers along with the increase of the model depth, in the multiple layers of CFBlock, the output of the previous layer of CFBlock is used as the input of the next layer of CFBlock, each layer of CFBlock has a residual edge, and the residual edge represents an edge directly added from the input to the output; after the input of the CFNet-1 module is combined through a convolution, the input of the CFblock is used as the input of a first layer of CFblock, the output of the last layer of CFblock in the multi-layer CFblock is spliced with the input of the first layer of CFblock to be used as the input of the SCA block, and the output of the SCA block is the output of the CFNet-1 module; the CFNet-1 module is constructed based on a CSPNet (cross-stage local network) module, and the original local core module in the CSPNet module is replaced by CFBlock, so that the advantages of the cross-stage structural form of the CSPNet module are reserved, the model learning capacity is further enhanced, and the memory cost is reduced;
the convolution combination comprises a 1×1 convolution, BN normalization and a SiLU activation function, wherein the input of the convolution combination sequentially passes through the 1×1 convolution, the BN normalization and the SiLU activation function, and the output of the SiLU activation function is the output of the convolution combination.
Referring to fig. 5, the CFBlock performs feature extraction on the number of partial channels inputted by the CFBlock through PConv, and the extracted features sequentially pass through two 1×1 convolutions with different channel numbers, and the sum of the output of the second 1×1 convolution and the input of the CFBlock is used as the output of the CFBlock.
Referring to fig. 6, the SCABlock includes CABlock and SABlock. The input of the SCA block is respectively used as the input of CABACK and SABlock, and the sum of the output of the CABACK and the output of the SABlock after adding is used as the output of the SCA block; the introduction of CABLock and SABlock can enhance the perceptibility of the model to the spatial features and also enhance the correlation between channels in the features.
The input of the CABLock is subjected to convolution combination and then is respectively used as the input of global average pooling and global maximum pooling; adding the global average pooled output and the global maximum pooled output, inputting a sigmoid function, outputting a channel vector with weight, weighting the output of the sigmoid function into the input of a CABLock to obtain the output of the CABLock, and completing recalibration of the input of the SCABlock in the channel dimension;
in this embodiment, an input feature map of an SCABlock is taken as an example for explanation, the input feature map of the SCABlock enters a CABlock, and a feature map a with the same size as the input feature map is obtained by a convolution combination; then the feature map A respectively obtains two global feature maps through a global average pooling layer and a global maximum pooling layer; then adding the two feature graphs with global property, and obtaining a channel vector with weight through a sigmoid function; and finally, weighting the channel vector weights on the input feature map channel by channel, and outputting the channel vector weights by CABlock, so as to finish recalibration of the input feature map on the channel dimension.
The SABlock is mainly composed of two parallel convolution combinations and softmax functions and reshape, transpose operation, wherein the input of the SABlock is respectively formed by the two parallel convolution combinations, the product of the result of one convolution combination output subjected to reshape and the result of the other convolution combination output subjected to reshape and transpost is input into the softmax function, the output of the softmax function is weighted into the input of the SABlock, the output of the SABlock is obtained, and the recalibration of the input of the SCABlock in the space dimension is completed;
in this embodiment, an input feature map of an SCABlock is taken as an example for explanation, and the input feature map of the SCABlock enters the SABlock to obtain a feature map a and a feature map B with the same size as the input feature map respectively through two parallel convolution combinations; in order to enable the feature map A and the feature map B to meet matrix multiplication conditions, so that a space weight matrix is generated, carrying out reshape operation on the feature map A and the feature map B to obtain a matrix of R epsilon (H multiplied by W) multiplied by I, wherein R represents the obtained matrix, H represents the height of the feature map A after reshape operation, W represents the width of the feature map A after reshape operation, I represents the number of channels of the feature map A after reshape operation, and carrying out transfer on the feature map B to obtain a matrix of R epsilon I multiplied by X (H multiplied by W); multiplying the two changed matrixes, and obtaining a product which is subjected to a softmax function to obtain a weight characteristic matrix, wherein each point in the weight characteristic matrix has a certain spatial relationship; and finally, restoring the spatial relation weight matrix into the size of the input feature map, weighting the restored spatial relation weight matrix onto the input feature map, and outputting by SABlock, thereby completing recalibration of the input feature map on the spatial latitude.
Referring to fig. 7, the input of the SPPF module sequentially performs a convolution combination and three layers of maximum pooling layers, and after the output of the convolution combination and the output of each layer of maximum pooling layer are subjected to channel splicing, the output of the SPPF module is obtained by the convolution combination.
The network for processing the neg features comprises an FPN structure and a PAN structure, wherein the FPN structure and the PAN structure respectively comprise two CFNet-2 modules, each layer of CFBlock in the CFNet-2 modules is free of residual edges compared with the CFNet-1 modules, namely, the output of a second 1X 1 convolution is directly the output of the CFBlock in the CFNet-2 modules, the output of the SPPF module is used as the input of the FPN structure, channel stitching is carried out on the output of a third layer feature extraction block in the trunk feature extraction network to obtain a first feature image, the output of the first CFNet-2 module in the FPN structure is used as the input of a first CFNet-2 module in the FPN structure, channel stitching is carried out on the output of the first CFNet-2 module in the FPN structure and the output of a second layer feature extraction block in the trunk feature extraction network to obtain a second feature image, the second feature image is used as the input of the second CFNet-2 module in the FPN structure, and the output of the second CFNet-2 module in the FPN structure is used as the input of the second CFNet-2 module in the PAN structure, and the first PAN structure is used as the output of the first PAN-2 module in the PAN structure, and the output of the first PAN-2 module in the PAN structure is processed on the second PAN structure;
in this embodiment, an input feature image of an FPN structure is taken as an example to describe, the input feature image of the FPN structure and an output of a third layer feature extraction block in a trunk feature extraction network are subjected to channel splicing to obtain a feature image C, the feature image C passes through a first CFNet-2 module in the FPN structure to output a feature image D, the feature image D and an output of a second layer feature extraction block in the trunk feature extraction network are subjected to channel splicing to obtain a feature image E, the feature image E passes through a second CFNet-2 module in the FPN structure to output a feature image F, the feature image F is taken as an input of a PAN structure, and the FPN structure aims to fully utilize features of each level, so that features of each level contain abundant semantic information and abundant detail information.
The PAN structure is opposite to the FPN feature fusion direction, an input feature image F of the PAN structure is spliced with the feature image E in a channel mode, then a feature image G is output through a first CFNet-2 module in the PAN structure, after the feature image G is spliced with the feature image C, the input of a second CFNet-2 module in the PAN structure is used as the input of a next feature processing network, the output of the second CFNet-2 module in the PAN structure is the output of the next feature processing network, and the PAN structure aims to further cascade and integrate feature information and improve the expression capacity of the model.
S4, performing repeated parameter adjustment and training on the LFNT model by using the training set to obtain an optimal new energy heavy-duty battery pack detection model;
the pictures in the training set are input into the LFNet model, the LFNet model is repeatedly subjected to parameter adjustment and training, and in the embodiment, a loss value (loss) obtained after the model is trained 300 times and a lightweight index (floating point operation times and parameter amount) of the model are used as evaluation indexes of the model performance.
The loss values (loss) obtained after training the LFNet model 300 times with the existing lightweight models (YOLOv 3, PP-YOLO, YOLOv4, YOLOv5s, YOLOX-s) are shown in fig. 8, and as can be seen from fig. 8, the LFNet model using the CFNet module and the SCABlock is significantly faster in convergence speed compared with the original YOLOv5 model, and the final loss value is also lower, which indicates that the LFNet model of the invention is faster in reasoning speed and higher in detection accuracy.
The light weight indexes of the LFNet model and the existing light weight models (YOLOv 5s, YOLOv5 sn) are shown in fig. 9, wherein the horizontal axis is the parameter number (Params), the vertical axis is the number of floating point operations (FLOPs), the LFNet-n model is the LFNet model with the parameter number of 1.2M, the LFNet-s model is the LFNet model with the parameter number of 5.4M, the parameter number of the LFNet-n model is reduced by 36.8% compared with the parameter number of the YOLOv5n model, the parameter number of the LFNet-s model is reduced by 25% compared with the parameter number of the YOLOv5s model, and the LFNet model has lower FLOPs value compared with the corresponding YOLOv5n and YOLOv5s, namely the LFNet model has relatively lower calculation complexity.
Fig. 8 and 9 verify that LFNet models are lighter than existing lightweight models.
S5, inputting the pictures in the test set into an optimal new energy heavy-duty battery pack detection model, and predicting the pictures in the test set by the optimal new energy heavy-duty battery pack detection model to identify the specific positions of the battery packs.
Referring to fig. 10, in order to further demonstrate performance of LFNet, in this embodiment, the LFNet model and the YOLOv5 model are compared with each other in the detection result visualization of partial images in the battery pack data set, the upper row of pictures in fig. 10 is the YOLOv5n detection result visualization, the lower row of pictures is the LFNet-n detection result visualization, and as can be seen from the first row of pictures on the left in fig. 10, the overall detection effect of LFNet-n (accuracy is 0.9) is better than that of YOLOv5n (accuracy is 0.8); as can be seen from the second column of the pictures from the left in fig. 10, YOLOv5n has a missed detection condition (only one target is detected), whereas LFNet-n has no such condition (two targets are detected); as can be seen from the third column of the pictures from the left in fig. 10, the small target detection effect of LFNet-n (small target accuracy of 0.9) is better than YOLOv5n (small target accuracy of 0.8); as can be seen from the fourth column of the left image in fig. 10, both LFNet-n and YOLOv5n can detect multiple targets, but LFNet has better detection effect, YOLOv5n also has repeated detection of targets, and a large number of detection frames repeatedly appear on the same target, but LFNet-n does not have such a situation. Therefore, the LFNT model can effectively detect the whole battery pack, the small battery pack targets and a plurality of battery pack targets, and the conditions of missing detection and repeated detection are avoided.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made without departing from the spirit and scope of the invention, which is defined in the appended claims.

Claims (6)

1. The light new energy heavy-duty battery pack identification method based on the improved YOLOv5 model is characterized by comprising the following steps of:
s1, acquiring an image data set of the battery pack;
s2, preprocessing an image data set, constructing a battery pack data set, and dividing the battery pack data set into a training set, a verification set and a test set according to the proportion;
s3, constructing an LFNT model;
the LFNT model comprises a trunk feature extraction network and a neg feature processing network, wherein the input of the LFNT model is the input of the trunk feature extraction network, and the output of the trunk feature extraction network is the input of the neg feature processing network;
the trunk feature extraction network comprises four layers of feature extraction blocks and an SPPF module, wherein in the four layers of feature extraction blocks, the output of the former layer of feature extraction block is used as the input of the latter layer of feature extraction block, the input of the trunk feature extraction network is the input of the first layer of feature extraction block, the output of the last layer of feature extraction block is the input of the SPPF module, and the output of the SPPF module is the input of the neg feature processing network;
each layer of the feature extraction block consists of a convolution combination and a CFNet-1 module, wherein the input of the feature extraction block is subjected to the convolution combination, the output of the convolution combination is used as the input of the CFNet-1 module, and the output of the CFNet-1 module is the output of the feature extraction block;
the CFNet-1 module comprises a convolution combination, a plurality of layers of CFblocks and an SCA block, wherein in the plurality of layers of CFblocks, the output of the former layer of CFblock is used as the input of the latter layer of CFblock; after the input of the CFNet-1 is combined through a convolution, the input of the CFblock is used as the input of a first layer of CFblock, the output of the last layer of CFblock in the multi-layer CFblock and the input of the first layer of CFblock are subjected to channel splicing, and then the output of the CFblock is used as the input of the SCA block, and the output of the SCA block is the output of the CFNet-1 module;
the SCA Block comprises CABACK and SABlock, wherein the input of the SCA Block is respectively used as the input of CABACK and SABlock, and the sum of the output of the CABACK and the output of the SABlock is added to be used as the output of the SCA Block;
the network for processing the neg features comprises an FPN structure and a PAN structure, wherein the FPN structure and the PAN structure comprise two CFNet-2 modules, compared with the CFNet-1 modules, each layer of CFBlock in the CFNet-2 modules has no residual edge, the output of the SPPF module is used as the input of the FPN structure, the output of the SPPF module is used for channel stitching with the output of a third layer of feature extraction block in the trunk feature extraction network to obtain a first feature image, the first feature image is used as the input of a first CFNet-2 module in the FPN structure, the output of the first CFNet-2 module in the FPN structure is used for channel stitching with the output of a second layer of feature extraction block in the trunk feature extraction network to obtain a second feature image, the output of the second feature image is used as the input of a second CFNet-2 module in the FPN structure, the output of the second CFNet-2 module in the FPN structure is used for channel stitching with the second feature image, and the output of the second feature image is used for channel stitching with the first CFNet-2 module in the PAN structure, and the second feature image is used for channel stitching with the second CFNet-2 module in the second CFNet structure;
s4, performing repeated parameter adjustment and training on the LFNT model by using the training set to obtain an optimal new energy heavy-duty battery pack detection model;
s5, predicting the test set through an optimal new energy heavy-duty battery pack detection model, and identifying the specific position of the battery pack.
2. The method for identifying the light new energy heavy-duty battery pack based on the improved YOLOv5 model according to claim 1, wherein S2 specifically comprises:
s21, converting the image data set in the S1 into a YOLOv5 format to obtain txt files of information of the left upper corner coordinates and the right lower corner coordinates of the battery pack in each image;
s22, generating three same-level folders under the total folder, and generating two subfolders under each folder in the three same-level folders, wherein the two subfolders respectively store images and corresponding txt files to form a battery pack data set;
s23, the images in the battery pack data set and txt files are all processed according to 8:1:1, generating a training set, a verification set and a test set, automatically creating a class. Txt file in a label folder under the folder where each training set is located, and writing a package English word in the class. Txt file.
3. The method for identifying the light new energy heavy-duty battery pack based on the improved YOLOv5 model according to claim 1, wherein the CFBlock extracts the characteristics of the partial channel number input by PConv, the extracted characteristics sequentially pass through two 1×1 convolutions with different channel numbers, and the sum of the output of the second 1×1 convolution and the input of the CFBlock is used as the output of the CFBlock.
4. The method for identifying the light new energy heavy-duty battery pack based on the improved YOLOv5 model according to claim 1, wherein the input of the CABlock is respectively used as the input of global average pooling and global maximum pooling after being subjected to convolution combination; adding the global average pooled output and the global maximum pooled output, inputting a sigmoid function, and weighting the output of the sigmoid function into the input of a CABLock to obtain the output of the CABLock;
and the input of the SABlock is respectively combined through two parallel convolutions, the product of the result of one convolution combination output after reshape and the result of the other convolution combination output after reshape and transpost is input into a softmax function, and the output of the softmax function is weighted into the input of the SABlock to obtain the output of the SABlock.
5. The method for identifying the light new energy heavy-duty battery pack based on the improved YOLOv5 model according to claim 1, wherein the input of the SPPF module is sequentially subjected to a convolution combination and three layers of maximum pooling layers, the output of the convolution combination and the output of each layer of maximum pooling layer are subjected to channel splicing, and then the output of the SPPF module is obtained through the convolution combination.
6. The method for identifying the light new energy heavy-duty battery pack based on the improved YOLOv5 model according to any one of claims 1 to 5, wherein the convolution combination comprises a 1 x 1 convolution, BN normalization and a SiLU activation function, and the input of the convolution combination sequentially passes through the 1 x 1 convolution, BN normalization and the SiLU activation function, and the output of the SiLU activation function is the output of the convolution combination.
CN202410149737.3A 2024-02-02 2024-02-02 Lightweight new energy heavy-duty battery pack identification method based on improved YOLOv model Active CN117689731B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410149737.3A CN117689731B (en) 2024-02-02 2024-02-02 Lightweight new energy heavy-duty battery pack identification method based on improved YOLOv model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410149737.3A CN117689731B (en) 2024-02-02 2024-02-02 Lightweight new energy heavy-duty battery pack identification method based on improved YOLOv model

Publications (2)

Publication Number Publication Date
CN117689731A true CN117689731A (en) 2024-03-12
CN117689731B CN117689731B (en) 2024-04-26

Family

ID=90137509

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410149737.3A Active CN117689731B (en) 2024-02-02 2024-02-02 Lightweight new energy heavy-duty battery pack identification method based on improved YOLOv model

Country Status (1)

Country Link
CN (1) CN117689731B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117975173A (en) * 2024-04-02 2024-05-03 华侨大学 Child evil dictionary picture identification method and device based on light-weight visual converter

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114565900A (en) * 2022-01-18 2022-05-31 广州软件应用技术研究院 Target detection method based on improved YOLOv5 and binocular stereo vision
CN114897857A (en) * 2022-05-24 2022-08-12 河北工业大学 Solar cell defect detection method based on light neural network
CN115187921A (en) * 2022-05-13 2022-10-14 华南理工大学 Power transmission channel smoke detection method based on improved YOLOv3
CN115457395A (en) * 2022-09-22 2022-12-09 南京信息工程大学 Lightweight remote sensing target detection method based on channel attention and multi-scale feature fusion
CN116188849A (en) * 2023-02-02 2023-05-30 苏州大学 Target identification method and system based on lightweight network and sweeping robot
CN116206185A (en) * 2023-02-27 2023-06-02 山东浪潮科学研究院有限公司 Lightweight small target detection method based on improved YOLOv7
CN116343150A (en) * 2023-03-24 2023-06-27 湖南师范大学 Road sign target detection method based on improved YOLOv7
CN116681962A (en) * 2023-05-05 2023-09-01 江苏宏源电气有限责任公司 Power equipment thermal image detection method and system based on improved YOLOv5
CN116863539A (en) * 2023-07-20 2023-10-10 吴剑飞 Fall figure target detection method based on optimized YOLOv8s network structure
US20230351573A1 (en) * 2021-03-17 2023-11-02 Southeast University Intelligent detection method and unmanned surface vehicle for multiple type faults of near-water bridges
US11810366B1 (en) * 2022-09-22 2023-11-07 Zhejiang Lab Joint modeling method and apparatus for enhancing local features of pedestrians

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230351573A1 (en) * 2021-03-17 2023-11-02 Southeast University Intelligent detection method and unmanned surface vehicle for multiple type faults of near-water bridges
CN114565900A (en) * 2022-01-18 2022-05-31 广州软件应用技术研究院 Target detection method based on improved YOLOv5 and binocular stereo vision
CN115187921A (en) * 2022-05-13 2022-10-14 华南理工大学 Power transmission channel smoke detection method based on improved YOLOv3
CN114897857A (en) * 2022-05-24 2022-08-12 河北工业大学 Solar cell defect detection method based on light neural network
CN115457395A (en) * 2022-09-22 2022-12-09 南京信息工程大学 Lightweight remote sensing target detection method based on channel attention and multi-scale feature fusion
US11810366B1 (en) * 2022-09-22 2023-11-07 Zhejiang Lab Joint modeling method and apparatus for enhancing local features of pedestrians
CN116188849A (en) * 2023-02-02 2023-05-30 苏州大学 Target identification method and system based on lightweight network and sweeping robot
CN116206185A (en) * 2023-02-27 2023-06-02 山东浪潮科学研究院有限公司 Lightweight small target detection method based on improved YOLOv7
CN116343150A (en) * 2023-03-24 2023-06-27 湖南师范大学 Road sign target detection method based on improved YOLOv7
CN116681962A (en) * 2023-05-05 2023-09-01 江苏宏源电气有限责任公司 Power equipment thermal image detection method and system based on improved YOLOv5
CN116863539A (en) * 2023-07-20 2023-10-10 吴剑飞 Fall figure target detection method based on optimized YOLOv8s network structure

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
VINA: "YOLOv5添加注意力机制", Retrieved from the Internet <URL:https://zhuanlan.zhihu.com/p/573870094> *
今天炼丹了吗: "三分钟学会使用系列(YOLOv5)|CBAM注意力机制,涨点神器!", Retrieved from the Internet <URL:https://blog.csdn.net/StopAndGoyyy/article/details/135873724> *
卡卡猡特: "详解Partial Convolution (一) | 图像修复领域经典之作 | 运算机制及模型结构", Retrieved from the Internet <URL:https://zhuanlan.zhihu.com/p/519446359> *
嘿♚: "【YOLOv5】Backbone、Neck、Head各模块详解", Retrieved from the Internet <URL:https://blog.csdn.net/qq_44878985/article/details/129287587> *
小酒馆燃着灯: "YOLOv5系列(一) 解析YOLOv5的网络结构(详尽)", Retrieved from the Internet <URL:https://zhuanlan.zhihu.com/p/669301460> *
赵永强 等: "深度学习目标检测方法综述", 《中国图象图形学报》, vol. 25, no. 4, 15 April 2020 (2020-04-15) *
魏润辰 等: "YOLO-Person:道路区域行人检测", 《计算机工程与应用》, vol. 56, no. 19, 9 June 2020 (2020-06-09) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117975173A (en) * 2024-04-02 2024-05-03 华侨大学 Child evil dictionary picture identification method and device based on light-weight visual converter

Also Published As

Publication number Publication date
CN117689731B (en) 2024-04-26

Similar Documents

Publication Publication Date Title
CN117689731B (en) Lightweight new energy heavy-duty battery pack identification method based on improved YOLOv model
CN111860693A (en) Lightweight visual target detection method and system
CN111563507A (en) Indoor scene semantic segmentation method based on convolutional neural network
CN113052254B (en) Multi-attention ghost residual fusion classification model and classification method thereof
CN109919084B (en) Pedestrian re-identification method based on depth multi-index hash
CN114092697B (en) Building facade semantic segmentation method with attention fused with global and local depth features
CN115077556A (en) Unmanned vehicle field operation path planning method based on multi-dimensional map
CN113095251B (en) Human body posture estimation method and system
CN115100238A (en) Knowledge distillation-based light single-target tracker training method
CN113159067A (en) Fine-grained image identification method and device based on multi-grained local feature soft association aggregation
CN113870160A (en) Point cloud data processing method based on converter neural network
CN114758129A (en) RandLA-Net outdoor scene semantic segmentation method based on local feature enhancement
CN117576149A (en) Single-target tracking method based on attention mechanism
CN114120046B (en) Lightweight engineering structure crack identification method and system based on phantom convolution
CN115937594A (en) Remote sensing image classification method and device based on local and global feature fusion
CN115641469A (en) Gyolov 5-based X-ray dangerous article detection and identification method
CN115424012A (en) Lightweight image semantic segmentation method based on context information
CN115063352A (en) Salient object detection device and method based on multi-graph neural network collaborative learning architecture
CN113887536A (en) Multi-stage efficient crowd density estimation method based on high-level semantic guidance
CN113313030A (en) Human behavior identification method based on motion trend characteristics
CN117553807B (en) Automatic driving navigation method and system based on laser radar
CN117557857B (en) Detection network light weight method combining progressive guided distillation and structural reconstruction
CN117935031B (en) Saliency target detection method integrating mixed attention
Cong et al. Object Detection and Image Segmentation for Autonomous Vehicles
Lun et al. Dual-Branch Point Cloud Feature Learning for 3D Object Detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant