CN115294548B - Lane line detection method based on position selection and classification method in row direction - Google Patents

Lane line detection method based on position selection and classification method in row direction Download PDF

Info

Publication number
CN115294548B
CN115294548B CN202210897544.7A CN202210897544A CN115294548B CN 115294548 B CN115294548 B CN 115294548B CN 202210897544 A CN202210897544 A CN 202210897544A CN 115294548 B CN115294548 B CN 115294548B
Authority
CN
China
Prior art keywords
row
resnet
layer
data
test
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210897544.7A
Other languages
Chinese (zh)
Other versions
CN115294548A (en
Inventor
宋永超
王璇
黄涛
阎维青
徐金东
刘兆伟
王莹洁
吕骏
赵金东
孔令甲
齐泉智
李凯强
毕季平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yantai University
Original Assignee
Yantai University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yantai University filed Critical Yantai University
Priority to CN202210897544.7A priority Critical patent/CN115294548B/en
Publication of CN115294548A publication Critical patent/CN115294548A/en
Application granted granted Critical
Publication of CN115294548B publication Critical patent/CN115294548B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a lane line detection method based on a position selection and classification method in a row direction, which can adopt a ResNet-based feature extraction module to extract features of a lane line shallow layer, combine a CBAM attention mechanism to enable a model to pay attention to important features, adopt an auxiliary segmentation module to increase segmentation tasks in a training process to enhance visual features, finally adopt a classification module based on a row anchor point to divide a lane image into characteristic blocks and detect whether the characteristic blocks contain lane lines or not, and realize lane line detection.

Description

Lane line detection method based on position selection and classification method in row direction
Technical Field
The invention relates to the technical field of intelligent traffic, in particular to a lane line detection method based on a position selection and classification method in a row direction.
Background
How to detect the lane line in real time and accurately by using an effective technical means is an urgent problem to be solved, and the lane line detection plays a great role in unmanned driving and assisting driver in safe driving, and is an important component in the technical field of intelligent transportation.
The lane line detection method is mainly divided into a traditional detection method and a detection method based on deep learning, wherein the traditional detection method is affected by different scenes, experimental parameters are required to be modified, and the traditional lane line detection method is relatively poor in robustness; the lane line detection method based on deep learning can select different models to set different parameters, can adapt to various complex environments, but has to be improved in the face of problems of curves, vehicle shielding, no vision and the like.
Disclosure of Invention
Aiming at the situation, in order to overcome the defects of the prior art, the invention provides a lane line detection method based on a position selection and classification method in the row direction, which can adopt a ResNet-based feature extraction module to extract features of a lane line shallow layer, and combine a CBAM attention mechanism to enable a model to pay attention to important features, adopts an auxiliary segmentation module to increase segmentation tasks in a training process, enhances visual features, and finally adopts a classification module based on row anchor points to divide a lane image into feature blocks, detect whether the feature blocks contain lane lines or not, thereby realizing the detection of the lane lines.
The technical scheme adopted by the invention is as follows: a lane line detection method based on a position selection and classification method in a row direction comprises a training part and a testing part;
the training part comprises the following steps:
step one, loading depth and training parameters of a ResNet backbone network, and taking backbone network depth of a backup, batch_size, data_root, feature block number grid_num divided on each line, line number row_anchor of a divided line anchor point, mark use_aux of whether an auxiliary segmentation module is used or not and lane number parameter num_lanes in an image according to a preset py file;
step two, loading a data set and a model: finding a training set path according to the data_root in the first step, acquiring a training sequence file train_gt.txt of the training set, converting a data set Resize into 288 x 800 resolution images, converting the 288 x 800 resolution images into a Tensor data type, setting 56 rows of fixed rows for a TuSimple data set, loading a used model, and calling the parameters acquired in the first step into the model;
step three, shallow feature extraction is carried out on an input image by using a feature extraction module based on ResNet, the ResNet is adopted as a backbone network, an identity mapping is added, the current output is directly transmitted to a next layer network, a residual block of layers 1-4 in conv, bn, relu, CBAM and ResNet passes through the module in sequence, and the ResNet used by the module is ResNet-18, so the residual block is Basicblock; performing four times of downsampling on an input Tensor data type image, extracting shallow layer characteristics, and transmitting an input obtained by each layer of residual error block to an auxiliary segmentation module;
step four, processing the features obtained by the feature extraction module by using an auxiliary segmentation module, performing tensor splicing Concat on the three-layer features by using the three-layer shallow features extracted by semantic segmentation processing, performing parameter splicing on the three-layer features, and performing up-sampling after splicing, wherein cross entropy is taken as segmentation loss, so that visual features are enhanced;
step five, obtaining model test parameters, saving the accuracy of calculating the prediction results top1, top2 and top3, wherein top1 is x, and if top2 is predicted to be any one of x-1, x and x+1, the accuracy is considered to be correct, and top3 is the accuracy when the calculation is correct within the range of [ x-2, x+2], and updating a TensorBoard test index;
step six, obtaining a loss index: lsim (similarity loss), lshp (shape loss), laux (segmentation loss);
the similarity loss Lsim is shown in formula (1):
Figure SMS_1
the shape loss Lshp is shown in equation (2):
Figure SMS_2
wherein the expected Loc for the location i,j : as shown in the formula (3), the probability Prob of the lane at each position i,j ,: as shown in formula (4):
Figure SMS_3
Prob i,j,: =softmax(P i,j,1:w ) (4)
the partition loss Laux is a cross entropy loss, i.e., sigmoid function, as shown in equation (5):
Figure SMS_4
step seven, saving training parameters to txt files, and saving weights obtained by each epoch training;
the test part comprises the following steps:
step one, loading depth and training parameters of a ResNet backbone network, and taking backbone network depth, batch_size, data_root of a backup, feature block number grid_num divided on each row, row_anchor of a divided row anchor point, a mark use_aux of whether an auxiliary segmentation module is used or not and lane number parameter num_lanes in an image according to a preset py file;
step two, loading a data set and a model: finding a path of a test set according to the data_root in the first step, acquiring a test sequence file test. Txt of the test set, converting a data set Resize into 288 x 800 resolution images, converting the 288 x 800 resolution images into a Tensor data type, and setting 56 rows of fixed rows for a TuSimple data set; loading a model and a weight used by a test, setting a pre-trained=flise, usx _aux=false, and loading a calibration standard test_label.json of the test;
step three, shallow feature extraction is carried out on an input image by using a feature extraction module based on ResNet, resNet is adopted as a backbone network, an identity mapping is added, the current output is directly transmitted to a next layer network, a residual block of layers 1-4 in conv, bn, relu, CBAM and ResNet passes through the module, and ResNet used by the module is ResNet-18, so the residual block is Basicblock; four times of downsampling is carried out on an input Tensor data type image, shallow layer characteristics are extracted, and data processed by four layers of residual blocks are transmitted to a classification module based on line anchor points;
processing the data transmitted by the feature extraction module by using a classification module based on a Row Anchor point Row Anchor, detecting whether each candidate point has a lane line or not by using a position selection and classification method Row-based classification based on a Row direction on a global feature according to a Row Anchor point Row index, upsampling the data of the previous step, and detecting whether each feature block has a lane line or not by using a fully-connected FC layer;
fifthly, obtaining the accuracy of the test by comparing the detected lane line points with the test results of the actual calibration standard, as shown in a formula (6):
Figure SMS_5
further, the training part comprises the following steps:
(1) Sequentially performing conv, bn, relu, ca, sa, maxpool and other operations;
wherein conv=nn. Conv2d (3, self. Laminates, kernel_size=7, stride=2, padding=3, bias=false);
bn=norm_layer(self.inplanes);
relu=nn.ReLU(inplace=True);
ca=ChannelAttention(self.inplanes);
sa=SpatialAttention();
maxpool=nn.MaxPool2d(kernel_size=3,stride=2,padding=1)。
the channel attention mechanism channel attention and the spatial attention mechanism space attention are used in series to form a CBAM attention mechanism;
(2) After the operation of the layers is finished, sequentially taking data processed by residual blocks of the layer2-4 layers of the ResNet network to the aux-handler 2-4 layers, and taking the data as input data of an auxiliary segmentation module; only the aux-handler 2-4 layer is used during training, so that the calculated amount is reduced, and the reasoning speed is higher.
Further, the training part comprises the following specific steps:
(1) Processing the image information output by the feature extraction module by using an aux-handler 2 layer, an aux-handler 3 layer and an aux-handler 4 layer respectively, wherein each aux-handler layer comprises different conv_bn_relu operations, and tensor splicing the processed data;
(2) And performing four conv_bn_relu operations on the tensor spliced data, and finally performing a convolution operation to obtain a segmented image.
Further, the third specific step in the test section is as follows:
sequentially performing conv, bn, relu, ca, sa, maxpool, layer layers 1-4 and other operations;
wherein conv=nn. Conv2d (3, self. Laminates, kernel_size=7, stride=2, padding=3, bias=false);
bn=norm_layer(self.inplanes);
relu=nn.ReLU(inplace=True);
ca=ChannelAttention(self.inplanes);
sa=SpatialAttention();
maxpool=nn.MaxPool2d(kernel_size=3,stride=2,padding=1)。
the channel attention mechanism channel attention and the spatial attention mechanism space attention are used in series to form a CBAM attention mechanism;
layer1-4 is the BasicbLock residual block of the ResNet residual network when the ResNet-18 network is used.
Further, the specific steps of the fourth step in the test section are as follows:
up-sampling and full-connection FC layer processing are sequentially carried out through the torch. Nn. Conv2 d;
wherein the fully connected FC layer comprises the following operations
torch.nn.linear (1800,2048) # linear transformation function, setting full connection layer;
torch.nn.ReLU();
torch.nn.Linear(2048,self.total_dim);
detecting each divided row and the feature blocks on each row by adopting a position selection and classification method in the row direction, and detecting whether the feature blocks have lane lines or not; converting lane line detection into selecting a specific line anchor point on a predefined line, detecting a picture Resize to 288 x 800, setting the height and width of the picture to be H and W respectively, setting the corresponding predefined line number to be H, setting the feature block number divided on each line to be W, and setting the number of lanes in the image to be C;
prediction of lanes P i,j, As shown in a formula (7):
P i,j,: =f i,j (X),s.t.i∈[1,C],j∈[1,h] (7)
the classification loss Lcls at this module is shown in equation (8):
Figure SMS_6
wherein i is the range of the number of lanes, j is the range of the number of rows set by the row anchor points; in this case, the calculated amount for the division mode is h×w× (c+1), and the calculated amount for the position selection and classification method based on the row direction is h× (w+1) ×c, but since the row anchor point mechanism is that H and W are much smaller than H and W, the calculated amount can be reduced greatly by adopting this method, and naturally, a faster speed is also provided.
The beneficial effects obtained by the invention by adopting the structure are as follows: the lane line detection method based on the position selection and classification method in the row direction uses a feature extraction module, an auxiliary segmentation module and a classification module based on row anchor points based on ResNet. The lane structure loss function is adopted, the investigation has larger receptive field by means of the context information and the global information from the image, and the problems of no visual field, poor light and shielding can be well processed by adopting global characteristics; the image is divided into characteristic blocks by using a position selection and classification method based on the row direction, so that compared with pixel-by-pixel segmentation of a semantic segmentation method, the method has lower calculation power and higher model reasoning speed; and a CBAM attention mechanism is added in the ResNet residual error network, so that the test effect and the robustness are better.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a schematic diagram of a model training and testing process;
FIG. 2 is a ResNet based feature extraction module;
FIG. 3 is a channel attention mechanism;
FIG. 4 is a spatial attention mechanism
FIG. 5 is an auxiliary segmentation module;
FIG. 6 is a classification module based on row anchors;
FIG. 7 is a method of selecting and classifying based on position in the row direction;
FIG. 8 is a graph showing test results in the attention-added mechanism and the non-attention-added mechanism ResNet;
FIG. 9 shows the detection effect of the model on sunny days;
FIG. 10 shows the effect of the model on the night;
FIG. 11 shows the detection effect of a model in a complex scene.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention; all other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
1-11, a lane line detection method based on a position selection and classification method in a row direction includes a training section and a test section;
the training part comprises the following steps:
step one, loading depth and training parameters of a ResNet backbone network, and taking backbone network depth of a backup, batch_size, data_root, feature block number grid_num divided on each line, line number row_anchor of a divided line anchor point, mark use_aux of whether an auxiliary segmentation module is used or not and lane number parameter num_lanes in an image according to a preset py file;
step two, loading a data set and a model: finding a training set path according to the data_root in the first step, acquiring a training sequence file train_gt.txt of the training set, converting a data set Resize into 288 x 800 resolution images, converting the 288 x 800 resolution images into a Tensor data type, setting 56 rows of fixed rows for a TuSimple data set, loading a used model, and calling the parameters acquired in the first step into the model;
step three, shallow feature extraction is carried out on an input image by using a feature extraction module based on ResNet, the ResNet is adopted as a backbone network, an identity mapping is added, the current output is directly transmitted to a next layer network, a residual block of layers 1-4 in conv, bn, relu, CBAM and ResNet passes through the module in sequence, and the ResNet used by the module is ResNet-18, so the residual block is Basicblock; performing four times of downsampling on an input Tensor data type image, extracting shallow layer characteristics, and transmitting an input obtained by each layer of residual error block to an auxiliary segmentation module;
step four, processing the features obtained by the feature extraction module by using an auxiliary segmentation module, performing tensor splicing Concat on the three-layer features by using the three-layer shallow features extracted by semantic segmentation processing, performing parameter splicing on the three-layer features, and performing up-sampling after splicing, wherein cross entropy is taken as segmentation loss, so that visual features are enhanced;
step five, obtaining model test parameters, saving the accuracy of calculating the prediction results top1, top2 and top3, wherein top1 is x, and if top2 is predicted to be any one of x-1, x and x+1, the accuracy is considered to be correct, and top3 is the accuracy when the calculation is correct within the range of [ x-2, x+2], and updating a TensorBoard test index;
step six, obtaining a loss index: lsim (similarity loss), lshp (shape loss), laux (segmentation loss);
the similarity loss Lsim is shown in formula (1):
Figure SMS_7
the shape loss Lshp is shown in equation (2).
Figure SMS_8
Wherein the expected Loc for the location i,j : as shown in the formula (3), the probability Prob of the lane at each position i,j ,: as shown in formula (4):
Figure SMS_9
Prob i,j,: =softmax(P i,j,1:w ) (4)
the partition loss Laux is a cross entropy loss, i.e., sigmoid function, as shown in equation (5):
Figure SMS_10
step seven, saving training parameters to txt files, and saving weights obtained by each epoch training;
the test part comprises the following steps:
step one, loading depth and training parameters of a ResNet backbone network, and taking backbone network depth, batch_size, data_root of a backup, feature block number grid_num divided on each row, row_anchor of a divided row anchor point, a mark use_aux of whether an auxiliary segmentation module is used or not and lane number parameter num_lanes in an image according to a preset py file;
step two, loading a data set and a model: finding a path of a test set according to the data_root in the first step, acquiring a test sequence file test. Txt of the test set, converting a data set Resize into 288 x 800 resolution images, converting the 288 x 800 resolution images into a Tensor data type, and setting 56 rows of fixed rows for a TuSimple data set; loading a model and a weight used by a test, setting a pre-trained=flise, usx _aux=false, and loading a calibration standard test_label.json of the test;
step three, shallow feature extraction is carried out on an input image by using a feature extraction module based on ResNet, resNet is adopted as a backbone network, an identity mapping is added, the current output is directly transmitted to a next layer network, a residual block of layers 1-4 in conv, bn, relu, CBAM and ResNet passes through the module, and ResNet used by the module is ResNet-18, so the residual block is Basicblock; four times of downsampling is carried out on an input Tensor data type image, shallow layer characteristics are extracted, and data processed by four layers of residual blocks are transmitted to a classification module based on line anchor points;
processing the data transmitted by the feature extraction module by using a classification module based on a Row Anchor point Row Anchor, detecting whether each candidate point has a lane line or not by using a position selection and classification method Row-based classification based on a Row direction on a global feature according to a Row Anchor point Row index, upsampling the data of the previous step, and detecting whether each feature block has a lane line or not by using a fully-connected FC layer;
fifthly, obtaining the accuracy of the test by comparing the detected lane line points with the test results of the actual calibration standard, as shown in a formula (6):
Figure SMS_11
the training part comprises the following steps:
(1) Sequentially performing conv, bn, relu, ca, sa, maxpool and other operations;
wherein conv=nn. Conv2d (3, self. Laminates, kernel_size=7, stride=2, padding=3, bias=false);
bn=norm_layer(self.inplanes);
relu=nn.ReLU(inplace=True);
ca=ChannelAttention(self.inplanes);
sa=SpatialAttention();
maxpool=nn.MaxPool2d(kernel_size=3,stride=2,padding=1)。
the channel attention mechanism channel attention and the spatial attention mechanism space attention are used in series to form a CBAM attention mechanism;
(2) After the operation of the layers is finished, sequentially taking data processed by residual blocks of the layer2-4 layers of the ResNet network to the aux-handler 2-4 layers, and taking the data as input data of an auxiliary segmentation module; only the aux-handler 2-4 layer is used during training, so that the calculated amount is reduced, and the reasoning speed is higher.
The training part comprises the following specific steps:
(1) Processing the image information output by the feature extraction module by using an aux-handler 2 layer, an aux-handler 3 layer and an aux-handler 4 layer respectively, wherein each aux-handler layer comprises different conv_bn_relu operations, and tensor splicing the processed data;
(2) And performing four conv_bn_relu operations on the tensor spliced data, and finally performing a convolution operation to obtain a segmented image.
The third specific step in the test section is as follows:
sequentially performing conv, bn, relu, ca, sa, maxpool, layer layers 1-4 and other operations;
wherein conv=nn. Conv2d (3, self. Laminates, kernel_size=7, stride=2, padding=3, bias=false);
bn=norm_layer(self.inplanes);
relu=nn.ReLU(inplace=True);
ca=ChannelAttention(self.inplanes);
sa=SpatialAttention();
maxpool=nn.MaxPool2d(kernel_size=3,stride=2,padding=1)。
the channel attention mechanism channel attention and the spatial attention mechanism space attention are used in series to form a CBAM attention mechanism;
layer1-4 is the BasicbLock residual block of the ResNet residual network when the ResNet-18 network is used.
The specific steps of the fourth step in the test part are as follows:
up-sampling and full-connection FC layer processing are sequentially carried out through the torch. Nn. Conv2 d;
wherein the fully connected FC layer comprises the following operations
torch.nn.linear (1800,2048) # linear transformation function, setting full connection layer;
torch.nn.ReLU();
torch.nn.Linear(2048,self.total_dim);
detecting each divided row and the feature blocks on each row by adopting a position selection and classification method in the row direction, and detecting whether the feature blocks have lane lines or not; converting lane line detection into selecting a specific line anchor point on a predefined line, detecting a picture Resize to 288 x 800, setting the height and width of the picture to be H and W respectively, setting the corresponding predefined line number to be H, setting the feature block number divided on each line to be W, and setting the number of lanes in the image to be C;
prediction of lanes P i,j The method is as shown in a formula (7):
P i,j,: =f i,j (X),s.t.i∈[1,C],j∈[1,h] (7)
the classification loss Lcls at this module is shown in equation (8):
Figure SMS_12
wherein i is the range of the number of lanes, j is the range of the number of rows set by the row anchor points; in this case, the calculated amount for the division mode is h×w× (c+1), and the calculated amount for the position selection and classification method based on the row direction is h× (w+1) ×c, but since the row anchor point mechanism is that H and W are much smaller than H and W, the calculated amount can be reduced greatly by adopting this method, and naturally, a faster speed is also provided.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (5)

1. A lane line detection method based on a position selection and classification method in a row direction is characterized by comprising a training part and a testing part;
the training part comprises the following steps:
step one, loading depth and training parameters of a ResNet backbone network, and taking backbone network depth, batch_size, data_root of a backup, feature block number grid_num divided on each row, row_anchor of a divided row anchor point, a mark use_aux of whether an auxiliary segmentation module is used or not and lane number parameter num_lanes in an image according to a preset py file;
step two, loading a data set and a model: finding a training set path according to the data_root in the first step, acquiring a training sequence file train_gt.txt of the training set, converting a data set Resize into 288 x 800 resolution images, converting the 288 x 800 resolution images into a Tensor data type, setting 56 rows of fixed rows for a TuSimple data set, loading a used model, and calling the parameters acquired in the first step into the model;
step three, shallow feature extraction is carried out on an input image by using a feature extraction module based on ResNet, the ResNet is adopted as a backbone network, an identity mapping is added, the current output is directly transmitted to a next layer network, a residual block of layers 1-4 in conv, bn, relu, CBAM and ResNet passes through the module in sequence, and the ResNet used by the module is ResNet-18, so the residual block is Basicblock; performing four times of downsampling on an input Tensor data type image, extracting shallow layer characteristics, and transmitting an input obtained by each layer of residual error block to an auxiliary segmentation module;
step four, processing the features obtained by the feature extraction module by using an auxiliary segmentation module, performing tensor splicing Concat on the three-layer features by using the three-layer shallow features extracted by semantic segmentation processing, performing parameter splicing on the three-layer features, and performing up-sampling after splicing, wherein cross entropy is taken as segmentation loss, so that visual features are enhanced;
step five, obtaining model test parameters, saving the accuracy of calculating the prediction results top1, top2 and top3, wherein top1 is x, and if top2 is predicted to be any one of x-1, x and x+1, the accuracy is considered to be correct, and top3 is the accuracy when the calculation is correct within the range of [ x-2, x+2], and updating a TensorBoard test index;
step six, obtaining a loss index: lsim similarity loss, lshp shape loss, laux segmentation loss;
the similarity loss Lsim is shown in formula (1):
Figure FDA0004120594930000011
the shape loss Lshp is shown in equation (2):
Figure FDA0004120594930000012
wherein the expected Loc for the location i,j : as shown in the formula (3), the probability Prob of the lane at each position i,j ,: as shown in formula (4):
Figure FDA0004120594930000021
Prob i,j,: =soft max(P i,j,1:w ) (4)
the partition loss Laux is a cross entropy loss, i.e., sigmoid function, as shown in equation (5):
Figure FDA0004120594930000022
step seven, saving training parameters to txt files, and saving weights obtained by each epoch training;
the test part comprises the following steps:
step one, loading depth and training parameters of a ResNet backbone network, and taking backbone network depth, batch_size, data_root of a backup, feature block number grid_num divided on each row, row_anchor of a divided row anchor point, a mark use_aux of whether an auxiliary segmentation module is used or not and lane number parameter num_lanes in an image according to a preset py file;
step two, loading a data set and a model: finding a path of a test set according to the data_root in the first step, acquiring a test sequence file test. Txt of the test set, converting a data set Resize into 288 x 800 resolution images, converting the 288 x 800 resolution images into a Tensor data type, and setting 56 rows of fixed rows for a TuSimple data set; loading a model and a weight used by a test, setting a pre-trained=flise, usx _aux=false, and loading a calibration standard test_label.json of the test;
step three, shallow feature extraction is carried out on an input image by using a feature extraction module based on ResNet, resNet is adopted as a backbone network, an identity mapping is added, the current output is directly transmitted to a next layer network, a residual block of layers 1-4 in conv, bn, relu, CBAM and ResNet passes through the module, and ResNet used by the module is ResNet-18, so the residual block is Basicblock; four times of downsampling is carried out on an input Tensor data type image, shallow layer characteristics are extracted, and data processed by four layers of residual blocks are transmitted to a classification module based on line anchor points;
processing the data transmitted by the feature extraction module by using a classification module based on a Row Anchor point Row Anchor, detecting whether each candidate point has a lane line or not by using a position selection and classification method Row-based classification based on a Row direction on a global feature according to a Row Anchor point Row index, upsampling the data of the previous step, and detecting whether each feature block has a lane line or not by using a fully-connected FC layer;
fifthly, obtaining the accuracy of the test by comparing the detected lane line points with the test results of the actual calibration standard, as shown in a formula (6):
Figure FDA0004120594930000023
2. the lane line detection method based on the position selection and classification method in the row direction according to claim 1, wherein the training part step three specifically comprises the steps of:
(1) Conv, bn, relu, ca, sa, maxpool operation is sequentially carried out, and a channel attention mechanism channel attention and a spatial attention mechanism are used in series, so that a CBAM attention mechanism is formed;
(2) After the operation of the layers is finished, sequentially taking data processed by residual blocks of the layer2-4 layers of the ResNet network to the aux-handler 2-4 layers, and taking the data as input data of an auxiliary segmentation module; only the aux-handler 2-4 layer is used during training, so that the calculated amount is reduced, and the reasoning speed is higher.
3. The lane line detection method based on the position selection and classification method in the row direction according to claim 1, wherein the training section step four specifically comprises the steps of:
(1) Processing the image information output by the feature extraction module by using an aux-handler 2 layer, an aux-handler 3 layer and an aux-handler 4 layer respectively, wherein each aux-handler layer comprises different conv_bn_relu operations, and tensor splicing the processed data;
(2) And performing four conv_bn_relu operations on the tensor spliced data, and finally performing a convolution operation to obtain a segmented image.
4. The lane line detection method based on the position selection and classification method in the row direction according to claim 1, wherein the step three in the test section specifically includes the steps of:
sequentially performing conv, bn, relu, ca, sa, maxpool, layer layers 1-4;
the channel attention mechanism channel attention and the spatial attention mechanism space attention set in the operation are used in series to form a CBAM attention mechanism;
layer1-4 is the BasicbLock residual block of the ResNet residual network when the ResNet-18 network is used.
5. The lane line detection method based on the position selection and classification method in the row direction according to claim 1, wherein the step four in the test section is specifically as follows:
up-sampling and full-connection FC layer processing are sequentially carried out through the torch. Nn. Conv2 d;
detecting each divided row and the feature blocks on each row by adopting a position selection and classification method in the row direction, and detecting whether the feature blocks have lane lines or not; converting lane line detection into selecting a specific line anchor point on a predefined line, detecting a picture Resize to 288 x 800, setting the height and width of the picture to be H and W respectively, setting the corresponding predefined line number to be H, setting the feature block number divided on each line to be W, and setting the number of lanes in the image to be C;
prediction of lanes P i,j, As shown in a formula (7):
P i,j,: =f i,j (X),s.t.i∈[1,C],j∈[1,h] (7)
the classification loss Lcls at this module is shown in equation (8):
Figure FDA0004120594930000041
wherein i is the range of the number of lanes, j is the range of the number of rows set by the row anchor points; at this time, the calculated amount for the division method is h×w× (c+1), and for the position selection and classification method based on the row direction, the calculated amount is h× (w+1) ×c, but because of the row anchor point mechanism, H and W are smaller than H and W.
CN202210897544.7A 2022-07-28 2022-07-28 Lane line detection method based on position selection and classification method in row direction Active CN115294548B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210897544.7A CN115294548B (en) 2022-07-28 2022-07-28 Lane line detection method based on position selection and classification method in row direction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210897544.7A CN115294548B (en) 2022-07-28 2022-07-28 Lane line detection method based on position selection and classification method in row direction

Publications (2)

Publication Number Publication Date
CN115294548A CN115294548A (en) 2022-11-04
CN115294548B true CN115294548B (en) 2023-05-02

Family

ID=83823793

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210897544.7A Active CN115294548B (en) 2022-07-28 2022-07-28 Lane line detection method based on position selection and classification method in row direction

Country Status (1)

Country Link
CN (1) CN115294548B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115861951B (en) * 2022-11-27 2023-06-09 石家庄铁道大学 Complex environment lane line accurate detection method based on dual-feature extraction network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114463715A (en) * 2021-12-27 2022-05-10 江苏航天大为科技股份有限公司 Lane line detection method
CN114743126A (en) * 2022-03-09 2022-07-12 上海瀚所信息技术有限公司 Lane line sign segmentation method based on graph attention machine mechanism network

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110395257B (en) * 2018-04-20 2021-04-23 北京图森未来科技有限公司 Lane line example detection method and device and automatic driving vehicle
CN111242037B (en) * 2020-01-15 2023-03-21 华南理工大学 Lane line detection method based on structural information
CN112528878B (en) * 2020-12-15 2024-01-09 中国科学院深圳先进技术研究院 Method and device for detecting lane line, terminal equipment and readable storage medium
CN113313031B (en) * 2021-05-31 2022-04-22 南京航空航天大学 Deep learning-based lane line detection and vehicle transverse positioning method
CN113468967B (en) * 2021-06-02 2023-08-18 北京邮电大学 Attention mechanism-based lane line detection method, attention mechanism-based lane line detection device, attention mechanism-based lane line detection equipment and attention mechanism-based lane line detection medium
CN113822149A (en) * 2021-08-06 2021-12-21 武汉卓目科技有限公司 Emergency lane visual detection method and system based on view angle of unmanned aerial vehicle
CN113902915B (en) * 2021-10-12 2024-06-11 江苏大学 Semantic segmentation method and system based on low-light complex road scene
CN114429621A (en) * 2021-12-27 2022-05-03 南京信息工程大学 UFSA algorithm-based improved lane line intelligent detection method
CN114463721A (en) * 2022-01-30 2022-05-10 哈尔滨理工大学 Lane line detection method based on spatial feature interaction
CN114550118B (en) * 2022-02-23 2023-07-11 烟台大学 Full-automatic intelligent highway marking method based on video image driving
CN114677560A (en) * 2022-03-21 2022-06-28 国科温州研究院(温州生物材料与工程研究所) Deep learning algorithm-based lane line detection method and computer system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114463715A (en) * 2021-12-27 2022-05-10 江苏航天大为科技股份有限公司 Lane line detection method
CN114743126A (en) * 2022-03-09 2022-07-12 上海瀚所信息技术有限公司 Lane line sign segmentation method based on graph attention machine mechanism network

Also Published As

Publication number Publication date
CN115294548A (en) 2022-11-04

Similar Documents

Publication Publication Date Title
CN112200161B (en) Face recognition detection method based on mixed attention mechanism
CN112183203B (en) Real-time traffic sign detection method based on multi-scale pixel feature fusion
CN108520238B (en) Scene prediction method of night vision image based on depth prediction coding network
EP3690741A2 (en) Method for automatically evaluating labeling reliability of training images for use in deep learning network to analyze images, and reliability-evaluating device using the same
CN111008633B (en) License plate character segmentation method based on attention mechanism
CN111914838B (en) License plate recognition method based on text line recognition
CN110717493B (en) License plate recognition method containing stacked characters based on deep learning
CN111414807A (en) Tidal water identification and crisis early warning method based on YO L O technology
CN112861619A (en) Model training method, lane line detection method, equipment and device
CN112990065A (en) Optimized YOLOv5 model-based vehicle classification detection method
CN114913498A (en) Parallel multi-scale feature aggregation lane line detection method based on key point estimation
CN115294548B (en) Lane line detection method based on position selection and classification method in row direction
CN113129336A (en) End-to-end multi-vehicle tracking method, system and computer readable medium
CN114596548A (en) Target detection method, target detection device, computer equipment and computer-readable storage medium
CN114495050A (en) Multitask integrated detection method for automatic driving forward vision detection
CN117612136A (en) Automatic driving target detection method based on increment small sample learning
CN116630917A (en) Lane line detection method
CN115035429A (en) Aerial photography target detection method based on composite backbone network and multiple measuring heads
CN117809289B (en) Pedestrian detection method for traffic scene
CN113313091B (en) Density estimation method based on multiple attention and topological constraints under warehouse logistics
CN116486203B (en) Single-target tracking method based on twin network and online template updating
CN115272814B (en) Long-distance space self-adaptive multi-scale small target detection method
CN116524203B (en) Vehicle target detection method based on attention and bidirectional weighting feature fusion
CN117392392B (en) Rubber cutting line identification and generation method
CN114863122B (en) Intelligent high-precision pavement disease identification method based on artificial intelligence

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant