CN115294548B

CN115294548B - Lane line detection method based on position selection and classification method in row direction

Info

Publication number: CN115294548B
Application number: CN202210897544.7A
Authority: CN
Inventors: 宋永超; 王璇; 黄涛; 阎维青; 徐金东; 刘兆伟; 王莹洁; 吕骏; 赵金东; 孔令甲; 齐泉智; 李凯强; 毕季平
Original assignee: Yantai University
Current assignee: Yantai University
Priority date: 2022-07-28
Filing date: 2022-07-28
Publication date: 2023-05-02
Anticipated expiration: 2042-07-28
Also published as: CN115294548A

Abstract

The invention discloses a lane line detection method based on a position selection and classification method in a row direction, which can adopt a ResNet-based feature extraction module to extract features of a lane line shallow layer, combine a CBAM attention mechanism to enable a model to pay attention to important features, adopt an auxiliary segmentation module to increase segmentation tasks in a training process to enhance visual features, finally adopt a classification module based on a row anchor point to divide a lane image into characteristic blocks and detect whether the characteristic blocks contain lane lines or not, and realize lane line detection.

Description

Lane line detection method based on position selection and classification method in row direction

Technical Field

The invention relates to the technical field of intelligent traffic, in particular to a lane line detection method based on a position selection and classification method in a row direction.

Background

How to detect the lane line in real time and accurately by using an effective technical means is an urgent problem to be solved, and the lane line detection plays a great role in unmanned driving and assisting driver in safe driving, and is an important component in the technical field of intelligent transportation.

The lane line detection method is mainly divided into a traditional detection method and a detection method based on deep learning, wherein the traditional detection method is affected by different scenes, experimental parameters are required to be modified, and the traditional lane line detection method is relatively poor in robustness; the lane line detection method based on deep learning can select different models to set different parameters, can adapt to various complex environments, but has to be improved in the face of problems of curves, vehicle shielding, no vision and the like.

Disclosure of Invention

Aiming at the situation, in order to overcome the defects of the prior art, the invention provides a lane line detection method based on a position selection and classification method in the row direction, which can adopt a ResNet-based feature extraction module to extract features of a lane line shallow layer, and combine a CBAM attention mechanism to enable a model to pay attention to important features, adopts an auxiliary segmentation module to increase segmentation tasks in a training process, enhances visual features, and finally adopts a classification module based on row anchor points to divide a lane image into feature blocks, detect whether the feature blocks contain lane lines or not, thereby realizing the detection of the lane lines.

The technical scheme adopted by the invention is as follows: a lane line detection method based on a position selection and classification method in a row direction comprises a training part and a testing part;

the training part comprises the following steps:

step one, loading depth and training parameters of a ResNet backbone network, and taking backbone network depth of a backup, batch_size, data_root, feature block number grid_num divided on each line, line number row_anchor of a divided line anchor point, mark use_aux of whether an auxiliary segmentation module is used or not and lane number parameter num_lanes in an image according to a preset py file;

step two, loading a data set and a model: finding a training set path according to the data_root in the first step, acquiring a training sequence file train_gt.txt of the training set, converting a data set Resize into 288 x 800 resolution images, converting the 288 x 800 resolution images into a Tensor data type, setting 56 rows of fixed rows for a TuSimple data set, loading a used model, and calling the parameters acquired in the first step into the model;

step three, shallow feature extraction is carried out on an input image by using a feature extraction module based on ResNet, the ResNet is adopted as a backbone network, an identity mapping is added, the current output is directly transmitted to a next layer network, a residual block of layers 1-4 in conv, bn, relu, CBAM and ResNet passes through the module in sequence, and the ResNet used by the module is ResNet-18, so the residual block is Basicblock; performing four times of downsampling on an input Tensor data type image, extracting shallow layer characteristics, and transmitting an input obtained by each layer of residual error block to an auxiliary segmentation module;

step four, processing the features obtained by the feature extraction module by using an auxiliary segmentation module, performing tensor splicing Concat on the three-layer features by using the three-layer shallow features extracted by semantic segmentation processing, performing parameter splicing on the three-layer features, and performing up-sampling after splicing, wherein cross entropy is taken as segmentation loss, so that visual features are enhanced;

step five, obtaining model test parameters, saving the accuracy of calculating the prediction results top1, top2 and top3, wherein top1 is x, and if top2 is predicted to be any one of x-1, x and x+1, the accuracy is considered to be correct, and top3 is the accuracy when the calculation is correct within the range of [ x-2, x+2], and updating a TensorBoard test index;

step six, obtaining a loss index: lsim (similarity loss), lshp (shape loss), laux (segmentation loss);

the similarity loss Lsim is shown in formula (1):

the shape loss Lshp is shown in equation (2):

wherein the expected Loc for the location _i，j : as shown in the formula (3), the probability Prob of the lane at each position _i，j ,: as shown in formula (4):

Prob _i,j,: ＝softmax(P _i,j,1:w ) (4)

the partition loss Laux is a cross entropy loss, i.e., sigmoid function, as shown in equation (5):

step seven, saving training parameters to txt files, and saving weights obtained by each epoch training;

the test part comprises the following steps:

step one, loading depth and training parameters of a ResNet backbone network, and taking backbone network depth, batch_size, data_root of a backup, feature block number grid_num divided on each row, row_anchor of a divided row anchor point, a mark use_aux of whether an auxiliary segmentation module is used or not and lane number parameter num_lanes in an image according to a preset py file;

step two, loading a data set and a model: finding a path of a test set according to the data_root in the first step, acquiring a test sequence file test. Txt of the test set, converting a data set Resize into 288 x 800 resolution images, converting the 288 x 800 resolution images into a Tensor data type, and setting 56 rows of fixed rows for a TuSimple data set; loading a model and a weight used by a test, setting a pre-trained=flise, usx _aux=false, and loading a calibration standard test_label.json of the test;

step three, shallow feature extraction is carried out on an input image by using a feature extraction module based on ResNet, resNet is adopted as a backbone network, an identity mapping is added, the current output is directly transmitted to a next layer network, a residual block of layers 1-4 in conv, bn, relu, CBAM and ResNet passes through the module, and ResNet used by the module is ResNet-18, so the residual block is Basicblock; four times of downsampling is carried out on an input Tensor data type image, shallow layer characteristics are extracted, and data processed by four layers of residual blocks are transmitted to a classification module based on line anchor points;

processing the data transmitted by the feature extraction module by using a classification module based on a Row Anchor point Row Anchor, detecting whether each candidate point has a lane line or not by using a position selection and classification method Row-based classification based on a Row direction on a global feature according to a Row Anchor point Row index, upsampling the data of the previous step, and detecting whether each feature block has a lane line or not by using a fully-connected FC layer;

fifthly, obtaining the accuracy of the test by comparing the detected lane line points with the test results of the actual calibration standard, as shown in a formula (6):

further, the training part comprises the following steps:

(1) Sequentially performing conv, bn, relu, ca, sa, maxpool and other operations;

wherein conv=nn. Conv2d (3, self. Laminates, kernel_size=7, stride=2, padding=3, bias=false);

bn＝norm_layer(self.inplanes)；

relu＝nn.ReLU(inplace＝True)；

ca＝ChannelAttention(self.inplanes)；

sa＝SpatialAttention()；

maxpool＝nn.MaxPool2d(kernel_size＝3,stride＝2,padding＝1)。

the channel attention mechanism channel attention and the spatial attention mechanism space attention are used in series to form a CBAM attention mechanism;

(2) After the operation of the layers is finished, sequentially taking data processed by residual blocks of the layer2-4 layers of the ResNet network to the aux-handler 2-4 layers, and taking the data as input data of an auxiliary segmentation module; only the aux-handler 2-4 layer is used during training, so that the calculated amount is reduced, and the reasoning speed is higher.

Further, the training part comprises the following specific steps:

(1) Processing the image information output by the feature extraction module by using an aux-handler 2 layer, an aux-handler 3 layer and an aux-handler 4 layer respectively, wherein each aux-handler layer comprises different conv_bn_relu operations, and tensor splicing the processed data;

(2) And performing four conv_bn_relu operations on the tensor spliced data, and finally performing a convolution operation to obtain a segmented image.

Further, the third specific step in the test section is as follows:

sequentially performing conv, bn, relu, ca, sa, maxpool, layer layers 1-4 and other operations;

bn＝norm_layer(self.inplanes)；

relu＝nn.ReLU(inplace＝True)；

ca＝ChannelAttention(self.inplanes)；

sa＝SpatialAttention()；

maxpool＝nn.MaxPool2d(kernel_size＝3,stride＝2,padding＝1)。

layer1-4 is the BasicbLock residual block of the ResNet residual network when the ResNet-18 network is used.

Further, the specific steps of the fourth step in the test section are as follows:

up-sampling and full-connection FC layer processing are sequentially carried out through the torch. Nn. Conv2 d;

wherein the fully connected FC layer comprises the following operations

torch.nn.linear (1800,2048) # linear transformation function, setting full connection layer;

torch.nn.ReLU()；

torch.nn.Linear(2048,self.total_dim)；

detecting each divided row and the feature blocks on each row by adopting a position selection and classification method in the row direction, and detecting whether the feature blocks have lane lines or not; converting lane line detection into selecting a specific line anchor point on a predefined line, detecting a picture Resize to 288 x 800, setting the height and width of the picture to be H and W respectively, setting the corresponding predefined line number to be H, setting the feature block number divided on each line to be W, and setting the number of lanes in the image to be C;

prediction of lanes P _i,j, As shown in a formula (7):

P _i,j,：＝f _i,j (X),s.t.i∈[1,C],j∈[1,h] (7)

the classification loss Lcls at this module is shown in equation (8):

wherein i is the range of the number of lanes, j is the range of the number of rows set by the row anchor points; in this case, the calculated amount for the division mode is h×w× (c+1), and the calculated amount for the position selection and classification method based on the row direction is h× (w+1) ×c, but since the row anchor point mechanism is that H and W are much smaller than H and W, the calculated amount can be reduced greatly by adopting this method, and naturally, a faster speed is also provided.

The beneficial effects obtained by the invention by adopting the structure are as follows: the lane line detection method based on the position selection and classification method in the row direction uses a feature extraction module, an auxiliary segmentation module and a classification module based on row anchor points based on ResNet. The lane structure loss function is adopted, the investigation has larger receptive field by means of the context information and the global information from the image, and the problems of no visual field, poor light and shielding can be well processed by adopting global characteristics; the image is divided into characteristic blocks by using a position selection and classification method based on the row direction, so that compared with pixel-by-pixel segmentation of a semantic segmentation method, the method has lower calculation power and higher model reasoning speed; and a CBAM attention mechanism is added in the ResNet residual error network, so that the test effect and the robustness are better.

Drawings

The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:

FIG. 1 is a schematic diagram of a model training and testing process;

FIG. 2 is a ResNet based feature extraction module;

FIG. 3 is a channel attention mechanism;

FIG. 4 is a spatial attention mechanism

FIG. 5 is an auxiliary segmentation module;

FIG. 6 is a classification module based on row anchors;

FIG. 7 is a method of selecting and classifying based on position in the row direction;

FIG. 8 is a graph showing test results in the attention-added mechanism and the non-attention-added mechanism ResNet;

FIG. 9 shows the detection effect of the model on sunny days;

FIG. 10 shows the effect of the model on the night;

FIG. 11 shows the detection effect of a model in a complex scene.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all embodiments of the invention; all other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

1-11, a lane line detection method based on a position selection and classification method in a row direction includes a training section and a test section;

the training part comprises the following steps:

the similarity loss Lsim is shown in formula (1):

the shape loss Lshp is shown in equation (2).

Prob _i,j,: ＝softmax(P _i,j,1:w ) (4)

the test part comprises the following steps:

the training part comprises the following steps:

bn＝norm_layer(self.inplanes)；

relu＝nn.ReLU(inplace＝True)；

ca＝ChannelAttention(self.inplanes)；

sa＝SpatialAttention()；

maxpool＝nn.MaxPool2d(kernel_size＝3,stride＝2,padding＝1)。

The training part comprises the following specific steps:

The third specific step in the test section is as follows:

bn＝norm_layer(self.inplanes)；

relu＝nn.ReLU(inplace＝True)；

ca＝ChannelAttention(self.inplanes)；

sa＝SpatialAttention()；

maxpool＝nn.MaxPool2d(kernel_size＝3,stride＝2,padding＝1)。

The specific steps of the fourth step in the test part are as follows:

wherein the fully connected FC layer comprises the following operations

torch.nn.ReLU()；

torch.nn.Linear(2048,self.total_dim)；

prediction of lanes P _i,j The method is as shown in a formula (7):

P _i,j,：＝f _i,j (X),s.t.i∈[1,C],j∈[1,h] (7)

the classification loss Lcls at this module is shown in equation (8):

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A lane line detection method based on a position selection and classification method in a row direction is characterized by comprising a training part and a testing part;

the training part comprises the following steps:

step six, obtaining a loss index: lsim similarity loss, lshp shape loss, laux segmentation loss;

the similarity loss Lsim is shown in formula (1):

the shape loss Lshp is shown in equation (2):

Prob _i,j,: ＝soft max(P _i,j,1:w ) (4)

the test part comprises the following steps:

2. the lane line detection method based on the position selection and classification method in the row direction according to claim 1, wherein the training part step three specifically comprises the steps of:

(1) Conv, bn, relu, ca, sa, maxpool operation is sequentially carried out, and a channel attention mechanism channel attention and a spatial attention mechanism are used in series, so that a CBAM attention mechanism is formed;

3. The lane line detection method based on the position selection and classification method in the row direction according to claim 1, wherein the training section step four specifically comprises the steps of:

4. The lane line detection method based on the position selection and classification method in the row direction according to claim 1, wherein the step three in the test section specifically includes the steps of:

sequentially performing conv, bn, relu, ca, sa, maxpool, layer layers 1-4;

the channel attention mechanism channel attention and the spatial attention mechanism space attention set in the operation are used in series to form a CBAM attention mechanism;

5. The lane line detection method based on the position selection and classification method in the row direction according to claim 1, wherein the step four in the test section is specifically as follows:

prediction of lanes P _i,j, As shown in a formula (7):

P _i,j,：＝f _i,j (X),s.t.i∈[1,C],j∈[1,h] (7)

the classification loss Lcls at this module is shown in equation (8):

wherein i is the range of the number of lanes, j is the range of the number of rows set by the row anchor points; at this time, the calculated amount for the division method is h×w× (c+1), and for the position selection and classification method based on the row direction, the calculated amount is h× (w+1) ×c, but because of the row anchor point mechanism, H and W are smaller than H and W.