CN117011819A

CN117011819A - Lane line detection method, device and equipment based on feature guidance attention

Info

Publication number: CN117011819A
Application number: CN202310992383.4A
Authority: CN
Inventors: 刘登峰; 郭文静; 陈世海; 郭虓赫; 朱烁; 王然; 柴志雷; 吴秦; 陈璟; 周浩杰; 王宁
Original assignee: Jiangnan University
Current assignee: Jiangnan University
Priority date: 2023-08-08
Filing date: 2023-08-08
Publication date: 2023-11-07

Abstract

The application relates to the technical field of lane line detection, in particular to a lane line detection method, equipment and storage medium based on feature guidance attention. The lane line detection method comprises the following steps: using a characteristic residual neural network as a backbone network of a lane line detection model, carrying out lane line characteristic extraction on a picture containing lane lines, and outputting a multi-size characteristic map; using an equalization feature pyramid network as a trans-scale feature fusion module of a lane line detection model, generating a plurality of sizes corresponding to a pyramid for each size feature generated by the feature residual neutral network through up-down sampling operation, and fusing feature graphs of the same size; and using the ROI Gather as a detection module of the lane line detection model, iteratively updating a preset lane line, and outputting a finally detected lane line. The method improves the detection speed while ensuring the precision by intensively analyzing the key areas.

Description

Lane line detection method, device and equipment based on feature guidance attention

Technical Field

The application relates to the technical field of lane line detection, in particular to a lane line detection method, device and equipment based on feature guidance attention and a readable storage medium.

Background

Lane lines serve as important clues for constraining vehicles to travel on roads, which play a vital role in modern automotive advanced driving assistance systems and automatic driving systems. Helping intelligent vehicles to better locate and drive more safely is the goal of lane line detection systems.

In actual lane line detection, since there are many bad scenes such as bad weather, dim or dazzle light, lane lines being blocked by other vehicles, etc., although a human being can easily infer the lane line position and fill the blocked portion according to the context, it is difficult for the lane line detection task to distinguish the lane line from the surrounding environment without advanced semantics and global context information. The prior art proposes a message passing mechanism to collect global context information, but this method performs pixel-by-pixel prediction, which is difficult to meet the real-time requirement.

Moreover, the existing lane line detection task based on the line anchor does not reuse local features and global features, and is easy to cause missed detection and false detection. Modeling and integrating the local geometry of the lane lines into the global features easily misdeems landmarks as lane lines causing false detection. Constructing a fully connected layer with global features to detect lanes can easily lead to inaccurate positioning of the predicted lanes.

Disclosure of Invention

Therefore, the technical problem to be solved by the application is to solve the problems that the lane line detection method in the prior art is inaccurate in detection in a severe scene and is difficult to meet the real-time performance.

In order to solve the technical problems, the application provides a lane line detection method based on feature guidance attention, which comprises the following specific steps:

using a characteristic residual neural network as a backbone network of a lane line detection model, carrying out lane line characteristic extraction on a picture containing lane lines, and outputting a multi-size characteristic map;

the characteristic residual neural network comprises a basic residual block and a basic block formed by characteristic guidance attention; the basic blocks formed by the characteristic directing attention adopt jump connection, and the basic blocks comprise a plurality of convolution blocks, a batch normalization layer, a rectification linear unit activation function layer and the characteristic directing attention; the feature guidance attention is constructed based on a convolution attention module, an input feature map is processed by the convolution attention module, a space refinement map is output, and the input feature map and the space refinement map are subjected to channel shuffling operation to obtain a final output refinement feature map;

using an equalization feature pyramid network as a trans-scale feature fusion module of a lane line detection model, generating a plurality of sizes corresponding to a pyramid for each size feature generated by the feature residual neutral network through up-down sampling operation, and fusing feature graphs of the same size;

and using the ROIGatre as a detection module of a lane line detection model, inputting the multi-size feature map fused by the balanced feature pyramid network into the ROIGatre detection module, iteratively updating a preset lane line, and outputting a finally detected lane line.

In one embodiment of the present application, the extracting lane line features of the image including the lane lines by using the feature residual neural network as the backbone network of the lane line detection model, and outputting the multi-size feature map includes:

sequentially inputting the lane line images to be detected into a convolution layer, a batch normalization layer, a rectification linear unit activation function, the convolution layer and the batch normalization layer, and outputting the images as input characteristic images with characteristic guiding attention;

the feature guidance attention processes the input feature map through a convolution attention module, a space refinement map is output, and the input feature map and the space refinement map are subjected to channel shuffling operation to obtain a final output refinement feature map;

and shuffling the lane line image to be detected and the refined feature image of the feature guidance attention output by adopting jump connection, and outputting a multi-size feature image through a rectification linear unit activation function.

In one embodiment of the present application, the spatial refinement graph is used as a guide, and the spatial refinement graph and the input feature graph are rearranged in an alternating manner through a channel shuffling operation, so as to obtain a final output refinement feature graph, wherein the formula is:

F _out ＝σ(GC _7×7 (CS([F _in ,F _s ])))

wherein F is _out To the endOutput refinement feature graphs, sigma represents sigmoid operation, CS (·) represents channel shuffling operation, GC _7×7 Representing a group convolution layer with a kernel size of 7 x 7, F _in To input the feature map F _s The graph is spatially refined.

In one embodiment of the application, the spatially refined graph F _s Refinement of graph F by channels _c The specific formula of the application of the 2D space attention is as follows:

wherein F is _SRM Mapping for spatial attention;

the channel refinement graph is derived from the input feature graph F by applying 1D channel attention _in The specific formula is as follows:

wherein F is _CRM Mapping for channel attention;

the channel attention map F _CRM The specific formula is as follows:

the spatial attention map F _SRM The specific formula is as follows:

wherein MLP is a multi-layer sensor, and the number of hidden layers is R ^C/r×1×1 C is the number of channels in the feature map, r is the reduction ratio, W ₁ And W is ₀ Is the weight of the multi-layer perceptron,is a cross-channelFeature of global average pooling operation of dimensions, +.>Features of the global max pooling operation across channel dimensions, +.>For the feature of global averaging pooling operations across spatial dimensions,features that operate to process for global maximization across spatial dimensions.

In one embodiment of the present application, the balanced feature pyramid network is used as a trans-scale feature fusion module, and for each size feature generated by the feature residual neural network, a plurality of sizes corresponding to the pyramid are generated through up-down sampling operation, up-sampling operation is performed by using hole convolution, and down-sampling is performed by using convolution kernel and stride.

In one embodiment of the present application, the specific method for inputting the multi-size feature map obtained by fusing the balanced feature pyramid network to the ROIGather detection module, iteratively updating the preset lane line, and outputting the finally detected lane line includes:

after a preset lane line is distributed to the minimum size feature map output by the balanced feature pyramid network, each preset lane line has N points; sampling N uniformly from the preset lane line _p Calculating the exact value of the input characteristic by bilinear interpolation, and carrying out zero padding on the characteristic exceeding the picture range to obtain the ROI characteristic of each preset lane lineC is the number of channels in the feature map;

performing 9×9 one-dimensional convolution on the ROI feature of each preset lane line, and collecting the nearby features of each channel pixel;

further extracting by using full-connectivity algorithm to obtain preset valueCharacteristic X 'of lane line' _p ∈R ^C×1 ；

Adjusting global feature map X _f ∈R ^C×H×W Is the same as the minimum dimension feature map output by the equilibrium feature pyramid network in size and flattened to X' _f ∈R ^C×HW ；

Establishing the feature X 'of the preset lane line obtained by further extraction' _p And global feature map X' _f Obtaining an attention matrix W between the characteristics of the preset lane lines and the global characteristic diagram;

calculating an aggregation characteristic G through the attention matrix W, and combining the aggregation characteristic G with a characteristic X 'of a preset lane line' _p And adding, namely distributing the new preset lane lines to the next-layer feature graphs output by the balanced feature pyramid network, and sequentially cycling until all the feature graphs output by the balanced feature pyramid network are utilized.

In one embodiment of the application, the characteristic X 'of the preset lane line' _p And global feature map X' _f The attention matrix W between is:

wherein f is a normalization function softmax, and C is the number of channels in the feature map;

the polymeric features

The application also provides a lane line detection device based on the feature guiding attention, which comprises:

the feature extraction module is used for extracting the lane line features of the picture containing the lane lines by using the feature residual neural network as a backbone network of the lane line detection model and outputting a multi-size feature map; the characteristic residual neural network comprises a basic residual block and a basic block formed by characteristic guidance attention; the basic blocks formed by the characteristic directing attention adopt jump connection, and the basic blocks comprise a plurality of convolution blocks, a batch normalization layer, a rectification linear unit activation function layer and the characteristic directing attention; the feature guidance attention is constructed based on a convolution attention module, an input feature map is processed by the convolution attention module, a space refinement map is output, and the input feature map and the space refinement map are subjected to channel shuffling operation to obtain a final output refinement feature map;

the feature fusion module uses an equilibrium feature pyramid network as a trans-scale feature fusion module of the lane line detection model, generates a plurality of sizes corresponding to pyramids for features of each size generated by the feature residual neural network through up-down sampling operation, and fuses feature graphs of the same size;

the lane line detection module uses the ROIGATher as a detection module of a lane line detection model, inputs the multi-size feature map fused by the balanced feature pyramid network to the ROIGATher detection module, iteratively updates a preset lane line, and outputs the finally detected lane line.

The application also provides a lane line detection device based on the feature guiding attention, comprising:

the camera acquisition unit is used for acquiring lane line images;

a memory for storing a computer program;

and the processor is used for processing the lane line image acquired by the camera acquisition unit, realizing the step of the lane line detection method based on the characteristic guiding attention when executing the computer program, and outputting the finally detected lane line.

The present application also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the above-described feature-based attention-directing lane line detection method.

Compared with the prior art, the technical scheme of the application has the following advantages:

according to the feature-based attention-guiding lane line detection method, the lane is roughly positioned by utilizing the global features, the global features are refined, so that fine local features are obtained, more accurate position and high-precision detection results are obtained, and the detection speed is improved while the precision is ensured by intensively analyzing key areas. The method has higher accuracy in the task of detecting the lane lines, particularly in a severe environment, has higher detection speed compared with a pixel-by-pixel prediction method, and can meet the requirement of instantaneity.

The features described in the present application direct the basic blocks of attention to focus more on lane line pixels and more important channel information. The features guide attention to generate a refined feature map in a coarse-to-fine mode, and the input feature map and the spatial refined map are subjected to channel shuffling operation to obtain a final output refined feature map, so that information between the spatial attention features and the channel attention features is fully exchanged, and important information in the features is emphasized.

The balanced feature pyramid network fully utilizes global features and local features, solves the problem of detail information loss caused by gradual desalination of high-level features along with the fusion process, realizes balanced feature fusion among different levels, and reduces false detection rate and omission rate of lane line detection.

Drawings

In order that the application may be more readily understood, a more particular description of the application will be rendered by reference to specific embodiments thereof that are illustrated in the appended drawings, in which

FIG. 1 is an overall architecture diagram of a feature-based attention-directing lane line detection method provided by the present application;

FIG. 2 is a block diagram of the basic blocks of the feature directing attention component provided by the present application;

FIG. 3 is a block diagram of the feature directing attention provided by the present application;

FIG. 4 is a diagram of a network architecture of an equalizing feature pyramid provided by the present application;

fig. 5 is a visual result of CULane and TuSimple datasets provided in this embodiment.

Detailed Description

The present application will be further described with reference to the accompanying drawings and specific examples, which are not intended to be limiting, so that those skilled in the art will better understand the application and practice it.

Referring to FIG. 1, the present application provides a lane line detection method based on feature guidance attention, which is input as a picture I ε R containing lane lines ^3×H×W The method specifically comprises the following steps of:

and the characteristic residual neural network is used as a backbone network of a lane line detection model (FGANet), and is used for carrying out lane line characteristic extraction on pictures containing lane lines and outputting a multi-size characteristic diagram. Wherein the characteristic residual neural network comprises a basic residual block and a basic block (FGAB) of characteristic directing attention composition.

Referring to fig. 2, the present application uses a basic block composed of feature guidance attentions as a building block of a feature residual neural network, sequentially inputs lane line images to be detected into a convolution layer, a batch normalization layer, a rectification linear unit activation function, the convolution layer and the batch normalization layer, and outputs as input feature images of Feature Guidance Attentions (FGAs).

And shuffling the lane line image to be detected and the refined feature image of the feature guidance attention output by adopting jump connection, and outputting a multi-size feature image through a rectification linear unit activation function. The jump connection is adopted to enable the lane line detection model to be simple enough, fewer parameter amounts are contained, the processing speed is higher, the method can be deployed into real-time equipment, the deep learning model is helped to solve the problem of vanishing gradient, and the phenomenon of overfitting caused by over-deep network depth is prevented.

The feature-guided attention is optimized based on a convolution attention module consisting of a channel attention and a spatial attention, which are placed in sequence to calculate the attention weights in the channel and spatial dimensionsHeavy. Channel attention computation channel-by-channel vectors using both average pooling and maximum pooling operations to generate a channel attention map, i.e., F _CRM ∈R ^C×1×1 The channel attention map is obtained by multiplying the recalibration feature and the input feature map element by element, so that the representation capability of the network is greatly improved. Spatial attention applies average pooling and maximum pooling operations along the channel axis, applying convolution layers to generate a spatial attention map, i.e., F _SRM ∈R ^1×H×W The importance of different regions is represented adaptively, and the spatial attention map is obtained by multiplying the channel attention map element by element.

The main purpose of the attention mechanism is to focus more on the road portion of the image, while ignoring other objects in the image, such as sky, trees, pedestrians, utility poles, and others. However, the spatial attention inside the convolution attention module can only solve the problem of focusing on the road part at the image level, whereas focusing on the road at the feature level is ignored. Channel attention within the convolution attention module models channel differences without consideration of context information, other object information is encoded into the feature map as the feature channels expand, meaning that for each feature channel the other object information is unevenly distributed in the spatial dimension. Furthermore, another problem with the convolution attention module is that the spatial attention and channel attention features are computed sequentially, and that there is insufficient information exchange between them.

In order to solve the above problems, the present application proposes a feature guidance attention, which inputs a feature map F _in ∈R ^C×H×W And (3) outputting a space refinement graph through convolution attention module processing, and obtaining a final output refinement feature graph from the input feature graph and the space refinement graph through channel shuffling operation, wherein the final output refinement feature graph has the same dimension as the input feature graph.

The features direct attention from the input feature map F by applying 1D channel attention _in ∈R ^C×H×W Find important channels in the image and generate channel refinement diagram F _c ：

Wherein channel attention map F _CRM The specific formula is as follows:

refining the graph F in the channel _c Applying 2D space attention to generate a space refinement F _s ：

Wherein the spatial attention map F _SRM The specific formula is as follows:

wherein MLP is a multi-layer sensor, and the number of hidden layers is R in order to reduce the number of parameters and limit the complexity of the lane line detection model ^C/r×1×1 C is the number of channels in the feature map, and t is the reduction ratio. W (W) ₁ And W is ₀ Weight of multi-layer sensor, W ₀ ∈R ^C/r×C ,W ₁ ∈R ^C×C/r ，Features of the global averaging pooling operation across channel dimensions +.>Features of the global max pooling operation across channel dimensions, +.>Features of the global averaging pooling operation across spatial dimensions +.>Features that operate to process for global maximization across spatial dimensions.

Finally, the space refinement diagram F is utilized _s As a guide to generate the final feature map, spatially refining map F _s And input a feature map F _in Is rearranged in an alternating manner by a channel shuffling operation to obtain a refined feature map of the final output, guaranteeing information interaction, the specific formula is as follows:

F _out ＝σ(GC _7×7 (CS([F _in ,F _s ])))

wherein F is _out For the final output refinement feature map, σ represents sigmoid operation, CS (·) represents channel shuffling operation, GC _7×7 Representing a group convolution layer with a kernel size of 7 x 7, F _in To input the feature map F _s The graph is spatially refined.

Referring to fig. 4, the balanced feature pyramid network is used as a cross-scale feature fusion module of the lane line detection model, and features of each size generated by the feature residual neural network are generated into a plurality of sizes corresponding to a pyramid through up-down sampling operation, and feature graphs of the same size are fused.

One of the difficulties in lane line detection is how to efficiently represent and process multi-scale features. The current lane line detection model based on deep learning already uses a characteristic pyramid network as a neck module, detects large objects in a low-resolution pyramid characteristic diagram and detects small objects in a high-resolution pyramid characteristic diagram so as to effectively process multi-scale objects in an image. The profound insight that higher-level neurons respond strongly to the whole object, while other neurons are more likely to be activated by local textures and patterns, suggests the importance of higher-level features and the necessity to propagate semantically strong features. However, conventional top-down feature pyramid networks are limited by unidirectional information flow. To effectively solve this problem, the existing lane line detection model adds an additional bottom-up path aggregation network over the feature pyramid network.

In order to solve the problem that the conventional characteristic pyramid network gradually fades useful information in high-level characteristics along with the fusion process when acquiring multi-scale characteristics, and detail information is lost, the application provides a balanced characteristic pyramid network based on the multi-scale characteristic network as a trans-scale characteristic fusion module of a lane line detection model, and the useful information and detail information in the high-level characteristics are reserved so as to reduce the influence of scale difference on the performance of the model.

And the balanced feature pyramid network is used as a trans-scale feature fusion module, features of each size generated by the feature residual neural network are generated into a plurality of sizes corresponding to the pyramid through up-down sampling operation, the up-sampling operation is carried out by using cavity convolution, and the down-sampling operation is carried out by using convolution kernels and steps. Compared with the common convolution operation, the cavity convolution can increase the receptive field on the premise of not increasing the network parameter quantity.

For some extreme cases where there is no local visual evidence of the presence of a lane, it is necessary to look at features in the vicinity of the pixel, i.e. contextual features, in order to determine whether the current pixel belongs to the lane. Therefore, the application adopts the ROIGATher as a detection module of the lane line detection model, inputs the multi-size feature map fused by the balanced feature pyramid network to the ROIGATher detection module, iteratively updates the preset lane line, and outputs the finally detected lane line, and the specific steps are as follows:

after a preset lane line is distributed to the minimum size feature map output by the balanced feature pyramid network, each preset lane line has N points; sampling N uniformly from the preset lane line _p Calculating the exact value of the input characteristic by bilinear interpolation, and carrying out zero padding on the characteristic exceeding the picture range to obtain the ROI characteristic of each preset lane lineC is the number of channels in the feature map.

And performing 9×9 one-dimensional convolution on the ROI feature of each preset lane line, and collecting the nearby features of each channel pixel.

Further extracting by using a full-connectivity algorithm to obtain a pre-preparationSetting the characteristic X 'of the lane line' _p ∈R ^C×1 。

Adjusting global feature map X _f ∈R ^C×H×W Is the same as the minimum dimension feature map output by the equilibrium feature pyramid network in size and flattened to X' _f ∈R ^C×HW 。

Establishing the feature X 'of the preset lane line obtained by further extraction' _p And global feature map X' _f Obtaining an attention matrix W between the characteristics of the preset lane lines and the global characteristic diagram:

where f is the normalization function softmax and C is the number of channels in the feature map.

Calculating by the attention matrix W to obtain the aggregate characteristicsCombining the aggregate characteristic G with the characteristic X 'of the preset lane line' _p And adding, namely distributing the new preset lane lines to the next-layer feature graphs output by the balanced feature pyramid network, and sequentially cycling until all the feature graphs output by the balanced feature pyramid network are utilized.

The loss function of the lane line detection model comprises classification loss and regression loss, and the total loss function is defined as:

L _total ＝w _cls L _cls +w _xytl L _xytl +w _xytl L _LIoU

wherein L is _cls Is the focus loss between the prediction and the label, L _xytl Is a smooth L1 loss of regression of the starting point coordinates, angle theta and lane line length, L _LIoU Is the lineinformation loss between the predicted lane and the baseline truth. Super parameter w _cls 、w _xytl 、w _xytl Set to 6.0, 0.5 and 2.0, respectively.

In order to broadly evaluate the proposed method of the present application, the present embodiment performs experiments on two widely used lane detection reference data sets CULane and tune. CULane is one of the widely used large-scale lane detection datasets, and one of the most complex datasets, which contains nine challenging scenarios such as congestion, night, no lane lines, etc. TuSimple lane detection benchmark is also one of the most widely used data sets in lane detection, which is collected under steady illumination conditions on highways. Details of the CULane and Tusimple datasets are shown in Table 1.

Dataset

Train

Val.

Test

Road type

Resolution

Scenarios

CULane

88.9K

9.7K

34.7K

Urban&Highway

1640×590

9

Tusimple

3.3K

0.4K

2.8K

Highway

1280×720

1

Table 1, CULane and Tusimple dataset details

In this embodiment, the characteristic residual neural network is used as a backbone network of the lane line detection model, and the sizes of all the input images are adjusted to 320×800. In the optimization process, the embodiment uses an AdamW optimizer, and the initial learning rate is 10 ^-3 Cosine decay learning rate strategy. This example trained 70 epochs and 300 epochs for CULane and Tusimple, respectively, where the larger difference was due to the larger difference between the data set sizes. For data enhancement, the present embodiment uses a random affine transformation including translation, rotation, and scaling, and random horizontal flipping. The network of this embodiment is implemented based on Pytorch, using 2 GPUs to run all experiments. All experimental results were calculated on the machines of Intel (R) Xeon (R) Silver4110 and RTX2080 Ti.

For the CULane dataset, mF1 is used as a metric, the intersection ratio between the real lane line and the predicted lane line is calculated (IoU), and then the detection result is classified into True Positive (TP), false Positive (FP) and False Negative (FN) according to a preset threshold value.

For the Tusimple dataset, three official indices are employed: accuracy (Acc), false Positive (FP) and False Negative (FN), the evaluation formula is:

wherein C is _clip Is the number of correctly predicted lane points, S _slip Is the number of reference truth points of the image. If more than 85% of the predicted lane points are at the benchmarkWithin 20 pixels of the truth point, the predicted lane is considered correct.

Wherein F is _pred Is the number of lane lines predicted incorrectly, N _pred Is to predict the correct number of lane lines, M _pred Is not to correctly predict the number of real lane lines, N _gt Is the actual lane line number.

Referring to fig. 5, the first behavior is a visualization result of the CULane dataset, the second behavior is a visualization result of the TuSimple dataset, and different lane instances are represented by different colors.

The results of the lane line detection method based on the feature guidance attention on the CULane data set are shown in Table 2.

TABLE 2 results of the application on CULane dataset

Referring to table 2, the proposed method achieves the most advanced results of the 52.88mf1 score. In addition, the method of the application realizes the best performance in eight scenes in nine scenes, and has robustness to different scenes. For difficult situations, such as curves and nights, the method of the present application has significant advantages. In addition, the method of the present application achieves an mf1 fraction improvement of 2.18% with similar efficiency compared to the line anchor based method laneat-S. In most cases in the CULane dataset, the fresenet 101 version of the method of the present application exceeded all previous methods.

The results of the lane line detection method based on the feature-guided attention on the Tusimple data set are shown in Table 3.

TABLE 3 results of the application on Tusimple dataset

Referring to table 3, the difference between the different methods on the dataset is relatively small due to the small amount of data and the large number of single scenes. The method of the application achieves a best F1 score of 97.45 and a most advanced Acc score of 96.67.

To verify the effect of the different components of the proposed method, the present embodiment performs qualitative and quantitative experiments on the tuning dataset. Experiments were performed with the same training setup and different combinations of modules, and table 4 shows the quantitative results of the method according to the application under this experiment. The gradual addition of feature guidance attention and equalization feature pyramid networks in the baseline, feature guidance attention increased F1 from 95.04 to 96.13, which verifies that there was a significant increase in positioning accuracy. Furthermore, the balanced feature pyramid network increases F1 to 96.72. The results in F1, acc, FP, FN were consistently improved, which verifies the effectiveness of the algorithm.

TABLE 4 results of the ablation experiments of the application on a Tusimple dataset

In summary, based on the evaluation results performed on the two lane detection reference data sets Culane and Tusimple, the method of the present application is superior to the lane line detection method in the prior art.

The lane line detection method based on the feature guidance attention has higher accuracy in the task of detecting the lane line. The feature guidance attention provided by the application can emphasize more useful information coded in the features, and the balanced feature pyramid network solves the problem of detail information loss caused by gradual desalination of high-level features along with the fusion process, so that balanced feature fusion among different levels is realized.

the camera acquisition unit is used for acquiring lane line images;

a memory for storing a computer program;

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations and modifications of the present application will be apparent to those of ordinary skill in the art in light of the foregoing description. It is not necessary here nor is it exhaustive of all embodiments. And obvious variations or modifications thereof are contemplated as falling within the scope of the present application.

Claims

1. The lane line detection method based on the feature guiding attention is characterized by comprising the following specific steps:

using the balanced feature pyramid network as a trans-scale feature fusion module of the lane line detection model,

generating a plurality of sizes corresponding to pyramids for the features of each size generated by the feature residual neural network through up-down sampling operation, and fusing feature graphs of the same size;

2. The method for detecting the lane line based on the feature guidance attention according to claim 1, wherein the backbone network using the feature residual neural network as the lane line detection model performs the lane line feature extraction on the picture containing the lane line, and outputting the multi-size feature map comprises:

3. The lane line detection method according to claim 2, wherein the spatial refinement map is used as a guide, and the spatial refinement map and the input feature map are rearranged in an alternating manner through a channel shuffling operation, so as to obtain a final output refinement feature map, and the formula is:

F _out ＝σ(GC _7×7 (CS([F _in ,F _s ])))

wherein F is _out For the final output refinement feature map, σ represents sigmoid operation, CS (·) represents channel shuffling operation, GC _7×7 (. Cndot.) represents a group convolution layer with a kernel size of 7 x 7, F _in To input the feature map F _s The graph is spatially refined.

4. A method of lane line detection based on feature-guided attention as claimed in claim 3 wherein said spatially refined map F _s Refinement of graph F by channels _c The specific formula of the application of the 2D space attention is as follows:

wherein F is _SRM Mapping for spatial attention;

wherein F is _CRM Mapping for channel attention;

the channel attention map F _CRM The specific formula is as follows:

the spatial attention map F _SRM The specific formula is as follows:

wherein MLP is a multi-layer sensor, and the number of hidden layers is R ^C/r×1×1 C is the number of channels in the feature map, r is the reduction ratio, W ₁ And W is ₀ Is the weight of the multi-layer perceptron,features of the global averaging pooling operation across channel dimensions +.>Features of the global max pooling operation across channel dimensions, +.>Features of the global averaging pooling operation across spatial dimensions +.>Features that operate to process for global maximization across spatial dimensions.

5. The lane line detection method based on feature guidance attention according to claim 1, wherein the balanced feature pyramid network is used as a trans-scale feature fusion module, and features of each size generated by the feature residual neural network are generated into a plurality of sizes corresponding to pyramids through up-down sampling operation, up-sampling operation is performed by using hole convolution, and down-sampling is performed by using convolution kernel and stride.

6. The method for detecting the lane line based on the feature guidance attention according to claim 1, wherein the specific method for inputting the multi-size feature map obtained by fusing the balanced feature pyramid network to the roiglaher detection module, iteratively updating the preset lane line and outputting the finally detected lane line comprises the following steps:

will be presetAfter the fixed lane lines are distributed to the minimum-size feature graphs output by the balanced feature pyramid network, each preset lane line has N points; sampling N uniformly from the preset lane line _p Calculating the exact value of the input characteristic by bilinear interpolation, and carrying out zero padding on the characteristic exceeding the picture range to obtain the ROI characteristic of each preset lane lineC is the number of channels in the feature map;

further extracting and obtaining the characteristic X 'of the preset lane line by using a full-communication algorithm' _p ∈R ^C×1 ；

7. The method for detecting a lane line based on feature guidance concentration according to claim 6, wherein the feature X 'of the lane line is preset' _p And global feature map X' _f The attention matrix W between is:

the polymeric features

8. A lane line detection apparatus for directing attention based on features, comprising:

9. A lane line detection apparatus that guides attention based on features, characterized by comprising:

the camera acquisition unit is used for acquiring lane line images;

a memory for storing a computer program;

a processor for processing the lane line image acquired by the camera acquisition unit, and when executing the computer program, implementing the lane line detection method based on the feature-based attention guidance according to any one of claims 1 to 7, and outputting the finally detected lane line.

10. A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, implements the steps of the feature-based attention-directing lane line detection method of any one of claims 1 to 7.