CN113591756A - Lane line detection method based on heterogeneous information interaction convolutional network - Google Patents

Lane line detection method based on heterogeneous information interaction convolutional network Download PDF

Info

Publication number
CN113591756A
CN113591756A CN202110904312.5A CN202110904312A CN113591756A CN 113591756 A CN113591756 A CN 113591756A CN 202110904312 A CN202110904312 A CN 202110904312A CN 113591756 A CN113591756 A CN 113591756A
Authority
CN
China
Prior art keywords
layer
convolutional
output
convolutional layer
normalization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110904312.5A
Other languages
Chinese (zh)
Other versions
CN113591756B (en
Inventor
周庆
周晶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Aerospace Technology Co ltd
Original Assignee
Nanjing Aerospace Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Aerospace Technology Co ltd filed Critical Nanjing Aerospace Technology Co ltd
Priority to CN202110904312.5A priority Critical patent/CN113591756B/en
Publication of CN113591756A publication Critical patent/CN113591756A/en
Application granted granted Critical
Publication of CN113591756B publication Critical patent/CN113591756B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a lane line detection method based on heterogeneous information interaction convolutional network; the constructed network branch includes two predicted branches: the method comprises the steps of lane line segmentation (pixel level) and lane line block classification, wherein a reverse characteristic representation space is constructed by utilizing a lane line segmentation prediction result and is used as a complementary space to be cascaded with a picture characteristic level space, so that the extraction capability of network characteristics is improved, and particularly, the lane line is lost due to factors such as shielding; secondly, global and local parts are designed to further improve the feature extraction capability of the network, so that the network pays attention to global context information and local detail information; finally, the invention avoids redundant calculation during network design, improves the reasoning efficiency of the network, and has important application value in the fields of automatic driving and auxiliary driving.

Description

Lane line detection method based on heterogeneous information interaction convolutional network
Technical Field
The invention relates to the technical field of image processing and pattern recognition, in particular to a lane line detection method based on a heterogeneous information interaction convolutional network.
Background
The automatic driving technology realizes the unmanned driving through a computer system, improves the driving experience of users, and is a hot problem in current research of colleges and universities and enterprises; in the technology, the lane line detection is used as one of the basic modules to guide the vehicle to autonomously judge the driving direction, standardize the driving behavior and avoid vehicle collision, so that better driving experience is realized.
Lane line detection faces a significant challenge in practical applications: one is the problem of image quality, such as distortion and blur caused by vehicle shake during driving; illumination changes and shadow problems caused by buildings and trees; visibility reduction caused by fog days, rain days and the like; secondly, the quality problem of the lane line is solved, such as the definition and the different abrasion degrees of the lane line on different road sections; the lane lines are different in width; thirdly, the problems of visual angle change, shielding and the like; the single source characteristics cannot provide rich representation for the network, and further limit the generalization capability of the network; therefore, the invention realizes the robust detection of the lane line by means of the interaction of heterogeneous information.
Disclosure of Invention
The invention realizes the construction of a feature space except attention by utilizing the prediction result of lane line pixel segmentation, and the feature space is used as a complementary space of picture features to be cascaded with the complementary space to form complete feature representation; secondly, by using the learning of global and local characteristics, the network not only focuses on global context information, but also focuses on local detail information; finally, the complexity of network calculation is reduced by utilizing a lightweight design;
in order to achieve the purpose, the invention adopts the following technical scheme:
the lane line detection method based on the heterogeneous information interactive convolutional network comprises the following steps:
step 1: making training data; preprocessing each training picture and the corresponding label thereof, specifically as follows:
let N pictures of the training data and their corresponding labels (wherein the height and width of each picture are 288 and 800 respectively), let the pictures in the training set be { I1,I2,...,INThe label corresponding to each picture is { l }1,l2,...,lN};
Step 101: reading a coordinate point of a lane line in a label to obtain a designated color p, drawing a polygon by using an ImageDraw.Draw.polygon () function in a PIL package, and filling the drawn polygon with the designated color by using a filling function in the function;
step 102: assigning a color p to each lane line in sequence according to yellow (p ═ 1), green (p ═ 2), blue (p ═ 3), red (p ═ 4) and purple (p ═ 5), reading each lane line in a picture label, recording the total number of lane lines as C (C is less than or equal to 5), the current lane number as k (k is less than or equal to C), giving the lane line color as p ═ k, sequentially executing the operation of step 101, further obtaining a picture with colors of lane lines, then converting the picture into a gray picture, sequentially converting the gray values of the corresponding colors into 1, 2, 3, 4 and 5, and recording the pixel points without colors as 0, and finally converting the labels of all pictures into { mask } pictures1,mask2,...,maskN};
Step 103: generating a multi-target segmentation label by using an One-hot coding mode, wherein the dimensionality of the multi-target segmentation label is 288 multiplied by 800 multiplied by (C +1), and finally converting the label into { seg × (C +1)1,seg2,...,segN};
Step 104: predefined row anchors _ anchor, such as [121,131,141,150,160,170,180,189,199,209,219,228,238,248,258,267,277,287 ]; dividing the height of the label obtained in the step 102 into 18 parts according to the predefined row anchor, then equally dividing the width direction of the label into 200 parts, and simultaneously generating an 18 multiplied by 201 multiplied by C zero matrix U;
step 105: selecting a block with the largest k-th lane line proportion of the gray label generated in the step 102 in the row where the predefined row anchor is located as a label of the lane line at the position (i, j) of the row by using the predefined row anchor in the step 104, wherein U (i, j, k) is 1; if the mth line anchor position in the gray label generated in step 102 has no lane line, U (m,201, k) is 1;
step 106: label { mask in step 1021,mask2,...,maskNThe operations of step 105 are performed in sequence to generate block labels { block }1,block2,...,blockN};
Step 2: establishing a lane line detection network for heterogeneous information interaction; the concrete model of the network is as follows:
the convolutional layer 1: deconvoluting an image with input of 288 × 800 × 3 using 64 7 × 7 convolution kernels with step size of 2, and obtaining features of 144 × 400 × 64 after normalization (BN) layer and ReLU activation function;
a pooling layer 1: the output features of the convolutional layer 1 are subjected to a maximum pooling layer of 3 × 3 with the step length of 2 to obtain features of 72 × 200 × 64;
and (3) convolutional layer 2: deconvolving the output of the pooling layer 1 with 64 3 × 3 convolution kernels with step size 1, and obtaining a 72 × 200 × 64 feature after normalization (BN) layer and ReLU activation function;
and (3) convolutional layer: deconvoluting the output of the convolutional layer 2 by using 64 convolutional kernels with the step length of 3 × 3, adding the characteristics obtained after normalization (BN) layer to the characteristics output by the pooling layer 1, and obtaining the characteristics of 72 × 200 × 64 after a ReLU activation function;
and (4) convolutional layer: deconvoluting the output of the convolutional layer 3 by using 64 convolutional kernels with the step size of 3 × 3, and obtaining the characteristics of 72 × 200 × 64 after a normalization (BN) layer and a ReLU activation function;
and (5) convolutional layer: deconvolving the output of the convolutional layer 4 by using 64 convolutional kernels with the step length of 3 × 3, adding the characteristics of the output of the convolutional layer 3 to the characteristics obtained after normalization (BN) layer, and obtaining the characteristics of 72 × 200 × 64 after a ReLU activation function;
and (6) a convolutional layer: deconvolving the output of the convolutional layer 5 with 128 convolutional kernels of 3 × 3 with the step size of 2, and obtaining the characteristics of 36 × 100 × 128 after a normalization (BN) layer and a ReLU activation function;
and (3) a convolutional layer 7: deconvoluting the output of the convolutional layer 6 by using 128 convolution kernels of 3 × 3 with the step size of 1, and obtaining a characteristic of 36 × 100 × 128 through the characteristic obtained after normalization (BN) layer;
convolutional layer 7_ 1: deconvolving the output of the convolutional layer 5 by using 128 convolutional kernels with the step length of 1 × 1 of 2, adding the output characteristics of the convolutional layer 7 after passing through a normalization (BN) layer, and obtaining the characteristics of 36 × 100 × 128 after passing through a ReLU activation function;
and (3) convolutional layer 8: deconvolving the output of the convolution layer 7_1 with 128 convolution kernels of 3 × 3 with step size 1, and obtaining the characteristics of 36 × 100 × 128 after normalization (BN) layer and ReLU activation function;
a convolutional layer 9: deconvoluting the output of the convolutional layer 8 by using 128 convolutional kernels with the step length of 3 × 3, performing normalization (BN) layer and adding the characteristics of the convolutional layer 7_1 output, and performing a ReLU activation function to obtain the characteristics of 36 × 100 × 128;
the convolutional layer 10: deconvoluting the output of the convolutional layer 9 by using 256 convolutional kernels with the step size of 3 × 3, and obtaining the characteristics of 18 × 50 × 256 after normalization (BN) layer and ReLU activation function;
the convolutional layer 11: deconvolving the output of the convolution layer 10 with 256 convolution kernels of 3 × 3 with a step size of 1, and obtaining a feature of 18 × 50 × 256 through the feature obtained after normalization (BN) layer;
convolutional layer 11_ 1: deconvolving the output of the convolutional layer 9 by using 256 convolutional kernels with the step length of 1 × 1 of 2, adding the characteristics output by the convolutional layer 11 after passing through a normalization (BN) layer, and obtaining the characteristics of 18 × 50 × 256 after passing through a ReLU activation function;
the convolutional layer 12: deconvoluting the output of the convolutional layer 11_1 by using 256 convolutional kernels with the step size of 3 × 3, and obtaining the characteristics of 18 × 50 × 256 after a normalization (BN) layer and a ReLU activation function;
a convolutional layer 13: deconvoluting the output of the convolutional layer 12 by using 256 convolutional kernels with the step length of 3 × 3, performing normalization (BN) layer and adding the characteristics of the convolutional layer 11_1 output, and performing a ReLU activation function to obtain the characteristics of 18 × 50 × 256;
the convolutional layer 14: deconvoluting the output of the convolutional layer 13 by using 512 convolution kernels with the step length of 3 × 3 and 2, and obtaining the characteristics of 9 × 25 × 512 after a normalization (BN) layer and a ReLU activation function;
a convolution layer 15: deconvolving the output of the convolutional layer 14 with 512 convolutional kernels of 3 × 3 with the step size of 1, and obtaining the characteristics of 9 × 25 × 512 through the characteristics obtained after normalization (BN) layer;
convolutional layer 15_ 1: deconvolving the output of the convolutional layer 13 by using 512 convolutional kernels with the step length of 1 × 1 of 2, adding the characteristics output by the convolutional layer 15 after passing through a normalization (BN) layer, and obtaining the characteristics of 9 × 25 × 512 after passing through a ReLU activation function;
a convolutional layer 16: the output of the convolution layer 15_1 is deconvoluted by using 512 convolution kernels with the step size of 1 and the convolution kernel is deconvoluted by 3 multiplied by 3, and the characteristics of 9 multiplied by 25 multiplied by 512 are obtained after a normalization (BN) layer and a ReLU activation function;
a convolutional layer 17: deconvoluting the output of the convolutional layer 16 by using 512 convolutional kernels with the step length of 3 × 3, performing normalization (BN) layer and adding the characteristics of the output of the convolutional layer 15_1, and performing a ReLU activation function to obtain the characteristics of 9 × 25 × 512;
the convolutional layer 18: deconvolving the output of the convolutional layer 9 with 128 convolutional kernels of 1 × 1 with the step size of 1, and obtaining the characteristics of 36 × 100 × 128 after a normalization (BN) layer and a ReLU activation function;
a convolutional layer 19: deconvolving the output of the convolutional layer 18 with 128 convolutional kernels of 3 × 3 with a step size of 1, and obtaining the characteristics of 36 × 100 × 128 after a normalization (BN) layer and a ReLU activation function;
a convolutional layer 20: deconvolving the output of the convolutional layer 19 with 128 convolutional kernels of 1 × 1 with step size 1, and obtaining the characteristics of 36 × 100 × 128 after normalization (BN) layer and ReLU activation function;
a convolutional layer 21: deconvolving the output of the convolutional layer 20 with 128 convolutional kernels of 3 × 3 with step size 1, and obtaining the characteristics of 36 × 100 × 128 after normalization (BN) layer and ReLU activation function;
convolutional layer 22: deconvolving the output of the convolutional layer 13 with 128 convolutional kernels of 3 × 3 with a step size of 1, and obtaining the characteristics of 18 × 50 × 128 after normalization (BN) layer and ReLU activation function;
a convolutional layer 23: deconvolving the output of the convolutional layer 22 with 128 1 × 1 convolution kernels with step size 1, and obtaining 18 × 50 × 128 features after normalization (BN) layer and ReLU activation function;
convolutional layer 24: deconvolving the output of the convolutional layer 23 with 128 convolutional kernels of 3 × 3 with a step size of 1, and obtaining the characteristics of 18 × 50 × 128 after a normalization (BN) layer and a ReLU activation function;
upper sampling layer 1: the output of the convolutional layer 24 is up-sampled by 2 times by using bilinear interpolation to obtain the characteristics of 36 multiplied by 100 multiplied by 128;
a convolutional layer 25: deconvolving the output of the convolutional layer 17 with 128 convolutional kernels of 3 × 3 with a step size of 1, and obtaining the characteristics of 9 × 25 × 128 after normalization (BN) layer and ReLU activation function;
the convolutional layer 26: deconvolving the output of the convolutional layer 25 with 128 convolutional kernels of 1 × 1 with step size 1, and obtaining the characteristics of 9 × 25 × 128 after normalization (BN) layer and ReLU activation function;
upper sampling layer 2: the output of the convolutional layer 26 is up-sampled by 4 times by using bilinear interpolation to obtain the characteristics of 36 multiplied by 100 multiplied by 128;
cascade layer 1: the output of the upsampling 1, the upsampling 2 and the convolutional layer 9 are cascaded along the channel dimension to obtain the characteristics of 36 multiplied by 100 multiplied by 384;
the convolutional layer 27: deconvolving the output of the cascade layer 1 by using 256 convolution kernels with the step size of 3 × 3 and obtaining the characteristics of 36 × 100 × 256 after a normalization (BN) layer and a ReLU activation function;
convolutional layer 28: deconvolving the output of the convolutional layer 27 with 128 convolutional kernels of 3 × 3 with a step size of 1, and obtaining the characteristics of 36 × 100 × 128 after normalization (BN) layer and ReLU activation function;
convolutional layer 29: deconvoluting the output of the convolutional layer 28 using 128 convolutional kernels of 3 × 3 with step size 1, and obtaining features of 36 × 100 × 128 after normalization (BN) layer and ReLU activation function;
the convolutional layer 30: deconvoluting the output of the convolutional layer 29 by using a 1 × 1 convolution kernel with C +1 step size of 1, and obtaining the characteristics of 36 × 100 × (C +1) after passing through a Sigmoid activation function;
convolutional layer 30_ 1: up-sampling the output of convolutional layer 30 by 8 times and waiting for 288 × 800 × (C +1) feature;
convolutional layer 30_ 2: adding the opposite number of the output of the convolutional layer 30 to the tensor whose dimension is the same as that of the convolutional layer and whose element is 1 to obtain the characteristic of 36 × 100 × (C + 1);
the convolutional layer 31: deconvoluting the output of the convolutional layer 30_2 by using 256 convolutional kernels with the step size of 3 × 3, and obtaining the characteristics of 18 × 50 × 256 after a normalization (BN) layer and a ReLU activation function;
the convolutional layer 32: deconvoluting the output of the convolutional layer 31 with 256 convolutional kernels of 3 × 3 with the step size of 1, and obtaining the characteristics of 18 × 50 × 256 after normalization (BN) layer and ReLU activation function;
the convolutional layer 33: deconvolving the output of the convolutional layer 32 with 256 1 × 1 convolutional kernels with step size 1, and obtaining 18 × 50 × 256 characteristics through a normalization (BN) layer and a ReLU activation function;
upper sampling layer 3: the output of the convolutional layer 17 is up-sampled by 2 times by using bilinear interpolation to obtain the characteristics of 18 multiplied by 50 multiplied by 512;
cascade layer 2: the convolution layer 33 and the output of the up-sampling layer 3 are cascaded along the channel dimension to obtain the characteristics of 18 multiplied by 50 multiplied by 768;
the convolutional layer 34: deconvolving the output of the cascade layer 2 by using 256 convolution kernels with the step size of 3 × 3 and 1, and obtaining the characteristics of 18 × 50 × 256 after a normalization (BN) layer and a ReLU activation function;
the convolutional layer 35: deconvolving the output of the convolutional layer 34 with 64 3 × 3 convolutional kernels with step length of 1, performing normalization (BN) layer and ReLU activation function, and then performing bilinear interpolation up-sampling to [18,201] to obtain 18 × 201 × 64 characteristics;
the convolutional layer 36: deconvolving the output of the convolution layer 35 with 16 convolution kernels of 3 × 3 with step size 1, and obtaining the features of 18 × 201 × 16 after normalization (BN) layer and ReLU activation function;
convolutional layer 37: deconvolving the output of the cascade layer 2 by using 128 3 × 3 convolution kernels with step size 1, and obtaining the characteristics of 18 × 50 × 128 after a normalization (BN) layer and a ReLU activation function;
convolutional layer 38: the output of the convolution level convolution layer 37 is deconvolved by using 2 convolution kernels with the step length of 1 and the dimension of the output characteristic is reconstructed to obtain the characteristic of 1 multiplied by 1800 after the ReLU activation function;
the convolutional layer 39: deconvolving the output of the level convolution layer 38 by using 57888 convolution kernels with 1 × 1 step length and reconstructing the dimensionality of the output characteristics after a ReLU activation function to obtain the characteristics of 18 × 201 × 16;
cascade layer 3: the outputs of convolutional layer 36 and convolutional layer 39 are cascaded along the channel dimension to obtain 18 × 201 × 32 features;
the convolutional layer 39: deconvolving the output of the cascade layer 3 by using C1 × 1 convolution kernels with the step length of 1, and obtaining the characteristic of 18 × 201 × C after a Softmax activation function;
and step 3: training the network model established in the step 2 by using the training data in the step 1, performing parameter learning on the model through an SGD (generalized minimum deviation) optimization strategy, and storing a final training model, wherein the method specifically comprises the following steps:
step 301: the network designed by the invention trains and learns the network parameters in a multitask mode, and the initial learning rate of the network is set as gamma;
step 302: the output of the convolutional layer 30_1 in the step 2 is recorded as Pre _ seg, the output of the convolutional layer 39 is recorded as Pre _ block, parameters in the network are learned based on the label given in the step 1, and the loss function is
Figure BDA0003200950320000061
Wherein λ1Is radix Ginseng;
step 303: after the network is trained in step 301 and step 302, the network parameters are saved
And 4, step 4: testing a deep network model; testing the input picture based on the parameters saved in the step 303, which specifically comprises the following steps:
step 401: initializing parameters: noting the size of the input picture as
Figure BDA0003200950320000071
Wherein
Figure BDA0003200950320000072
And
Figure BDA0003200950320000073
respectively the height and width of the picture; a wide variation of
Figure BDA0003200950320000074
Generating a matrix with the step length of 1 and the range of 1 to 200, and reconstructing the output dimension into a matrix of 200 multiplied by 1, and recording the matrix as Idx;
step 402: reconstructing the output dimension of the convolution layer 39 in the step 2 into 201 multiplied by 18 multiplied by C, then taking the first 200 lines of the reconstructed layer as Cut _ block, multiplying the Cut _ block by Idx, summing the sum by lines and recording the sum as Loc;
step 403: according to the row, the index of the maximum output number of the convolution layer 39 in the step 2 is recorded as Maxind, and the number of positions where Maxind is equal to 200 in Loc is assigned as 0;
step 404: sequentially traversing the elements in the Loc matrix, and if the sum of non-zero elements in a certain column of the Loc is greater than 2, calculating the position of the k-th lane line and the ith preset row anchor in the original image according to the following formula:
Figure BDA0003200950320000075
where int (·) denotes rounding.
Compared with the prior art, the invention has the beneficial effects that:
1. the method utilizes multi-scale features to realize pixel-level lane line segmentation, learns a group of prior features by means of reverse output of segmentation results, enables a network to discover heterogeneous complementary features, combines the heterogeneous complementary features with the features of an image, and constructs complete feature representation;
2. global and local feature learning modules are constructed, so that the network can sense global context information and can not ignore local detail information
3. When the network is designed, the 3 multiplied by 3 convolution and the 1 multiplied by 1 convolution are stacked, so that the complexity of network calculation is reduced, and the expression capability of the network is enhanced; meanwhile, the network models all use convolution operation, and the complex calculation consumption of a linear layer is avoided.
Drawings
FIG. 1 is a framework diagram of a deep network model according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments;
example 1: referring to fig. 1, the lane line detection method based on the heterogeneous information interaction convolutional network includes the following steps:
step 1: making training data; preprocessing each training picture and the corresponding label thereof, specifically as follows:
let N pictures of the training data and their corresponding labels (wherein the height and width of each picture are 288 and 800 respectively), let the pictures in the training set be { I1,I2,...,INThe label corresponding to each picture is { l }1,l2,...,lN};
Step 101: reading a coordinate point of a lane line in a label to obtain a designated color p, drawing a polygon by using an ImageDraw.Draw.polygon () function in a PIL package, and filling the drawn polygon with the designated color by using a filling function in the function;
step 102: assigning a color p to each lane line in sequence according to yellow (p ═ 1), green (p ═ 2), blue (p ═ 3), red (p ═ 4) and purple (p ═ 5), reading each lane line in a picture label, recording the total number of lane lines as C (C is less than or equal to 5), the current lane number as k (k is less than or equal to C), giving the lane line color as p ═ k, sequentially executing the operation of step 101, further obtaining a picture with colors of lane lines, then converting the picture into a gray picture, sequentially converting the gray values of the corresponding colors into 1, 2, 3, 4 and 5, and recording the pixel points without colors as 0, and finally converting the labels of all pictures into { mask } pictures1,mask2,...,maskN};
Step 103: generating a multi-target segmentation label by using an One-hot coding mode, wherein the dimensionality of the multi-target segmentation label is 288 multiplied by 800 multiplied by (C +1), and finally converting the label into { seg × (C +1)1,seg2,...,segN};
Step 104: predefined row anchors _ anchor, such as [121,131,141,150,160,170,180,189,199,209,219,228,238,248,258,267,277,287 ]; dividing the height of the label obtained in the step 102 into 18 parts according to the predefined row anchor, then equally dividing the width direction of the label into 200 parts, and simultaneously generating an 18 multiplied by 201 multiplied by C zero matrix U;
step 105: selecting a block with the largest k-th lane line proportion of the gray label generated in the step 102 in the row where the predefined row anchor is located as a label of the lane line at the position (i, j) of the row by using the predefined row anchor in the step 104, wherein U (i, j, k) is 1; if the mth line anchor position in the gray label generated in step 102 has no lane line, U (m,201, k) is 1;
step 106: label { mask in step 1021,mask2,...,maskNExecuting the step 1 in sequence05 of generating Block tag { Block }1,block2,...,blockN};
Step 2: establishing a lane line detection network for heterogeneous information interaction; the concrete model of the network is as follows:
the convolutional layer 1: deconvoluting an image with input of 288 × 800 × 3 using 64 7 × 7 convolution kernels with step size of 2, and obtaining features of 144 × 400 × 64 after normalization (BN) layer and ReLU activation function;
a pooling layer 1: the output features of the convolutional layer 1 are subjected to a maximum pooling layer of 3 × 3 with the step length of 2 to obtain features of 72 × 200 × 64;
and (3) convolutional layer 2: deconvolving the output of the pooling layer 1 with 64 3 × 3 convolution kernels with step size 1, and obtaining a 72 × 200 × 64 feature after normalization (BN) layer and ReLU activation function;
and (3) convolutional layer: deconvoluting the output of the convolutional layer 2 by using 64 convolutional kernels with the step length of 3 × 3, adding the characteristics obtained after normalization (BN) layer to the characteristics output by the pooling layer 1, and obtaining the characteristics of 72 × 200 × 64 after a ReLU activation function;
and (4) convolutional layer: deconvoluting the output of the convolutional layer 3 by using 64 convolutional kernels with the step size of 3 × 3, and obtaining the characteristics of 72 × 200 × 64 after a normalization (BN) layer and a ReLU activation function;
and (5) convolutional layer: deconvolving the output of the convolutional layer 4 by using 64 convolutional kernels with the step length of 3 × 3, adding the characteristics of the output of the convolutional layer 3 to the characteristics obtained after normalization (BN) layer, and obtaining the characteristics of 72 × 200 × 64 after a ReLU activation function;
and (6) a convolutional layer: deconvolving the output of the convolutional layer 5 with 128 convolutional kernels of 3 × 3 with the step size of 2, and obtaining the characteristics of 36 × 100 × 128 after a normalization (BN) layer and a ReLU activation function;
and (3) a convolutional layer 7: deconvoluting the output of the convolutional layer 6 by using 128 convolution kernels of 3 × 3 with the step size of 1, and obtaining a characteristic of 36 × 100 × 128 through the characteristic obtained after normalization (BN) layer;
convolutional layer 7_ 1: deconvolving the output of the convolutional layer 5 by using 128 convolutional kernels with the step length of 1 × 1 of 2, adding the output characteristics of the convolutional layer 7 after passing through a normalization (BN) layer, and obtaining the characteristics of 36 × 100 × 128 after passing through a ReLU activation function;
and (3) convolutional layer 8: deconvolving the output of the convolution layer 7_1 with 128 convolution kernels of 3 × 3 with step size 1, and obtaining the characteristics of 36 × 100 × 128 after normalization (BN) layer and ReLU activation function;
a convolutional layer 9: deconvoluting the output of the convolutional layer 8 by using 128 convolutional kernels with the step length of 3 × 3, performing normalization (BN) layer and adding the characteristics of the convolutional layer 7_1 output, and performing a ReLU activation function to obtain the characteristics of 36 × 100 × 128;
the convolutional layer 10: deconvoluting the output of the convolutional layer 9 by using 256 convolutional kernels with the step size of 3 × 3, and obtaining the characteristics of 18 × 50 × 256 after normalization (BN) layer and ReLU activation function;
the convolutional layer 11: deconvolving the output of the convolution layer 10 with 256 convolution kernels of 3 × 3 with a step size of 1, and obtaining a feature of 18 × 50 × 256 through the feature obtained after normalization (BN) layer;
convolutional layer 11_ 1: deconvolving the output of the convolutional layer 9 by using 256 convolutional kernels with the step length of 1 × 1 of 2, adding the characteristics output by the convolutional layer 11 after passing through a normalization (BN) layer, and obtaining the characteristics of 18 × 50 × 256 after passing through a ReLU activation function;
the convolutional layer 12: deconvoluting the output of the convolutional layer 11_1 by using 256 convolutional kernels with the step size of 3 × 3, and obtaining the characteristics of 18 × 50 × 256 after a normalization (BN) layer and a ReLU activation function;
a convolutional layer 13: deconvoluting the output of the convolutional layer 12 by using 256 convolutional kernels with the step length of 3 × 3, performing normalization (BN) layer and adding the characteristics of the convolutional layer 11_1 output, and performing a ReLU activation function to obtain the characteristics of 18 × 50 × 256;
the convolutional layer 14: deconvoluting the output of the convolutional layer 13 by using 512 convolution kernels with the step length of 3 × 3 and 2, and obtaining the characteristics of 9 × 25 × 512 after a normalization (BN) layer and a ReLU activation function;
a convolution layer 15: deconvolving the output of the convolutional layer 14 with 512 convolutional kernels of 3 × 3 with the step size of 1, and obtaining the characteristics of 9 × 25 × 512 through the characteristics obtained after normalization (BN) layer;
convolutional layer 15_ 1: deconvolving the output of the convolutional layer 13 by using 512 convolutional kernels with the step length of 1 × 1 of 2, adding the characteristics output by the convolutional layer 15 after passing through a normalization (BN) layer, and obtaining the characteristics of 9 × 25 × 512 after passing through a ReLU activation function;
a convolutional layer 16: the output of the convolution layer 15_1 is deconvoluted by using 512 convolution kernels with the step size of 1 and the convolution kernel is deconvoluted by 3 multiplied by 3, and the characteristics of 9 multiplied by 25 multiplied by 512 are obtained after a normalization (BN) layer and a ReLU activation function;
a convolutional layer 17: deconvoluting the output of the convolutional layer 16 by using 512 convolutional kernels with the step length of 3 × 3, performing normalization (BN) layer and adding the characteristics of the output of the convolutional layer 15_1, and performing a ReLU activation function to obtain the characteristics of 9 × 25 × 512;
the convolutional layer 18: deconvolving the output of the convolutional layer 9 with 128 convolutional kernels of 1 × 1 with the step size of 1, and obtaining the characteristics of 36 × 100 × 128 after a normalization (BN) layer and a ReLU activation function;
a convolutional layer 19: deconvolving the output of the convolutional layer 18 with 128 convolutional kernels of 3 × 3 with a step size of 1, and obtaining the characteristics of 36 × 100 × 128 after a normalization (BN) layer and a ReLU activation function;
a convolutional layer 20: deconvolving the output of the convolutional layer 19 with 128 convolutional kernels of 1 × 1 with step size 1, and obtaining the characteristics of 36 × 100 × 128 after normalization (BN) layer and ReLU activation function;
a convolutional layer 21: deconvolving the output of the convolutional layer 20 with 128 convolutional kernels of 3 × 3 with step size 1, and obtaining the characteristics of 36 × 100 × 128 after normalization (BN) layer and ReLU activation function;
convolutional layer 22: deconvolving the output of the convolutional layer 13 with 128 convolutional kernels of 3 × 3 with a step size of 1, and obtaining the characteristics of 18 × 50 × 128 after normalization (BN) layer and ReLU activation function;
a convolutional layer 23: deconvolving the output of the convolutional layer 22 with 128 1 × 1 convolution kernels with step size 1, and obtaining 18 × 50 × 128 features after normalization (BN) layer and ReLU activation function;
convolutional layer 24: deconvolving the output of the convolutional layer 23 with 128 convolutional kernels of 3 × 3 with a step size of 1, and obtaining the characteristics of 18 × 50 × 128 after a normalization (BN) layer and a ReLU activation function;
upper sampling layer 1: the output of the convolutional layer 24 is up-sampled by 2 times by using bilinear interpolation to obtain the characteristics of 36 multiplied by 100 multiplied by 128;
a convolutional layer 25: deconvolving the output of the convolutional layer 17 with 128 convolutional kernels of 3 × 3 with a step size of 1, and obtaining the characteristics of 9 × 25 × 128 after normalization (BN) layer and ReLU activation function;
the convolutional layer 26: deconvolving the output of the convolutional layer 25 with 128 convolutional kernels of 1 × 1 with step size 1, and obtaining the characteristics of 9 × 25 × 128 after normalization (BN) layer and ReLU activation function;
upper sampling layer 2: the output of the convolutional layer 26 is up-sampled by 4 times by using bilinear interpolation to obtain the characteristics of 36 multiplied by 100 multiplied by 128;
cascade layer 1: the output of the upsampling 1, the upsampling 2 and the convolutional layer 9 are cascaded along the channel dimension to obtain the characteristics of 36 multiplied by 100 multiplied by 384;
the convolutional layer 27: deconvolving the output of the cascade layer 1 by using 256 convolution kernels with the step size of 3 × 3 and obtaining the characteristics of 36 × 100 × 256 after a normalization (BN) layer and a ReLU activation function;
convolutional layer 28: deconvolving the output of the convolutional layer 27 with 128 convolutional kernels of 3 × 3 with a step size of 1, and obtaining the characteristics of 36 × 100 × 128 after normalization (BN) layer and ReLU activation function;
convolutional layer 29: deconvoluting the output of the convolutional layer 28 using 128 convolutional kernels of 3 × 3 with step size 1, and obtaining features of 36 × 100 × 128 after normalization (BN) layer and ReLU activation function;
the convolutional layer 30: deconvoluting the output of the convolutional layer 29 by using a 1 × 1 convolution kernel with C +1 step size of 1, and obtaining the characteristics of 36 × 100 × (C +1) after passing through a Sigmoid activation function;
convolutional layer 30_ 1: up-sampling the output of convolutional layer 30 by 8 times and waiting for 288 × 800 × (C +1) feature;
convolutional layer 30_ 2: adding the opposite number of the output of the convolutional layer 30 to the tensor whose dimension is the same as that of the convolutional layer and whose element is 1 to obtain the characteristic of 36 × 100 × (C + 1);
the convolutional layer 31: deconvoluting the output of the convolutional layer 30_2 by using 256 convolutional kernels with the step size of 3 × 3, and obtaining the characteristics of 18 × 50 × 256 after a normalization (BN) layer and a ReLU activation function;
the convolutional layer 32: deconvoluting the output of the convolutional layer 31 with 256 convolutional kernels of 3 × 3 with the step size of 1, and obtaining the characteristics of 18 × 50 × 256 after normalization (BN) layer and ReLU activation function;
the convolutional layer 33: deconvolving the output of the convolutional layer 32 with 256 1 × 1 convolutional kernels with step size 1, and obtaining 18 × 50 × 256 characteristics through a normalization (BN) layer and a ReLU activation function;
upper sampling layer 3: the output of the convolutional layer 17 is up-sampled by 2 times by using bilinear interpolation to obtain the characteristics of 18 multiplied by 50 multiplied by 512;
cascade layer 2: the convolution layer 33 and the output of the up-sampling layer 3 are cascaded along the channel dimension to obtain the characteristics of 18 multiplied by 50 multiplied by 768;
the convolutional layer 34: deconvolving the output of the cascade layer 2 by using 256 convolution kernels with the step size of 3 × 3 and 1, and obtaining the characteristics of 18 × 50 × 256 after a normalization (BN) layer and a ReLU activation function;
the convolutional layer 35: deconvolving the output of the convolutional layer 34 with 64 3 × 3 convolutional kernels with step length of 1, performing normalization (BN) layer and ReLU activation function, and then performing bilinear interpolation up-sampling to [18,201] to obtain 18 × 201 × 64 characteristics;
the convolutional layer 36: deconvolving the output of the convolution layer 35 with 16 convolution kernels of 3 × 3 with step size 1, and obtaining the features of 18 × 201 × 16 after normalization (BN) layer and ReLU activation function;
convolutional layer 37: deconvolving the output of the cascade layer 2 by using 128 3 × 3 convolution kernels with step size 1, and obtaining the characteristics of 18 × 50 × 128 after a normalization (BN) layer and a ReLU activation function;
convolutional layer 38: the output of the convolution level convolution layer 37 is deconvolved by using 2 convolution kernels with the step length of 1 and the dimension of the output characteristic is reconstructed to obtain the characteristic of 1 multiplied by 1800 after the ReLU activation function;
the convolutional layer 39: deconvolving the output of the level convolution layer 38 by using 57888 convolution kernels with 1 × 1 step length and reconstructing the dimensionality of the output characteristics after a ReLU activation function to obtain the characteristics of 18 × 201 × 16;
cascade layer 3: the outputs of convolutional layer 36 and convolutional layer 39 are cascaded along the channel dimension to obtain 18 × 201 × 32 features;
the convolutional layer 39: deconvolving the output of the cascade layer 3 by using C1 × 1 convolution kernels with the step length of 1, and obtaining the characteristic of 18 × 201 × C after a Softmax activation function;
and step 3: training the network model established in the step 2 by using the training data in the step 1, performing parameter learning on the model through an SGD (generalized minimum deviation) optimization strategy, and storing a final training model, wherein the method specifically comprises the following steps:
step 301: the network designed by the invention trains and learns the network parameters in a multitask mode, and the initial learning rate of the network is set as gamma;
step 302: the output of the convolutional layer 30_1 in the step 2 is recorded as Pre _ seg, the output of the convolutional layer 39 is recorded as Pre _ block, parameters in the network are learned based on the label given in the step 1, and the loss function is
Figure BDA0003200950320000131
Wherein λ1Is radix Ginseng;
step 303: after the network is trained in step 301 and step 302, the network parameters are saved
And 4, step 4: testing a deep network model; testing the input picture based on the parameters saved in the step 303, which specifically comprises the following steps:
step 401: initializing parameters: noting the size of the input picture as
Figure BDA0003200950320000132
Wherein
Figure BDA0003200950320000133
And
Figure BDA0003200950320000134
respectively the height and width of the picture; a wide variation of
Figure BDA0003200950320000135
Generating a matrix with the step length of 1 and the range of 1 to 200, and reconstructing the output dimension into a matrix of 200 multiplied by 1, and recording the matrix as Idx;
step 402: reconstructing the output dimension of the convolution layer 39 in the step 2 into 201 multiplied by 18 multiplied by C, then taking the first 200 lines of the reconstructed layer as Cut _ block, multiplying the Cut _ block by Idx, summing the sum by lines and recording the sum as Loc;
step 403: according to the row, the index of the maximum output number of the convolution layer 39 in the step 2 is recorded as Maxind, and the number of positions where Maxind is equal to 200 in Loc is assigned as 0;
step 404: sequentially traversing the elements in the Loc matrix, and if the sum of non-zero elements in a certain column of the Loc is greater than 2, calculating the position of the k-th lane line and the ith preset row anchor in the original image according to the following formula:
Figure BDA0003200950320000136
where int (·) denotes rounding.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims (5)

1. The lane line detection method based on the heterogeneous information interactive convolutional network is characterized by comprising the following steps of:
step 1: making training data;
step 2: constructing a lane line detection network based on heterogeneous information interaction;
and step 3: training the network model established in the step 2 by using the training data in the step 1, performing parameter learning on the model by using an SGD (generalized serving gateway) optimization strategy, and storing a final training model;
and 4, step 4: and (4) testing the final network model in the step (3).
2. The method of claim 1, wherein the method comprises the steps of,
step 1: making training data; preprocessing each training picture and the corresponding label thereof, specifically as follows:
let N pictures of the training data and their corresponding labels (wherein the height and width of each picture are 288 and 800 respectively), let the pictures in the training set be { I1,I2,...,INThe label corresponding to each picture is { l }1,l2,...,lN};
Step 101: reading a coordinate point of a lane line in a label to obtain a designated color p, drawing a polygon by using an ImageDraw.Draw.polygon () function in a PIL package, and filling the drawn polygon with the designated color by using a filling function in the function;
step 102: assigning a color p to each lane line in sequence according to yellow (p ═ 1), green (p ═ 2), blue (p ═ 3), red (p ═ 4) and purple (p ═ 5), reading each lane line in a picture label, recording the total number of lane lines as C (C is less than or equal to 5), the current lane number as k (k is less than or equal to C), giving the lane line color as p ═ k, sequentially executing the operation of step 101, further obtaining a picture with colors of lane lines, then converting the picture into a gray picture, sequentially converting the gray values of the corresponding colors into 1, 2, 3, 4 and 5, and recording the pixel points without colors as 0, and finally converting the labels of all pictures into { mask } pictures1,mask2,...,maskN};
Step 103: generating a multi-target segmentation label by using an One-hot coding mode, wherein the dimensionality of the multi-target segmentation label is 288 multiplied by 800 multiplied by (C +1), and finally converting the label into { seg × (C +1)1,seg2,...,segN};
Step 104: predefined row anchors _ anchor, such as [121,131,141,150,160,170,180,189,199,209,219,228,238,248,258,267,277,287 ]; dividing the height of the label obtained in the step 102 into 18 parts according to the predefined row anchor, then equally dividing the width direction of the label into 200 parts, and simultaneously generating an 18 multiplied by 201 multiplied by C zero matrix U;
step 105: selecting a block with the largest k-th lane line in the row where the predefined row anchor is located in the gray label generated in step 102 as a label of the lane line at the position (i, j), that is, U (i, j, k) is 1, by using the predefined row anchor in step 104, and if there is no lane line at the position of the mth row anchor in the gray label generated in step 102, U (m,201, k) is 1;
step 106: label { mask in step 1021,mask2,...,maskNThe operations of step 105 are performed in sequence to generate block labels { block }1,block2,...,blockN}。
3. The method of claim 1, wherein the method comprises the steps of,
step 2: establishing a lane line detection network for heterogeneous information interaction; the concrete model of the network is as follows:
the convolutional layer 1: deconvoluting an image with input of 288 × 800 × 3 using 64 7 × 7 convolution kernels with step size of 2, and obtaining features of 144 × 400 × 64 after normalization (BN) layer and ReLU activation function;
a pooling layer 1: the output features of the convolutional layer 1 are subjected to a maximum pooling layer of 3 × 3 with the step length of 2 to obtain features of 72 × 200 × 64;
and (3) convolutional layer 2: deconvolving the output of the pooling layer 1 with 64 3 × 3 convolution kernels with step size 1, and obtaining a 72 × 200 × 64 feature after normalization (BN) layer and ReLU activation function;
and (3) convolutional layer: deconvoluting the output of the convolutional layer 2 by using 64 convolutional kernels with the step length of 3 × 3, adding the characteristics obtained after normalization (BN) layer to the characteristics output by the pooling layer 1, and obtaining the characteristics of 72 × 200 × 64 after a ReLU activation function;
and (4) convolutional layer: deconvoluting the output of the convolutional layer 3 by using 64 convolutional kernels with the step size of 3 × 3, and obtaining the characteristics of 72 × 200 × 64 after a normalization (BN) layer and a ReLU activation function;
and (5) convolutional layer: deconvolving the output of the convolutional layer 4 by using 64 convolutional kernels with the step length of 3 × 3, adding the characteristics of the output of the convolutional layer 3 to the characteristics obtained after normalization (BN) layer, and obtaining the characteristics of 72 × 200 × 64 after a ReLU activation function;
and (6) a convolutional layer: deconvolving the output of the convolutional layer 5 with 128 convolutional kernels of 3 × 3 with the step size of 2, and obtaining the characteristics of 36 × 100 × 128 after a normalization (BN) layer and a ReLU activation function;
and (3) a convolutional layer 7: deconvoluting the output of the convolutional layer 6 by using 128 convolution kernels of 3 × 3 with the step size of 1, and obtaining a characteristic of 36 × 100 × 128 through the characteristic obtained after normalization (BN) layer;
convolutional layer 7_ 1: deconvolving the output of the convolutional layer 5 by using 128 convolutional kernels with the step length of 1 × 1 of 2, adding the output characteristics of the convolutional layer 7 after passing through a normalization (BN) layer, and obtaining the characteristics of 36 × 100 × 128 after passing through a ReLU activation function;
and (3) convolutional layer 8: deconvolving the output of the convolution layer 7_1 with 128 convolution kernels of 3 × 3 with step size 1, and obtaining the characteristics of 36 × 100 × 128 after normalization (BN) layer and ReLU activation function;
a convolutional layer 9: deconvoluting the output of the convolutional layer 8 by using 128 convolutional kernels with the step length of 3 × 3, performing normalization (BN) layer and adding the characteristics of the convolutional layer 7_1 output, and performing a ReLU activation function to obtain the characteristics of 36 × 100 × 128;
the convolutional layer 10: deconvoluting the output of the convolutional layer 9 by using 256 convolutional kernels with the step size of 3 × 3, and obtaining the characteristics of 18 × 50 × 256 after normalization (BN) layer and ReLU activation function;
the convolutional layer 11: deconvolving the output of the convolution layer 10 with 256 convolution kernels of 3 × 3 with a step size of 1, and obtaining a feature of 18 × 50 × 256 through the feature obtained after normalization (BN) layer;
convolutional layer 11_ 1: deconvolving the output of the convolutional layer 9 by using 256 convolutional kernels with the step length of 1 × 1 of 2, adding the characteristics output by the convolutional layer 11 after passing through a normalization (BN) layer, and obtaining the characteristics of 18 × 50 × 256 after passing through a ReLU activation function;
the convolutional layer 12: deconvoluting the output of the convolutional layer 11_1 by using 256 convolutional kernels with the step size of 3 × 3, and obtaining the characteristics of 18 × 50 × 256 after a normalization (BN) layer and a ReLU activation function;
a convolutional layer 13: deconvoluting the output of the convolutional layer 12 by using 256 convolutional kernels with the step length of 3 × 3, performing normalization (BN) layer and adding the characteristics of the convolutional layer 11_1 output, and performing a ReLU activation function to obtain the characteristics of 18 × 50 × 256;
the convolutional layer 14: deconvoluting the output of the convolutional layer 13 by using 512 convolution kernels with the step length of 3 × 3 and 2, and obtaining the characteristics of 9 × 25 × 512 after a normalization (BN) layer and a ReLU activation function;
a convolution layer 15: deconvolving the output of the convolutional layer 14 with 512 convolutional kernels of 3 × 3 with the step size of 1, and obtaining the characteristics of 9 × 25 × 512 through the characteristics obtained after normalization (BN) layer;
convolutional layer 15_ 1: deconvolving the output of the convolutional layer 13 by using 512 convolutional kernels with the step length of 1 × 1 of 2, adding the characteristics output by the convolutional layer 15 after passing through a normalization (BN) layer, and obtaining the characteristics of 9 × 25 × 512 after passing through a ReLU activation function;
a convolutional layer 16: the output of the convolution layer 15_1 is deconvoluted by using 512 convolution kernels with the step size of 1 and the convolution kernel is deconvoluted by 3 multiplied by 3, and the characteristics of 9 multiplied by 25 multiplied by 512 are obtained after a normalization (BN) layer and a ReLU activation function;
a convolutional layer 17: deconvoluting the output of the convolutional layer 16 by using 512 convolutional kernels with the step length of 3 × 3, performing normalization (BN) layer and adding the characteristics of the output of the convolutional layer 15_1, and performing a ReLU activation function to obtain the characteristics of 9 × 25 × 512;
the convolutional layer 18: deconvolving the output of the convolutional layer 9 with 128 convolutional kernels of 1 × 1 with the step size of 1, and obtaining the characteristics of 36 × 100 × 128 after a normalization (BN) layer and a ReLU activation function;
a convolutional layer 19: deconvolving the output of the convolutional layer 18 with 128 convolutional kernels of 3 × 3 with a step size of 1, and obtaining the characteristics of 36 × 100 × 128 after a normalization (BN) layer and a ReLU activation function;
a convolutional layer 20: deconvolving the output of the convolutional layer 19 with 128 convolutional kernels of 1 × 1 with step size 1, and obtaining the characteristics of 36 × 100 × 128 after normalization (BN) layer and ReLU activation function;
a convolutional layer 21: deconvolving the output of the convolutional layer 20 with 128 convolutional kernels of 3 × 3 with step size 1, and obtaining the characteristics of 36 × 100 × 128 after normalization (BN) layer and ReLU activation function;
convolutional layer 22: deconvolving the output of the convolutional layer 13 with 128 convolutional kernels of 3 × 3 with a step size of 1, and obtaining the characteristics of 18 × 50 × 128 after normalization (BN) layer and ReLU activation function;
a convolutional layer 23: deconvolving the output of the convolutional layer 22 with 128 1 × 1 convolution kernels with step size 1, and obtaining 18 × 50 × 128 features after normalization (BN) layer and ReLU activation function;
convolutional layer 24: deconvolving the output of the convolutional layer 23 with 128 convolutional kernels of 3 × 3 with a step size of 1, and obtaining the characteristics of 18 × 50 × 128 after a normalization (BN) layer and a ReLU activation function;
upper sampling layer 1: the output of the convolutional layer 24 is up-sampled by 2 times by using bilinear interpolation to obtain the characteristics of 36 multiplied by 100 multiplied by 128;
a convolutional layer 25: deconvolving the output of the convolutional layer 17 with 128 convolutional kernels of 3 × 3 with a step size of 1, and obtaining the characteristics of 9 × 25 × 128 after normalization (BN) layer and ReLU activation function;
the convolutional layer 26: deconvolving the output of the convolutional layer 25 with 128 convolutional kernels of 1 × 1 with step size 1, and obtaining the characteristics of 9 × 25 × 128 after normalization (BN) layer and ReLU activation function;
upper sampling layer 2: the output of the convolutional layer 26 is up-sampled by 4 times by using bilinear interpolation to obtain the characteristics of 36 multiplied by 100 multiplied by 128;
cascade layer 1: the output of the upsampling 1, the upsampling 2 and the convolutional layer 9 are cascaded along the channel dimension to obtain the characteristics of 36 multiplied by 100 multiplied by 384;
the convolutional layer 27: deconvolving the output of the cascade layer 1 by using 256 convolution kernels with the step size of 3 × 3 and obtaining the characteristics of 36 × 100 × 256 after a normalization (BN) layer and a ReLU activation function;
convolutional layer 28: deconvolving the output of the convolutional layer 27 with 128 convolutional kernels of 3 × 3 with a step size of 1, and obtaining the characteristics of 36 × 100 × 128 after normalization (BN) layer and ReLU activation function;
convolutional layer 29: deconvoluting the output of the convolutional layer 28 using 128 convolutional kernels of 3 × 3 with step size 1, and obtaining features of 36 × 100 × 128 after normalization (BN) layer and ReLU activation function;
the convolutional layer 30: deconvoluting the output of the convolutional layer 29 by using a 1 × 1 convolution kernel with C +1 step size of 1, and obtaining the characteristics of 36 × 100 × (C +1) after passing through a Sigmoid activation function;
convolutional layer 30_ 1: up-sampling the output of convolutional layer 30 by 8 times and waiting for 288 × 800 × (C +1) feature;
convolutional layer 30_ 2: adding the opposite number of the output of the convolutional layer 30 to the tensor whose dimension is the same as that of the convolutional layer and whose element is 1 to obtain the characteristic of 36 × 100 × (C + 1);
the convolutional layer 31: deconvoluting the output of the convolutional layer 30_2 by using 256 convolutional kernels with the step size of 3 × 3, and obtaining the characteristics of 18 × 50 × 256 after a normalization (BN) layer and a ReLU activation function;
the convolutional layer 32: deconvoluting the output of the convolutional layer 31 with 256 convolutional kernels of 3 × 3 with the step size of 1, and obtaining the characteristics of 18 × 50 × 256 after normalization (BN) layer and ReLU activation function;
the convolutional layer 33: deconvolving the output of the convolutional layer 32 with 256 1 × 1 convolutional kernels with step size 1, and obtaining 18 × 50 × 256 characteristics through a normalization (BN) layer and a ReLU activation function;
upper sampling layer 3: the output of the convolutional layer 17 is up-sampled by 2 times by using bilinear interpolation to obtain the characteristics of 18 multiplied by 50 multiplied by 512;
cascade layer 2: the convolution layer 33 and the output of the up-sampling layer 3 are cascaded along the channel dimension to obtain the characteristics of 18 multiplied by 50 multiplied by 768;
the convolutional layer 34: deconvolving the output of the cascade layer 2 by using 256 convolution kernels with the step size of 3 × 3 and 1, and obtaining the characteristics of 18 × 50 × 256 after a normalization (BN) layer and a ReLU activation function;
the convolutional layer 35: deconvolving the output of the convolutional layer 34 with 64 3 × 3 convolutional kernels with step length of 1, performing normalization (BN) layer and ReLU activation function, and then performing bilinear interpolation up-sampling to [18,201] to obtain 18 × 201 × 64 characteristics;
the convolutional layer 36: deconvolving the output of the convolution layer 35 with 16 convolution kernels of 3 × 3 with step size 1, and obtaining the features of 18 × 201 × 16 after normalization (BN) layer and ReLU activation function;
convolutional layer 37: deconvolving the output of the cascade layer 2 by using 128 3 × 3 convolution kernels with step size 1, and obtaining the characteristics of 18 × 50 × 128 after a normalization (BN) layer and a ReLU activation function;
convolutional layer 38: the output of the convolution level convolution layer 37 is deconvolved by using 2 convolution kernels with the step length of 1 and the dimension of the output characteristic is reconstructed to obtain the characteristic of 1 multiplied by 1800 after the ReLU activation function;
the convolutional layer 39: deconvolving the output of the level convolution layer 38 by using 57888 convolution kernels with 1 × 1 step length and reconstructing the dimensionality of the output characteristics after a ReLU activation function to obtain the characteristics of 18 × 201 × 16;
cascade layer 3: the outputs of convolutional layer 36 and convolutional layer 39 are cascaded along the channel dimension to obtain 18 × 201 × 32 features;
the convolutional layer 39: the output of the cascade layer 3 is deconvolved using C1 × 1 convolution kernels with step size 1, and after the Softmax activation function, the 18 × 201 × C characteristic is obtained.
4. The method of claim 1, wherein the method comprises the steps of,
and step 3: training the network model established in the step 2 by using the training data in the step 1, performing parameter learning on the model through an SGD (generalized minimum deviation) optimization strategy, and storing a final training model, wherein the method specifically comprises the following steps:
step 301: the network designed by the invention trains and learns the network parameters in a multitask mode, and the initial learning rate of the network is set as gamma;
step 302: the output of the convolutional layer 30_1 in the step 2 is recorded as Pre _ seg, the output of the convolutional layer 39 is recorded as Pre _ block, parameters in the network are learned based on the label given in the step 1, and the loss function is
Figure FDA0003200950310000061
Wherein λ1Is radix Ginseng;
step 303: after the network is trained in step 301 and step 302, the network parameters are saved.
5. The method of claim 1, wherein the method comprises the steps of,
and 4, step 4: testing a deep network model; testing the input picture based on the parameters saved in the step 303, which specifically comprises the following steps:
step 401: initializing parameters: noting the size of the input picture as
Figure FDA0003200950310000062
Wherein
Figure FDA0003200950310000063
And
Figure FDA0003200950310000064
respectively the height and width of the picture; a wide variation of
Figure FDA0003200950310000065
Generating a matrix with the step length of 1 and the range of 1 to 200, and reconstructing the output dimension into a matrix of 200 multiplied by 1, and recording the matrix as Idx;
step 402: reconstructing the output dimension of the convolution layer 39 in the step 2 into 201 multiplied by 18 multiplied by C, then taking the first 200 lines of the reconstructed layer as Cut _ block, multiplying the Cut _ block by Idx, summing the sum by lines and recording the sum as Loc;
step 403: according to the row, the index of the maximum output number of the convolution layer 39 in the step 2 is recorded as Maxind, and the number of positions where Maxind is equal to 200 in Loc is assigned as 0;
step 404: sequentially traversing the elements in the Loc matrix, and if the sum of non-zero elements in a certain column of the Loc is greater than 2, calculating the position of the k-th lane line and the ith preset row anchor in the original image according to the following formula:
Figure FDA0003200950310000071
where int (·) denotes rounding.
CN202110904312.5A 2021-08-06 2021-08-06 Lane line detection method based on heterogeneous information interaction convolution network Active CN113591756B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110904312.5A CN113591756B (en) 2021-08-06 2021-08-06 Lane line detection method based on heterogeneous information interaction convolution network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110904312.5A CN113591756B (en) 2021-08-06 2021-08-06 Lane line detection method based on heterogeneous information interaction convolution network

Publications (2)

Publication Number Publication Date
CN113591756A true CN113591756A (en) 2021-11-02
CN113591756B CN113591756B (en) 2024-06-28

Family

ID=78256031

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110904312.5A Active CN113591756B (en) 2021-08-06 2021-08-06 Lane line detection method based on heterogeneous information interaction convolution network

Country Status (1)

Country Link
CN (1) CN113591756B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009524A (en) * 2017-12-25 2018-05-08 西北工业大学 A kind of method for detecting lane lines based on full convolutional network
CN109886125A (en) * 2019-01-23 2019-06-14 青岛慧拓智能机器有限公司 A kind of method and Approach for road detection constructing Road Detection model
US20190220746A1 (en) * 2017-08-29 2019-07-18 Boe Technology Group Co., Ltd. Image processing method, image processing device, and training method of neural network
CN110276267A (en) * 2019-05-28 2019-09-24 江苏金海星导航科技有限公司 Method for detecting lane lines based on Spatial-LargeFOV deep learning network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190220746A1 (en) * 2017-08-29 2019-07-18 Boe Technology Group Co., Ltd. Image processing method, image processing device, and training method of neural network
CN108009524A (en) * 2017-12-25 2018-05-08 西北工业大学 A kind of method for detecting lane lines based on full convolutional network
CN109886125A (en) * 2019-01-23 2019-06-14 青岛慧拓智能机器有限公司 A kind of method and Approach for road detection constructing Road Detection model
CN110276267A (en) * 2019-05-28 2019-09-24 江苏金海星导航科技有限公司 Method for detecting lane lines based on Spatial-LargeFOV deep learning network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王帅帅;刘建国;纪郭;: "基于全卷积神经网络的车道线检测", 数字制造科学, no. 02, 15 June 2020 (2020-06-15) *

Also Published As

Publication number Publication date
CN113591756B (en) 2024-06-28

Similar Documents

Publication Publication Date Title
CN112651973B (en) Semantic segmentation method based on cascade of feature pyramid attention and mixed attention
Zhang et al. Deep gated attention networks for large-scale street-level scene segmentation
CN111275713B (en) Cross-domain semantic segmentation method based on countermeasure self-integration network
CN111612807B (en) Small target image segmentation method based on scale and edge information
CN110276354B (en) High-resolution streetscape picture semantic segmentation training and real-time segmentation method
CN113158862B (en) Multitasking-based lightweight real-time face detection method
CN110717851A (en) Image processing method and device, neural network training method and storage medium
CN113837938B (en) Super-resolution method for reconstructing potential image based on dynamic vision sensor
CN113657388A (en) Image semantic segmentation method fusing image super-resolution reconstruction
CN112396607A (en) Streetscape image semantic segmentation method for deformable convolution fusion enhancement
CN110378344B (en) Spectral dimension conversion network-based convolutional neural network multispectral image segmentation method
CN111476133B (en) Unmanned driving-oriented foreground and background codec network target extraction method
CN111723812B (en) Real-time semantic segmentation method based on sequence knowledge distillation
WO2022156621A1 (en) Artificial intelligence-based image coloring method and apparatus, electronic device, computer readable storage medium, and computer program product
CN111310766A (en) License plate identification method based on coding and decoding and two-dimensional attention mechanism
CN111382759A (en) Pixel level classification method, device, equipment and storage medium
CN114724155A (en) Scene text detection method, system and equipment based on deep convolutional neural network
CN116863194A (en) Foot ulcer image classification method, system, equipment and medium
CN112801029B (en) Attention mechanism-based multitask learning method
CN114445620A (en) Target segmentation method for improving Mask R-CNN
CN115376195B (en) Method for training multi-scale network model and face key point detection method
CN116977631A (en) Streetscape semantic segmentation method based on DeepLabV3+
CN116863437A (en) Lane line detection model training method, device, equipment, medium and vehicle
CN113591756B (en) Lane line detection method based on heterogeneous information interaction convolution network
CN115909378A (en) Document text detection model training method and document text detection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant