CN110852157A - Deep learning track line detection method based on binarization network - Google Patents

Deep learning track line detection method based on binarization network Download PDF

Info

Publication number
CN110852157A
CN110852157A CN201910940999.0A CN201910940999A CN110852157A CN 110852157 A CN110852157 A CN 110852157A CN 201910940999 A CN201910940999 A CN 201910940999A CN 110852157 A CN110852157 A CN 110852157A
Authority
CN
China
Prior art keywords
network
bisenet
layer
binarization
deep learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910940999.0A
Other languages
Chinese (zh)
Inventor
段章领
洪予晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Ho Chi Chi Intelligent Technology Co Ltd
Original Assignee
Hefei Ho Chi Chi Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Ho Chi Chi Intelligent Technology Co Ltd filed Critical Hefei Ho Chi Chi Intelligent Technology Co Ltd
Priority to CN201910940999.0A priority Critical patent/CN110852157A/en
Publication of CN110852157A publication Critical patent/CN110852157A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/588Recognition of the road, e.g. of lane markings; Recognition of the vehicle driving pattern in relation to the road
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a deep learning track line detection method based on a binarization network, which comprises the following steps: in the sample preparation stage, firstly, a plurality of road scene videos are collected through a camera, after the videos are processed into a single frame, a picture is calibrated, trained, corrected, rotationally transformed and post-processed to obtain a picture data set P, and the data set A is divided into a training set X and a testing set Y according to the ratio of 4: 1; a network establishing stage, namely establishing BiSeNet; and a network creating stage, wherein the BiSeNet is created into a binary network. The method for detecting the track line through deep learning based on the binarization network can realize the detection of the track line in the picture, has smaller requirements on time and power consumption in the detection process, namely, the memory occupies small space and the training speed is high, and can greatly compress and accelerate the neural network after the parameters of the convolution layer of the BiSeNet are binarized, thereby being beneficial to promoting the integration of the deep learning and the embedding and bringing better use prospect.

Description

Deep learning track line detection method based on binarization network
Technical Field
The invention relates to the field of track line detection based on images/videos, in particular to a depth learning track line detection method based on a binarization network.
Background
With the development of deep learning, artificial intelligence has been widely applied in various fields, and in recent years, more and more research teams have developed work on the aspects of transportation problems of underground locomotives, such as underground obstacle detection, underground track detection, underground pedestrian obstacle avoidance and the like. In order to ensure the safe transportation of underground rail locomotives, thereby reducing the casualties of miners and further achieving the ultimate goal of building intelligent and digital mines, the automatically-driven locomotives are gradually being transported to underground rails. On the one hand, the visibility is low for the condition of underground light irradiation is weaker, the roadway is narrow, the track line laying is complex, and the operating environment is severe. On the other hand, miners who do not operate according to safe production specifications exist, and the running locomotive is very easy to collide with mineral resource mining personnel in operation, so that a series of locomotive transportation accidents are caused, and irreparable loss is caused to the country and people.
The track line detection means that a lane area in a video or an image is recognized by using an image processing technology, and a specific position of a track line is displayed. The track line detection is to accurately and quickly find out the position of the track line in the picture through a proper algorithm. In practice, however, underground track detection is easily affected by complex environmental factors such as illumination shadow change, water accumulation coverage and vehicle shielding, so that the analysis speed of the track in various track scenes by the conventional track line identification is low, the accuracy is low, and the result cannot be obtained in real time.
Disclosure of Invention
The invention mainly aims to provide a depth learning track line detection method based on a binarization network, which can effectively solve the problems in the background technology and has the advantages of track line detection in pictures, smaller requirements on time and power consumption in the detection process, namely small memory occupation and high training speed.
In order to achieve the purpose, the invention adopts the technical scheme that:
a deep learning track line detection method based on a binarization network comprises the following steps:
a. in the sample preparation stage, firstly, a plurality of road scene videos are collected through a camera, after the videos are processed into a single frame, a picture is calibrated, trained, corrected, rotationally transformed and post-processed to obtain a picture data set P, and the data set A is divided into a training set X and a testing set Y according to a ratio of 4: 1;
b. a network establishing stage, namely establishing BiSeNet;
c. a network establishing stage, namely establishing the BiSeNet as a binary network;
d. and in the network operation stage, generating a countermeasure network through the binarization condition to perform track line detection.
Preferably, in the step a, the video cropping in the sample preparation stage finally obtains 2500 pictures with a resolution band size of 1280 × 720, the pictures of a part of the highway lanes are selected as a training set, the number of the training set is 2000, and the pictures of another part of different highway lanes are selected as a test set, and the number of the test set is 500.
Preferably, all pictures in the data set need to be marked, the data set is marked uniformly by adopting AutoCAD, the marking colors of people, vehicles and the surrounding environment are different during marking, the environment can be distinguished conveniently, and model parameters can be reduced as soon as possible during training.
Preferably, in the step b, the network creation stage includes the following steps;
b1, a space path module is arranged on the left branch of the BiSeNet network model structure and used for solving the problem of space information loss, the space path module is composed of three convolution layers, each layer comprises a convolution layer with the step length of 2, then batch standardization processing and ReLU nonlinear activation are carried out, the size of an image output through the path is 1/8 of an original image, so that a feature map with a larger size can be utilized, rich space information is reserved, a context path module on the right branch uses a pre-trained lightweight model Xception module as a backbone network, rapid down-sampling operation is carried out to obtain a receptive field, then a global average pooling layer is added at the tail of the Xception module to provide the receptive field, finally the characteristics of the last two stages are fused by means of a partial U-shaped structure, the space path module is used for coding rich space information, the context path module is used for providing sufficient receptive field, the two modules complete different requirements and assist in completing detection tasks together;
b2, designing an attention thinning module by the BiSeNet network, wherein the attention thinning module is used for optimizing the characteristics of each stage, the characteristics of the two paths are different at the level of characteristic representation, so that the output of the two paths cannot be directly combined, and the network consists of a Global pool (Global pool), a convolution and polynomial multiplication layer (conv (1 × 1)), an accelerated neural network training layer (Batch norm) and an activation function layer (sigmoid);
b3, the BiSeNet network simultaneously uses an attention thinning module and a feature fusion module; the feature fusion module fuses the output features of the two paths to carry out final prediction; the network consists of a concatenate layer, a convolution and polynomial multiplication layer (Conv), a bn layer, an activation function layer (relu), a Global pool (Global pool), a convolution and polynomial multiplication layer (Conv (1 x 1)), an accelerated neural network training layer (Batch norm) and an activation function layer (sigmoid);
b4, cross entropy loss function as shown in equation (1):
CE- [ ylog p + (1-y) log (1-p) ] formula (1)
Wherein p represents the probability that the sample label is 1, p belongs to [0,1], and y represents the value of the label;
then equation (1) can be written as:
Figure BDA0002222900890000031
if p istComprises the following steps:
Figure BDA0002222900890000032
then:
loss=-log(pt) Formula (4)
Balancing the imbalance between the orbit and the background, introducing an improved cross-entropy loss as shown in equation (5):
L_loss=-α(1-pt)2log(pt) Formula (5)
Experiments show that when the value of α is 0.35, the improved loss function can enable the optimizer to be more suitable for a network model, and the model performance is better.
Preferably, in the step c, the network creation stage includes the following steps;
c1, creating the BiSeNet as binary BiSeNet, wherein the BiSeNet constraint weight and the activation value are binarized into +1 or-1 as shown in formula (6):
Figure BDA0002222900890000041
wherein a isbIs a binary activation value, arIs the real value activation amount, wbIs a binary weight, wrRepresenting the actual weight;
c2, the constraint BiSeNet has a binarization weight, and the convolution operation can be approximated by equation (7):
Figure BDA0002222900890000042
the symbol represents the conventional convolution operation, and the weight in the convolution process is binary, so that the symbol represents the conventional convolution operation
Figure BDA0002222900890000043
The convolution operation involving only addition and subtraction, not multiplication, yields the optimal estimates of E and β by solving the following optimization problem:
Figure BDA0002222900890000044
solving the deformation of the formula (8) to obtain:
Figure BDA0002222900890000045
training a binary weight network: training a BiSeNet with a binarization weight, and binarizing the weight during forward propagation and backward propagation; calculating a binary weight according to the formula (9), and then calculating activated forward propagation and gradient backward propagation according to the scaled binary weight; wherein the gradient formula is
Figure BDA0002222900890000046
Preferably, in the step d, the network operation stage: the method comprises the steps of inputting a picture to be detected into a trained binary Bisenet network, obtaining a corresponding classification result, and realizing detection of the track line, wherein the first layer and the last layer of the binary network keep full-precision weight, and the result analysis shows that the picture generated by the method has very fine and smooth results and very real effect, can generate better results for all scenes, and can still better detect track information which is not marked in the training picture, so that the better robustness of the text algorithm is embodied.
Compared with the prior art, the deep learning track line detection method based on the binarization network has the following beneficial effects:
1. the method is based on the bilateral segmentation network, the track detection problem is regarded as an example segmentation problem by adjusting the structure and parameters of the network, and the track end-to-end real-time detection is carried out by establishing the track detection network based on deep learning by improving the context path module, the attention refining module and the feature fusion module in the BiSeNet network, so that the accuracy of the model can be improved and the detection rate of the model can be accelerated;
2. the end-to-end deep learning algorithm provided by the invention has the advantages that the obtained result is finer and more exquisite, the effect is very real, and the method does not depend on more post-processing technologies;
3. the invention compresses the BiSeNet, so that the use of the deep learning algorithm in the embedded terminal becomes possible, and the development of the application of the deep learning algorithm to equipment such as a mobile terminal is promoted.
Drawings
FIG. 1 is a flowchart of an overall method of a deep learning track line detection method based on a binarization network according to the present invention;
fig. 2 is a structural diagram of a BiSeNet network in the deep learning orbit line detection method based on the binarization network of the present invention;
FIG. 3 is a structural diagram of an attention thinning module in the deep learning track line detection method based on the binarization network according to the present invention;
fig. 4 is a structural diagram of a feature fusion module in the deep learning track line detection method based on the binarization network of the present invention.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further described with the specific embodiments.
A deep learning track line detection method based on a binarization network comprises the following steps:
a. in the sample preparation stage, firstly, a plurality of road scene videos are collected through a camera, after the videos are processed into a single frame, a picture is calibrated, trained, corrected, rotationally transformed and post-processed to obtain a picture data set P, and the data set A is divided into a training set X and a testing set Y according to a ratio of 4: 1;
in the sample preparation stage, video cutting is carried out to finally obtain 2500 pictures with the resolution band size of 1280 x 720, the pictures of a part of road lanes are selected as a training set, the number of the training sets is 2000, the pictures of the other part of different road lanes are taken as a test set, and the number of the test sets is 500;
all pictures in the data set need to be marked, the data set is marked in a unified mode through AutoCAD, the marking colors of figures, vehicles and the surrounding environment are different during marking, the environment can be distinguished conveniently, and model parameters can be reduced as soon as possible during training.
b. A network establishing stage, namely establishing BiSeNet;
the network creation stage comprises the following steps;
b1, a space path module is arranged on the left branch of the BiSeNet network model structure and used for solving the problem of space information loss, the space path module is composed of three convolution layers, each layer comprises a convolution layer with the step length of 2, then batch standardization processing and ReLU nonlinear activation are carried out, the size of an image output through the path is 1/8 of an original image, so that a feature map with a larger size can be utilized, rich space information is reserved, a context path module on the right branch uses a pre-trained lightweight model Xception module as a backbone network, rapid down-sampling operation is carried out to obtain a receptive field, then a global average pooling layer is added at the tail of the Xception module to provide the receptive field, finally the characteristics of the last two stages are fused by means of a partial U-shaped structure, the space path module is used for coding rich space information, the context path module is used for providing sufficient receptive field, the two modules complete different requirements and assist in completing detection tasks together;
b2, designing an attention thinning module by the BiSeNet network, wherein the attention thinning module is used for optimizing the characteristics of each stage, the characteristics of the two paths are different at the level of characteristic representation, so that the output of the two paths cannot be directly combined, and the network consists of a Global pool (Global pool), a convolution and polynomial multiplication layer (conv (1 × 1)), an accelerated neural network training layer (Batch norm) and an activation function layer (sigmoid);
b3, the BiSeNet network simultaneously uses an attention thinning module and a feature fusion module; the feature fusion module fuses the output features of the two paths to carry out final prediction; the network consists of a concatenate layer, a convolution and polynomial multiplication layer (Conv), a bn layer, an activation function layer (relu), a Global pool (Global pool), a convolution and polynomial multiplication layer (Conv (1 x 1)), an accelerated neural network training layer (Batch norm) and an activation function layer (sigmoid);
b4, cross entropy loss function as shown in equation (1):
CE- [ ylog p + (1-y) log (1-p) ] formula (1)
Wherein p represents the probability that the sample label is 1, p belongs to [0,1], and y represents the value of the label;
then equation (1) can be written as:
Figure BDA0002222900890000071
if p istComprises the following steps:
Figure BDA0002222900890000072
then:
loss=-log(pt) Formula (4)
Balancing the imbalance between the orbit and the background, introducing an improved cross-entropy loss as shown in equation (5):
L_loss=-α(1-pt)2log(pt) Formula (5)
Experiments show that when the value of α is 0.35, the improved loss function can enable the optimizer to be more suitable for a network model, and the model performance is better.
c. A network establishing stage, namely establishing the BiSeNet as a binary network;
the network creation stage comprises the following steps;
c1, creating the BiSeNet as binary BiSeNet, wherein the BiSeNet constraint weight and the activation value are binarized into +1 or-1 as shown in formula (6):
Figure BDA0002222900890000081
wherein a isbIs a binary activation value, arIs the real value activation amount, wbIs a binary weight, wrRepresenting the actual weight;
c2, the constraint BiSeNet has a binarization weight, and the convolution operation can be approximated by equation (7):
Figure BDA0002222900890000082
the symbol represents the conventional convolution operation, and the weight in the convolution process is binary, so that the symbol represents the conventional convolution operation
Figure BDA0002222900890000083
The convolution operation involving only addition and subtraction, not multiplication, yields the optimal estimates of E and β by solving the following optimization problem:
Figure BDA0002222900890000084
solving the deformation of the formula (8) to obtain:
Figure BDA0002222900890000085
training a binary weight network: training a BiSeNet with a binarization weight, and binarizing the weight during forward propagation and backward propagation; calculating a binary weight according to the formula (9), and then calculating activated forward propagation and gradient backward propagation according to the scaled binary weight; wherein the gradient formula is:
d. in the network operation stage, a countermeasure network is generated through binarization conditions to carry out track line detection;
and (3) network operation stage: the method comprises the steps of inputting a picture to be detected into a trained binary Bisenet network, obtaining a corresponding classification result, and realizing detection of the track line, wherein the first layer and the last layer of the binary network keep full-precision weight, and the result analysis shows that the picture generated by the method has very fine and smooth results and very real effect, can generate better results for all scenes, and can still better detect track information which is not marked in the training picture, so that the better robustness of the text algorithm is embodied.
The invention relates to a depth learning track line detection method based on a binarization network, which is used for detecting lanes in traffic pictures;
firstly, training and testing conditions are respectively established to generate a picture set of the countermeasure network, the proportion of the training set to the testing set is 4:1, in the aspect of data set marking, AutoCAD is adopted for unified marking, the data set is distinguished from the surrounding environment, the marking colors of people, vehicles and the surrounding environment are different during marking, the environment can be conveniently distinguished, model parameters can be reduced as soon as possible during training, and 8000 pictures with the resolution of 1280 x 720 are finally obtained by video cutting;
the BiSeNet network mainly contributes to providing a new method, the acquisition of image space information and the increase of the receptive field of an image are decomposed into two paths, and finally, the image characteristics extracted by the two paths are fused by a characteristic fusion module;
the left branch of the BiSeNet network model structure is a space path module used for solving the problem of space information loss, and the BiSeNet network model structure is composed of three convolutional layers, each layer comprises a convolutional layer with the step length of 2, then batch standardization processing and ReLU nonlinear activation are carried out, the size of an image output through the path is 1/8 of an original image, so that a feature map with a larger size can be utilized, and rich space information is reserved; for a context path module of a right branch, a pre-trained lightweight model Xscene module is used as a backbone network, rapid down-sampling operation is carried out to obtain a receptive field, then a global average pooling layer is added at the tail of the Xscene module to provide the receptive field, and finally the characteristics of the last two stages are fused by means of a partial U-shaped structure; the spatial path module is used for coding abundant spatial information, the context path module is used for providing enough receptive fields, and the two modules complete different requirements and jointly assist in completing detection tasks;
an Attention Refining Module (ARM) is designed in the BiSeNet network; the attention thinning module is used for optimizing the characteristics of each stage; the characteristics of the two paths are different at the level of the representation of the characteristics, so that the outputs of the two paths cannot be directly combined; the network consists of a Global pool (Global pool), a convolution and polynomial multiplication layer (conv (1 × 1)), an accelerated neural network training layer (Batch norm) and an activation function layer (sigmoid);
the BiSeNet network uses an Attention Refining Module (ARM) and a Feature Fusion Module (FFM) at the same time; the feature fusion module fuses the output features of the two paths to carry out final prediction; the network consists of a concatenate layer, a convolution and polynomial multiplication layer (Conv), a bn layer, an activation function layer (relu), a Global pool (Global pool), a convolution and polynomial multiplication layer (Conv (1 x 1)), an accelerated neural network training layer (Batch norm) and an activation function layer (sigmoid);
because the lane lines and the tracks have the same characteristics, a large lane line data set CULane pre-training network model proposed by Pan et al, Chinese university of hong Kong, is used, and a self-built track data set is used for retraining the model so as to learn and adjust parameters;
the cross entropy loss function is shown in equation (1):
CE- [ ylog p + (1-y) log (1-p) ] formula (1)
Wherein p represents the probability that the sample label is 1, p belongs to [0,1], and y represents the value of the label;
then equation (1) can be written as:
Figure BDA0002222900890000111
if p istComprises the following steps:
Figure BDA0002222900890000112
then:
loss=-log(pt) Formula (4)
To balance the imbalance between the track and the background, an improved cross-entropy loss is introduced as shown in equation (5):
L_loss=-α(1-pt)2log(pt) Formula (5)
Experiments show that when the value of α is 0.35, the improved loss function can enable the optimizer to be more suitable for a network model, and the model performance is better;
creating the BiSeNet as a binarized BiSeNet, wherein the BiSeNet constraint weights and activation values are binarized to +1 or-1 as shown in equation (6):
Figure BDA0002222900890000113
wherein a isbIs a binary activation value, arIs the real value activation amount, wbIs a binary weight, wrRepresenting the actual weight;
c2, to constrain BiSeNet to have binarization weights, the convolution operation can be approximated by equation (7):
Figure BDA0002222900890000114
the symbol represents the conventional convolution operation, and the weight in the convolution process is binary, so that the symbol represents the conventional convolution operation
Figure BDA0002222900890000115
The convolution operation involving only addition and subtraction, not multiplication, yields the optimal estimates of E and β by solving the following optimization problem:
Figure BDA0002222900890000121
solving the deformation of the formula (8) to obtain:
training a binary weight network: training a BiSeNet with a binarization weight, and binarizing the weight during forward propagation and backward propagation; calculating a binary weight according to the formula (9), and then calculating activated forward propagation and gradient backward propagation according to the scaled binary weight; wherein the gradient formula is
Figure BDA0002222900890000123
Inputting the picture to be detected into a trained binary Bisenet network, so that a corresponding classification result can be obtained, and the detection of the track line is realized, wherein the first layer and the last layer of the binary network keep full-precision weight; the result analysis shows that the picture generated by the method has very fine and smooth results and very real effect, can generate better results for all the scenes, can still better detect the track information which is not marked in the training picture, and embodies better robustness of the algorithm;
for the binary convolution neural network, the binaryzation of the first layers can cause precision loss, and the binaryzation effects of the second layers are very slight, so that the weight precision is reserved for the first layer and the last layer in the binaryzation process;
in addition, because the vehicle body occupies a certain space when the vehicle passes through the outer side of the track, if the pedestrian only avoids the track and the space between the two tracks, the safety of the pedestrian can not be ensured, and a certain space for the vehicle body to pass through needs to be reserved; therefore, the monocular distance measurement method is used for measuring the safe distance after the track is detected, and the actual coordinates, known quantities, of the corresponding pixel points are calculated according to the similar triangular proportion: the height H of the camera, the distance between a world coordinate point corresponding to the center of an image coordinate and a camera on a y axis, the image coordinate of the center point of a lens, the image coordinate of a measuring pixel point, the length xpix of an actual pixel, the width ypix of the actual pixel, the focal length f of the camera, the calculation of the y axis direction is the same as that of a previous model, and the calculation of the x axis is obtained by calculating the y axis coordinate through proportion, so that the coordinate in the vertical direction can be obtained; the safe distance between two sides of the known track is reversely deduced, the safe distance from the track in the picture is drawn, and if a pedestrian is detected outside the safe distance, a prompt is required to be set to enable the pedestrian to avoid in time so as to avoid the occurrence of underground track traffic accidents;
finally, inputting pictures for testing by using a trained lane line detection technology; the experimental results show that the lane detection algorithm of the invention has better effect, can quickly identify lane lines, is obviously superior to the traditional lane identification method of highways, and contributes to the development of embedded deep learning.
When the method is used, the track detection problem is regarded as an example segmentation problem by adjusting the structure and parameters of the network based on the bilateral segmentation network, and the track detection network based on deep learning is established by improving the context path module, the attention refining module and the feature fusion module in the BiSeNet network to carry out end-to-end real-time detection on the track, so that the accuracy of the model can be improved and the detection rate of the model can be accelerated.
The end-to-end deep learning algorithm provided by the invention has the advantages that the obtained result is finer and more exquisite, the effect is very real, and the method does not depend on more post-processing technologies;
the invention compresses the BiSeNet, so that the use of the deep learning algorithm in the embedded terminal becomes possible, and the development of the application of the deep learning algorithm to equipment such as a mobile terminal is promoted.
Firstly, obtaining a picture data set, then creating a BiSeNet, constructing a Spatial Path with 3 layers of binarization convolution layers, 3 layers of BatchNormal layers and 3 layers of signalling layers, an Attention Reformat Module with a global pole layer, binarization convolution layers, BatchNormal layers and signalling, and a Feature Fusion Module with 2 layers of binarization convolution layers, 2 layers of BatchNormal layers, signalling layers, global pole layers and signalling; and establishing the BiSeNet network as a binarization network, binarizing the convolution layer weight parameters and the activation value, and accelerating the neural network.
The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (6)

1. A deep learning track line detection method based on a binarization network is characterized in that: the method comprises the following steps:
a. in the sample preparation stage, firstly, a plurality of road scene videos are collected through a camera, after the videos are processed into a single frame, a picture is calibrated, trained, corrected, rotationally transformed and post-processed to obtain a picture data set P, and the data set A is divided into a training set X and a testing set Y according to a ratio of 4: 1;
b. a network establishing stage, namely establishing BiSeNet;
c. a network establishing stage, namely establishing the BiSeNet as a binary network;
d. and in the network operation stage, generating a countermeasure network through the binarization condition to perform track line detection.
2. The method for detecting the deep learning orbit line based on the binarization network as claimed in claim 1, wherein: in the step a, 2500 pictures with the resolution band size of 1280 × 720 are finally obtained through video cutting in the sample preparation stage, the pictures of a part of the highway lanes are selected as a training set, the number of the training sets is 2000, the pictures of the other part of different highway lanes are selected as a test set, and the number of the test sets is 500.
3. The method for detecting the deep learning orbit line based on the binarization network as claimed in claim 2, wherein: all pictures in the data set need to be marked, the data set is marked uniformly by adopting AutoCAD, and the marking colors of people, vehicles and the surrounding environment are different during marking.
4. The method for detecting the deep learning orbit line based on the binarization network as claimed in claim 1, wherein: in the step b, the network creation stage comprises the following steps;
b1, a space path module is arranged on the left branch of a BiSeNet network model structure, the space path module is composed of three convolution layers, each layer comprises a convolution layer with the step length of 2, then batch standardization processing and ReLU nonlinear activation are carried out, the size of an image output through the path is 1/8 of an original image, the context path module of the right branch uses a pre-trained lightweight model Xconcentration module as a backbone network to carry out rapid down-sampling operation to obtain a receptive field, then a global average pooling layer is added at the tail of the Xconcentration module to provide the receptive field, finally the characteristics of the last two stages are fused by means of a partial U-shaped structure, the space path module is used for coding rich space information, the context path module is used for providing enough receptive fields, and the two modules jointly assist in the completion of detection tasks;
b2, designing an attention refining module by the BiSeNet network, wherein the attention refining module is used for optimizing the characteristics of each stage, and the network consists of a global pool (Globalpool), a convolution and polynomial multiplication layer (conv (1 multiplied by 1)), an accelerated neural network training layer (Batch norm) and an activation function layer (sigmoid);
b3, the BiSeNet network simultaneously uses an attention thinning module and a feature fusion module; the feature fusion module fuses the output features of the two paths to carry out final prediction; the network consists of a concatenate layer, a convolution and polynomial multiplication layer (Conv), a bn layer, an activation function layer (relu), a Global pool (Global pool), a convolution and polynomial multiplication layer (Conv (1 x 1)), an accelerated neural network training layer (Batch norm) and an activation function layer (sigmoid);
b4, cross entropy loss function as shown in equation (1):
CE ═ ylogp + (1-y) log (1-p) ] formula (1)
Wherein p represents the probability that the sample label is 1, p belongs to [0,1], and y represents the value of the label;
then equation (1) can be written as:
Figure FDA0002222900880000021
if p istComprises the following steps:
Figure FDA0002222900880000022
then:
loss=-log(pt) Formula (4)
Balancing the imbalance between the orbit and the background, introducing an improved cross-entropy loss as shown in equation (5):
L_loss=-α(1-pt)2log(pt) Formula (5)
α is 0.35.
5. The method for detecting the deep learning orbit line based on the binarization network as claimed in claim 1, wherein: in the step c, the network creation stage comprises the following steps;
c1, creating the BiSeNet as binary BiSeNet, wherein the BiSeNet constraint weight and the activation value are binarized into +1 or-1 as shown in formula (6):
Figure FDA0002222900880000031
wherein a isbIs a binary activation value, arIs the real value activation amount, wbIs a binary weight, wrRepresenting the actual weight;
c2, the constraint BiSeNet has a binarization weight, and the convolution operation can be approximated by equation (7):
the symbol denotes a conventional convolution operation,
Figure FDA0002222900880000033
the convolution operation involving only addition and subtraction, not multiplication, yields the optimal estimates of E and β by solving the following optimization problem:
Figure FDA0002222900880000034
solving the deformation of the formula (8) to obtain:
Figure FDA0002222900880000035
training a binary weight network: training a BiSeNet with a binarization weight, and binarizing the weight during forward propagation and backward propagation; calculating a binary weight according to the formula (9), and then calculating activated forward propagation and gradient backward propagation according to the scaled binary weight; wherein the gradient formula is
Figure FDA0002222900880000036
6. The method for detecting the deep learning orbit line based on the binarization network as claimed in claim 1, wherein: in step d, the network operation stage: and inputting the picture to be detected into the trained binary Bisenet network to obtain a corresponding classification result, and completing the detection of the track line, wherein the first layer and the last layer of the binary network keep full-precision weight.
CN201910940999.0A 2019-09-30 2019-09-30 Deep learning track line detection method based on binarization network Pending CN110852157A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910940999.0A CN110852157A (en) 2019-09-30 2019-09-30 Deep learning track line detection method based on binarization network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910940999.0A CN110852157A (en) 2019-09-30 2019-09-30 Deep learning track line detection method based on binarization network

Publications (1)

Publication Number Publication Date
CN110852157A true CN110852157A (en) 2020-02-28

Family

ID=69597420

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910940999.0A Pending CN110852157A (en) 2019-09-30 2019-09-30 Deep learning track line detection method based on binarization network

Country Status (1)

Country Link
CN (1) CN110852157A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112001878A (en) * 2020-05-21 2020-11-27 合肥合工安驰智能科技有限公司 Deep learning ore scale measuring method based on binarization neural network and application system
CN112164047A (en) * 2020-09-25 2021-01-01 上海联影医疗科技股份有限公司 X-ray image metal detection method and device and computer equipment
CN116206224A (en) * 2023-03-06 2023-06-02 北京交通大学 Full-angle identification method for railway line of Unmanned Aerial Vehicle (UAV) railway inspection

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635744A (en) * 2018-12-13 2019-04-16 合肥工业大学 A kind of method for detecting lane lines based on depth segmentation network
CN110147794A (en) * 2019-05-21 2019-08-20 东北大学 A kind of unmanned vehicle outdoor scene real time method for segmenting based on deep learning
US20190266418A1 (en) * 2018-02-27 2019-08-29 Nvidia Corporation Real-time detection of lanes and boundaries by autonomous vehicles

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190266418A1 (en) * 2018-02-27 2019-08-29 Nvidia Corporation Real-time detection of lanes and boundaries by autonomous vehicles
CN109635744A (en) * 2018-12-13 2019-04-16 合肥工业大学 A kind of method for detecting lane lines based on depth segmentation network
CN110147794A (en) * 2019-05-21 2019-08-20 东北大学 A kind of unmanned vehicle outdoor scene real time method for segmenting based on deep learning

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
CHANGQIAN YU ETC.: "BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation", 《ARXIV[CS.CV]》 *
DAVY NEVEN ETC.: "Towards End-to-End Lane Detection: an Instance Segmentation Approach", 《2018 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV)》 *
MATTHIEU COURBARIAUX ETC.: "Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to +1 or −1", 《ARXIV[CS.LG]》 *
李乔伊: "基于视觉的车道线检测技术研究", 《中国优秀博硕士学位论文全文数据库(硕士)工程科技Ⅱ辑》 *
李升波 等: "深度神经网络的关键技术及其在自动驾驶领域的应用", 《汽车安全与节能学报》 *
韩江洪 等: "基于空间卷积神经网络的井下轨道检测方法", 《电子测量与仪器学报》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112001878A (en) * 2020-05-21 2020-11-27 合肥合工安驰智能科技有限公司 Deep learning ore scale measuring method based on binarization neural network and application system
CN112164047A (en) * 2020-09-25 2021-01-01 上海联影医疗科技股份有限公司 X-ray image metal detection method and device and computer equipment
CN116206224A (en) * 2023-03-06 2023-06-02 北京交通大学 Full-angle identification method for railway line of Unmanned Aerial Vehicle (UAV) railway inspection
CN116206224B (en) * 2023-03-06 2024-01-02 北京交通大学 Full-angle identification method for railway line of Unmanned Aerial Vehicle (UAV) railway inspection

Similar Documents

Publication Publication Date Title
WO2022141910A1 (en) Vehicle-road laser radar point cloud dynamic segmentation and fusion method based on driving safety risk field
CN111582029B (en) Traffic sign identification method based on dense connection and attention mechanism
CN111814621A (en) Multi-scale vehicle and pedestrian detection method and device based on attention mechanism
CN105260712A (en) Method and system for detecting pedestrian in front of vehicle
He et al. Rail transit obstacle detection based on improved CNN
CN110852157A (en) Deep learning track line detection method based on binarization network
CN106934374B (en) Method and system for identifying traffic signboard in haze scene
Liu et al. Real-time signal light detection based on yolov5 for railway
CN113052159B (en) Image recognition method, device, equipment and computer storage medium
CN114677507A (en) Street view image segmentation method and system based on bidirectional attention network
Cao et al. MCS-YOLO: A multiscale object detection method for autonomous driving road environment recognition
CN112949633A (en) Improved YOLOv 3-based infrared target detection method
CN104881661A (en) Vehicle detection method based on structure similarity
CN114782949B (en) Traffic scene semantic segmentation method for boundary guide context aggregation
CN117994206A (en) Dam crack detection model
CN114973199A (en) Rail transit train obstacle detection method based on convolutional neural network
Abraham et al. Traffic lights and traffic signs detection system using modified you only look once
CN114821510B (en) Lane line detection method and device based on improved U-Net network
Sato et al. Scene recognition for blind spot via road safety mirror and in-vehicle camera
CN116311146A (en) Traffic sign detection method based on deep learning
Chen et al. Near real-time situation awareness and anomaly detection for complex railway environment
CN113537000A (en) Monocular vision instance segmentation depth chain type feature extraction network, method and system
CN113869239A (en) Traffic signal lamp countdown identification system and construction method and application method thereof
Zhang YOLO Series Target Detection Technology and Application
Yao et al. TL-detector: Lightweight based real-time traffic light detection model for intelligent vehicles

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
AD01 Patent right deemed abandoned
AD01 Patent right deemed abandoned

Effective date of abandoning: 20230407