CN116311140A

CN116311140A - Method, apparatus and storage medium for detecting lane lines

Info

Publication number: CN116311140A
Application number: CN202310526923.XA
Authority: CN
Inventors: 王鸿飞; 姚佳丽; 李波; 赖福辉
Original assignee: Jika Intelligent Robot Co ltd
Current assignee: Jika Intelligent Robot Co ltd
Priority date: 2023-05-11
Filing date: 2023-05-11
Publication date: 2023-06-23
Anticipated expiration: 2043-05-11
Also published as: CN116311140B

Abstract

The present disclosure relates to a method, apparatus, and storage medium for detecting lane lines. The method comprises the steps of obtaining a lane line image set to be used for training a lane line detection model, wherein the lane line image set comprises left front, left rear, right front and right rear visual angle images captured by a target vehicle at the same moment; determining lane line instance point set information and lane line instance classification information associated with lane lines of the lane line image set image; combining the left front and right front view images to obtain a front view combined image, combining the left rear and right rear view images to obtain a rear view combined image, and scaling the front view combined image and the rear view combined image respectively; converting the lane line instance point set information of the zoomed image; and obtaining a target data set for training the lane line detection model based on the lane line instance classification information and the converted lane line instance point set information. In this way, accurate detection of lane lines of different manifestations, positions and features at the lateral viewing angle can be achieved.

Description

Method, apparatus and storage medium for detecting lane lines

Technical Field

The present disclosure relates generally to the field of intelligent traffic and image detection, and in particular to a method, apparatus and storage medium for detecting lane lines.

Background

The lane line is used as a basic component in road traffic, and the lane line detection technology is an important basis for realizing automatic navigation and vehicle track planning in the automatic driving field. Through the accurate detection of the lane lines, the automatic driving vehicle can realize the operations of automatic driving, overtaking, lane changing and the like in the lane, thereby improving the safety and the efficiency, and also providing real-time lane line information and warning for a driver, helping the driver to keep the vehicle to drive in the correct lane and reducing the risk of traffic accidents.

The current lane line detection methods are mainly divided into two types: traditional lane line detection algorithms and lane line detection algorithms based on deep learning.

The conventional lane line detection method generally solves the problem of lane line detection based on visual information clues, for example, the lane line detection is performed by means of extracting image features, applying an edge detection algorithm, fitting a straight line or a curve, etc., however, the robustness of the lane line detection method to the conditions such as illumination change, complex background, shielding, etc. is poor, and the detection effect is easily affected by environmental factors.

The lane line detection algorithm based on deep learning can automatically learn and automatically correct learning parameters in the model training process for target features, and the robustness and accuracy of the lane line detection algorithm can be improved to a certain extent. However, the current research direction of the lane line detection algorithm mainly focuses on the forward view angle, and the detection research of the lane line detection on the lateral view angle is less.

In order to ensure the safety of automatic driving, an automatic driving system cannot rely on a front-view camera only, and also can perform multi-view perception. For multi-view perception, the perception range is not only limited to the lane lines under the forward view, but also the lane lines with the lateral view are required to be detected, and the trained forward view lane line detection model is directly applied to the side view due to inconsistent lane line expression forms, positions and characteristics of the lateral and forward view, so that the requirement on generalization capability of the network is particularly high, and the realization is generally difficult. For some models, the lateral lane line is not used for lateral detection because of the large change of the characteristics of the lateral lane line relative to the forward lane line.

Aiming at the fact that Line Anchor is difficult to be applicable to detection of a lateral small-view lane Line, the Chinese patent application CN115205800A (automatic driving weekly vision lane Line model training and detecting method) adopts a strategy of 'first rotating and then combining' to enable the lateral view lane Line to be consistent with a forward view sample space, and further detection of the lateral lane Line is achieved. However, this method requires a strong priori information, and is poor in generalization and accuracy for lane line detection. Moreover, the model has large calculation amount and high requirement on hardware.

Therefore, there is an urgent need for a lane line detection scheme under a lateral multi-view angle to at least partially solve the technical problems existing in the prior art.

Disclosure of Invention

According to an example embodiment of the present disclosure, a scheme for detecting lane lines is provided.

In a first aspect of the present disclosure, a method for training a lane line detection model is provided. The lane line detection model is used for lane line detection under a lateral multi-view angle, and the method comprises the following steps: acquiring a lane line image set to be used for training a lane line detection model, wherein the lane line image set comprises a left front view angle image, a left rear view angle image, a right front view angle image and a right rear view angle image which are captured by a target vehicle at the same moment; determining lane line instance point set information and lane line instance classification information associated with lane lines of the lane line image set image; combining the left front view angle image and the right front view angle image to obtain a front view combined image, combining the left rear view angle image and the right rear view angle image to obtain a rear view combined image, and scaling the front view combined image and the rear view combined image respectively; converting the lane line instance point set information of the zoomed front-view combined image and the zoomed rear-view combined image; and obtaining a target data set for training the lane line detection model based on the lane line instance classification information and the converted lane line instance point set information.

In some embodiments, acquiring a set of lane line images to be used for training a lane line detection model may include: obtaining a plurality of pieces of video data of a left front side view angle, a left rear side view angle, a right front side view angle, and a right rear side view angle by using a camera arranged on a target vehicle; and extracting lane line image sets from the multiple segments of video data based on the preset frame extraction frequency.

In some embodiments, determining lane line instance point set information and lane line instance classification information associated with lane lines of the lane line image set image may include: and marking the lane lines in a point set mode and marking classification information aiming at the lane lines to obtain lane line instance point set information.

In some embodiments, combining the left and right front view images to obtain a front view combined image, combining the left and right rear view images to obtain a rear view combined image, and scaling the front view combined image and the rear view combined image, respectively, may include: combining the left front view image and the right front view image in the horizontal direction, wherein the left front view image is positioned on the left side, and the right front view image is positioned on the right side; combining the left back view image and the right back view image in a horizontal direction, wherein the left back view image is positioned on the right side, and the right back view image is positioned on the left side; and scaling the forward-looking combined image and the backward-looking combined image to a single-size image size.

In some embodiments, converting lane line instance point set information of the scaled forward looking merged image and the scaled backward looking merged image may include: for the left front view angleThe lane line instance point set conversion rule in the image and the right rear view image may be: u (U) _after = U _before / 2，V _after = V _before The method comprises the steps of carrying out a first treatment on the surface of the The lane line instance point set conversion rule for the right front view image and the left rear view image may be: u (U) _after = U _before / 2 + w / 2，V _after = V _before The method comprises the steps of carrying out a first treatment on the surface of the Wherein U is _after And V _after Representing the transformed coordinate points, U _before And V _before And representing coordinate points before conversion, wherein w is the width of the zoomed front-view combined image or the zoomed rear-view combined image.

In some embodiments, deriving the target data set for training the lane-line detection model based on the lane-line instance classification information and the converted lane-line instance point set information may include: connecting the lane line instance point set information to obtain a lane line instance segmentation map; processing the coordinates of the starting point of the lane line to obtain a starting point heat map of the example of the lane line; further identifying according to the lane line instance classification information to generate lane line instance classification parameters; and obtaining a target data set based on the lane line instance segmentation map, the lane line instance starting point heat map and the lane line instance classification parameters.

In a second aspect of the present disclosure, a method for detecting a lane line is provided. The lane lines include lateral multi-view lane lines, and the method includes: acquiring an image to be detected, wherein the image to be detected comprises a left front view angle image, a left rear view angle image, a right front view angle image and a right rear view angle image which are captured by a target vehicle at the same moment; and detecting the image to be detected by using the lane line detection model according to the first aspect of the disclosure to obtain a lane line detection result.

In some embodiments, the method may further comprise: combining the left front view angle image and the right front view angle image respectively to obtain a front view combined image, the left rear view angle image and the right rear view angle image to obtain a rear view combined image, and scaling the front view combined image and the rear view combined image respectively; determining a lane line instance segmentation map and lane line instance classification information corresponding to the zoomed front-view combined image and the zoomed rear-view combined image; and determining the actual coordinates of the left lane line and the right lane line based on the determined lane line instance segmentation map and the lane line instance classification information.

In a third aspect of the present disclosure, an electronic device is provided. The apparatus includes one or more processors; and a memory coupled to the processor, the memory having instructions stored therein that, when executed by the processor, cause the device to perform actions comprising: acquiring a lane line image set to be used for training a lane line detection model, wherein the lane line image set comprises a left front view angle image, a left rear view angle image, a right front view angle image and a right rear view angle image which are captured by a target vehicle at the same moment; determining lane line instance point set information and lane line instance classification information associated with lane lines of the lane line image set image; combining the left front view angle image and the right front view angle image to obtain a front view combined image, combining the left rear view angle image and the right rear view angle image to obtain a rear view combined image, and scaling the front view combined image and the rear view combined image respectively; converting the lane line instance point set information of the zoomed front-view combined image and the zoomed rear-view combined image; and obtaining a target data set for training the lane line detection model based on the lane line instance classification information and the converted lane line instance point set information.

In a fourth aspect of the present disclosure, an electronic device is provided. The apparatus includes one or more processors; and a memory coupled to the processor, the memory having instructions stored therein that, when executed by the processor, cause the device to perform actions comprising: acquiring an image to be detected, wherein the image to be detected comprises a left front view angle image, a left rear view angle image, a right front view angle image and a right rear view angle image which are captured by a target vehicle at the same moment; and detecting the image to be detected by using the lane line detection model according to the first aspect of the disclosure to obtain a lane line detection result.

In a fifth aspect of the present disclosure, there is provided a computer readable medium having stored thereon a computer program which when executed by a processor implements a method according to the first aspect of the present disclosure.

In a sixth aspect of the present disclosure, there is provided a computer readable medium having stored thereon a computer program which when executed by a processor implements a method according to the second aspect of the present disclosure.

Various embodiments according to the present disclosure can at least provide the following technical effects:

in the training process of the detection model, aiming at the obvious differences between the expression form, the position and the characteristics of the lane line under the lateral view angle and the lane line under the forward view angle, a plurality of lateral image data are combined and then zoomed, and the zoomed images are converted to obtain target training data, so that the accurate detection of the lane line under the lateral view angle by the model is realized. The trained model can predict lane lines in multiple sideways view images simultaneously, so that lane line detection can be better adapted to sideways view.

The parameters needed for merging and scaling the pictures are relatively less, and a plurality of parameters for scaling the single-size image can be multiplexed, so that the calculation force needed in the calculation of the model can be remarkably reduced, and the hardware cost is reduced.

The lane line prediction model fully combines the main branch and the auxiliary branch in the learning process, so that the learning capability of the model network is improved, only the main branch can be selectively reserved in the reasoning process, the running speed of the model is improved, a better reasoning result can be obtained, and the model robustness and accuracy are high.

The model network can be fully applied to lane line detection under a lateral view angle, the images of lane lines at different angles do not need to be considered to rotate at different angles, an accurate reasoning result can be obtained, stronger priori information does not need to be relied on, and generalization and accuracy of the system are improved.

It should be understood that what is described in this summary is not intended to limit the critical or essential features of the embodiments of the disclosure nor to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The above and other features, advantages and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. In the drawings, the same or similar reference numerals denote the same or similar elements. The accompanying drawings are included to provide a better understanding of the present disclosure, and are not to be construed as limiting the disclosure, wherein:

FIG. 1 illustrates a schematic diagram of an example environment in which various embodiments of the present disclosure may be implemented;

FIG. 2 illustrates a schematic flow diagram of a process for training a lane line detection model according to some embodiments of the present disclosure;

FIG. 3 illustrates a lane line detection network architecture schematic diagram according to some embodiments of the present disclosure;

FIG. 4 illustrates a schematic representation of lane marking of an image to be detected in accordance with some embodiments of the present disclosure;

FIG. 5 illustrates an artwork image and corresponding example split label contrast schematic according to some embodiments of the present disclosure;

FIG. 6 illustrates a schematic flow diagram of a process for detecting lane lines according to some embodiments of the present disclosure;

FIG. 7 illustrates a schematic block diagram of an apparatus for training a lane line detection model according to some embodiments of the present disclosure;

FIG. 8 illustrates a schematic block diagram of an apparatus for detecting lane lines according to some embodiments of the present disclosure; and

FIG. 9 illustrates a block diagram of an example computing device capable of implementing various embodiments of the disclosure.

Detailed Description

Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In describing embodiments of the present disclosure, the term "comprising" and its like should be taken to be open-ended, i.e., including, but not limited to. The term "based on" should be understood as "based at least in part on". The term "one embodiment" or "the embodiment" should be understood as "at least one embodiment". The terms "first," "second," and the like, may refer to different or the same object. Other explicit and implicit definitions are also possible below.

As described above, the conventional lane line detection method has poor robustness to the conditions such as illumination change, complex background and shielding, and the detection effect is easily affected by environmental factors. The detection method based on deep learning mainly focuses on a forward visual angle, the detection research on the detection of the lane lines at the lateral visual angle is insufficient, the generalization capability of the existing detection method network is insufficient, the detection accuracy of the lateral visual angle is insufficient, and the detection method cannot be even used for lateral detection. In addition, the existing model has large calculation amount and high hardware requirement, and is not beneficial to saving the cost.

To this end, the present disclosure provides a solution for detecting lane lines. In the process of training a detection model, aiming at obvious differences between the expression form, the position and the characteristics of the lane line under the lateral view angle and the lane line under the forward view angle, the scheme performs merging and scaling on a plurality of lateral image data, converts the scaled image to obtain target training data, and realizes accurate detection of the lane line under the lateral view angle. The trained model can predict lane lines in multiple sideways view images at the same time stamp, so that lane line detection can be better adapted to sideways view. In addition, the model parameters required by the operation of merging and zooming the pictures are relatively less, and a plurality of parameters can be multiplexed when zooming the single-size image, so that the calculation force required by the calculation of the model can be remarkably reduced, and the hardware cost is reduced.

Exemplary embodiments of the present disclosure will be described below in conjunction with fig. 1-9.

FIG. 1 illustrates a schematic diagram of an example environment 100 in which various embodiments of the present disclosure may be implemented. In this example environment 100, as shown in fig. 1, the present disclosure illustrates the manner in which model training and application is illustrated with lane line entity detection models. Generally, the example environment 100 includes a target vehicle 101, a set of lane line images to be detected 110, a computing device 120 and detection results 130, a detection model 140, and a set of lane line images to be trained 150. Wherein the detection model 140 and the set of lane line images to be trained 150 constitute a model training system 160, and the set of lane line images to be detected 110, the computing device 120, and the detection results 130 constitute a model application system 170.

In some embodiments, the target vehicle 101 may be a vehicle for acquiring a set 150 of lane line images to be trained, the acquired images of which may be input to the detection model 140 for model training. The target vehicle 101 may also be a vehicle traveling on a road, which captures a set of images to be detected and inputs the images to the computing device 120 for detection, thereby outputting a final lane line detection result to assist in automatic driving or semi-automatic driving of the target vehicle 101. That is, the target vehicle 101 may capture the set of lane line images 150 to be trained for training of the detection model 140 through a camera provided on itself, or may capture the set of lane line images 110 to be detected as input for lane line detection using the model application system 170. In other embodiments, the lane line image set 150 to be trained may also be acquired in any other suitable manner, which is not limited by the present disclosure.

In the example of fig. 1, target vehicle 101 may be any type of vehicle that may carry a person and/or object and that is moved by a power system such as an engine, including, but not limited to, a car, truck, bus, electric car, motorcycle, caravan, train, collection vehicle, and the like. In some embodiments, the target vehicle 101 in the environment 100 may be a vehicle having some autonomous capability, such a vehicle also being referred to as an unmanned vehicle or an autonomous vehicle. In some embodiments, the target vehicle 101 may also be a vehicle with semi-autonomous driving capabilities.

In some implementations, as shown in fig. 1, the target vehicle 101 may install a plurality of cameras for data acquisition on a vehicle body, for example, install 4 cameras, respectively acquire data from 4 viewing angles of front left, rear left, front right and rear right, the viewing angles of 2 cameras on any side are partially overlapped, so as to ensure blind areas that can cover the views on the left side and the right side of the vehicle, and the acquired raw data is video data recorded by the 4 cameras during the running process of the vehicle.

With continued reference to fig. 1, in some embodiments, the computing device 120 may be communicatively coupled to the target vehicle 101. Although shown as a separate entity, the computing device 120 may be embedded in the target vehicle 101. The computing device 120 may also be an entity external to the target vehicle 101 and may communicate with the target vehicle 101 via a wireless network. Computing device 120 may be any device having computing capabilities.

In some embodiments, computing device 120 may include, but is not limited to, personal computers, server computers, hand-held or laptop devices, mobile devices (such as mobile phones, personal digital assistants PDAs, media players, etc.), consumer electronics, minicomputers, mainframe computers, cloud computing resources, and the like.

The training and use of models in computing device 120 will be described below in terms of a machine or deep learning model. As shown in FIG. 1, in an example environment 100, a model training system 160 and/or a model application system 170 may be implemented, for example, in a computing device 120 as shown in FIG. 1. It should be understood that the description of the structure and functionality of the example environment 100 is for illustrative purposes only and is not intended to limit the scope of the subject matter described herein. The subject matter described herein may be implemented in different structures and/or functions.

As described above, the process of determining the detection result 130 of the lane line image set 110 to be detected may be divided into two stages: a model training phase and a model application phase. As an example, in the model training phase, the model training system 160 may utilize the target data set in the data image set to be trained 150 to train the detection model 140 for achieving lane line detection at lateral multi-view. In the model application phase, the model application system 170 may receive a trained detection model 140 such that the detection results 130 of the lane lines are determined by the detection model 140 based on the set of lane line images 110 to be detected.

In some embodiments, the detection model 140 may be constructed as a deep or machine learning network. In some embodiments, the learning network may include a plurality of networks, wherein each network may be a multi-layer neural network, which may be composed of a large number of neurons. Through the training process, the corresponding parameters of the neurons in each network can be determined. The training process of the detection model 140 may be performed in an iterative manner until at least some of the parameters of the detection model 140 converge or until a predetermined number of iterations is reached, thereby obtaining final model parameters.

In order to implement the embodiments of the present disclosure, the lane line detection network of the lane line detection model 140 needs to be specifically set according to the detection requirement, for example, the network structure shown in fig. 3 may be adopted. Fig. 3 illustrates a lane line detection network architecture schematic diagram according to some embodiments of the present disclosure.

As shown in fig. 3, the lane line detection network may be generally divided into a coding network and a decoding network, where the coding network extracts lane line multi-scale features from an input image, and decodes the coding network to obtain required lane line information, where the coding network includes classification information such as color, line type, and whether a fishbone line is or not, of each of two lane line examples on the left and right sides of a target vehicle, for example, an acquisition vehicle, under a lateral view, and meanwhile, adopts a lane line example starting point heat map as an auxiliary branch to improve learning ability of the network.

Specifically, for the coding network, by analyzing the characteristics of the lane lines, the lane lines are generally white or yellow strip-shaped areas in the pavement background, and the characteristics are difficult to extract after being blocked and worn, so that the lane line characteristics in the image are difficult to accurately extract by adopting the traditional image processing algorithm, and the characteristics are required to be extracted by adopting the deep learning network. In some embodiments, residual network ResNet may be employed as a backbone network to extract multi-scale features of an image. Assuming that the size of the input image is (h, w, 3), after passing through the coding network, the size of the output feature map is (h/n, w/n, c); where h represents the height of the image, w represents the width of the image, 3 represents the RGB channels used for the input image, n represents the downsampling multiple, and c represents the number of channels of the extracted feature map. The coding network can extract rich semantic information through a convolution network, and can balance spatial resolution and processing calculation amount by adjusting the multiple n of downsampling.

For the decoding network, the decoding network may be responsible for decoding and outputting the extracted lane line features and may include an instance segmentation main branch, a lane line instance classification main branch and a lane line instance starting point auxiliary branch.

In one embodiment, the auxiliary branch of the starting point of the lane line example can be used for example, a feature map extracted by a ResNet network in the coding network, the channel of the feature map is reduced to a layer through a 1*1 convolution kernel, so that the starting point heat map of the lane line example is predicted, the auxiliary branch is used for model training only, the network learning performance is improved, and the network only keeps the main branch during testing, so that the running speed of the model is improved.

In one embodiment, the lane line instance splitting main branch may utilize the feature map extracted by the ResNet network in the encoding network, for example, by rolling up the sum of 1*1 to reduce the number of channels of the feature map. On the basis, based on the characteristic of the sparsity of the lane lines, namely the lane lines are strip-shaped, slender and sparse in shape relative to the road background, the slices of the feature map are repeatedly moved in the horizontal and vertical directions by utilizing the strong shape prior information of the lane lines and the space information of the captured inter-pixel cross rows and columns, so that each pixel can obtain global information. The up-sampling stage restores the feature map to the image size fed into the network, unlike semantic segmentation, the instance segmentation distinguishes each lane line, inputs a feature map of size (h/n, w/n, c), and outputs an instance segmentation feature map of size (h, w, 5), where 5 represents the number of lane line instance segmentations (including background).

In one embodiment, the lane line instance classification branches may include four branches of classification information, namely whether a lane line instance exists, a lane line instance color branch, a lane line instance line type branch, and a lane line instance fish bone line branch. The network structure of the four branches is consistent. The basic structure of the network classification branch is described by taking the example of whether the lane line example has a branch or not. The feature map extracted by the ResNet network in the coding network is input as (h/n, w/n, c), firstly, the channel number of the feature map is reduced through the convolution sum of one 1*1, then the feature map is flattened and then is predicted through the fully-connected network, the size of the fully-connected network output layer depends on the number of predicted examples, and the size is generally set as 4, and the size corresponds to classification information (without background) of different lane line examples respectively.

It should be appreciated that the training of the model training system 160 is primarily training the detection network of the lane line detection model 140. That is, the set of data images 150 to be trained is processed into a set of target data that can be used to train the detection model 140, as will be described in more detail below in connection with FIG. 2.

The technical solutions described above are only for example and not limiting the present disclosure. It should be understood that the individual networks may also be arranged in other ways and connections. In order to more clearly explain the principles of the disclosed solution, the process of model training will be described in more detail below with reference to fig. 2.

FIG. 2 illustrates a schematic flow diagram of a process 200 for training a lane line detection model according to some embodiments of the present disclosure. In some embodiments, process 200 may be implemented in computing device 120 of fig. 1. A process 200 for training a model according to an embodiment of the present disclosure is now described with reference to fig. 2 in conjunction with fig. 1. For ease of understanding, the specific examples mentioned in the following description are illustrative and are not intended to limit the scope of the disclosure.

At block 201, a lane line image set is acquired that is to be used to train a lane line detection model, the lane line image set including a left front view image, a left rear view image, a right front view image, a right rear view image captured by a target vehicle at the same time. The lane line image set to be used for training the lane line detection model may be the lane line image set to be trained 150 shown in fig. 1.

In one embodiment, multiple segments of video data of the left front side view, the left rear side view, the right front side view, and the right rear side view may be obtained using a camera disposed on the target vehicle 101, and then the lane line image set 150 to be trained may be extracted from the multiple segments of video data based on a preset extraction frame frequency. The lane line image set 150 to be trained may include a plurality of sets of left front view images, left rear view images, right front view images, and right rear view images captured by the target vehicle 101 under the same time stamp, and each set of front view images, left rear view images, right front view images, and right rear view images under the same time stamp may be regarded as one training set.

In this way, for the video data obtained during the driving process of the target vehicle 101, according to a certain frame extraction frequency, image data with the same time stamp and a certain amount of four lateral directions are extracted, and according to the principle of data diversity, lane line data of different scenes can be screened.

At block 203, lane line instance point set information and lane line instance classification information associated with the lane lines of the lane line image set image are determined. After extracting the set of lane line images to be trained, these images may be annotated, which may be accomplished automatically using computing device 120 in FIG. 1, or may be manually implemented or semi-automatically annotated, as not limited by the present disclosure.

Fig. 4 illustrates a schematic representation of lane marking of an image to be detected in accordance with some embodiments of the present disclosure. Referring to fig. 4, a left lane line in the left front view image is denoted by 1, and a right lane line is denoted by 2; the left lane line in the right front view image is marked as 3, and the right lane line is marked as 4; the left lane line in the left rear view angle image is marked as 3, and the right lane line is marked as 4; and the left lane line in the right rear view image is marked as 1, and the right lane line is marked as 2.

In one embodiment, the lane marking schematic of fig. 4 (taking the collection vehicle as an example) may be obtained by marking by the following rule:

a) Marking type: the lane lines are marked as point sets, namely each lane line instance is expressed as a set of points;

b) Marking the range: marking a total of four lane line examples of each of left and right lane line examples of the acquisition vehicle by taking the acquisition vehicle as a reference;

c) Marking positions: the marked lane line example point set is the edge position close to one side of the acquisition vehicle;

d) Labeling information: the lane line labeling has strong relevance to the example segmentation, and the category information to be labeled for each lane line example is shown in the following table (1):

table (1): lane line instance classification information table

。

And inputting the marked image set into a detection network of the detection model 140 as described above to obtain an original lane line data set and training labels required by the lane line detection network, wherein the training labels specifically comprise a lane line instance segmentation label map, a lane line instance starting point heat map and a lane line instance classification information parameter matrix.

At block 205, the left and right front view images are combined to obtain a front view combined image, the left and right rear view images are combined to obtain a rear view combined image, and the front view combined image and the rear view combined image are scaled, respectively.

Fig. 5 illustrates an artwork image and corresponding example split label contrast schematic according to some embodiments of the present disclosure. In one embodiment, as shown to the left in fig. 5, the left front view image and the right front view image may be combined in a horizontal direction, with the left front view image on the left and the right front view image on the right. Subsequently, the left and right back view images are combined in the horizontal direction, and the left back view image is on the right side and the right back view image is on the left side, and finally the front and back view combined images are scaled to a single size image.

In such embodiments, in particular, the lane line sample space under a single lateral view has a large gap from the lane line sample space under a forward view, and there are significant limitations in the feature enhancement operations adopted in the lane line instance segmentation network in the decoding network, and the four lateral view lane line sample spaces are further analyzed, and the following strategies are adopted: first, four side view image data of the same time stamp and marked lane line information are obtained.

In one embodiment, the two images are then combined horizontally, with the front left and front right images on the left and front right; combining the left rear image and the right rear image in the horizontal direction, wherein the left rear image is on the right side and the right rear image is on the left side during combination; and after the two images obtained by splicing are zoomed in the size of the one-size image of the receipt, the lane line sample space of the two images is relatively consistent with the forward visual angle. Therefore, the two combined images are consistent in size, network parameters can be detected in a common mode in the model learning process, required calculation force is reduced, and hardware cost is reduced.

At block 207, lane line instance point set information for the scaled forward looking merged image and the scaled backward looking merged image is converted. And combining the four pictures to obtain new image data and new lane marking information. The lane line instance ID, the lane line instance color, the line type and whether the fishbone line in the marked lane line information are kept unchanged, but the marked lane line instance point set is required to be converted because the merged and scaled lane line instance point set is changed.

In one embodiment, the set of labeled lane line instance points may be converted in the following manner. The transformation rule for the lane line instance point set in the left front view image and the right back view image can be U _after = U _before / 2；V _after = V _before And the rule for converting the lane line instance point set in the right front view image and the left rear view image can be U _after = U _before / 2 + w / 2；V _after = V _before . Wherein U is _after And V _after Representing the transformed coordinate points, U _before And V _before And representing a coordinate point before conversion, wherein w is the width of the zoomed front-view combined image or the zoomed rear-view combined image.

At block 209, a target data set for training a lane line detection model is obtained based on the lane line instance classification information and the converted lane line instance point set information. After the lane line instance point set is converted, the lane line instance classification information can be combined to obtain a target data set, and the target data set can comprise lane line instance segmentation label information, lane line instance starting point heat icon label information and lane line instance classification information.

In some embodiments, the lane line instance point set information may be connected to obtain a lane line instance segmentation map. And then, processing the coordinates of the starting point of the lane line to obtain a starting point heat map of the lane line example. And then, re-marking is carried out according to the lane line instance classification information, and lane line instance classification parameters are generated. And finally, obtaining a target data set based on the lane line instance segmentation map, the lane line instance starting point heat map and the lane line instance classification parameters.

In one embodiment, as shown in the right side of fig. 5, in order to generate the image instance segmentation label, the labeling information of the point set may be connected, and the width of the point set is set to 16 grid pixels, so as to obtain a final image label. The instance split ID is consistent with the original marked lane line instance ID. The original label and corresponding example split label graph is shown in fig. 5. The lane line pixels in the real label graph may be represented by 0, 1, 2, 3, and 4, but in fig. 5, in order to clearly represent the label data, the 0, 1, 2, 3, and 4 pixels in the original image label may be replaced by five colors of black, red, yellow, green, and blue or colors of different gray scales.

It should be noted that the above color or gray line substitutions are merely exemplary, and other suitable means may be used to replace the original image label, which is not limited by the present disclosure.

In one embodiment, to enhance the accuracy of the network for lane line detection, lane line instance start point heat map supervision branches may be added. Specifically, the lane line starting point coordinates in the input image are processed, and supervision information is provided for the lane line starting point detection module: firstly, constructing a starting point matrix with the size of (h/n, w/n, 1), wherein each point corresponds to an n-n region of an input image, then acquiring starting point coordinates of all lane lines on the input image according to marked information, and generating a Gaussian blur kernel at a position corresponding to the starting point matrix by taking the starting point as the center. Where h is the height of the image, w is the width of the image, n represents the multiple of the scaling of the feature encoding network, and n x n represents the size of any pixel point after scaling corresponding to the original input image.

In one embodiment, lane line instance classification information is an integral part of lane line detection, requiring more specific identification of the detected lane line instance. Generating a 4*4 matrix according to the original labeling information, wherein the first row of the matrix corresponds to whether a lane line example exists or not, 0 represents absence, and 1 represents existence; the second row of the matrix corresponds to the example color of the lane line, 0 represents white, and 1 represents yellow; the third row of the matrix corresponds to the example line type of the lane line, 0 represents a solid line, and 1 represents a broken line; the fourth row of the matrix corresponds to whether the lane line example is a fishbone line, 0 indicates no, and 1 indicates yes.

Therefore, the tag data necessary for training the lane line detection network can be obtained through the data processing, namely, a target data set is obtained and is used for training the detection model.

In some embodiments, the training data set obtained by the data preprocessing is utilized, the built lane line detection network is subjected to end-to-end training according to the designed lane line detection loss function, and the converged lane line detection model is obtained after multiple iterations. The lane line detection loss function may include, for example, a lane line instance segmentation loss function, a lane line instance classification loss function, and a lane line instance start point loss function.

In one embodiment, the lane line instance segmentation and lane line instance classification loss functions refer to a loss function for a lane line instance segmentation module task and a loss function for a lane line instance classification module task, respectively, which are both cross entropy loss, and the calculation formula can be:

；

where L is the classification loss to be calculated, i represents the prediction sample, c represents the corresponding instance classOtherwise, N represents the number of units of all computation loss, M represents the number of instance divisions (e.g., take 5 in instance division loss, 4 in instance classification loss); y is _ic For a sign function (0 or 1), taking 1 if the true class of the predicted sample i is equal to c, otherwise taking 0; p is p _ic Representing the prediction probability that the prediction sample i belongs to category c.

In one embodiment, the lane line instance start point Loss function refers to a Loss function for the task of the lane line instance start point prediction module, and a Focal Loss function may be used, and the formula for calculation may be:

；

wherein l _point For the loss of the starting point of the lane line example to be calculated, alpha and beta represent adjustable factors, i and j correspond to the row and column coordinates of the starting point heat map, y _ij Representing the tag at the (i, j) position, p _ij Is the predicted value at that location.

It should be appreciated that the lane line detection loss function described above is merely exemplary, and that any other suitable loss function may be employed by those skilled in the art, as this disclosure is not limited in this regard.

Fig. 6 illustrates a schematic flow diagram of a process 600 for detecting lane lines according to some embodiments of the present disclosure.

At block 601, an image to be detected is acquired, the image to be detected including a left front view image, a left rear view image, a right front view image, a right rear view image, the target vehicle capturing at the same time. The target vehicle 101 may acquire the set of lane line images 110 to be detected by a camera and input the set of lane line images 110 to be detected into the computing device 120 including the detection model 140 for detection.

In one embodiment, specifically, for a specific time stamp, four perspective images synchronously acquired by four lateral cameras may be collected first, and for front and rear sides, merging and then scaling back to the original image size may be performed.

At block 603, the image to be detected is detected by using the lane line detection model, and a lane line detection result is obtained. The detection result 130 is finally output by the detection of the detection model 140. In such an embodiment, the two pieces of image data after the data preprocessing are inferred, and the corresponding lane line instance segmentation map and the corresponding lane line instance classification information are detected. The detection result is, for example, a visual data matrix presented on the image.

In order to further improve the output effect of the detection result 130, post-processing may also be performed on the obtained data matrix. In one embodiment, the post-processing may include a two-step operation. Firstly, the obtained lane line example segmentation map is subjected to longitudinal equidistant mode to select the pixel point with the maximum probability larger than a specified threshold value in the corresponding row in the corresponding example segmentation map, and the continuous pixel point coordinates (U) of the lane line of the corresponding example can be obtained through the operation _det ,V _det ) Then, the left side and the right side are separated by taking the image center line as a boundary, and the actual coordinates of the lane line corresponding to the left side are as follows: u (U) _{after_left} = 2 * U _det ；V _{after_left} = V _det . The actual coordinates of the lane line on the right side are: u (U) _{after_left} = 2 * （U _det - w/2）；V _{after_left} = V _det . Wherein, (U) _det ,V _det ) Corresponding to a lane line coordinate point (U) detected by a network _{after_left，} V _{after_left} ) And correspondingly obtaining a lane line coordinate point on the original image, wherein w is the width of the scaled image. And finally, after the lane line detection flow is processed, displaying the visualization effect of the detected lane line result on the original image.

In the training process of the detection model according to the various embodiments of the present disclosure, aiming at the obvious differences between the expression form, the position and the characteristics of the lane line under the lateral view angle and the lane line under the forward view angle, the plurality of lateral image data are combined and scaled, and the scaled images are converted to obtain target training data, so that the accurate detection of the lane line under the lateral view angle by the model is realized. The trained model can predict lane lines in multiple sideways view images simultaneously, so that lane line detection can be better adapted to sideways view. The parameters needed for merging and scaling the pictures are relatively less, and a plurality of parameters for scaling the single-size image can be multiplexed, so that the calculation force needed in the calculation of the model can be remarkably reduced, and the hardware cost is reduced. The lane line prediction model fully combines the main branch and the auxiliary branch in the learning process, so that the learning capability of the model network is improved, only the main branch can be selectively reserved in the reasoning process, the running speed of the model is improved, a better reasoning result can be obtained, and the model robustness and accuracy are high. The model network can be fully applied to lane line detection under a lateral view angle, the images of lane lines at different angles do not need to be considered to rotate at different angles, an accurate reasoning result can be obtained, stronger priori information does not need to be relied on, and generalization and accuracy of the system are improved.

Fig. 7 illustrates a schematic block diagram of an apparatus 700 for training a lane line detection model according to some embodiments of the present disclosure. As shown in fig. 7, the apparatus 700 includes: a lane line image set to be trained acquisition module 701 configured to acquire a lane line image set to be used for training a lane line detection model, the lane line image set including a left front view image, a left rear view image, a right front view image, a right rear view image captured by a target vehicle at the same timing; a lane line instance point set information and classification information determination module 703 configured to determine lane line instance point set information and lane line instance classification information associated with lane lines of the lane line image set image; an image merge scaling module 705 configured to merge the left and right front view images to obtain a front view merged image, merge the left and right rear view images to obtain a rear view merged image, and scale the front view merged image and the rear view merged image, respectively; a lane line point set information conversion module 707 configured to convert lane line instance point set information of the scaled forward-looking combined image and the scaled backward-looking combined image; and a target data set generation module 709 configured to derive a target data set for training the lane line detection model based on the lane line instance classification information and the converted lane line instance point set information.

In some embodiments, the lane line instance point set information and classification information determination module 703 may be further configured to label lane lines in a point set manner and to label classification information for lane lines to obtain lane line instance point set information.

In some embodiments, the image merge scaling module 705 may be further configured to merge the left front view image with the right front view image in a horizontal direction, and the left front view image is on the left side and the right front view image is on the right side; combining the left back view image and the right back view image in a horizontal direction, wherein the left back view image is positioned on the right side, and the right back view image is positioned on the left side; and scaling the forward-looking combined image and the backward-looking combined image to a single-size image size.

In some embodiments, lane line point set information conversion module 707 may be further configured to: the lane line instance point set conversion rule for the left front view image and the right rear view image may be: u (U) _after = U _before / 2，V _after = V _before The method comprises the steps of carrying out a first treatment on the surface of the The lane line instance point set conversion rule for the right front view image and the left rear view image may be: u (U) _after = U _before / 2 + w / 2，V _after = V _before The method comprises the steps of carrying out a first treatment on the surface of the Wherein U is _after And V _after Representing the transformed coordinate points, U _before And V _before And representing coordinate points before conversion, wherein w is the width of the zoomed front-view combined image or the zoomed rear-view combined image.

In some embodiments, the target data set generating module 709 may be further configured to connect the lane line instance point set information to obtain a lane line instance segmentation map; processing the coordinates of the starting point of the lane line to obtain a starting point heat map of the example of the lane line; further identifying according to the lane line instance classification information to generate lane line instance classification parameters; and obtaining a target data set based on the lane line instance segmentation map, the lane line instance starting point heat map and the lane line instance classification parameters.

Fig. 8 illustrates a schematic block diagram of an apparatus 800 for detecting lane lines according to some embodiments of the present disclosure. As shown in fig. 8, the apparatus 800 includes: a to-be-detected image acquisition module 801 configured to acquire an to-be-detected image including a left front view image, a left rear view image, a right front view image, a right rear view image captured by a target vehicle at the same timing; and a lane line detection model application module 803 configured to detect an image to be detected using the lane line detection model to obtain a lane line detection result.

In summary, in the training process of the detection model according to various embodiments of the present disclosure, aiming at the obvious differences between the expression form, the position and the characteristics of the lane line under the lateral view angle and the lane line under the forward view angle, the plurality of lateral image data are combined and scaled, and the scaled image is converted to obtain the target training data, so that accurate detection of the lane line under the lateral view angle by the model is realized. The trained model can predict lane lines in multiple sideways view images simultaneously, so that lane line detection can be better adapted to sideways view. The parameters needed for merging and scaling the pictures are relatively less, and a plurality of parameters for scaling the single-size image can be multiplexed, so that the calculation force needed in the calculation of the model can be remarkably reduced, and the hardware cost is reduced. The lane line prediction model fully combines the main branch and the auxiliary branch in the learning process, so that the learning capability of the model network is improved, only the main branch can be selectively reserved in the reasoning process, the running speed of the model is improved, a better reasoning result can be obtained, and the model robustness and accuracy are high. The model network can be fully applied to lane line detection under a lateral view angle, the images of lane lines at different angles do not need to be considered to rotate at different angles, an accurate reasoning result can be obtained, stronger priori information does not need to be relied on, and generalization and accuracy of the system are improved.

FIG. 9 illustrates a block diagram of an example computing device 900 capable of implementing various embodiments of the disclosure. Device 900 may be used to implement computing device 120 of fig. 1. As shown in fig. 9, the apparatus 900 includes a Central Processing Unit (CPU) 901, which can perform various appropriate actions and processes according to computer program instructions stored in a Read Only Memory (ROM) 902 or computer program instructions loaded from a storage unit 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data required for the operation of the device 900 can also be stored. The CPU 901, ROM 902, and RAM 903 are connected to each other through a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.

Various components in device 900 are connected to I/O interface 905, including: an input unit 906 such as a keyboard, a mouse, or the like; an output unit 907 such as various types of displays, speakers, and the like; a storage unit 908 such as a magnetic disk, an optical disk, or the like; and a communication unit 909 such as a network card, modem, wireless communication transceiver, or the like. The communication unit 909 allows the device 900 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunications networks.

The processing unit 901 performs the various methods and processes described above, such as process 200 and/or process 600. For example, in some embodiments, process 200 and/or process 600 may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 908. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 900 via the ROM 902 and/or the communication unit 909. When the computer program is loaded into RAM 903 and executed by CPU 901, one or more steps of process 200 and/or process 600 described above may be performed. Alternatively, in other embodiments, CPU 901 may be configured to perform process 200 and/or process 600 in any other suitable manner (e.g., by means of firmware).

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a load programmable logic device (CPLD), etc.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

Moreover, although operations are depicted in a particular order, this should be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation can also be implemented in multiple implementations separately or in any suitable subcombination.

Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims

1. A method for training a lane line detection model for lane line detection at a lateral multi-view angle, comprising:

Acquiring a lane line image set to be used for training the lane line detection model, wherein the lane line image set comprises a left front view image, a left rear view image, a right front view image and a right rear view image which are captured by a target vehicle at the same moment;

determining lane line instance point set information and lane line instance classification information associated with lane lines of the lane line image set image;

combining the left front view image and the right front view image to obtain a front view combined image, combining the left rear view image and the right rear view image to obtain a rear view combined image, and scaling the front view combined image and the rear view combined image respectively;

converting the zoomed front-view combined image and the zoomed back-view combined image into the lane line instance point set information; and

and obtaining a target data set for training the lane line detection model based on the lane line instance classification information and the converted lane line instance point set information.

2. The method of claim 1, wherein acquiring a set of lane line images to be used for training the lane line detection model comprises:

Obtaining multiple pieces of video data of a left front side view angle, a left rear side view angle, a right front side view angle and a right rear side view angle by using a camera arranged on the target vehicle; and

and extracting the lane line image set from the multiple segments of video data based on a preset frame extraction frequency.

3. The method of claim 1, wherein determining lane line instance point set information and lane line instance classification information associated with lane lines of the lane line image set image comprises:

and marking the lane lines in a point set mode and marking classification information aiming at the lane lines to obtain the lane line instance point set information.

4. The method of claim 1, wherein merging the left front view image and the right front view image to obtain a front view merged image, merging the left rear view image and the right rear view image to obtain a rear view merged image, and scaling the front view merged image and the rear view merged image, respectively, comprises:

combining the left front view image and the right front view image in a horizontal direction, the left front view image being on the left side and the right front view image being on the right side;

Combining the left back view image and the right back view image in a horizontal direction, and the left back view image being on the right side and the right back view image being on the left side; and

the front-view combined image and the rear-view combined image are scaled to a single-size image size.

5. The method of claim 1, wherein converting the lane-line instance point set information of the scaled forward-looking combined image and the scaled backward-looking combined image comprises:

the lane line instance point set conversion rule in the left front view image and the right rear view image is as follows:

U _after = U _before / 2；

V _after = V _before the method comprises the steps of carrying out a first treatment on the surface of the And

the lane line instance point set conversion rule for the right front view image and the left rear view image is:

U _after = U _before / 2 + w / 2；

V _after = V _before ；

wherein U is _after And V _after Representing the transformed coordinate points, U _before And V _before And representing a coordinate point before conversion, wherein w is the width of the zoomed front-view combined image or the zoomed rear-view combined image.

6. The method of claim 1, wherein deriving a target dataset for training the lane-line detection model based on the lane-line instance classification information and the converted lane-line instance point set information comprises:

Connecting the lane line instance point set information to obtain a lane line instance segmentation map;

processing the coordinates of the starting point of the lane line to obtain a starting point heat map of the example of the lane line;

re-marking according to the lane line instance classification information to generate lane line instance classification parameters; and

and obtaining the target data set based on the lane line instance segmentation map, the lane line instance starting point heat map and the lane line instance classification parameters.

7. A method for detecting lane lines, the lane lines comprising lateral multi-view lane lines, comprising:

acquiring an image to be detected, wherein the image to be detected comprises a left front view angle image, a left rear view angle image, a right front view angle image and a right rear view angle image which are captured by a target vehicle at the same moment; and

the image to be detected is detected using the lane line detection model according to any one of claims 1 to 6, and a lane line detection result is obtained.

8. The method of claim 7, wherein the method further comprises:

respectively merging the left front view image and the right front view image to obtain a front view merged image, the left rear view image and the right rear view image to obtain a rear view merged image, and respectively scaling the front view merged image and the rear view merged image;

Determining a lane line instance segmentation map and lane line instance classification information corresponding to the zoomed front-view combined image and the zoomed rear-view combined image;

and determining the actual coordinates of the left lane line and the right lane line based on the determined lane line instance segmentation map and the lane line instance classification information.

9. An electronic device, comprising:

a processor; and

a memory coupled with the processor, the memory having instructions stored therein, which when executed by the processor, cause the device to perform actions comprising:

acquiring a lane line image set to be used for training a lane line detection model, wherein the lane line image set comprises a left front view image, a left rear view image, a right front view image and a right rear view image which are captured by a target vehicle at the same moment;

10. An electronic device, comprising:

a processor; and

11. A computer readable storage medium having stored thereon a computer program which when executed by a processor implements the method according to any of claims 1 to 6.

12. A computer readable storage medium having stored thereon a computer program which when executed by a processor implements the method according to claim 7 or 8.