CN112862839B - Method and system for enhancing robustness of semantic segmentation of map elements - Google Patents

Method and system for enhancing robustness of semantic segmentation of map elements Download PDF

Info

Publication number
CN112862839B
CN112862839B CN202110203999.XA CN202110203999A CN112862839B CN 112862839 B CN112862839 B CN 112862839B CN 202110203999 A CN202110203999 A CN 202110203999A CN 112862839 B CN112862839 B CN 112862839B
Authority
CN
China
Prior art keywords
frame image
semantic segmentation
optical flow
frame
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110203999.XA
Other languages
Chinese (zh)
Other versions
CN112862839A (en
Inventor
杨蒙蒙
唐雪薇
江昆
杨殿阁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202110203999.XA priority Critical patent/CN112862839B/en
Publication of CN112862839A publication Critical patent/CN112862839A/en
Application granted granted Critical
Publication of CN112862839B publication Critical patent/CN112862839B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/187Segmentation; Edge detection involving region growing; involving region merging; involving connected component labelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a method and a system for enhancing the robustness of semantic segmentation of map elements, which are characterized by comprising the following steps: 1) And dividing the driving scene video acquired by the vehicle-mounted camera sensor into independent video frames according to a time sequence. 2) Semantic segmentation is carried out on each independent video frame data in the step 1) based on a preset semantic segmentation network to obtain masks corresponding to semantic segmentation results of various map elements in each frame image, and optical flow information is introduced between adjacent frame images to enhance the video semantic segmentation stability. According to the method, only continuous video information of a camera sensor is used, each frame of semantic segmentation result is connected through optical flow information, and robust map elements can be accurately identified through low cost; therefore, the invention can be widely applied to the field of automatic driving. The invention can be widely applied to the field of automatic driving.

Description

Method and system for enhancing robustness of semantic segmentation of map elements
Technical Field
The invention belongs to the field of automatic driving, and particularly relates to a video map element semantic segmentation robustness enhancing method and system based on computer vision and optical flow information fusion.
Background
The high-precision map is used as an indispensable perception container for high-level automatic driving, is a key basis for realizing automatic driving, not only provides lane-level navigation and driving environment information for the automatic driving vehicle, but also enriches the prior environment information of the automatic driving vehicle to assist the automatic driving vehicle in subsequent decision judgment. Two major tasks for building high-precision maps are acquisition and updating. The adoption of faster and lower cost methods to accomplish these two tasks is a practical challenge for high precision maps. The problems associated with this are also a matter of intense research in the current field of automotive driving. Meanwhile, with the continuous and deep research in the related field of computer vision, different elements of a high-precision map based on image perception also become an important way for solving the perception problem.
At present, the mainstream scheme for high-precision map construction and updating is to adopt video data of a camera sensor and perceive real-world lane information based on a visual method. The semantic segmentation of the map elements in the visual sensor screen is a high-precision and low-cost map element information extraction mode, and the map element information can be effectively provided for a subsequent three-dimensional map element modeling module through the method. Semantic segmentation is a task of semantically classifying all pixels in an image. The current semantic segmentation model research based on the deep convolutional neural network is very common, the convolutional neural network has very strong characteristic learning capability, and a stable output effect can be obtained by fully training a large amount of data. The semantic segmentation neural network based on deep learning is generally divided into an encoding and a decoding part. The coding part is a feature extraction network, and deep features of the input image are extracted through multilayer convolution; the decoding part is an up-sampling network to get an output result that is consistent with the input size.
The existing semantic segmentation network can obtain a segmentation result with higher precision on the existing open source data set, but has the defect that the existing semantic segmentation network is directly applied to an actual high-precision map element sensing and modeling task aiming at actual scene data. The current semantic segmentation of a single-frame image omits time sequence information acquired by a camera sensor, and the situation of segmentation and hopping is caused by the uncorrelated processing between adjacent frames, so that the direct application of the existing method often has unstable and hopping results, and the direct application of the existing method is difficult to be applied to actual engineering.
Disclosure of Invention
In view of the above problems, an object of the present invention is to provide a method and a system for enhancing robustness of semantic segmentation of map elements based on optical flow information fusion, in which only continuous video information of a camera sensor is used, and semantic segmentation results of each frame are connected through optical flow information, so that accurate identification of robust map elements can be realized at a low cost.
In order to realize the purpose, the invention adopts the following technical scheme:
the first aspect of the invention provides a robustness enhancing method for semantic segmentation of map elements, which comprises the following steps: 1) And dividing the driving scene video acquired by the vehicle-mounted camera sensor into independent video frame images according to the time sequence. 2) Semantic segmentation is carried out on each independent video frame image data in the step 1) based on a preset semantic segmentation network to obtain masks corresponding to semantic segmentation results of various map elements in each frame image, and optical flow information is introduced between adjacent frame images to enhance the stability of the semantic segmentation video images.
Further, in the step 2), the method for enhancing the video semantic segmentation stability includes the following steps:
2.1 Reading the ith frame of image, and inputting the ith frame of image into a preset semantic segmentation network for processing to obtain masks corresponding to semantic segmentation results of various map elements in the ith frame of image; the preset semantic segmentation network is a network aiming at three map elements of a lane line, a lamp post and a road signboard;
2.2 Reading the (i + 1) th frame image, and inputting the (i + 1) th frame image into a preset semantic segmentation network for processing to obtain masks corresponding to semantic segmentation results of various map elements in the (i + 1) th frame image;
2.3 Calculating optical flow information between the ith frame image and the (i + 1) th frame image to obtain an inter-frame optical flow diagram;
2.4 Based on the obtained inter-frame light flow graph, transmitting masks corresponding to semantics of various map elements in the ith frame image to the (i + 1) th frame image, and performing enhancement operation on a semantic segmentation result of the (i + 1) th frame image within a preset limited region range, so that an incomplete segmentation region in the (i + 1) th frame image is supplemented.
2.5 Iterative enhancement: and repeating the steps 2.2) to 2.4) until all the independent video frame images in the step 1) are processed.
Further, the step 2.4) of merging optical flow information includes the steps of:
2.4.1 According to the calculated optical flow graph between the two frame images, the semantic segmentation result of the map element of the previous frame image is propagated to the corresponding position of the later frame image along the displacement vector of the optical flow graph according to the corresponding relation of the pixel in the optical flow graph, so as to obtain a corrected semantic element area;
2.4.2 The corrected semantic element region obtained in the step 2.4.1) and the semantic element segmentation result of the later frame image obtained in the step 2.2) are compared, and the part of the later frame image with incomplete map element segmentation is corrected based on the comparison result, so that the enhanced supplement is realized.
Further, in the step 2.4.1), the calculation formula that the semantic segmentation result of the map element of the previous frame image is propagated to the corresponding position of the subsequent frame image along the displacement vector of the optical flow map according to the corresponding relationship of the pixel in the optical flow map is as follows:
Figure BDA0002949665300000021
in the formula (I), the compound is shown in the specification,
Figure BDA0002949665300000022
the abscissa of a certain pixel on the image of the ith frame,
Figure BDA0002949665300000023
the vertical coordinate of a certain pixel of the ith frame on the image of the previous frame;
Figure BDA0002949665300000024
is the abscissa of a certain pixel of the (i + 1) th frame on the image,
Figure BDA0002949665300000025
the vertical coordinate of a certain pixel of the (i + 1) th frame on the image; dt is elapsed time; (u) (x,y) ,v (x,y) ) The pixel of that position stored for each coordinate in the light flow map corresponds to the velocity vector propagated to the next frame.
Further, in the step 2.4.2), the semantic segmentation result corresponding to the modified post-frame image is:
I (i+1|i+1) =I (i+1) ∪(I (i+1|i) ∩I (restricted area) )
Wherein, I (i+1) Semantic segmentation for the i +1 th frame imageCutting results; i is (i+1|i) The corrected semantic segmentation result of the ith frame image of optical flow propagation; i is (i+1|i+1) Semantic segmentation result I for fusing I +1 frame image (i+1) Modified semantic segmentation result I of ith frame image propagated with optical flow (i+1|i) And (5) a final semantic segmentation result of the (i + 1) th frame image.
In a second aspect of the present invention, there is provided a robustness enhancing system for semantic segmentation of map elements, comprising:
the video frame image acquisition module is used for dividing the driving scene video acquired by the vehicle-mounted camera sensor into independent video frames according to time sequence;
and the semantic enhancement module is used for performing semantic segmentation on each independent video frame data based on a preset semantic segmentation network to obtain masks corresponding to semantic segmentation results of various map elements in each frame image, and introducing optical flow information between adjacent frame images to enhance the video semantic segmentation stability.
Further, the semantic enhancement module includes:
the front frame image processing module is used for reading the front frame image and inputting the front frame image into a semantic segmentation network aiming at three map elements of a lane line, a lamp post and a road signboard for processing to obtain masks corresponding to semantic segmentation results of various map elements in the front frame image;
the post-frame image processing module is used for reading a post-frame image and inputting the post-frame image into a semantic segmentation network aiming at three map elements of a lane line, a lamp post and a road signboard for processing to obtain masks corresponding to semantic segmentation results of various map elements in the post-frame image;
the optical flow graph acquisition module is used for calculating optical flow information between the images of the front frame and the rear frame to obtain an inter-frame optical flow graph;
the optical flow information fusion module is used for transmitting the mask corresponding to the front frame image to the rear frame image through the obtained interframe optical flow graph, and performing enhancement operation on the result of the rear frame image in a preset limited area range to supplement the incomplete semantic segmentation area of the rear frame image;
and the iteration enhancement module is used for adding 1 to the number of the image of the next frame and returning to the image processing module of the previous frame until all the images of the frames are processed.
Further, the optical flow information fusion module comprises a correction module and a growing area limiting module, wherein the correction module is used for transmitting the map element semantic segmentation result of the front frame image to the corresponding position of the rear frame image along the optical flow image displacement vector according to the corresponding relation of the pixels in the optical flow map through the optical flow map corresponding to the optical flow information according to the optical flow information between the two calculated frame images to obtain a corrected semantic element area; the growth region limiting module is used for comparing the obtained corrected semantic element region with the semantic element segmentation result of the later frame image, and correcting the part of the later frame image with incomplete map element segmentation based on the comparison result, so that the enhanced supplement is realized.
Due to the adoption of the technical scheme, the invention has the following advantages: 1) According to the method, only continuous video information of a camera sensor is used, each frame of semantic segmentation result is connected through optical flow information, and robust map elements can be accurately identified through low cost; 2) The invention enhances the segmentation effect of the later frame by spreading the information of the former frame to the later frame through the optical flow, can reduce the unstable segmentation situation of the map elements and enhance the robustness of the segmentation of the map elements. Therefore, the invention can be widely applied to the field of automatic driving.
Drawings
FIG. 1 is a flow chart of the optical flow fusion algorithm of the present invention;
FIG. 2 is a block diagram of the iterative enhancement algorithm of the present invention.
Detailed Description
The invention is described in detail below with reference to the figures and examples.
As shown in FIG. 1, the method for enhancing the semantic segmentation robustness of map elements provided by the invention takes a video shot by a vehicle-mounted camera sensor in an actual automatic driving engineering task as input, adopts a basic semantic segmentation network to realize the preliminary extraction of the map elements, on the basis, a segmentation mask of a front frame is propagated to a rear frame through optical flow by fusing optical flow information between continuous video frames, and adopts a certain fault-tolerant mechanism to continuously iterate an optimization result to enhance the map elements which are segmented and jumped. Specifically, the method comprises the following steps:
1) And dividing the driving scene video acquired by the vehicle-mounted camera sensor into independent video frame images according to the time sequence.
2) Semantic segmentation is carried out on each independent video frame image data in the step 1) based on a preset semantic segmentation network to obtain masks corresponding to semantic segmentation results of various map elements in each frame image, and optical flow information is introduced between adjacent frame images to enhance the stability of the semantic segmentation video images.
Specifically, the method comprises the following steps:
2.1 Processing ith frame image: reading the ith frame of image, and inputting the ith frame of image into a preset semantic segmentation network for processing to obtain masks corresponding to semantic segmentation results of various map elements in the ith frame of image; the preset semantic segmentation network is a network aiming at three map elements of a lane line, a lamp post and a road signboard;
2.2 Processing the i +1 th frame image: reading the (i + 1) th frame image, and inputting the (i + 1) th frame image into a preset semantic segmentation network for processing to obtain masks corresponding to semantic segmentation results of various map elements in the (i + 1) th frame image;
2.3 Obtain an optical flow map: calculating optical flow information between the ith frame image and the (i + 1) th frame image to obtain an inter-frame optical flow diagram;
2.4 Optical flow information fusion: and transmitting masks corresponding to various map element semantics in the ith frame image to the (i + 1) th frame image based on the obtained interframe light flow graph, and performing enhancement operation on the semantic segmentation result of the (i + 1) th frame image within a preset limited region range to supplement the incomplete segmentation region in the (i + 1) th frame image. The limited area refers to a neighborhood with a detection result area in the (i + 1) th frame of image, for the lamp post map elements, the neighborhood can be a longitudinal neighborhood, for the road signboard map elements, the neighborhood can be a horizontal neighborhood and a longitudinal neighborhood, and the size of the neighborhood can be adjusted according to actual problems;
2.5 Iterative enhancement: and repeating the steps 2.2) -2.4) until all the independent video frame images in the step 1) are processed.
Further, as shown in fig. 2, the semantic segmentation networks preset in step 2.1) and step 2.2) are generally composed of an encoder and a decoder. The encoder is usually a deep convolutional network with a complex structure, and the purpose of the deep convolutional network is to extract deep feature information, and different encoder models have different characterization capabilities on features and different segmentation effects. The decoder is usually an upsampled network whose purpose is to convert the depth feature information extracted by the encoder into a segmentation result that is consistent with the size of the input image.
In an actual high-precision map modeling task, three map elements of a lane line, a lamp post and a road signboard need to be extracted, so that the semantic segmentation network only segments the three semantics in a video image, labels input into the semantic segmentation network are subjected to certain processing before the semantic segmentation network is trained, only semantic labels of the three map elements of the lane line, the lamp post and the road signboard are reserved, and after full training, the semantic segmentation network applied to the invention can directly output three semantic segmentation results. The step only provides a preliminary semantic segmentation result of the map elements in a single frame, so that the method does not relate to a specific method for realizing semantic segmentation, and does not make other special requirements on the structure of a semantic segmentation network.
Further, in the step 2.3), after each frame of map element is extracted, optical flow information is introduced between adjacent frames to enhance the video semantic segmentation stability. The invention is not concerned with, and is therefore not limited to, methods of specifically implementing optical flow computations.
Further, in the step 2.4), when the optical flow information fusion is performed, the method includes the steps of:
2.4.1 Iterative enhancement algorithm: and according to the calculated optical flow graph between the two frame images, transmitting the semantic segmentation result of the map element of the previous frame image to the corresponding position of the later frame image along the displacement vector of the optical flow graph according to the corresponding relation of the pixels in the optical flow graph, and obtaining a corrected semantic element area.
In order to make the map element area of the optical flow diagram correction effectively utilized, the invention uses the corrected segmentation result as the semantic segmentation result of the final frame image and iterates to the operation of the next frame as the previous frame information.
Wherein, the optical flow graph is interframe information calculated according to the pixel relation of two frames, and each coordinate stores a velocity vector (u) of the pixel of the position correspondingly propagating to the next frame (x,y) ,v (x,y) ). Is provided with
Figure BDA0002949665300000051
In order to fuse the ith frame semantic segmentation result with the abscissa of a certain pixel of the ith frame on the image after the ith-1 frame segmentation result of optical flow propagation,
Figure BDA0002949665300000052
in order to fuse the i-th frame semantic segmentation result and the i-1 frame segmentation result of optical flow propagation, the vertical coordinate of a certain pixel of the i-th frame on the image is subjected to dt, and then the pixel point
Figure BDA0002949665300000053
The corresponding position coordinates of the propagated delay flow diagram to the next frame of pixels are:
Figure BDA0002949665300000054
2.4.2 Growth region restriction algorithm: comparing the corrected semantic element region obtained in the step 2.4.1) with the semantic element segmentation result of the later frame image obtained in the step 2.2), and correcting the part of the later frame image with incomplete map element segmentation based on the comparison result, thereby realizing enhanced supplement.
Specifically, the area corresponding to the semantic segmentation result of the next frame image is compared with the semantic element area corresponding to the corresponding position obtained in the step 2.4.1) according to the semantic segmentation network in the step 2.2) to judge whether the next frame image can be modified, whether the semantic pixel points of the previous frame image propagated to the subsequent frame image can modify the semantic segmentation result of the subsequent frame image is judged, and if the subsequent frame image has semantic segmentation information in the semantic element area, the subsequent frame image is considered to be a feasible area, so that the optical flow information can only be increased in the action area of the original result.
In order to limit the influence of the error propagation of the optical flow on the subsequent frame image, ensure that the area of the optical flow enhancement is not gradually expanded along with the iteration and ensure that the error area of the optical flow propagation can be avoided, the invention designs an algorithm for limiting the area of the optical flow enhancement to a certain feasible area, and avoids the conditions of error propagation and error iteration. Specifically, let I (i+1) As a result of semantic segmentation of the I +1 th frame image, I (i+1|i) Modified semantic segmentation results for the ith frame image of optical flow propagation, I (i+1|i+1) A modified semantic segmentation result I of the ith frame image for fusing the (I + 1) th frame semantic segmentation result I (I + 1) with optical flow propagation (i+1|i) The final semantic segmentation result of the i +1 th frame image includes:
I (i+1|i+1) =I (i+1) ∪(I (i+1|i) ∩I (restricted area) )
Because the precision of the optical flow calculation method is limited, the complete one-to-one correspondence of the pixels of the front frame and the rear frame cannot be realized, and the algorithm is only effective for map elements with obviously insufficient segmentation areas in the frame subjected to segmentation and jumping. If the accuracy itself is high enough, the algorithm may introduce new errors due to the introduction of optical flow.
Based on the map element semantic segmentation enhancing method, the invention also provides a map element semantic segmentation enhancing system, which comprises the following steps: the video frame image acquisition module is used for dividing the driving scene video acquired by the vehicle-mounted camera sensor into independent video frames according to time sequence; and the semantic enhancement module is used for performing semantic segmentation on each independent video frame data based on a preset semantic segmentation network to obtain masks corresponding to semantic segmentation results of various map elements in each frame image, and introducing optical flow information between adjacent frame images to enhance the video semantic segmentation stability.
Further, the semantic enhancement module comprises:
the front frame image processing module is used for reading the front frame image and inputting the front frame image into a semantic segmentation network aiming at three map elements of a lane line, a lamp post and a road signboard for processing to obtain masks corresponding to semantic segmentation results of various map elements in the front frame image;
the post-frame image processing module is used for reading a post-frame image and inputting the post-frame image into a semantic segmentation network aiming at three map elements of a lane line, a lamp post and a road signboard for processing to obtain masks corresponding to semantic segmentation results of various map elements in the post-frame image;
the optical flow graph acquisition module is used for calculating optical flow information between the images of the front frame and the rear frame to obtain an inter-frame optical flow graph;
the optical flow information fusion module is used for transmitting the mask corresponding to the front frame image to the rear frame image through the obtained interframe optical flow graph, and performing enhancement operation on the result of the rear frame image in a preset limited area range to supplement the incomplete semantic segmentation area of the rear frame image;
and the iteration enhancement module is used for adding 1 to the number of the next frame image and returning to the previous frame image processing module until all the frame images are processed.
Further, the optical flow information fusion module comprises a correction module and a growth area limiting module, wherein the correction module is used for transmitting the map element semantic segmentation result of the front frame image to the corresponding position of the rear frame image along the optical flow image displacement vector according to the corresponding relation of pixels in the optical flow map through the optical flow map corresponding to the optical flow information according to the optical flow information between the two frame images obtained by calculation so as to obtain a corrected semantic element area; the increase region limiting module is used for comparing the obtained corrected semantic element region with the semantic element segmentation result of the later frame image, and correcting the part of the later frame image with incomplete map element segmentation based on the comparison result to realize enhanced supplement.
The above embodiments are only used for illustrating the present invention, and the structure, connection mode, manufacturing process, etc. of the components may be changed, and all equivalent changes and modifications performed on the basis of the technical solution of the present invention should not be excluded from the protection scope of the present invention.

Claims (3)

1. A method for enhancing robustness of semantic segmentation of map elements is characterized by comprising the following steps:
1) Dividing a driving scene video acquired by a vehicle-mounted camera sensor into independent video frame images according to a time sequence;
2) Semantic segmentation is carried out on each independent video frame image data in the step 1) based on a preset semantic segmentation network to obtain masks corresponding to semantic segmentation results of various map elements in each frame image, and optical flow information is introduced between adjacent frame images to enhance the stability of the semantic segmentation video images;
in the step 2), the method for enhancing the video semantic segmentation stability comprises the following steps:
2.1 Reading the ith frame of image, and inputting the ith frame of image into a preset semantic segmentation network for processing to obtain masks corresponding to semantic segmentation results of various map elements in the ith frame of image; the preset semantic segmentation network is a network aiming at three map elements of a lane line, a lamp post and a road signboard;
2.2 Reading the (i + 1) th frame image, and inputting the (i + 1) th frame image into a preset semantic segmentation network for processing to obtain masks corresponding to semantic segmentation results of various map elements in the (i + 1) th frame image;
2.3 Calculating optical flow information between the ith frame image and the (i + 1) th frame image to obtain an inter-frame optical flow diagram;
2.4 Based on the obtained interframe light flow graph, transmitting masks corresponding to semantics of various map elements in the ith frame image to the (i + 1) th frame image, and performing enhancement operation on a semantic segmentation result of the (i + 1) th frame image in a preset limited region range to supplement an incomplete segmentation region in the (i + 1) th frame image;
the optical flow information fusion method comprises the following steps:
2.4.1 According to the calculated optical flow graph between the two frame images, spreading the map element semantic segmentation result of the front frame image to the corresponding position of the rear frame image along the optical flow graph displacement vector according to the corresponding relation of pixels in the optical flow graph to obtain a corrected semantic element area;
2.4.2 Comparing the corrected semantic element region obtained in the step 2.4.1) with the semantic element segmentation result of the later frame image obtained in the step 2.2), and correcting the part of the later frame image with incomplete map element segmentation based on the comparison result to realize enhanced supplement;
the semantic segmentation result corresponding to the modified post-frame image is as follows:
I (i+1|i+1) =I (i+1) ∪(I (i+1|i) ∩I (restricted area) )
Wherein, I (i+1) The semantic segmentation result is the (i + 1) th frame image; i is (i+1|i) The corrected semantic segmentation result of the ith frame image of optical flow propagation; i is (i+1|i+1) Semantic segmentation result I for fusing I +1 frame image (i+1) Modified semantic segmentation result I of ith frame image propagated with optical flow (i+1|i) The final semantic segmentation result of the (i + 1) th frame image;
2.5 Iterative enhancement: and repeating the steps 2.2) to 2.4) until all the independent video frame images in the step 1) are processed.
2. The method as claimed in claim 1, wherein the robustness of the semantic segmentation of the map elements is enhanced by: in the step 2.4.1), the calculation formula that the semantic segmentation result of the map element of the previous frame image is propagated to the corresponding position of the subsequent frame image along the displacement vector of the optical flow map according to the corresponding relation of the pixel in the optical flow map is as follows:
Figure FDA0003926748370000021
in the formula (I), the compound is shown in the specification,
Figure FDA0003926748370000022
the abscissa of a certain pixel on the image of the ith frame,
Figure FDA0003926748370000023
the vertical coordinate of a certain pixel in the ith frame on the image;
Figure FDA0003926748370000024
is the abscissa of a certain pixel of the (i + 1) th frame on the image,
Figure FDA0003926748370000025
the vertical coordinate of a certain pixel of the (i + 1) th frame on the image; dt is elapsed time; (U) (x,y) ,V (x,y) ) The pixel of that position stored for each coordinate in the light flow map corresponds to the velocity vector propagated to the next frame.
3. A robustness enhancement system for semantic segmentation of map elements, comprising:
the video frame image acquisition module is used for dividing the driving scene video acquired by the vehicle-mounted camera sensor into independent video frames according to time sequence;
the semantic enhancement module is used for performing semantic segmentation on each independent video frame data based on a preset semantic segmentation network to obtain masks corresponding to semantic segmentation results of various map elements in each frame image, and introducing optical flow information between adjacent frame images to enhance the video semantic segmentation stability;
the semantic enhancement module comprises:
the front frame image processing module is used for reading the front frame image and inputting the front frame image into a semantic segmentation network aiming at three map elements of a lane line, a lamp post and a road signboard for processing to obtain masks corresponding to semantic segmentation results of various map elements in the front frame image;
the post-frame image processing module is used for reading a post-frame image and inputting the post-frame image into a semantic segmentation network aiming at three map elements of a lane line, a lamp post and a road signboard for processing to obtain masks corresponding to semantic segmentation results of various map elements in the post-frame image;
the optical flow graph acquisition module is used for calculating optical flow information between the images of the front frame and the rear frame to obtain an inter-frame optical flow graph;
the optical flow information fusion module is used for transmitting the mask corresponding to the front frame image to the rear frame image through the obtained interframe optical flow graph, and performing enhancement operation on the result of the rear frame image in a preset limited area range to supplement the incomplete semantic segmentation area of the rear frame image;
the iteration enhancement module is used for adding 1 to the number of the next frame image and returning to the previous frame image processing module until all the frame images are processed;
the optical flow information fusion module comprises a correction module and a growing area limiting module, wherein the correction module is used for transmitting the map element semantic segmentation result of the front frame image to the corresponding position of the rear frame image along the optical flow image displacement vector according to the corresponding relation of pixels in the optical flow map through the optical flow map corresponding to the optical flow information according to the optical flow information between the two calculated frame images to obtain a corrected semantic element area; the growth region limiting module is used for comparing the obtained corrected semantic element region with the semantic element segmentation result of the later frame image, and correcting the part of the later frame image with incomplete map element segmentation based on the comparison result to realize enhanced supplement;
the semantic segmentation result corresponding to the modified post-frame image is as follows:
I (i+1|i+1) =I (i+1) ∪(I (i+1|i) ∩I (restricted area) )
Wherein, I (i+1) The semantic segmentation result is the (i + 1) th frame image; i is (i+1|i) The corrected semantic segmentation result of the ith frame image of optical flow propagation; i is (i+1|i+1) Semantic segmentation result I for fusing I +1 frame image (i+1) Modified semantic segmentation result I of ith frame image propagated with optical flow (i+1|i) And (5) a final semantic segmentation result of the i +1 th frame image.
CN202110203999.XA 2021-02-24 2021-02-24 Method and system for enhancing robustness of semantic segmentation of map elements Active CN112862839B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110203999.XA CN112862839B (en) 2021-02-24 2021-02-24 Method and system for enhancing robustness of semantic segmentation of map elements

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110203999.XA CN112862839B (en) 2021-02-24 2021-02-24 Method and system for enhancing robustness of semantic segmentation of map elements

Publications (2)

Publication Number Publication Date
CN112862839A CN112862839A (en) 2021-05-28
CN112862839B true CN112862839B (en) 2022-12-23

Family

ID=75990495

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110203999.XA Active CN112862839B (en) 2021-02-24 2021-02-24 Method and system for enhancing robustness of semantic segmentation of map elements

Country Status (1)

Country Link
CN (1) CN112862839B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113780067A (en) * 2021-07-30 2021-12-10 武汉中海庭数据技术有限公司 Lane linear marker detection method and system based on semantic segmentation
CN114529719A (en) * 2022-01-25 2022-05-24 清华大学 Method, system, medium and device for semantic segmentation of ground map elements
CN116168173B (en) * 2023-04-24 2023-07-18 之江实验室 Lane line map generation method, device, electronic device and storage medium
CN117763064A (en) * 2023-11-01 2024-03-26 武汉中海庭数据技术有限公司 Map updating method, system, equipment and storage medium

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106875406B (en) * 2017-01-24 2020-04-14 北京航空航天大学 Image-guided video semantic object segmentation method and device
CN109753913B (en) * 2018-12-28 2023-05-23 东南大学 Multi-mode video semantic segmentation method with high calculation efficiency
CN110147763B (en) * 2019-05-20 2023-02-24 哈尔滨工业大学 Video semantic segmentation method based on convolutional neural network
CN111062395B (en) * 2019-11-27 2020-12-18 北京理工大学 Real-time video semantic segmentation method
CN111652081B (en) * 2020-05-13 2022-08-05 电子科技大学 Video semantic segmentation method based on optical flow feature fusion

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Semantic Segmentation for Urban Planning Maps Based on U-Net;Zhiling Guo 等;《IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium》;20181104;全文 *
基于加权K-means 聚类与路网无向图的地图分割算法;肖尚华 等;《现代计算机》;20180331;全文 *

Also Published As

Publication number Publication date
CN112862839A (en) 2021-05-28

Similar Documents

Publication Publication Date Title
CN112862839B (en) Method and system for enhancing robustness of semantic segmentation of map elements
CN111126359B (en) High-definition image small target detection method based on self-encoder and YOLO algorithm
CN109753913B (en) Multi-mode video semantic segmentation method with high calculation efficiency
CN114782691A (en) Robot target identification and motion detection method based on deep learning, storage medium and equipment
WO2020097840A1 (en) Systems and methods for correcting a high-definition map based on detection of obstructing objects
CN110738121A (en) front vehicle detection method and detection system
CN109974743B (en) Visual odometer based on GMS feature matching and sliding window pose graph optimization
CN112633220B (en) Human body posture estimation method based on bidirectional serialization modeling
CN103049909B (en) A kind of be focus with car plate exposure method
CN111553945B (en) Vehicle positioning method
CN112070049B (en) Semantic segmentation method under automatic driving scene based on BiSeNet
CN104408757A (en) Method and system for adding haze effect to driving scene video
CN103578083A (en) Single image defogging method based on joint mean shift
CN112949633A (en) Improved YOLOv 3-based infrared target detection method
CN115830265A (en) Automatic driving movement obstacle segmentation method based on laser radar
CN116503709A (en) Vehicle detection method based on improved YOLOv5 in haze weather
CN112801021B (en) Method and system for detecting lane line based on multi-level semantic information
CN111160282B (en) Traffic light detection method based on binary Yolov3 network
CN114882328B (en) Target detection method combining visible light image and infrared image
CN116385994A (en) Three-dimensional road route extraction method and related equipment
CN115937449A (en) High-precision map generation method and device, electronic equipment and storage medium
CN114111817A (en) Vehicle positioning method and system based on SLAM map and high-precision map matching
CN115393822A (en) Method and equipment for detecting obstacle in driving in foggy weather
CN115578246B (en) Non-aligned visible light and infrared mode fusion target detection method based on style migration
Du et al. An Urban Road Semantic Segmentation Method Based on Bilateral Segmentation Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant