WO2021073656A1 - Method for automatically labeling image data and device - Google Patents

Method for automatically labeling image data and device Download PDF

Info

Publication number
WO2021073656A1
WO2021073656A1 PCT/CN2020/122514 CN2020122514W WO2021073656A1 WO 2021073656 A1 WO2021073656 A1 WO 2021073656A1 CN 2020122514 W CN2020122514 W CN 2020122514W WO 2021073656 A1 WO2021073656 A1 WO 2021073656A1
Authority
WO
WIPO (PCT)
Prior art keywords
map
road
information
coordinate system
image
Prior art date
Application number
PCT/CN2020/122514
Other languages
French (fr)
Chinese (zh)
Inventor
付万增
王哲
石建萍
Original Assignee
上海商汤临港智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤临港智能科技有限公司 filed Critical 上海商汤临港智能科技有限公司
Priority to KR1020217017022A priority Critical patent/KR20220053513A/en
Priority to JP2021539968A priority patent/JP2022517961A/en
Publication of WO2021073656A1 publication Critical patent/WO2021073656A1/en

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/38Electronic maps specially adapted for navigation; Updating thereof
    • G01C21/3804Creation or updating of map data
    • G01C21/3807Creation or updating of map data characterised by the type of data
    • G01C21/3815Road data
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/28Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network with correlation of data from several navigational instruments
    • G01C21/30Map- or contour-matching
    • G01C21/32Structuring or formatting of map data
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/34Route searching; Route guidance
    • G01C21/36Input/output arrangements for on-board computers
    • G01C21/3667Display of a road map
    • G01C21/367Details, e.g. road map scale, orientation, zooming, illumination, level of detail, scrolling of road map or positioning of current position marker
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/38Electronic maps specially adapted for navigation; Updating thereof
    • G01C21/3804Creation or updating of map data
    • G01C21/3833Creation or updating of map data characterised by the source of data
    • G01C21/3844Data obtained from position sensors only, e.g. from inertial navigation
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/38Electronic maps specially adapted for navigation; Updating thereof
    • G01C21/3863Structures of map data
    • G01C21/387Organisation of map data, e.g. version management or database structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5854Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using shape and object relationship
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/587Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present disclosure relates to the technical field of image processing, and relates to a method and device for automatically labeling image data.
  • the embodiments of the present disclosure provide a technical solution for automatic labeling of image data.
  • an embodiment of the present disclosure provides a method for automatically labeling image data.
  • the method includes: acquiring vehicle positioning information, a map image, and a vehicle collection image, wherein the map image includes road information; Information, acquiring road information on the map image in the local area corresponding to the vehicle positioning information; projecting the road information on the map image onto the vehicle acquisition image to mark the vehicle acquisition image The road information.
  • the acquiring, according to the vehicle positioning information, the road information on the map image in the local area corresponding to the vehicle positioning information includes: using the map image in the local area Is the root node, which sequentially queries the attribute information of the map road element of the map image in the local area, where the attribute information of the map road element includes at least one of the following information: semantic information of the map road element, the map The location information of the road element, and the shape information of the road element on the map.
  • the method further includes: determining the range of the local area according to the vehicle positioning information and the range of the map image; and acquiring the vehicle according to the vehicle positioning information
  • the road information on the map image in the local area corresponding to the positioning information includes: acquiring attribute information of map road elements on the map image in the range of the local area.
  • the map image is based on the world global coordinate system
  • the method further includes: The map image of the coordinate system is converted to the vehicle body coordinate system to obtain a map image based on the vehicle body coordinate system; the projecting the road information on the map image onto the vehicle collection image includes: The map image of the volume coordinate system is converted to the camera coordinate system and/or the pixel coordinate system, so as to project the road information on the map image onto the vehicle acquisition image.
  • the converting the map image based on the world global coordinate system to the vehicle body coordinate system to obtain the map image based on the vehicle body coordinate system includes: according to the vehicle positioning information , Obtain the rotation angle and translation amount of the rotation and translation matrix; according to the rotation and translation matrix, convert the map image based on the world global coordinate system to the vehicle body coordinate system to obtain a map based on the vehicle body coordinate system image.
  • the map image is a two-dimensional map
  • the map image based on the vehicle body coordinate system is converted to a camera coordinate system and/or a pixel coordinate system, so as to map the map image to a camera coordinate system and/or a pixel coordinate system.
  • Projecting the road information of the vehicle onto the vehicle collection image includes: obtaining a homography matrix between the pixel coordinate system and the vehicle body coordinate system; and using a homogeneous coordinate system to represent the map based on the vehicle body coordinate system Image; According to the homography matrix, the map image based on the vehicle body coordinate system expressed in a homogeneous coordinate system is converted to the pixel coordinate system to obtain a projected image on the pixel coordinate system The vehicle collects road information from the image.
  • the map image is a three-dimensional map
  • the map image based on the vehicle body coordinate system is converted to a camera coordinate system and/or a pixel coordinate system, so as to convert the map image on the map image into a camera coordinate system and/or a pixel coordinate system.
  • Projecting road information onto the vehicle collection image includes: converting the map image based on the vehicle body coordinate system to the camera coordinates according to the rotation and translation matrix between the vehicle body coordinate system and the camera coordinate system System, the road information of the vehicle acquisition image projected on the camera coordinate system is obtained; according to the projection matrix between the camera coordinate system and the pixel coordinate system, the vehicle projected on the camera coordinate system is acquired The road information of the image is converted to the pixel coordinate system, and the road information of the vehicle-collected image projected on the pixel coordinate system is obtained.
  • the method further includes: performing road information extraction processing on the vehicle collected image via a neural network for extracting road information to obtain perceived road information; and projecting road information based on the perceived road information Correct the road information on the image collected by the vehicle.
  • the correcting the road information projected on the image collected by the vehicle according to the perceived road information includes: determining the perceived road element in the perceived road information and the projected The offset information between map road elements in the road information on the vehicle collected image; and the road information projected on the vehicle collected image is corrected according to the offset information.
  • the determining the offset information between the perceived road element in the perceived road information and the map road element in the road information projected on the vehicle-collected image includes: The attribute information of the perceived road element, the map road element paired with the perceived road element is determined from the map image; the position information of the paired perceived road element and the map road element in the same device coordinate system is determined; based on the The location information determines the positioning offset between the paired perceived road element and the map road element.
  • determining the map road element paired with the perceived road element from the map according to the attribute information of the perceived road element includes: in the map image, based on the vehicle The positioning information searches for map road elements within a preset range; the perceived road elements in the vehicle collected image and the map road elements within the preset range are paired in pairs based on the attribute information to obtain a variety of matching schemes, where: At least one perceptual road element in different matching schemes has a different pairing method with map road elements within the preset range; determining the confidence of each of the matching schemes; among the multiple matching schemes, the confidence is the highest or exceeds the setting In a pairing scheme with a predetermined threshold, the map road element paired with the perceived road element is determined.
  • pairing the perceived road elements in the vehicle acquisition image with the map road elements within the preset range includes: the perceived road elements in the vehicle acquisition image are in the In the case that the map road element within the preset range cannot determine the paired road element, an empty or virtual element is set in the map road element to be paired to be paired with the perceived road element.
  • determining the confidence of each of the pairing schemes includes: separately determining the individual similarity of the pairing of each perceived road element and the map road element in each of the pairing schemes; determining each The overall similarity between each perceived road element and the map road element in the pairing scheme; and the confidence of each pairing scheme is determined according to the individual similarity and the overall similarity of each pairing scheme.
  • the positioning offset includes a coordinate offset and/or a direction offset; the positioning offset between the paired perceived road element and the map road element is determined based on the vehicle positioning information
  • the quantity includes: sampling the pixels of the perceived road element to obtain a set of perceived sampling points; sampling the pixel points of the map road element to obtain a set of map sampling points; determining the set of perceived sampling points and the The rotation and translation matrix between the sampling points included in each map sampling point set; the coordinate offset and the direction offset between the perceived road element and the map road element are obtained based on the rotation and translation matrix.
  • an embodiment of the present disclosure provides an automatic labeling device for image data.
  • the device includes: a first acquiring part configured to acquire vehicle positioning information, a map image, and a vehicle collection image, wherein the map image includes Road information; a second acquisition part configured to acquire road information on the map image in a local area corresponding to the vehicle positioning information according to the vehicle positioning information; a projection part configured to transfer the map image
  • the road information on the above is projected onto the vehicle acquisition image to mark the road information on the vehicle acquisition image.
  • the second acquisition part is configured to use the map image in the local area as a root node to sequentially query the attribute information of the map road elements of the map image in the local area, where
  • the attribute information of the map road element includes at least one of the following information: semantic information of the map road element, position information of the map road element, and shape information of the map road element.
  • the device further includes: a first determining part configured to determine the range of the local area according to the vehicle positioning information and the range of the map image; the second acquisition The part is configured to obtain attribute information of map road elements on the map image within the range of the local area.
  • the map image is based on the world global coordinate system
  • the device further includes: a first conversion part configured to convert the map image based on the world global coordinate system to the vehicle body coordinate system Above, a map image based on the vehicle body coordinate system is obtained; the projection part is configured to convert the map image based on the vehicle body coordinate system into a camera coordinate system and/or a pixel coordinate system to convert the map image
  • the road information on the map is projected onto the image collected by the vehicle.
  • the first conversion part is configured to: obtain a rotation angle and a translation amount of a rotation translation matrix according to the vehicle positioning information; and, according to the rotation translation matrix, based on the world
  • the map image of the global coordinate system is converted to the vehicle body coordinate system to obtain a map image based on the vehicle body coordinate system.
  • the map image is a two-dimensional map
  • the projection part includes: a third acquisition part configured to acquire the homography between the pixel coordinate system and the vehicle body coordinate system
  • the representation part is configured to use a homogeneous coordinate system to represent the map image based on the vehicle body coordinate system;
  • the second conversion part is configured to use the homogeneous coordinate system to convert the homogeneous coordinate system according to the homography matrix
  • the displayed map image based on the vehicle body coordinate system is converted to the pixel coordinate system to obtain road information of the vehicle collected image projected on the pixel coordinate system.
  • the map image is a three-dimensional map
  • the projection part includes: a third conversion part configured to rotate and translate a matrix between the vehicle body coordinate system and the camera coordinate system , Converting the map image based on the vehicle body coordinate system to the camera coordinate system to obtain road information of the vehicle collected image projected on the camera coordinate system;
  • the fourth conversion part is configured to convert the road information of the vehicle collected image projected on the camera coordinate system to the pixel coordinate system according to the projection matrix between the camera coordinate system and the pixel coordinate system , Obtain the road information of the vehicle-collected image projected on the pixel coordinate system.
  • the device further includes: an extraction part configured to perform road information extraction processing on the vehicle collected image via a neural network for extracting road information to obtain perceived road information;
  • the part is configured to correct the road information projected on the vehicle collected image according to the perceived road information.
  • the first correction part includes: a second determination part configured to determine the difference between the perceived road element in the perceived road information and the road information projected on the vehicle-collected image The offset information between the map road elements; the second correction part is configured to correct the road information projected on the vehicle collection image according to the offset information.
  • the second determining part includes: a third determining part configured to determine from the map image a pair of the perceived road element according to the attribute information of the perceived road element The map road element; the fourth determining part is configured to determine the position information of the paired perceived road element and the map road element in the same device coordinate system; the fifth determining part is configured to determine the paired perceived road based on the position information The positioning offset between the element and the map road element.
  • the third determining part includes: a searching part configured to search for map road elements within a preset range in the map image based on the vehicle positioning information; It is configured to pair the perceived road elements in the vehicle-collected image with the map road elements within the preset range based on the attribute information to obtain multiple pairing schemes, wherein at least one perceptual road element in the different pairing schemes is The pairing modes of the map road elements within the preset range are different; the sixth determining part is configured to determine the confidence of each of the pairing schemes; the seventh determining part is configured to be among the multiple pairing schemes In the matching scheme with the highest confidence or exceeding the set threshold, the map road element that is paired with the perceived road element is determined.
  • the matching part is configured to, in the case that the road elements on the map that perceive the road elements within the preset range in the vehicle acquisition image cannot determine the paired road elements, the matching part is configured to Set empty or virtual elements in the map road elements for pairing to be paired with the perceived road elements.
  • the sixth determining part is configured to separately determine the individual similarity of the pairing of each perceived road element and the map road element in each of the pairing schemes; determine each of the pairing schemes The overall similarity between each perceived road element and the map road element in the pairing; and the confidence of each pairing scheme is determined according to the individual similarity and the overall similarity of each pairing scheme.
  • the positioning offset includes a coordinate offset and/or a direction offset
  • the fifth determining part includes: a first sampling part configured to perceive the road element Sampling the pixels of the map to obtain a set of sensing sampling points;
  • the second sampling part is configured to sample the pixel points of the map road elements to obtain a set of map sampling points;
  • the eighth determining part is configured to determine the sensing The rotation and translation matrix between the sampling point set and the sampling points included in the map sampling point set;
  • the fourth obtaining part is configured to obtain the coordinates of the perceived road element and the map road element based on the rotation and translation matrix Offset and direction offset.
  • an embodiment of the present disclosure provides a device for automatically labeling image data.
  • the device includes: an input device, an output device, a memory, and a processor; wherein the memory stores a set of program codes, and the processing The device is used to call the program code stored in the memory to execute the method described in the first aspect or various possible implementations thereof.
  • embodiments of the present disclosure provide a computer-readable storage medium that stores instructions in the computer-readable storage medium, which when run on a computer, cause the computer to execute the first aspect or its various possibilities. To implement the described method.
  • the embodiments of the present disclosure provide a computer program, including computer-readable code.
  • the processor in the electronic device executes the first Aspect or its various possible implementations.
  • Utilizing the rich road information contained in the map data, projecting the road information on the map data onto the vehicle collection image can realize automatic labeling of the road information of the vehicle collection image, which improves the efficiency of labeling image data and helps reduce data
  • the error probability of labeling reduces the labor cost of image data labeling.
  • FIG. 1 is a schematic flowchart of a method for automatically labeling image data according to an embodiment of the disclosure
  • FIG. 2 is a schematic flowchart of another method for automatically labeling image data according to an embodiment of the present disclosure
  • Figures 3A-3C show the effect diagrams of identifying the semantic information of road elements
  • Figure 4A is a schematic diagram of the world global coordinate system
  • Figure 4B is a schematic diagram of the vehicle body coordinate system
  • 4C is a schematic diagram of the camera coordinate system and the pixel coordinate system
  • FIG. 5 is an example of a method for determining offset information between a perceived road element and a map road element provided by an embodiment of the disclosure
  • FIG. 6 is an example of a method for determining a map road element paired with a perceived road element from a map image provided by an embodiment of the disclosure
  • FIG. 7 is a schematic diagram of a pairing scheme provided by an embodiment of the disclosure.
  • Fig. 8 is an example of a method for determining a positioning offset between a paired perceived road element and a map road element provided by an embodiment of the disclosure
  • FIG. 9 is a schematic diagram of the closest point iteration method provided by an embodiment of the disclosure.
  • FIG. 10 is a schematic structural diagram of a device for automatically labeling image data provided by an embodiment of the disclosure.
  • FIG. 11 is a schematic structural diagram of another image data automatic labeling device provided by an embodiment of the disclosure.
  • FIG. 1 is a schematic flowchart of an image data automatic labeling method provided by an embodiment of the present disclosure.
  • the method may include the following steps:
  • S101 Acquire vehicle positioning information, map images, and vehicle collection images.
  • vehicles are understood in a broad sense and can include various types of vehicles with transportation or operation functions in the traditional sense, such as trucks, buses, buses, and cars; they also include movable robot equipment.
  • smart home devices such as blind guide devices, smart toys, sweeping robots, etc., can also be industrial robots, service robots, toy robots, educational robots, etc., which are not limited in the embodiments of the present disclosure.
  • the vehicle may be equipped with a position sensor to obtain the vehicle positioning information.
  • the vehicle may also be equipped with a vision sensor to collect images around the vehicle in real time, and the obtained images may be referred to as vehicle collection images. Since the image collected by the visual sensor installed on the vehicle is equivalent to the “perception” of the vehicle driving control system on the surrounding environment of the vehicle, the image collected by the vehicle can also be referred to as a perceived road image. In the embodiment of the present disclosure, the image collected by the vehicle is the collected image itself, and there is no label information on the image.
  • the map image can also be obtained from a server or a vehicle-mounted terminal.
  • the map image can be a semantic map, a high-precision map, etc., but is not limited to this, and can also be other types of maps.
  • the map image includes rich road information. Road information refers to attribute information of map road elements identified based on the map image.
  • the map road element in the road information may include road-related signs, and may include at least one or more of the following: various types of lane lines, stop lines, turning lines, and roads on the road. Edge lines, as well as traffic signs, traffic lights, street lights, etc. installed on the side of the road or on the road.
  • Various types of lane lines can include, but are not limited to, white solid lane lines, yellow dashed lane lines, left edge lane lines, right edge measurement lines, etc.
  • various types of traffic signs can include, but are not limited to, slow traffic Signs, no-stop traffic signs, speed limit traffic signs, etc.
  • the road elements are not limited to the above.
  • the attribute information of the map road element may include one or more kinds of information related to the above-mentioned map road element, such as semantic information, position information, shape information, and so on of the road element.
  • the semantic information of the road element can be the meaning represented by the road element and the information it wants to express. For example, in the case that a line on the road is detected in the collected road image, it can be based on the line on the road. The position, relative to the width and length of the road, etc., can determine that the line is a stop line, a lane line, and so on. Since the lane line can be subdivided into many types, the lane line is the basic semantic information, and its specific semantic information can be further determined according to the position of the line and the shape of the line, such as the left edge lane line, the white solid line lane line, etc. ; For traffic signs, slow traffic signs and no-stop traffic signs can be the specific semantic information of the road element. Those skilled in the art should understand that the specific expression form of the semantic information of the road element does not affect the implementation of the method of the present disclosure.
  • the above-mentioned position sensor may include at least one of the following: GPS (Global Positioning System), IMU (Inertial measurement unit, inertial measurement unit), etc.; the above-mentioned visual sensor may include at least one of the following: a camera, a video camera, Camera etc. Those skilled in the art should understand that the vision sensor and the position sensor are not limited to the above.
  • the vehicle positioning information may be a synchronized positioning information obtained by collecting images of each frame of the vehicle. It may be GPS positioning information, or IMU positioning information, or fusion information of the GPS positioning information and the IMU positioning information.
  • the fusion information is a more reliable positioning result obtained based on GPS positioning information and IMU positioning information. It can be obtained by Kalman filtering on GPS positioning information and IMU positioning information, or it can be calculated by means of GPS positioning information and IMU positioning information, or weighted average calculation.
  • the map image includes all or most of the road elements of the road section.
  • the vehicle acquisition image acquired during the positioning process is a partial area image of the road.
  • the scope of this local area can be set.
  • the local area corresponding to the vehicle positioning information is also related to the field of view of the visual sensor.
  • S103 Project the road information on the map image onto the vehicle acquisition image to mark the road information on the vehicle acquisition image.
  • the map image contains rich and accurate road information
  • projecting the road information on the map image onto the vehicle acquisition image is essentially marking the road information on the vehicle acquisition image, so that the vehicle acquisition
  • the image also contains the above-mentioned road information, which realizes the automatic labeling of the road information.
  • the trained neural network can be used to identify the road information in the collected images of vehicles.
  • the rich road information contained in the map image can be used to automatically label the road information of the image collected by the vehicle, which improves the efficiency of labeling image data and helps reduce the amount of data labeling.
  • the error probability reduces the labor cost of image data annotation.
  • FIG. 2 is a schematic flowchart of another method for automatically labeling image data according to an embodiment of the present disclosure.
  • the method may include the following steps:
  • S201 Acquire vehicle positioning information, map images, and vehicle collection images, where the map images include road information.
  • the map image of the road can be collected by the collection vehicle, and the attribute information of the map road element in the map image can be recognized.
  • the collected map images can be stored in a server or a vehicle-mounted terminal.
  • a visual sensor is provided on the collection vehicle to collect map images.
  • the visual sensor may include at least one of the following: a camera, a video camera, a camera, and so on.
  • the visual sensor configured in the collection vehicle can be a high-precision visual sensor, so that a map image with high definition and high accuracy can be collected.
  • the visual sensor used to collect the image collected by the vehicle can be a sensor with a relatively low accuracy.
  • the collection vehicle may also be equipped with a high-precision position sensor to obtain the location information of the collection vehicle more accurately.
  • the position sensor that obtains the vehicle positioning information can be a position sensor with a lower positioning accuracy, or an existing position sensor in the vehicle can be used.
  • the attribute information of the map road element may include at least one of the following: semantic information, location information, shape information, and so on.
  • the above attribute information can be obtained by using a trained neural network for road element detection.
  • the above neural network can be trained by road images with label information (may be called sample road images).
  • the road elements in the sample road images have label information.
  • the label information can be attribute information of the sample road elements, for example, Including but not limited to one or more of the following: semantic information, shape information, location information, etc.
  • Training the neural network through sample road images can make the model have the ability to recognize the attribute information of road elements in the input road image.
  • the attribute information of the map road element in the image can be output.
  • the neural network's ability to recognize the types of road elements depends on the type of sample road elements used in the training process.
  • the model can be trained with more types of sample road elements to make it have higher recognition capabilities.
  • Figure 3A-3C show the effect diagrams of recognizing the semantic information of road elements.
  • Figure 3A is a road image input to the neural network model, which can be a vehicle acquisition image, a map road image, or other road images
  • Figure 3B shows the road elements identified by the neural network model, as shown in the horizontal thick The solid line 31 is shown, and its semantic information is obtained as "stopline" 32 (stopline), which is marked on the upper left of the picture
  • Figure 3C shows the road elements recognized by the neural network model, as shown in Figure 3C.
  • Line 33 is shown, and the basic semantic information and specific semantic information of each line are obtained.
  • the basic semantic information is "laneline”, and the specific semantic information is (from left to right): "white solid line” (white solid line), “white solid line” (white solid line) , “White solid line” (white solid line), “white solid line” (white solid line), and “right edge” (right edge) are all marked on the upper left of the picture, as shown in Figure 3C 34 on the top left of the picture.
  • the identification method of the map road element attribute information is not limited to the above, and can also be obtained by other identification methods.
  • S202 Determine the range of the local area corresponding to the vehicle positioning information according to the vehicle positioning information and the range of the map image.
  • the map image for a road includes all or most of the road elements of the road.
  • the vehicle acquisition image acquired during the positioning process is a partial area image of the road.
  • the range of the local area can be set manually. Therefore, the range of the local area corresponding to the vehicle positioning information can be determined according to empirical values and the like.
  • the local area corresponding to the vehicle positioning information is related to the field of view of the vision sensor. Therefore, the range of the local area corresponding to the vehicle positioning information can also be determined according to the field of view of the visual sensor.
  • S203 Take the map image in the local area as a root node, and sequentially query the attribute information of the map road elements on the map image in the local area.
  • This step is used to obtain the attribute information of the map road element on the map image within the local area.
  • the tree-like hierarchical relationship can be used to query the road information. Take the map image within the local area as the root node; there are several rectangular areas under the root node, and each rectangular area is set with a center point to represent the location of the area, and each rectangular area corresponds to the map road element on the map image; query the corresponding map road element Attribute information.
  • S204 Convert the map image based on the world global coordinate system to the vehicle body coordinate system to obtain the map image based on the vehicle body coordinate system.
  • the map image uses the world global coordinate system, and any point in the coordinate system has a unique corresponding coordinate (longitude and latitude information) on the earth.
  • ECEF Earth-Centered-Earth-Fixed, centered on the earth and fixed on the earth
  • FIG. 4A is a right-handed Cartesian coordinate system.
  • the Cartesian coordinate system takes the center of the earth as the origin of the coordinates.
  • the direction of the intersection of the origin pointing to the prime meridian and the 0 degree latitude is the positive direction of the x-axis
  • the direction of the origin pointing to the north pole is the positive direction of the z-axis
  • the length is in meters.
  • the vehicle body coordinate system is also a right-handed Cartesian coordinate system, with the on-board high-precision inertial navigation center as the origin, the front direction as the positive direction of the x-axis, and the left side as the y Positive axis direction.
  • the length is in meters.
  • the world global coordinate system and the vehicle body coordinate system are both right-handed Cartesian rectangular coordinate systems, and the conversion between the two right-handed Cartesian rectangular coordinate systems requires only one rotation and translation matrix.
  • S204 includes: obtaining the rotation angle and translation amount of the rotation and translation matrix according to the vehicle positioning information; and according to the rotation and translation matrix, transforming the map image based on the world global coordinate system to the vehicle body coordinate system to obtain the vehicle body coordinate system Map image.
  • the rotation angle and translation amount of the rotation translation matrix between the world global coordinate system and the vehicle body coordinate system are determined. Therefore, the map image based on the world global coordinate system can be converted to the vehicle body coordinate system according to the rotation and translation matrix, and the map image based on the vehicle body coordinate system can be obtained.
  • S205 Convert the map image based on the vehicle body coordinate system to the camera coordinate system and/or the pixel coordinate system, so as to project the road information on the map image onto the vehicle collection image.
  • the road information on the map image is projected onto the vehicle acquisition image, and the vehicle acquisition image is based on the camera coordinate system or pixel coordinate system. Therefore, the above-mentioned map image based on the vehicle body coordinate system needs to be converted to camera coordinates.
  • System or pixel coordinate system A schematic diagram of the camera coordinate system and the pixel coordinate system as shown in FIG. 4C. Among them, the camera coordinate system o-x-y-z is a three-dimensional image, and the pixel coordinate system o'-x'-y' is a two-dimensional image.
  • the map image is a two-dimensional map
  • S205 includes: obtaining a homography matrix between the pixel coordinate system and the vehicle body coordinate system; adopting a homogeneous coordinate system to represent the map based on the vehicle body coordinate system Image; and according to the homography matrix, the map image based on the vehicle body coordinate system represented by the homogeneous coordinate system is converted to the pixel coordinate system, and the road information of the vehicle acquisition image projected on the pixel coordinate system is obtained.
  • the transformation from the vehicle body coordinate system to the pixel coordinate system can be completed by using a homography matrix transformation.
  • the homography matrix means that a three-dimensional object can be projected onto multiple two-dimensional planes, and the homography matrix can transform the projection of a three-dimensional object on a certain two-dimensional plane to the projection of another two-dimensional plane.
  • the homography matrix can be solved through algebraic analysis.
  • the homography matrix between the pixel coordinate system and the vehicle body coordinate system can be calibrated in advance through manual calibration data.
  • the affine transformation from one plane to another can be completed. Then, the homogeneous coordinate system is used to represent the road information on the map image, and then the coordinates of each road information are multiplied by the homography matrix to obtain the road information based on the pixel coordinate system.
  • the map image is a three-dimensional map
  • S205 includes: converting the map image based on the vehicle body coordinate system to the camera coordinate system according to the rotation and translation matrix between the vehicle body coordinate system and the camera coordinate system , Get the road information of the vehicle captured image projected on the camera coordinate system; and according to the projection matrix between the camera coordinate system and the pixel coordinate system, convert the road information of the vehicle captured image projected on the camera coordinate system to the pixel coordinate system
  • the road information of the collected image of the vehicle projected on the pixel coordinate system is obtained.
  • the internal and external parameters of the camera can be used to complete the conversion between the car body coordinate system, the camera coordinate system, and the pixel coordinate system.
  • the principle of camera imaging is small hole imaging.
  • the internal parameters of the camera refer to the focal length of the convex lens of the camera and the coordinates of the optical center in the pixel coordinate system; the external parameters of the camera refer to the rotation and translation matrix between the camera coordinate system and the vehicle body coordinate system.
  • the camera coordinate system is a right-hand Cartesian coordinate system with the optical center of the camera as the origin, and the positive direction of the y-axis and the front of the camera respectively.
  • the road information on the map image is rotated and translated to the camera coordinate system through the camera external parameters, and then the road information based on the camera coordinate system is based on the zoom principle of the pinhole imaging and the camera internal parameters Projected to the pixel coordinate system to obtain the road information projected on the image collected by the vehicle.
  • S206 Perform road information extraction processing on the vehicle collected image via a neural network for extracting road information to obtain perceived road information.
  • the perceived road information includes the attribute information of the perceived road element.
  • the attribute information of the perceived road element may include one or more kinds of information related to the perceived road element, such as semantic information, location information, shape information, etc. of the road element.
  • the perceived road element can include road-related signs, and can include at least one of the following: lane lines, stop lines, turning lines on the road, and traffic signs and traffic signs set beside or in front of the road Lamps, street lamps, etc.
  • the types of perceptual road elements and map road elements can be all the same, or they can be partly the same.
  • the perceived road elements should overlap with the map road elements in the map.
  • This overlap can refer to the overlap of perceived road elements and map road elements in the same coordinate system.
  • the road information in the map image may not completely overlap with the actual road information in the map image, so it is necessary to correct the road information projected on the vehicle collected image.
  • a neural network that has undergone preliminary training can be used to extract road information from images collected by vehicles to obtain perceived road information.
  • the above neural network can be trained by road images with label information (may be called sample road images).
  • the road elements in the sample road images carry label information, and the label information may be attribute information of the sample road elements.
  • the attribute information of the sample road element may include, but is not limited to, one or more of the following: semantic information, shape information, location information, and so on.
  • Training the neural network through sample road images can make the model have the ability to recognize the attribute information of road elements in the input road image.
  • the attribute information of the map road element in the image can be output.
  • the neural network's ability to recognize the types of road elements depends on the type of sample road elements used in the training process.
  • the model can be trained with more types of sample road elements, so that the neural network has a higher recognition ability.
  • the road information projected on the image collected by the vehicle can be corrected according to the perceived road information.
  • S207 includes determining the offset information between the perceived road element in the perceived road information and the map road element in the road information projected on the vehicle-collected image, and the pair is projected to the vehicle according to the offset information.
  • the road information on the image collected by the vehicle is corrected. This will be described in detail in the following examples.
  • the road information on the map data is projected onto the vehicle collection image by using the rich road information contained in the map data, which can realize the automatic labeling of the road information of the vehicle collection image , Improve the efficiency of annotating image data, help reduce the error probability of data annotation, and reduce the labor cost of image data annotation; and according to the perceived road information, the road information projected on the vehicle collection image is corrected, which improves the image data annotation Accuracy.
  • Fig. 5 shows a method for determining the offset information between the perceived road element and the map road element. As shown in Fig. 5, the method may include:
  • S301 According to the attribute information of the perceived road element, determine the map road element paired with the perceived road element from the map image.
  • the perceptual road element on the vehicle acquisition image can obtain the map road element paired with it on the map. That is, for a perceptual road element, if it is not misrecognized, nor is it newly appeared after the map is created or the latest update, then usually a map road element can be found on the map to correspond to it.
  • S302 Determine the position information of the paired perceived road element and the map road element in the same device coordinate system.
  • the position comparison needs to be performed in the same coordinate system, if the obtained position information of the perceived road element and the position information of the map road element are not in the same coordinate system, they need to be converted to the same coordinate system.
  • the map location information of the map road element is the map location information in the latitude and longitude coordinate system
  • the map location information needs to be converted to the vehicle body coordinate system.
  • the following describes the GPS device coordinate system as an example.
  • the process of the coordinate system conversion can be divided into two steps: first, the map location information is converted from the latitude and longitude coordinate system (for example, the WGS84 coordinate system) to the UTM coordinate system; Vehicle positioning information, which converts map road elements from UTM coordinate system to GPS device coordinate system.
  • this step can be obtained by first rotating the angle ⁇ of the front of the vehicle toward the true east, and then translating the GPS longitude and latitude positioning information (x, y).
  • the conversion from the latitude and longitude coordinate system to the vehicle body coordinate system can be performed according to its conversion rules.
  • S303 Determine a positioning offset between the paired perceived road element and the map road element based on the location information.
  • the positioning offset between them can be determined based on the positions of the two.
  • the paired perceptual road elements and map road elements are converted to the same device coordinate system, and then the position information of the two is used to determine the positioning offset between them.
  • Fig. 6 shows a method for determining map road elements paired with perceived road elements from a map image. As shown in Fig. 6, the method may include:
  • S401 In the map image, search for map road elements within a preset range based on the vehicle positioning information.
  • the vehicle positioning information is the location information of the vehicle itself.
  • the location information of the vehicle For example, for an autonomous vehicle, it is the location information of the vehicle.
  • the location of the vehicle on the map image can be determined, so that map road elements within a set range can be found in the map image, that is, map road elements near the vehicle.
  • the perceptual road element of the image collected by the vehicle is the road element located near the vehicle during the positioning process of the vehicle. Therefore, finding the map road elements near the vehicle on the map is the most likely and fastest way to find the map road elements paired with the perceived road elements.
  • the preset range can be set according to requirements. For example, if the matching accuracy is high, the range can be set to be relatively large, and more map road elements can be obtained to pair with the perceived road elements in the subsequent process; if the real-time requirements are high, it is hoped that the matching speed will be faster , The range can be set relatively small.
  • the preset range may be a range of 2 to 5 times the difference between the visual range of the visual sensor and the initial positioning recognition on the map with the vehicle positioning information as the center point, thereby weighing the matching speed and accuracy.
  • the preset range can be set to (60+10)*2. That is to say, in this case, the preset range can be a 140m*140m rectangular frame with the vehicle positioning as the center.
  • S402 Pair the perceived road elements in the image collected by the vehicle with the map road elements within a preset range based on the attribute information to obtain multiple pairing schemes.
  • each vehicle acquisition image in the vehicle acquisition image can be paired with each map road element within a preset range through enumeration to obtain a variety of different Matching plan.
  • the above-mentioned different pairing schemes may be that at least one perceived road element has a different pairing method with the map road element within the preset range.
  • the perceptual road elements in the image collected by the vehicle include a1, a2,..., aM
  • the map road elements within the above preset range include b1, b2,..., bN, where M and N are both positive integers, and N is greater than Equal to M.
  • the number of map road elements is more than, or at least equal to the number of perceived road elements.
  • each of the resulting pairing schemes is a set of two-tuples, each
  • the two-tuple (ai, bj) is a pairing method of road elements.
  • the perceptual road elements (a1, a2,..., aM) need to be all paired, and the map road elements (b1, b2,..., bN) may contain elements for which no matching target is found.
  • the map road elements (b1, b2,..., bN) may contain elements for which no matching target is found.
  • at least one set of two-tuples (ai, bj) is different.
  • the pairwise pairing of perceptual road elements and map road elements can be realized through the bipartite graph model. Including steps: constructing a bipartite graph model based on perceptual road elements and map road elements: abstracting each perceptual road element as a point in the vehicle's collected image, and all perceptual road elements forming a perceptual point set; combining the map road elements in the map It is also abstracted as a point, and all map road elements form a map point set.
  • the perceptual road elements with the same semantics can be sorted in the order from the left to the right of the vehicle, for the same semantics in the map
  • the map road elements are also sorted using a similar method, and the points in the corresponding point set formed are arranged in order according to the order of the road elements.
  • the perception point set and the map point set are connected by edges, and each edge represents the pairing relationship between a perception road element and a map road element. Different connection methods produce different pairing schemes, and each pairing scheme obtained is an edge set.
  • a bipartite graph matching method based on the above model can also be used to obtain a reasonable matching scheme from all the matching schemes.
  • the method includes: in all edge sets, selecting as many edge sets that do not intersect (not cross) as many edges as possible.
  • the disjointness mentioned here means that the two edges do not intersect when there is no common point, and the sequence numbers of the two vertices of one side in the point set are greater than the sequence numbers of the two vertices of the other side in the point set, so it can also It is understood as disjoint in the physical sense.
  • An edge set with a number of disjoint edges greater than a set ratio or a set threshold can be called a reasonable edge set, that is, a reasonable pairing scheme is obtained, as shown in Figure 7, for example.
  • confidence is an evaluation index that perceives the pairing between road elements and map road elements.
  • each perceived road element is paired with a map road. The higher the semantic information consistency between the two, the more the number of matching pairs, the higher the confidence of the pairing scheme.
  • the confidence of each pairing scheme can be determined in the following way:
  • the individual similarity can refer to the similarity of the attribute information of the two elements in each binary pair in the pairing scheme. For example, it may include the similarity of semantic information, the similarity of position information, the similarity of shape information, and so on.
  • the individual similarity between the perceived lane line and the map lane line can be calculated by the following formula (1), where the perceived lane line can refer to the lane line in the image collected by the vehicle, and the map lane line can refer to the map Lane line.
  • Weight(i,j) -Distance(i,j)+O type (i,j)*LaneWidth+O edgetype (i,j)*LaneWidth (1);
  • Weight(i,j) represents the individual similarity between the i-th (counted from left to right, the same below) perceptual lane line and the j-th map lane line, which can also be called the weight;
  • Distance( i, j) represents the distance between the i-th perceived lane line and the j-th map lane line.
  • the lane line is abstracted as a line segment.
  • the distance calculation method can be the Euclidean distance from the line segment to the line segment, that is, on a line segment The median of the distance between two endpoints and another line segment, that is, the average; LaneWidth represents the lane width, that is, the width between the two lane lines; O type (i,j) if and only if the i-th perceived lane line If the lane line attribute of the j-th map lane line is the same, it is 1, otherwise it is 0; where the lane line attributes can include lane line color, line type, etc., such as yellow solid line and white dashed line; O edgetype (i ,j) If and only if the edge lane line attributes of the i-th perceived lane line and the j-th map lane line are the same, it is 1, otherwise it is 0; where, the edge lane line attribute indicates whether the lane line belongs to the road edge.
  • Distance (i, j) is used to calculate the similarity of the position information between the perceived lane line and the lane under the map
  • LaneWidth is used to calculate the similarity of the shape information between them
  • O type (i, j) and O edgetype (i, j) is used to calculate the similarity of semantic information between them.
  • the overall similarity can be an overall evaluation of the similarity of the attribute information of all binary pairs in a pairing scheme.
  • the attribute information may include location information and semantic information.
  • the overall similarity of location information can be represented by the variance of the distance between two elements in all binary pairs. The smaller the variance, the closer the distance between the two elements in all binary pairs, and the higher the overall similarity of the position information; for the overall similarity of semantic information, the semantic information similarity of the two elements in all binary pairs can be used Perform average or weighted average calculation to obtain.
  • C. Determine the confidence of each pairing plan according to the individual similarity and the overall similarity of each pairing plan. For example, in each pairing scheme, the sum of the individual similarities of each binary group can be averaged with the overall similarity, or weighted average, to obtain the confidence of the pairing scheme.
  • the confidence of the pairing program is comprehensively evaluated, avoiding the extreme effect (extremely good or extremely poor) of individual pairing on the entire.
  • the influence of the confidence of the pairing scheme makes the calculation result of the confidence more reliable.
  • Formula (2) is an example of a function for calculating the confidence score of the pairing scheme, which is to calculate the score through three parts: the sum of individual similarities, the overall similarity of distance information, and the overall similarity of semantic information.
  • match_weight_sum sum(match_items_[pr_idx][hdm_idx].weight)+CalculateVarianceOfMatchResult(match_result)+CalculateMMConfidence(match_result); (2);
  • match_weight_sum represents the confidence score of a pairing scheme
  • sum(match_items_[pr_idx][hdm_idx].weight) represents the sum of the individual similarity of each two-tuple in the pairing scheme, which is calculated by summing the weights of the edges selected in the pairing scheme, that is, each pair of point sets Corresponding edge weight summation calculation;
  • CalculateVarianceOfMatchResult(match_result) represents the overall similarity of the distance information of each two-tuple in the pairing scheme, which is calculated by the variance of the distance between two elements in each two-tuple in the pairing scheme.
  • the variance is the variance of all these distances. Theoretically, the distance between all paired perception lane lines and map lane lines should be equal, that is, the variance is zero, but in fact, because of the inevitable introduction of errors, the variance may not be zero;
  • CalculateMMConfidence(match_result) represents the overall similarity of the semantic information of each two-tuple in the matching scheme, which is calculated by comparing the semantic similarity between two elements in each two-tuple. Still taking lane lines as an example, it can be judged whether the attributes of all paired lane lines are the same and whether the numbers are the same. For example, the confidence that the attributes are all consistent is 100%, and the attributes of each pair of lane lines are inconsistent, for example, the confidence can be set to decrease by 10%, and the confidence that the number does not match directly decrease by 30%.
  • the confidence score of the pairing scheme can be obtained.
  • S404 Determine the map road element that is paired with the perceived road element in the matching scheme with the highest confidence level or exceeding the set threshold among the multiple matching schemes.
  • the scheme with the highest confidence level can be used as the final selected matching scheme, and the matching scheme that exceeds the set threshold can be used as the final selected matching scheme, so as to be able to determine the pairing with the perceived road element Map road elements.
  • the map road elements near the device are obtained on the map by using the vehicle positioning information to be paired with the perceived road elements.
  • the map road elements near the device are obtained on the map by using the vehicle positioning information to be paired with the perceived road elements.
  • the perceived road elements in the vehicle acquisition image are in the preview
  • an empty or virtual element is set in the map road element to be paired to be paired with the perceived road element.
  • the perceived road elements in the image collected by the vehicle correspond to the map road elements in the map one-to-one, but in the case that the perceived road elements are the result of misrecognition, or the perceived road elements are only after the map is established When this happens, the map road element corresponding to the perceived road element cannot be found.
  • all perceptual road elements have matching objects in the process of determining the matching plan, making the matching plan richer and conducive to comprehensive evaluation of the best matching plan.
  • Fig. 8 shows a method for determining the positioning offset between the paired perceived road element and the map road element. As shown in Fig. 8, the method includes:
  • S501 Sampling the pixel points of the perceptual road element to obtain a set of perceptual sampling points.
  • the pixel points of the perceptual road element may be sampled at a fixed interval (for example, 0.1 m) to obtain a set of perceptual sampling points.
  • the perceived lane line can be abstracted as a set of points.
  • the lane lines can be arranged in the order from the left to the right of the vehicle, and the corresponding point sets are arranged from top to bottom according to the order of the lane lines.
  • S502 Sampling the pixel points of the map road element to obtain a map sampling point set.
  • a method similar to step 901 may be applied to sample the map road elements to obtain a map sampling point set.
  • S503 Determine a rotation and translation matrix between the sampling points included in the sensing sampling point set and the map sampling point set.
  • the closest point iteration method may be used to calculate the rotation and translation matrix between the two point sets.
  • Figure 9 shows a schematic diagram of the closest point iteration method.
  • the left side of the arrow represents two associated point sets (paired point sets) input to the algorithm model.
  • Applying the algorithm model for example, the least squares algorithm model can be obtained.
  • Rotation and translation matrix By applying the rotation and translation matrix to the input point set, the overlap of the two point sets can be realized.
  • the right side of the arrow indicates the two overlapping point sets.
  • S504 Obtain a coordinate offset and a direction offset between the perceived road element and the map road element based on the rotation and translation matrix.
  • the rotation average shift matrix obtained in step S503 is the required positioning offset, the translation coefficient in the rotation translation matrix corresponds to the coordinate offset, and the rotation coefficient corresponds to the direction offset.
  • the vehicle positioning information may be expressed as (x0, y0, ⁇ 0), and the positioning offset may be expressed as (dx, dy, d ⁇ ).
  • the positioning information obtained by correcting the vehicle positioning information can be as in formula (3).
  • Kalman filtering mean value calculation, weighted average calculation and other methods can be used to fuse the obtained positioning information and vehicle positioning information, avoiding excessive correction of the positioning information by the map information, and making the image data labeling more reliable.
  • the image data automatic labeling method provided by the embodiments of the present disclosure can be used to complete automatic labeling of the positions and attributes of static road elements such as lane lines, stop lines, and signs in road images; the automatic labeling algorithm is based on high precision Maps, high-precision maps contain a wealth of road elements, and have centimeter-level accuracy. High-precision maps are one of the basic modules in automatic driving. At present, high-precision maps have been widely used and mature acquisition schemes; the automatic labeling method of image data is only related to high-precision maps. If high-precision maps are sufficient According to the belief, the labeling algorithm can achieve a sufficiently high accuracy rate; the automatic labeling method of image data is an accessory product of the automatic driving system, and no additional cost is required.
  • the main principle is to obtain high-precision map information near the vehicle through high-precision positioning information, and use the on-board camera parameters to project the map elements onto the road image to obtain the corresponding road element location and semantic information.
  • the embodiments of the present disclosure also provide a high-precision map building solution, which uses a low-cost high-precision positioning solution to assist in completing automatic labeling of image data.
  • the image data automatic labeling method is an accessory product of the automatic driving system, which uses the existing high-precision maps, high-precision positioning schemes, on-board camera calibration schemes, and auxiliary positioning and detection annotations of the automatic driving system.
  • the effect of image lane line detection is based on a deep learning model.
  • the image data automatic labeling method first obtains the map information near the vehicle from the high-precision map according to the high-precision positioning scheme, and then projects the map road elements into the road image according to the calibration parameters of the on-board camera, and then compares them
  • the image lane line detection deep learning model extracts the offset between the lane line and the projected lane line to calibrate the projection function, and finally obtain the image road element labeling result with higher accuracy and precision.
  • the high-precision map used in the image data automatic labeling method can be obtained by simply processing the laser point cloud data obtained by the automatic driving data collection vehicle.
  • the point cloud of road elements such as lane lines and stop lines can be obtained by filtering the reflectivity of the laser point cloud, and then template matching, clustering and fitting methods are used to finally obtain a high-precision map containing rich road elements.
  • the method for automatically labeling image data provided by the present disclosure includes three parts: a map query module, a map information projection module, and a projection error correction module.
  • a map query module based on high-precision maps and high-precision positioning information, comprehensively using vehicle-mounted GPS positioning equipment, vehicle-mounted high-precision inertial navigation equipment, and vision-based positioning correction information to obtain a positioning result of at least decimeter level, and then based on this positioning Results Query road information in a 100m area around the location of the car on the high-precision map, including lane lines, stop line positions and attribute information.
  • the map information projection module supports two projection methods, the first is based on 2D (2-Dimension) map information and a pre-calibrated camera homography matrix; the second is based on 3D (3- Dimension, the projection method of map information and pre-calibrated camera internal and external parameters.
  • the two projection methods are essentially spatial transformations of geometric data, but one is the affine transformation from 2D space to 2D space, and the other is the projection transformation from 3D space to 2D space.
  • the projection error correction module uses the pre-prepared lane line detection deep learning model to extract the lane line position and attribute information in the image, and then minimizes the error between the extracted lane line and the projected lane line, and optimizes the projection function. Obtain the optimized label information of the position and attributes of road elements such as lane lines and stop lines.
  • the input of the map query module is high-precision map and high-precision positioning information
  • the output is local map information near the positioning location.
  • the embodiments of the present disclosure are based on three coordinate systems: the world global coordinate system (including the WGS84 latitude and longitude coordinate system and the ECEF geocentric coordinate system), the vehicle body coordinate system and the camera image pixel coordinate system, three coordinates See Figure 4A, Figure 4B and Figure 4C.
  • the high-precision map uses the world's global coordinate system, and any point in the coordinate system has unique coordinates on the earth, such as latitude and longitude information; among them, the WGS84 latitude and longitude coordinate system uses radian values to represent point coordinate information, so it is inconvenient to use.
  • the ECEF geocentric coordinate system is used.
  • the vehicle body coordinate system is also a right-hand Cartesian rectangular coordinate system
  • the camera image pixel coordinate system is a two-dimensional rectangular coordinate system with pixels as the unit.
  • the query is a recursive process. First, find the area closest to the positioning position, and then find the road closest to the positioning position and the corresponding lane line and stop line information in turn.
  • the KD tree is used to sequentially store the coordinates of each layer of the map to speed up the query process.
  • these road elements need to be converted from the world global coordinate system to the vehicle body local coordinate system, and the conversion between the two right-hand Cartesian rectangular coordinate systems Only one rotation and translation matrix is needed, and the rotation angle and translation amount are obtained from the positioning information.
  • the road elements on the map need to be filtered at the end, and only the lane lines and stop lines within the camera's field of view are retained.
  • the road element information in the occluded part of the field of view can be filtered as needed. For example, nearby objects will occlude distant objects, but this step is not necessary.
  • the input of the map information projection module is local map information near the positioning location
  • the output is the map road element in the pixel coordinate system of the camera image.
  • the accuracy of the height information in the map is low.
  • the camera image pixel coordinate system and the car can be pre-calibrated through manual calibration data.
  • the homography matrix between the body coordinate systems (the matrix is a 3*3 matrix, has 8 degrees of freedom, and can complete the affine transformation from one plane to another). Then only need to use the homogeneous coordinate system to represent the map element information, and then the coordinates of each map element are multiplied by the homography matrix to get the map road elements in the camera image pixel coordinate system.
  • the principle of camera imaging is aperture imaging.
  • the internal parameters of the camera refer to the focal length of the convex lens of the camera and the coordinates of the optical center in the pixel coordinate system; the external parameters of the camera refer to the rotation and translation matrix between the camera coordinate system and the vehicle body coordinate system.
  • the camera coordinate system is the right-hand Cartesian coordinate system with the camera's optical center as the origin, and the upper and front of the camera are the positive directions of the y-axis and z-axis respectively.
  • the map road elements are rotated and translated to the camera coordinate system through the camera external parameters, and then the map road elements are projected to the camera image pixel coordinate system according to the zoom principle of small hole imaging and the camera internal parameters .
  • the input of the projection error correction module is the map road element in the pixel coordinate system and the perceptual lane line information extracted by the deep learning detection segmentation model, and the output is the corrected image label.
  • the map elements projected to the pixel coordinate system may not completely coincide with the real information in the image, so the projection error Correction is an extremely important part.
  • the existing deep learning detection segmentation model is used to extract all lane line information in the image, and these lane lines are regarded as perceptual lane lines.
  • the characteristic of perceiving lane lines is the high accuracy of position information, but there are certain errors in attribute information, quantity information and completeness. It mainly uses the offset between the perceived lane line and the map lane line to correct the projection function. Among them, the correction projection function is divided into two steps. The first step is to find the correspondence between the map lane line and the perceived lane line; the second step is Minimize the distance between the corresponding lane lines. There is a good total sequence relationship between lane lines, that is, they are generally arranged from left to right.
  • the map lane line and the perception lane line are both abstracted as points, and the perception point set and the map point set can be obtained.
  • the points in the perception point concentration and the map point concentration ie edge connection
  • the points in the perception point set are connected.
  • the points in the map point set are not connected, so that a bipartite graph model can be obtained. Therefore, the lane line matching problem can be transformed into a bipartite graph matching problem.
  • Each side of the bipartite graph represents the pairing relationship between a perceived lane line and a map lane line.
  • the bipartite graph model can be seen in Figure 7. Continue to assign weights to the edges of the bipartite graph.
  • the weights can be equal to the degree of similarity and the opposite of the distance between the two lane lines.
  • the degree of similarity can be quantified by the similarity of the location information, the similarity of the shape information, and whether the attributes of the lane line match.
  • the distance between the lane line and the lane line can be converted into the average distance from one point set to another curve.
  • the goal is to perform a bipartite graph matching search to find a disjoint edge set with the largest sum of edge weights. Disjoint means that the edge and the points on both sides of the edge cannot be the same.
  • the maximum sum of edge weights indicates that the lane line matching scheme is optimal, that is, the map lane line and the perceived lane line have the highest similarity under this matching scheme.
  • the problem is converted to minimize the distance between the lane line and the lane line.
  • the lane line can be expressed as a curve, and a point set can be obtained by sampling the points on the curve.
  • the final problem is converted to minimize the distance from the point set to the point set. This problem can be solved by the nearest point iteration (ie ICP) algorithm.
  • the steps of the ICP algorithm include: (1) Pair the points in the two input points according to the closest point pairing principle; (2) Bring the point coordinates into the least square formula to find the rotation and translation
  • the matrix makes the distance between the paired points minimum after the points of one of the point sets are transformed by rotation and translation; (3) Use singular value decomposition to solve the rotation and translation matrix, which is the optimal solution of the optimization problem, through
  • the rotation and translation matrix can realize the overlap of the two point sets (that is, the overlap of the above-mentioned sensing point set and the map point set).
  • the ICP algorithm can output a correction amount. By adding the correction amount to all map elements, the map road element information that is fully consistent with the road image can be obtained. This information includes location and attribute information, that is, image annotation information.
  • the image data automatic labeling method provided by the embodiments of the present disclosure can be deployed on autonomous vehicles, continuously and automatically obtain free labeling data, and establish large-scale data sets, which can be used for deep learning research or application model training;
  • the labeling algorithm obtains and classifies the labeling data of different weather, time periods, and regions, and uses the classified data to carry out model training related to style conversion; this method essentially projects the map information onto the image to complete the image labeling, and use it for Deep learning model training; in turn, this process can be used to project the road element information recognized by the deep learning model to global coordinates for automated mapping.
  • the execution subject of the image data automatic labeling method may be an image processing device.
  • the image data automatic labeling method may be executed by a terminal device or a server or other processing equipment, where the terminal device may be a user Equipment (User Equipment, UE), mobile devices, user terminals, terminals, cellular phones, cordless phones, personal digital assistants (PDAs), handheld devices, computing devices, in-vehicle devices, wearable devices, etc.
  • the method for automatically labeling image data may be implemented by a processor invoking computer-readable instructions stored in a memory.
  • an embodiment of the present disclosure also provides an image data automatic labeling device 1000, which can be applied to the above-mentioned FIG. 1, FIG. 2 and FIG. 5.
  • the device 1000 includes: a first acquisition part 11 configured to acquire vehicle positioning information, a map image, and a vehicle acquisition image, wherein the map image includes road information; and a second acquisition part 12 is configured to acquire vehicle positioning information, map images, and vehicle acquisition images.
  • Information acquiring road information on the map image in the local area corresponding to the vehicle positioning information;
  • the projection part 13 is configured to project the road information on the map image onto the vehicle collection image, so as to The road information is marked on the image collected by the vehicle.
  • the second acquiring part 12 is configured to take the map image in the local area as a root node, and sequentially query the attribute information of the map road elements of the map image in the local area.
  • the attribute information of the map road element includes at least one of the following information: semantic information of the map road element, location information of the map road element, and shape information of the map road element.
  • the device 1000 further includes: a first determining part 14 configured to determine the range of the local area according to the vehicle positioning information and the range of the map image;
  • the second acquiring part 12 is configured to acquire attribute information of map road elements on the map image within the range of the local area.
  • the map image is based on the world global coordinate system
  • the device further includes: a first conversion part 15 configured to convert the map image based on the world global coordinate system to vehicle body coordinates System to obtain a map image based on the vehicle body coordinate system; the projection part 13 is configured to convert the map image based on the vehicle body coordinate system to a camera coordinate system and/or a pixel coordinate system to convert the The road information on the map image is projected onto the vehicle collection image.
  • the first conversion part 15 is configured to obtain the rotation angle and translation amount of the rotation translation matrix according to the vehicle positioning information; and according to the rotation translation matrix, it is based on the world
  • the map image of the global coordinate system is converted to the vehicle body coordinate system to obtain a map image based on the vehicle body coordinate system.
  • the map image is a two-dimensional map
  • the projection part 13 includes: a third acquiring part 131 configured to acquire the distance between the pixel coordinate system and the vehicle body coordinate system A homography matrix; the representation part 132 is configured to use a homogeneous coordinate system to represent the map image based on the vehicle body coordinate system; the second conversion part 133 is configured to use a homogeneous coordinate system according to the homography matrix
  • the displayed map image based on the vehicle body coordinate system is converted to the pixel coordinate system to obtain road information of the vehicle collected image projected on the pixel coordinate system.
  • the map image is a three-dimensional map
  • the projection part 13 includes: a third conversion part configured to translate according to the rotation between the vehicle body coordinate system and the camera coordinate system Matrix to convert the map image based on the vehicle body coordinate system to the camera coordinate system to obtain road information of the vehicle-collected image projected on the camera coordinate system; the fourth conversion part is configured to The projection matrix between the camera coordinate system and the pixel coordinate system converts the road information of the vehicle-collected image projected on the camera coordinate system to the pixel coordinate system to obtain the image projected on the pixel coordinate system The road information of the image collected by the vehicle.
  • the device further includes: an extraction part 16 configured to perform road information extraction processing on the vehicle collected image via a neural network for extracting road information to obtain perceived road information; first The correcting part 17 is configured to correct the road information projected on the collected image of the vehicle based on the perceived road information.
  • the first correction part 17 includes: a second determination part configured to determine the perceived road elements in the perceived road information and the road information projected on the vehicle acquisition image The offset information between the map road elements; the second correction part is configured to correct the road information projected on the vehicle collected image according to the offset information.
  • the second determining part includes: a third determining part configured to determine from the map image a pair of the perceived road element according to the attribute information of the perceived road element The map road element; the fourth determining part is configured to determine the position information of the paired perceived road element and the map road element in the same device coordinate system; the fifth determining part is configured to determine the paired perceived road based on the position information The positioning offset between the element and the map road element.
  • the third determining part includes: a searching part configured to search for map road elements within a preset range in the map image based on the vehicle positioning information; It is configured to pair the perceived road elements in the vehicle-collected image with the map road elements within the preset range based on the attribute information to obtain multiple pairing schemes, wherein at least one perceptual road element in the different pairing schemes is The pairing modes of the map road elements within the preset range are different; the sixth determining part is configured to determine the confidence of each of the pairing schemes; the seventh determining part is configured to be among the multiple pairing schemes In the matching scheme with the highest confidence or exceeding the set threshold, the map road element that is paired with the perceived road element is determined.
  • the matching part is configured to, in the case that the road elements on the map that perceive the road elements within the preset range in the vehicle acquisition image cannot determine the paired road elements, the matching part is configured to Set empty or virtual elements in the map road elements for pairing to be paired with the perceived road elements.
  • the sixth determining part is configured to separately determine the individual similarity of the pairing of each perceived road element and the map road element in each of the pairing schemes; determine each of the pairing schemes The overall similarity between each perceived road element and the map road element in the pairing; and the confidence of each pairing scheme is determined according to the individual similarity and the overall similarity of each pairing scheme.
  • the positioning offset includes a coordinate offset and/or a direction offset
  • the fifth determining part includes: a first sampling part configured to perceive the road element Sampling the pixels of the map to obtain a set of sensing sampling points;
  • the second sampling part is configured to sample the pixel points of the map road elements to obtain a set of map sampling points;
  • the eighth determining part is configured to determine the sensing The rotation and translation matrix between the sampling point set and the sampling points included in the map sampling point set;
  • the fourth obtaining part is configured to obtain the coordinates of the perceived road element and the map road element based on the rotation and translation matrix Offset and direction offset.
  • the rich road information contained in the map data can be used to realize the automatic labeling of the road information of the image collected by the vehicle, which improves the efficiency of labeling image data and is beneficial to reduce the cost of data labeling.
  • the error probability reduces the labor cost of image data annotation.
  • the embodiment of the present disclosure also provides an automatic labeling device for image data, which is used to execute the above-mentioned automatic labeling method for image data.
  • Part or all of the above methods can be implemented by hardware, and can also be implemented by software or firmware.
  • the device may be a chip or an integrated circuit.
  • the device 1100 may include: an input device 111, an output device 112, a memory 113, and a processor 114 (the processor 114 in the device may be one or more, and one processor is taken as an example in FIG. 11) .
  • the input device 111, the output device 112, the memory 113, and the processor 114 may be connected by a bus or in other ways. Among them, the connection by a bus is taken as an example in FIG. 11.
  • the processor 114 is configured to execute the method steps executed by the devices in FIG. 1, FIG. 2, FIG. 5, FIG. 6, and FIG. 8.
  • the program of the foregoing image data automatic labeling method may be stored in the memory 113.
  • the memory 113 may be a physically independent part, or may be integrated with the processor 114.
  • the memory 113 can also be used to store data.
  • the device may also only include a processor.
  • the memory for storing the program is located outside the device, and the processor is connected to the memory through a circuit or wire for reading and executing the program stored in the memory.
  • the processor may be a central processing unit (CPU), a network processor (NP), or a wide area network (WAN) device.
  • the processor may further include a hardware chip.
  • the aforementioned hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), or a combination thereof.
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • the above-mentioned PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL), or any combination thereof.
  • the memory may include volatile memory (volatile memory), such as random-access memory (RAM); the memory may also include non-volatile memory (non-volatile memory), such as flash memory (flash memory) , Hard disk drive (HDD) or solid-state drive (solid-state drive, SSD); the memory may also include a combination of the foregoing types of memory.
  • volatile memory volatile memory
  • non-volatile memory non-volatile memory
  • flash memory flash memory
  • HDD Hard disk drive
  • SSD solid-state drive
  • the rich road information contained in the map data can be used to automatically label the road information of the image collected by the vehicle, which improves the efficiency of labeling image data and is beneficial to reducing the data.
  • the error probability of labeling reduces the labor cost of image data labeling.
  • the embodiments of the present disclosure may be systems, methods, and/or computer program products.
  • the computer program product may include a computer-readable storage medium loaded with computer-readable program instructions for enabling a processor to implement various aspects of the embodiments of the present disclosure.
  • the computer-readable storage medium may be a tangible device that can hold and store instructions used by the instruction execution device.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Examples of computer-readable storage media include: portable computer disks, hard disks, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), erasable programmable Read-only memory (Electrical Programmable Read Only Memory, EPROM or flash memory), static random access memory (Static Random Access Memory, SRAM), portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), digital multi Function disks (Digital Video Disc, DVD), memory sticks, floppy disks, mechanical encoding devices, such as punch cards on which instructions are stored or raised structures in the grooves, and any suitable combination of the foregoing.
  • the computer-readable storage medium used herein is not interpreted as a transient signal itself, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media, or electrical signals transmitted through wires.
  • the computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to various computing/processing devices, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • the network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network, and forwards the computer-readable program instructions for storage in the computer-readable storage medium in each computing/processing device .
  • the computer program instructions used to perform the operations of the embodiments of the present disclosure may be assembly instructions, instruction set architecture (Industry Standard Architecture, ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or a combination of Or source code or object code written in any combination of multiple programming languages, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as "C" language or similar programming Language.
  • Computer-readable program instructions can be executed entirely on the user's computer, partly on the user's computer, executed as a stand-alone software package, partly on the user's computer and partly executed on a remote computer, or entirely on the remote computer or server carried out.
  • the remote computer can be connected to the user's computer through any kind of network-including local area network (LAN) or wide area network (WAN)-or it can be connected to an external computer (such as Use an Internet service provider to connect via the Internet).
  • the electronic circuit is customized by using the state information of the computer-readable program instructions, such as programmable logic circuit, field programmable gate array (FPGA) or programmable logic array (PLA), The electronic circuit can execute computer-readable program instructions to implement various aspects of the embodiments of the present disclosure.
  • These computer-readable program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, thereby producing a machine that allows these instructions to be executed by the processor of the computer or other programmable data processing device.
  • a device that implements the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams is produced. It is also possible to store these computer-readable program instructions in a computer-readable storage medium.
  • each block in the flowchart or block diagram may represent a module, program segment, or part of an instruction, and the module, program segment, or part of an instruction includes one or more executable instructions for realizing specified logical functions.
  • the functions marked in the block may also occur in a different order than the order marked in the drawings. For example, two consecutive blocks can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Automation & Control Theory (AREA)
  • Library & Information Science (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)
  • Navigation (AREA)
  • Traffic Control Systems (AREA)

Abstract

Provided are a method for automatically labeling image data and device. The method includes: obtaining the vehicle positioning information, the map data and the vehicle acquisition image (S101); obtaining the road information on the map image in the local area corresponding to the vehicle positioning information according to the vehicle positioning information (S102); and projecting the road information on the map image onto the vehicle acquisition image to label the road information on the vehicle acquisition image (S103).

Description

图像数据自动标注方法及装置Method and device for automatically labeling image data
相关申请的交叉引用Cross-references to related applications
本公开基于申请号为201910983438.9、申请日为2019年10月16日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本公开作为参考。This disclosure is filed based on a Chinese patent application with an application number of 201910983438.9 and an application date of October 16, 2019, and claims the priority of the Chinese patent application. The entire content of the Chinese patent application is hereby incorporated by reference into this disclosure.
技术领域Technical field
本公开涉及图像处理技术领域,涉及一种图像数据自动标注方法及装置。The present disclosure relates to the technical field of image processing, and relates to a method and device for automatically labeling image data.
背景技术Background technique
训练神经网络从一张道路图片中识别出车道线,需要准备足够多的道路图片以及对应的经过人为认真标注过的车道线位置和属性的标签。只有通过提供丰富的道路图片数据和具有人类先验知识的标签数据让神经网络经过充分训练之后,神经网络才能够具备从新道路图片中识别出车道线的能力。然而,图像数据人工标注是一个极度耗神的重复过程,人工成本高,效率低,并且随着标注时间的增长,出错的概率也明显增加。To train a neural network to recognize lane lines from a road picture, it is necessary to prepare enough road pictures and corresponding labels for the lane line positions and attributes that have been carefully marked by humans. Only after the neural network is fully trained by providing rich road image data and label data with human prior knowledge, can the neural network have the ability to recognize lane lines from new road images. However, manual labeling of image data is an extremely exhausting repetitive process, with high labor costs and low efficiency, and as the labeling time increases, the probability of error also increases significantly.
发明内容Summary of the invention
本公开实施例提供一种图像数据自动标注的技术方案。The embodiments of the present disclosure provide a technical solution for automatic labeling of image data.
第一方面,本公开实施例提供了一种图像数据自动标注方法,所述方法包括:获取车辆定位信息、地图图像和车辆采集图像,其中,所述地图图像包括道路信息;根据所述车辆定位信息,获取所述车辆定位信息对应的局部区域内的所述地图图像上的道路信息;将所述地图图像上的道路信息投影到所述车辆采集图像上,以在所述车辆采集图像上标注所述道路信息。In a first aspect, an embodiment of the present disclosure provides a method for automatically labeling image data. The method includes: acquiring vehicle positioning information, a map image, and a vehicle collection image, wherein the map image includes road information; Information, acquiring road information on the map image in the local area corresponding to the vehicle positioning information; projecting the road information on the map image onto the vehicle acquisition image to mark the vehicle acquisition image The road information.
在一种可能的实现方式中,所述根据所述车辆定位信息,获取所述车辆定位信息对应的局部区域内的所述地图图像上的道路信息,包括:以所述局部区域内的地图图像为根节点,依次查询所述局部区域内的地图图像的地图道路元素的属性信息,其中,所述地图道路元素的属性信息包括以下至少一个信息:所述地图道路元素的语义信息,所述地图道路元素的位置信息,所述地图道路元素的形状信息。In a possible implementation manner, the acquiring, according to the vehicle positioning information, the road information on the map image in the local area corresponding to the vehicle positioning information includes: using the map image in the local area Is the root node, which sequentially queries the attribute information of the map road element of the map image in the local area, where the attribute information of the map road element includes at least one of the following information: semantic information of the map road element, the map The location information of the road element, and the shape information of the road element on the map.
在一种可能的实现方式中,所述方法还包括:根据所述车辆定位信息和所述地图图像的范围,确定所述局部区域的范围;所述根据所述车辆定位信息,获取所述车辆定位信息对应的局部区域内的所述地图图像上的道路信息,包括:获取所述局部区域的范围内的所述地图图像上的地图道路元素的属性信息。In a possible implementation manner, the method further includes: determining the range of the local area according to the vehicle positioning information and the range of the map image; and acquiring the vehicle according to the vehicle positioning information The road information on the map image in the local area corresponding to the positioning information includes: acquiring attribute information of map road elements on the map image in the range of the local area.
在一种可能的实现方式中,所述地图图像基于世界全局坐标系,所述将所述地图图像上的道路信息投影到车辆采集图像上之前,所述方法还包括:将基于所述世界全局坐标系的地图图像转换到车体坐标系上,得到基于所述车体坐标系的地图图像;所述将所述地图图像上的道路信息投影到车辆采集图像上,包括:将基于所述车体坐标系的地图图像转换到相机坐标系和/或像素坐标系,以将地图图像上的道路信息投影到所述车辆采集图像上。In a possible implementation manner, the map image is based on the world global coordinate system, and before the road information on the map image is projected onto the vehicle collection image, the method further includes: The map image of the coordinate system is converted to the vehicle body coordinate system to obtain a map image based on the vehicle body coordinate system; the projecting the road information on the map image onto the vehicle collection image includes: The map image of the volume coordinate system is converted to the camera coordinate system and/or the pixel coordinate system, so as to project the road information on the map image onto the vehicle acquisition image.
在一种可能的实现方式中,所述将基于所述世界全局坐标系的地图图像转换到车体坐标系上,得到基于所述车体坐标系的地图图像,包括:根据所述车辆定位信息,得到旋转平移矩阵的旋转角度和平移量;根据所述旋转平移矩阵,将基于所述世界全局坐标系的地图图像转换到所述车体坐标系上,得到基于所述车体坐标系的地图图像。In a possible implementation manner, the converting the map image based on the world global coordinate system to the vehicle body coordinate system to obtain the map image based on the vehicle body coordinate system includes: according to the vehicle positioning information , Obtain the rotation angle and translation amount of the rotation and translation matrix; according to the rotation and translation matrix, convert the map image based on the world global coordinate system to the vehicle body coordinate system to obtain a map based on the vehicle body coordinate system image.
在一种可能的实现方式中,所述地图图像为二维地图,所述将基于所述车体坐标系的地图图像转换到相机坐标系和/或像素坐标系,以将所述地图图像上的道路信息投影到所述车辆采集图像上,包括:获取所述像素坐标系和所述车体坐标系之间的单应性矩阵;采用齐次坐标系表示基于所述车体坐标系的地图图像;根据所述单应性矩阵,将所述采用齐次坐标系表示的基于所述车体坐标系的地图图像,转换到所述像素坐标系上,得到投影到所述像素坐标系上的车辆采集图像的道路信息。In a possible implementation manner, the map image is a two-dimensional map, and the map image based on the vehicle body coordinate system is converted to a camera coordinate system and/or a pixel coordinate system, so as to map the map image to a camera coordinate system and/or a pixel coordinate system. Projecting the road information of the vehicle onto the vehicle collection image includes: obtaining a homography matrix between the pixel coordinate system and the vehicle body coordinate system; and using a homogeneous coordinate system to represent the map based on the vehicle body coordinate system Image; According to the homography matrix, the map image based on the vehicle body coordinate system expressed in a homogeneous coordinate system is converted to the pixel coordinate system to obtain a projected image on the pixel coordinate system The vehicle collects road information from the image.
在一种可能的实现方式中,所述地图图像为三维地图,所述将基于所述车体坐标系的地图图像转换到相机坐标系和/或像素坐标系,以将所述地图图像上的道路信息投影到所述车辆采集图像上,包括:根据所述车体坐标系与所述相机坐标系之间的旋转平移矩阵,将所述基于车体坐标系的地图图像转换到所述相机坐标系上,得到投影到所述相机坐标系上的车辆采集图像的道路信息;根据所述相机坐标系与所述像素坐标系之间的投影矩阵,将投影到所述相机坐标系上的车辆采集图像的道路信息转换到所述像素坐标系上,得到投影到所述像素坐标系上的车辆采集图像的道路信息。In a possible implementation manner, the map image is a three-dimensional map, and the map image based on the vehicle body coordinate system is converted to a camera coordinate system and/or a pixel coordinate system, so as to convert the map image on the map image into a camera coordinate system and/or a pixel coordinate system. Projecting road information onto the vehicle collection image includes: converting the map image based on the vehicle body coordinate system to the camera coordinates according to the rotation and translation matrix between the vehicle body coordinate system and the camera coordinate system System, the road information of the vehicle acquisition image projected on the camera coordinate system is obtained; according to the projection matrix between the camera coordinate system and the pixel coordinate system, the vehicle projected on the camera coordinate system is acquired The road information of the image is converted to the pixel coordinate system, and the road information of the vehicle-collected image projected on the pixel coordinate system is obtained.
在一种可能的实现方式中,所述方法还包括:经用于提取道路信息的神经网络对所述车辆采集图像进行道路信息提取处理,得到感知道路信息;根据所述感知道路信息,对投影到所述车辆采集图像上的道路信息进行修正。In a possible implementation, the method further includes: performing road information extraction processing on the vehicle collected image via a neural network for extracting road information to obtain perceived road information; and projecting road information based on the perceived road information Correct the road information on the image collected by the vehicle.
在一种可能的实现方式中,所述根据所述感知道路信息,对投影到所述车辆采集图像上的道路信息进行修正,包括:确定所述感知道路信息中的感知道路元素与投影到所述车辆采集图像上的道路信息中的地图道路元素之间的偏移信息;根据所述偏移信息,对投影到所述车辆采集图像上的道路信息进行修正。In a possible implementation manner, the correcting the road information projected on the image collected by the vehicle according to the perceived road information includes: determining the perceived road element in the perceived road information and the projected The offset information between map road elements in the road information on the vehicle collected image; and the road information projected on the vehicle collected image is corrected according to the offset information.
在一种可能的实现方式中,所述确定所述感知道路信息中的感知道路元素与投影到所述车辆采集图像上的道路信息中的地图道路元素之间的偏移信息,包括:根据所述感知道路元素的属性信息,从所述地图图像中确定与所述感知道路元素配对的地图道路元素;确定配对的感知道路元素和地图道路元素在同一设备坐标系下的位置信息;基于所述位置信息确定配对的感知道路元素和地图道路元素之间的定位偏移量。In a possible implementation, the determining the offset information between the perceived road element in the perceived road information and the map road element in the road information projected on the vehicle-collected image includes: The attribute information of the perceived road element, the map road element paired with the perceived road element is determined from the map image; the position information of the paired perceived road element and the map road element in the same device coordinate system is determined; based on the The location information determines the positioning offset between the paired perceived road element and the map road element.
在一种可能的实现方式中,根据所述感知道路元素的属性信息,从所述地图中确定与所述感知道路元素配对的地图道路元素,包括:在所述地图图像中,基于所述车辆定位信息查找预设范围内的地图道路元素;将所述车辆采集图像中的感知道路元素与所述预设范围内的地图道路元素基于属性信息进行两两配对,获得多种配对方案,其中,不同配对方案中至少一感知道路元素与所述预设范围内的地图道路元素的配对方式不同;确定每个所述配对方案的置信度;在所述多种配对方案中置信度最高或超过设定阈值的配对方案中,确定与所述感知道路元素配对的地图道路元素。In a possible implementation manner, determining the map road element paired with the perceived road element from the map according to the attribute information of the perceived road element includes: in the map image, based on the vehicle The positioning information searches for map road elements within a preset range; the perceived road elements in the vehicle collected image and the map road elements within the preset range are paired in pairs based on the attribute information to obtain a variety of matching schemes, where: At least one perceptual road element in different matching schemes has a different pairing method with map road elements within the preset range; determining the confidence of each of the matching schemes; among the multiple matching schemes, the confidence is the highest or exceeds the setting In a pairing scheme with a predetermined threshold, the map road element paired with the perceived road element is determined.
在一种可能的实现方式中,将所述车辆采集图像中的感知道路元素与所述预设范围内的地图道路元素进行配对,包括:在所述车辆采集图像中的感知道路元素在所述预设范围内的地图道路元素无法确定配对的道路元素情况下,在待进行配对的地图道路元素中设置空或虚拟元素与所述感知道路元素配对。In a possible implementation manner, pairing the perceived road elements in the vehicle acquisition image with the map road elements within the preset range includes: the perceived road elements in the vehicle acquisition image are in the In the case that the map road element within the preset range cannot determine the paired road element, an empty or virtual element is set in the map road element to be paired to be paired with the perceived road element.
在一种可能的实现方式中,确定每个所述配对方案的置信度,包括:分别确定每个所述配对方案中每个感知道路元素和地图道路元素的配对的个体相似度;确定每个所述配对方案中各感知道路元素和地图道路元素配对的整体相似度;根据每个所述配对方案的各个体相似度和整体相似度,确定每个所述配对方案的置信度。In a possible implementation manner, determining the confidence of each of the pairing schemes includes: separately determining the individual similarity of the pairing of each perceived road element and the map road element in each of the pairing schemes; determining each The overall similarity between each perceived road element and the map road element in the pairing scheme; and the confidence of each pairing scheme is determined according to the individual similarity and the overall similarity of each pairing scheme.
在一种可能的实现方式中,所述定位偏移量包括坐标偏移量和/或方向偏移量;基于所述车辆定位信息确定配对的感知道路元素和地图道路元素之间的定位偏移量,包括:对所述感知道路元素的像素点进行采样,获得感知采样点集;对所述地图道路元素的像素点进行采样,获得地图采样点集;确定所述感知采样点集与所述地图采样点集各自包括的采样点之间的旋转平移矩阵;基于所述旋转平移矩阵获得所述感知道路元素与所述地图道路元素的坐标偏移量和方向偏移量。In a possible implementation manner, the positioning offset includes a coordinate offset and/or a direction offset; the positioning offset between the paired perceived road element and the map road element is determined based on the vehicle positioning information The quantity includes: sampling the pixels of the perceived road element to obtain a set of perceived sampling points; sampling the pixel points of the map road element to obtain a set of map sampling points; determining the set of perceived sampling points and the The rotation and translation matrix between the sampling points included in each map sampling point set; the coordinate offset and the direction offset between the perceived road element and the map road element are obtained based on the rotation and translation matrix.
第二方面,本公开实施例提供了一种图像数据自动标注装置,所述装置包括:第一获取部分,被配置为获取车辆定位信息、地图图像和车辆采集图像,其中,所述地图图像包括道路信息;第二获取部分,被配置为根据所述车辆定位信息,获取所述车辆定位信息对应的局部区域内的所述地图图像上的道路信息;投影部分,被配置为将所述地图图像上的 道路信息投影到所述车辆采集图像上,以在所述车辆采集图像上标注所述道路信息。In a second aspect, an embodiment of the present disclosure provides an automatic labeling device for image data. The device includes: a first acquiring part configured to acquire vehicle positioning information, a map image, and a vehicle collection image, wherein the map image includes Road information; a second acquisition part configured to acquire road information on the map image in a local area corresponding to the vehicle positioning information according to the vehicle positioning information; a projection part configured to transfer the map image The road information on the above is projected onto the vehicle acquisition image to mark the road information on the vehicle acquisition image.
在一种可能的实现方式中,所述第二获取部分被配置为以所述局部区域内的地图图像为根节点,依次查询所述局部区域内的地图图像的地图道路元素的属性信息,其中,所述地图道路元素的属性信息包括以下至少一个信息:所述地图道路元素的语义信息,所述地图道路元素的位置信息,所述地图道路元素的形状信息。In a possible implementation manner, the second acquisition part is configured to use the map image in the local area as a root node to sequentially query the attribute information of the map road elements of the map image in the local area, where The attribute information of the map road element includes at least one of the following information: semantic information of the map road element, position information of the map road element, and shape information of the map road element.
在一种可能的实现方式中,所述装置还包括:第一确定部分,被配置为根据所述车辆定位信息和所述地图图像的范围,确定所述局部区域的范围;所述第二获取部分被配置为获取所述局部区域的范围内的所述地图图像上的地图道路元素的属性信息。In a possible implementation manner, the device further includes: a first determining part configured to determine the range of the local area according to the vehicle positioning information and the range of the map image; the second acquisition The part is configured to obtain attribute information of map road elements on the map image within the range of the local area.
在一种可能的实现方式中,所述地图图像基于世界全局坐标系,所述装置还包括:第一转换部分,被配置为将基于所述世界全局坐标系的地图图像转换到车体坐标系上,得到基于所述车体坐标系的地图图像;所述投影部分被配置为将基于所述车体坐标系的地图图像转换到相机坐标系和/或像素坐标系,以将所述地图图像上的道路信息投影到所述车辆采集图像上。In a possible implementation manner, the map image is based on the world global coordinate system, and the device further includes: a first conversion part configured to convert the map image based on the world global coordinate system to the vehicle body coordinate system Above, a map image based on the vehicle body coordinate system is obtained; the projection part is configured to convert the map image based on the vehicle body coordinate system into a camera coordinate system and/or a pixel coordinate system to convert the map image The road information on the map is projected onto the image collected by the vehicle.
在一种可能的实现方式中,所述第一转换部分被配置为:根据所述车辆定位信息,得到旋转平移矩阵的旋转角度和平移量;以及根据所述旋转平移矩阵,将基于所述世界全局坐标系的地图图像转换到所述车体坐标系上,得到基于所述车体坐标系的地图图像。In a possible implementation manner, the first conversion part is configured to: obtain a rotation angle and a translation amount of a rotation translation matrix according to the vehicle positioning information; and, according to the rotation translation matrix, based on the world The map image of the global coordinate system is converted to the vehicle body coordinate system to obtain a map image based on the vehicle body coordinate system.
在一种可能的实现方式中,所述地图图像为二维地图,所述投影部分包括:第三获取部分,被配置为获取所述像素坐标系和所述车体坐标系之间的单应性矩阵;表示部分,被配置为采用齐次坐标系表示基于所述车体坐标系的地图图像;第二转换部分,被配置为根据所述单应性矩阵,将所述采用齐次坐标系表示的基于所述车体坐标系的地图图像,转换到所述像素坐标系上,得到投影到所述像素坐标系上的车辆采集图像的道路信息。In a possible implementation manner, the map image is a two-dimensional map, and the projection part includes: a third acquisition part configured to acquire the homography between the pixel coordinate system and the vehicle body coordinate system The representation part is configured to use a homogeneous coordinate system to represent the map image based on the vehicle body coordinate system; the second conversion part is configured to use the homogeneous coordinate system to convert the homogeneous coordinate system according to the homography matrix The displayed map image based on the vehicle body coordinate system is converted to the pixel coordinate system to obtain road information of the vehicle collected image projected on the pixel coordinate system.
在一种可能的实现方式中,所述地图图像为三维地图,所述投影部分包括:第三转换部分,被配置为根据所述车体坐标系与所述相机坐标系之间的旋转平移矩阵,将所述基于车体坐标系的地图图像转换到所述相机坐标系上,得到投影到所述相机坐标系上的车辆采集图像的道路信息;In a possible implementation manner, the map image is a three-dimensional map, and the projection part includes: a third conversion part configured to rotate and translate a matrix between the vehicle body coordinate system and the camera coordinate system , Converting the map image based on the vehicle body coordinate system to the camera coordinate system to obtain road information of the vehicle collected image projected on the camera coordinate system;
第四转换部分,被配置为根据所述相机坐标系与所述像素坐标系之间的投影矩阵,将投影到所述相机坐标系上的车辆采集图像的道路信息转换到所述像素坐标系上,得到投影到所述像素坐标系上的车辆采集图像的道路信息。The fourth conversion part is configured to convert the road information of the vehicle collected image projected on the camera coordinate system to the pixel coordinate system according to the projection matrix between the camera coordinate system and the pixel coordinate system , Obtain the road information of the vehicle-collected image projected on the pixel coordinate system.
在一种可能的实现方式中,所述装置还包括:提取部分,被配置为经用于提取道路信息的神经网络对所述车辆采集图像进行道路信息提取处理,得到感知道路信息;第一修正部分,被配置为根据感知道路信息,对投影到所述车辆采集图像上的道路信息进行修正。In a possible implementation, the device further includes: an extraction part configured to perform road information extraction processing on the vehicle collected image via a neural network for extracting road information to obtain perceived road information; The part is configured to correct the road information projected on the vehicle collected image according to the perceived road information.
在一种可能的实现方式中,所述第一修正部分包括:第二确定部分,被配置为确定所述感知道路信息中的感知道路元素与投影到所述车辆采集图像上的道路信息中的地图道路元素之间的偏移信息;第二修正部分,被配置为根据所述偏移信息,对投影到所述车辆采集图像上的道路信息进行修正。In a possible implementation manner, the first correction part includes: a second determination part configured to determine the difference between the perceived road element in the perceived road information and the road information projected on the vehicle-collected image The offset information between the map road elements; the second correction part is configured to correct the road information projected on the vehicle collection image according to the offset information.
在一种可能的实现方式中,所述第二确定部分包括:第三确定部分,被配置为根据所述感知道路元素的属性信息,从所述地图图像中确定与所述感知道路元素配对的地图道路元素;第四确定部分,被配置为确定配对的感知道路元素和地图道路元素在同一设备坐标系下的位置信息;第五确定部分,被配置为基于所述位置信息确定配对的感知道路元素和地图道路元素之间的定位偏移量。In a possible implementation manner, the second determining part includes: a third determining part configured to determine from the map image a pair of the perceived road element according to the attribute information of the perceived road element The map road element; the fourth determining part is configured to determine the position information of the paired perceived road element and the map road element in the same device coordinate system; the fifth determining part is configured to determine the paired perceived road based on the position information The positioning offset between the element and the map road element.
在一种可能的实现方式中,所述第三确定部分包括:查找部分,被配置为在所述地图图像中,基于所述车辆定位信息查找预设范围内的地图道路元素;匹配部分,被配置为将所述车辆采集图像中的感知道路元素与所述预设范围内的地图道路元素基于属性信息进行两两配对,获得多种配对方案,其中,不同配对方案中至少一感知道路元素与所述预设 范围内的地图道路元素的配对方式不同;第六确定部分,被配置为确定每个所述配对方案的置信度;第七确定部分,被配置为在所述多种配对方案中置信度最高或超过设定阈值的配对方案中,确定与所述感知道路元素配对的地图道路元素。In a possible implementation manner, the third determining part includes: a searching part configured to search for map road elements within a preset range in the map image based on the vehicle positioning information; It is configured to pair the perceived road elements in the vehicle-collected image with the map road elements within the preset range based on the attribute information to obtain multiple pairing schemes, wherein at least one perceptual road element in the different pairing schemes is The pairing modes of the map road elements within the preset range are different; the sixth determining part is configured to determine the confidence of each of the pairing schemes; the seventh determining part is configured to be among the multiple pairing schemes In the matching scheme with the highest confidence or exceeding the set threshold, the map road element that is paired with the perceived road element is determined.
在一种可能的实现方式中,所述匹配部分被配置为在所述车辆采集图像中的感知道路元素在所述预设范围内的地图道路元素无法确定配对的道路元素的情况下,在待进行配对的地图道路元素中设置空或虚拟元素与所述感知道路元素配对。In a possible implementation manner, the matching part is configured to, in the case that the road elements on the map that perceive the road elements within the preset range in the vehicle acquisition image cannot determine the paired road elements, the matching part is configured to Set empty or virtual elements in the map road elements for pairing to be paired with the perceived road elements.
在一种可能的实现方式中,所述第六确定部分被配置为分别确定每个所述配对方案中每个感知道路元素和地图道路元素的配对的个体相似度;确定每个所述配对方案中各感知道路元素和地图道路元素配对的整体相似度;以及根据每个所述配对方案的各个体相似度和整体相似度,确定每个所述配对方案的置信度。In a possible implementation manner, the sixth determining part is configured to separately determine the individual similarity of the pairing of each perceived road element and the map road element in each of the pairing schemes; determine each of the pairing schemes The overall similarity between each perceived road element and the map road element in the pairing; and the confidence of each pairing scheme is determined according to the individual similarity and the overall similarity of each pairing scheme.
在一种可能的实现方式中,所述定位偏移量包括坐标偏移量和/或方向偏移量;所述第五确定部分包括:第一采样部分,被配置为对所述感知道路元素的像素点进行采样,获得感知采样点集;第二采样部分,被配置为对所述地图道路元素的像素点进行采样,获得地图采样点集;第八确定部分,被配置为确定所述感知采样点集与所述地图采样点集各自包括的采样点之间的旋转平移矩阵;第四获取部分,被配置为基于所述旋转平移矩阵获得所述感知道路元素与所述地图道路元素的坐标偏移量和方向偏移量。In a possible implementation manner, the positioning offset includes a coordinate offset and/or a direction offset; the fifth determining part includes: a first sampling part configured to perceive the road element Sampling the pixels of the map to obtain a set of sensing sampling points; the second sampling part is configured to sample the pixel points of the map road elements to obtain a set of map sampling points; the eighth determining part is configured to determine the sensing The rotation and translation matrix between the sampling point set and the sampling points included in the map sampling point set; the fourth obtaining part is configured to obtain the coordinates of the perceived road element and the map road element based on the rotation and translation matrix Offset and direction offset.
第三方面,本公开实施例提供了一种图像数据自动标注装置,所述装置包括:输入装置、输出装置、存储器和处理器;其中,所述存储器中存储一组程序代码,且所述处理器用于调用所述存储器中存储的程序代码,执行上述第一方面或其各种可能的实现所述的方法。In a third aspect, an embodiment of the present disclosure provides a device for automatically labeling image data. The device includes: an input device, an output device, a memory, and a processor; wherein the memory stores a set of program codes, and the processing The device is used to call the program code stored in the memory to execute the method described in the first aspect or various possible implementations thereof.
第四方面,本公开实施例提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面或其各种可能的实现所述的方法。In a fourth aspect, embodiments of the present disclosure provide a computer-readable storage medium that stores instructions in the computer-readable storage medium, which when run on a computer, cause the computer to execute the first aspect or its various possibilities. To implement the described method.
第五方面,本公开实施例提供了一种计算机程序,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行用于实现上述第一方面或其各种可能的实现所述的方法。In a fifth aspect, the embodiments of the present disclosure provide a computer program, including computer-readable code. When the computer-readable code runs in an electronic device, the processor in the electronic device executes the first Aspect or its various possible implementations.
采用本公开的方案,具有如下有益效果:Adopting the solution of the present disclosure has the following beneficial effects:
利用地图数据包含的丰富的道路信息,将该地图数据上的道路信息投影到车辆采集图像上,可以实现对车辆采集图像的道路信息的自动标注,提高了标注图像数据的效率,有利于降低数据标注的出错概率,减少图像数据标注的人工成本。Utilizing the rich road information contained in the map data, projecting the road information on the map data onto the vehicle collection image can realize automatic labeling of the road information of the vehicle collection image, which improves the efficiency of labeling image data and helps reduce data The error probability of labeling reduces the labor cost of image data labeling.
附图说明Description of the drawings
下面将对本公开实施例或背景技术中所需要使用的附图进行说明。Hereinafter, the drawings needed to be used in the embodiments of the present disclosure or the background art will be described.
图1为本公开实施例提供的一种图像数据自动标注方法的流程示意图;FIG. 1 is a schematic flowchart of a method for automatically labeling image data according to an embodiment of the disclosure;
图2为本公开实施例提供的又一种图像数据自动标注方法的流程示意图;2 is a schematic flowchart of another method for automatically labeling image data according to an embodiment of the present disclosure;
图3A-图3C示出了识别道路元素的语义信息的效果图;Figures 3A-3C show the effect diagrams of identifying the semantic information of road elements;
图4A为世界全局坐标系的示意图;Figure 4A is a schematic diagram of the world global coordinate system;
图4B为车体坐标系的示意图;Figure 4B is a schematic diagram of the vehicle body coordinate system;
图4C为相机坐标系和像素坐标系的示意图;4C is a schematic diagram of the camera coordinate system and the pixel coordinate system;
图5为本公开实施例提供的确定感知道路元素与地图道路元素之间的偏移信息方法的示例;FIG. 5 is an example of a method for determining offset information between a perceived road element and a map road element provided by an embodiment of the disclosure;
图6为本公开实施例提供的从地图图像中确定与感知道路元素配对的地图道路元素方法的示例;6 is an example of a method for determining a map road element paired with a perceived road element from a map image provided by an embodiment of the disclosure;
图7为本公开实施例提供的配对方案的示意图;FIG. 7 is a schematic diagram of a pairing scheme provided by an embodiment of the disclosure;
图8为本公开实施例提供的确定配对的感知道路元素和地图道路元素之间的定位偏移 量方法的示例;Fig. 8 is an example of a method for determining a positioning offset between a paired perceived road element and a map road element provided by an embodiment of the disclosure;
图9为本公开实施例提供的最近点迭代法的示意图;FIG. 9 is a schematic diagram of the closest point iteration method provided by an embodiment of the disclosure;
图10为本公开实施例提供的一种图像数据自动标注装置的结构示意图;FIG. 10 is a schematic structural diagram of a device for automatically labeling image data provided by an embodiment of the disclosure;
图11为本公开实施例提供的又一种图像数据自动标注装置的结构示意图。FIG. 11 is a schematic structural diagram of another image data automatic labeling device provided by an embodiment of the disclosure.
具体实施方式Detailed ways
下面结合本公开实施例中的附图对本公开实施例进行描述。The embodiments of the present disclosure will be described below in conjunction with the drawings in the embodiments of the present disclosure.
请参阅图1,为本公开实施例提供的一种图像数据自动标注方法的流程示意图,示例性地,该方法可以包括以下步骤:Please refer to FIG. 1, which is a schematic flowchart of an image data automatic labeling method provided by an embodiment of the present disclosure. Illustratively, the method may include the following steps:
S101、获取车辆定位信息、地图图像和车辆采集图像。S101. Acquire vehicle positioning information, map images, and vehicle collection images.
本公开实施例中,“车辆”做广义理解,可包括传统意义上的具有运输或运营功能的各种类型的车,例如货车、大巴车、公交车、小轿车;也包括可移动的机器人设备,例如导盲设备、智能玩具、扫地机器人等智能家居设备,也可以是工业机器人、服务机器人、玩具机器人、教育机器人等,本公开实施例均不限定。In the embodiments of the present disclosure, “vehicles” are understood in a broad sense and can include various types of vehicles with transportation or operation functions in the traditional sense, such as trucks, buses, buses, and cars; they also include movable robot equipment. For example, smart home devices such as blind guide devices, smart toys, sweeping robots, etc., can also be industrial robots, service robots, toy robots, educational robots, etc., which are not limited in the embodiments of the present disclosure.
在一种可能的实现方式中,该车辆可以配置有位置传感器以获得该车辆定位信息。In a possible implementation manner, the vehicle may be equipped with a position sensor to obtain the vehicle positioning information.
在一种可能的实现方式中,该车辆还可以配置有视觉传感器以实时采集车辆周围的图像,获得的图像不妨称为车辆采集图像。由于车辆上设置的视觉传感器采集的图像相当于车辆驾驶控制***对车辆周围环境的“感知”,因此,车辆采集图像也可以称为感知道路图像。在本公开实施例中,车辆采集图像就是采集到的图像本身,图像上没有标注信息。In a possible implementation manner, the vehicle may also be equipped with a vision sensor to collect images around the vehicle in real time, and the obtained images may be referred to as vehicle collection images. Since the image collected by the visual sensor installed on the vehicle is equivalent to the “perception” of the vehicle driving control system on the surrounding environment of the vehicle, the image collected by the vehicle can also be referred to as a perceived road image. In the embodiment of the present disclosure, the image collected by the vehicle is the collected image itself, and there is no label information on the image.
在一种可能的实现方式中,还可以从服务器或车载终端获取该地图图像。该地图图像可以是语义地图、高精地图等等,但不限于此,也可以是其他类型的地图。该地图图像包括丰富的道路信息。道路信息是指基于该地图图像识别到的地图道路元素的属性信息。In a possible implementation manner, the map image can also be obtained from a server or a vehicle-mounted terminal. The map image can be a semantic map, a high-precision map, etc., but is not limited to this, and can also be other types of maps. The map image includes rich road information. Road information refers to attribute information of map road elements identified based on the map image.
在一种可能的实现方式中,道路信息中的地图道路元素可以包括与道路相关的标识,可以至少包括以下一种或多种:道路上各种类型的车道线、停止线、转向线、道路边缘线,以及设置在道路旁或者道路上的交通标志牌、交通灯、路灯等等。各种类型的车道线可以包括但不限于白色实线车道线、黄色虚线车道线、左侧边缘车道线、右侧边缘测导线等等;各个类型的交通标志牌可以包括但不限于慢行交通牌、禁止停车交通牌、限速交通牌等等。本领域技术人员应当理解,道路元素并不限于以上所述。In a possible implementation, the map road element in the road information may include road-related signs, and may include at least one or more of the following: various types of lane lines, stop lines, turning lines, and roads on the road. Edge lines, as well as traffic signs, traffic lights, street lights, etc. installed on the side of the road or on the road. Various types of lane lines can include, but are not limited to, white solid lane lines, yellow dashed lane lines, left edge lane lines, right edge measurement lines, etc.; various types of traffic signs can include, but are not limited to, slow traffic Signs, no-stop traffic signs, speed limit traffic signs, etc. Those skilled in the art should understand that the road elements are not limited to the above.
其中,地图道路元素的属性信息,可以包括与上述地图道路元素相关的一种或多种信息,例如道路元素的语义信息、位置信息、形状信息等等。Among them, the attribute information of the map road element may include one or more kinds of information related to the above-mentioned map road element, such as semantic information, position information, shape information, and so on of the road element.
其中,道路元素的语义信息可以是该道路元素所代表的含义、其所要表达的信息,例如,在采集的道路图像中检测到道路上的线的情况下,可以根据道路上的线在道路上的位置、相对于道路的宽度、长度等等,可以确定该线为停止线、车道线等等。由于车道线又可以细分为很多类型,车道线为基本语义信息,还可以根据线的位置、线的形态进一步确定其具体语义信息,例如为左部边缘车道线、白色实线车道线等等;对于交通牌,慢行交通牌、禁止停车交通牌可以是该道路元素的具体语义信息。本领域技术人员应当理解,道路元素的语义信息的具体表达形式不影响本公开方法的实现。Among them, the semantic information of the road element can be the meaning represented by the road element and the information it wants to express. For example, in the case that a line on the road is detected in the collected road image, it can be based on the line on the road. The position, relative to the width and length of the road, etc., can determine that the line is a stop line, a lane line, and so on. Since the lane line can be subdivided into many types, the lane line is the basic semantic information, and its specific semantic information can be further determined according to the position of the line and the shape of the line, such as the left edge lane line, the white solid line lane line, etc. ; For traffic signs, slow traffic signs and no-stop traffic signs can be the specific semantic information of the road element. Those skilled in the art should understand that the specific expression form of the semantic information of the road element does not affect the implementation of the method of the present disclosure.
其中,上述位置传感器可以包括以下至少之一:GPS(Global Positioning System,全球定位***)、IMU(Inertial measurement unit,惯性测量单元)等等;上述视觉传感器可以包括以下至少之一:相机、摄像机、摄像头等。本领域技术人员应当理解,视觉传感器和位置传感器并不限于以上所述。The above-mentioned position sensor may include at least one of the following: GPS (Global Positioning System), IMU (Inertial measurement unit, inertial measurement unit), etc.; the above-mentioned visual sensor may include at least one of the following: a camera, a video camera, Camera etc. Those skilled in the art should understand that the vision sensor and the position sensor are not limited to the above.
其中,车辆定位信息,可以是对于每一帧车辆采集图像,获取的一个同步的定位信息。其可以是GPS定位信息,或IMU定位信息,或该GPS定位信息和该IMU定位信息的融合信息。Wherein, the vehicle positioning information may be a synchronized positioning information obtained by collecting images of each frame of the vehicle. It may be GPS positioning information, or IMU positioning information, or fusion information of the GPS positioning information and the IMU positioning information.
其中,融合信息是基于GPS定位信息和IMU定位信息获得的更可靠的定位结果。其 可以通过对GPS定位信息和IMU定位信息进行卡尔曼滤波获得,或者可以通过对GPS定位信息和IMU定位信息进行均值计算,或者加权平均计算。Among them, the fusion information is a more reliable positioning result obtained based on GPS positioning information and IMU positioning information. It can be obtained by Kalman filtering on GPS positioning information and IMU positioning information, or it can be calculated by means of GPS positioning information and IMU positioning information, or weighted average calculation.
S102、根据车辆定位信息,获取车辆定位信息对应的局部区域内的地图图像上的道路信息。S102. Obtain road information on a map image in a local area corresponding to the vehicle positioning information according to the vehicle positioning information.
在一种可能的实现方式中,针对一条道路的地图图像,该地图图像包括了该段道路的全部或者大部分道路元素。在定位过程中所获取的车辆采集图像,是该段道路的局部区域图像。为实现后续地图图像上的道路信息的自动标注,只需要获取车辆定位信息对应的局部区域内的地图图像上的道路信息。该局部区域的范围可以设置。该车辆定位信息对应的局部区域也与视觉传感器的视野范围相关。In a possible implementation manner, for a map image of a road, the map image includes all or most of the road elements of the road section. The vehicle acquisition image acquired during the positioning process is a partial area image of the road. In order to realize the automatic labeling of the road information on the subsequent map image, it is only necessary to obtain the road information on the map image in the local area corresponding to the vehicle positioning information. The scope of this local area can be set. The local area corresponding to the vehicle positioning information is also related to the field of view of the visual sensor.
S103、将地图图像上的道路信息投影到车辆采集图像上,以在车辆采集图像上标注道路信息。S103: Project the road information on the map image onto the vehicle acquisition image to mark the road information on the vehicle acquisition image.
在一种可能的实现方式中,由于地图图像包含丰富的、准确的道路信息,将地图图像上的道路信息投影到车辆采集图像上,实质上是在车辆采集图像上标注道路信息,使得车辆采集图像也包含上述道路信息,实现了道路信息的自动标注。将自动标注了道路信息的车辆采集图像用于上述神经网络的训练,可以达到较完善的训练结果。反过来,可以利用训练过的神经网络识别车辆采集图像中的道路信息。In a possible implementation, because the map image contains rich and accurate road information, projecting the road information on the map image onto the vehicle acquisition image is essentially marking the road information on the vehicle acquisition image, so that the vehicle acquisition The image also contains the above-mentioned road information, which realizes the automatic labeling of the road information. Using the vehicle collected images automatically marked with road information for the training of the above-mentioned neural network can achieve a more complete training result. Conversely, the trained neural network can be used to identify the road information in the collected images of vehicles.
根据本公开实施例提供的图像数据自动标注方法,利用地图图像包含的丰富的道路信息,可以实现对车辆采集图像的道路信息的自动标注,提高了标注图像数据的效率,有利于降低数据标注的出错概率,减少图像数据标注的人工成本。According to the method for automatically labeling image data provided by the embodiments of the present disclosure, the rich road information contained in the map image can be used to automatically label the road information of the image collected by the vehicle, which improves the efficiency of labeling image data and helps reduce the amount of data labeling. The error probability reduces the labor cost of image data annotation.
请参阅图2,为本公开实施例提供的又一种图像数据自动标注方法的流程示意图,示例性地,该方法可以包括以下步骤:Please refer to FIG. 2, which is a schematic flowchart of another method for automatically labeling image data according to an embodiment of the present disclosure. Illustratively, the method may include the following steps:
S201、获取车辆定位信息、地图图像和车辆采集图像,其中,地图图像包括道路信息。S201. Acquire vehicle positioning information, map images, and vehicle collection images, where the map images include road information.
该步骤的实现可参考图1所示实施例的步骤S101。The implementation of this step can refer to step S101 of the embodiment shown in FIG. 1.
需要说明的是,可以通过采集车采集道路的地图图像,并识别地图图像中地图道路元素的属性信息。采集到的地图图像可以存储在服务器或车载终端。It should be noted that the map image of the road can be collected by the collection vehicle, and the attribute information of the map road element in the map image can be recognized. The collected map images can be stored in a server or a vehicle-mounted terminal.
在一种可能的实现方式中,采集车上设置有视觉传感器以采集地图图像。该视觉传感器可以包括以下至少之一:相机、摄像机、摄像头等等。为了使地图图像达到更高的精度,采集车所配置的视觉传感器可以是高精视觉传感器,从而可以采集到清晰度高、精确度高的地图图像。而在定位过程中,用于采集该车辆采集图像的视觉传感器,则可以采用精度相对较低的传感器。In a possible implementation manner, a visual sensor is provided on the collection vehicle to collect map images. The visual sensor may include at least one of the following: a camera, a video camera, a camera, and so on. In order to achieve higher accuracy of the map image, the visual sensor configured in the collection vehicle can be a high-precision visual sensor, so that a map image with high definition and high accuracy can be collected. In the positioning process, the visual sensor used to collect the image collected by the vehicle can be a sensor with a relatively low accuracy.
在一种可能的实现方式中,采集车还可以配置有高精度的位置传感器,以更准确地获得采集车的定位信息。而在定位过程中,获取车辆定位信息的位置传感器,可以采用定位精度较低的位置传感器,或者利用车辆所现有的位置传感器。In a possible implementation manner, the collection vehicle may also be equipped with a high-precision position sensor to obtain the location information of the collection vehicle more accurately. In the positioning process, the position sensor that obtains the vehicle positioning information can be a position sensor with a lower positioning accuracy, or an existing position sensor in the vehicle can be used.
其中,地图道路元素的属性信息可以包括以下至少之一:语义信息、位置信息、形状信息等等。上述属性信息,可以利用训练好的用于进行道路元素检测的神经网络进行识别得到。上述神经网络可以通过带有标注信息的道路图像(不妨称为样本道路图像)进行训练,该样本道路图像中的道路元素带有标注信息,该标注信息可以是样本道路元素的属性信息,例如可以包括但不限于以下一种或多种:语义信息、形状信息、位置信息等等。Wherein, the attribute information of the map road element may include at least one of the following: semantic information, location information, shape information, and so on. The above attribute information can be obtained by using a trained neural network for road element detection. The above neural network can be trained by road images with label information (may be called sample road images). The road elements in the sample road images have label information. The label information can be attribute information of the sample road elements, for example, Including but not limited to one or more of the following: semantic information, shape information, location information, etc.
通过样本道路图像对神经网络进行训练,可以使该模型具备在输入的道路图像中,识别道路元素的属性信息的能力。对于输入该神经网络的地图道路图像,可以输出该图像中地图道路元素的属性信息。Training the neural network through sample road images can make the model have the ability to recognize the attribute information of road elements in the input road image. For the map road image input to the neural network, the attribute information of the map road element in the image can be output.
神经网络能够识别道路元素的类别,取决于训练过程中使用的样本道路元素的类型。可以通过更多类型的样本道路元素对模型进行训练,以使其具有更高的识别能力。The neural network's ability to recognize the types of road elements depends on the type of sample road elements used in the training process. The model can be trained with more types of sample road elements to make it have higher recognition capabilities.
图3A-图3C示出了识别道路元素的语义信息的效果图。其中,图3A是输入神经网络 模型的道路图像,其可以是车辆采集图像,也可以是地图道路图像,或者其他道路图像;图3B示出通过神经网络模型识别的道路元素,如图中横向粗实线31所示,并且获得了其语义信息为“停止线”32(stopline),标注在图片左上方;图3C示出了通过神经网络模型识别的道路元素,如图3C中纵向的粗实线33所示,并且获得了各个线的基本语义信息,以及具体语义信息。其基本语义信息为“车道线”(laneline),具体语义信息分别为(从左至右):“白色实线车道线”(white solid line)、“白色实线车道线”(white solid line)、“白色实线车道线”(white solid line)、“白色实线车道线”(white solid line)、“右侧边缘车道线”(right edge),均标注在图片左上方,如图3C中的图片左上方的34。在图像数据自动标注过程中,通过调用该地图图像,可以获得地图道路元素的上述属性信息。3A-3C show the effect diagrams of recognizing the semantic information of road elements. Among them, Figure 3A is a road image input to the neural network model, which can be a vehicle acquisition image, a map road image, or other road images; Figure 3B shows the road elements identified by the neural network model, as shown in the horizontal thick The solid line 31 is shown, and its semantic information is obtained as "stopline" 32 (stopline), which is marked on the upper left of the picture; Figure 3C shows the road elements recognized by the neural network model, as shown in Figure 3C. Line 33 is shown, and the basic semantic information and specific semantic information of each line are obtained. The basic semantic information is "laneline", and the specific semantic information is (from left to right): "white solid line" (white solid line), "white solid line" (white solid line) , "White solid line" (white solid line), "white solid line" (white solid line), and "right edge" (right edge) are all marked on the upper left of the picture, as shown in Figure 3C 34 on the top left of the picture. In the process of automatic labeling of image data, by calling the map image, the above-mentioned attribute information of the map road element can be obtained.
本领域技术人员应当理解,地图道路元素属性信息的识别方法不限于以上所述,还可以通过其他的识别方法获得。Those skilled in the art should understand that the identification method of the map road element attribute information is not limited to the above, and can also be obtained by other identification methods.
S202、根据车辆定位信息和地图图像的范围,确定车辆定位信息对应的局部区域的范围。S202: Determine the range of the local area corresponding to the vehicle positioning information according to the vehicle positioning information and the range of the map image.
其中,针对一条道路的地图图像,其包括了该段道路的全部或者大部分道路元素。在定位过程中所获取的车辆采集图像,是该段道路的局部区域图像。为实现后续地图图像上的道路信息的自动标注,只需要获取车辆定位信息对应的局部区域内的地图图像上的道路信息。在一种可能的实现方式中,该局部区域的范围可以人为设置,因此,可以根据经验值等确定车辆定位信息对应的局部区域的范围。在又一种可能的实现方式中,该车辆定位信息对应的局部区域与视觉传感器的视野范围相关,因此,也可以根据视觉传感器的视野范围确定车辆定位信息对应的局部区域的范围。Among them, the map image for a road includes all or most of the road elements of the road. The vehicle acquisition image acquired during the positioning process is a partial area image of the road. In order to realize the automatic labeling of the road information on the subsequent map image, it is only necessary to obtain the road information on the map image in the local area corresponding to the vehicle positioning information. In a possible implementation manner, the range of the local area can be set manually. Therefore, the range of the local area corresponding to the vehicle positioning information can be determined according to empirical values and the like. In another possible implementation manner, the local area corresponding to the vehicle positioning information is related to the field of view of the vision sensor. Therefore, the range of the local area corresponding to the vehicle positioning information can also be determined according to the field of view of the visual sensor.
S203、以局部区域内的地图图像为根节点,依次查询局部区域内的地图图像上的地图道路元素的属性信息。S203: Take the map image in the local area as a root node, and sequentially query the attribute information of the map road elements on the map image in the local area.
本步骤用于获取局部区域范围内的地图图像上的地图道路元素的属性信息。为了提升地图图像上的道路信息的查询效率,可以使用树状层级关系查询道路信息。以局部区域范围内的地图图像为根节点;根节点下设若干矩形区域,每个矩形区域设置中心点代表该区域定位,各个矩形区域对应地图图像上的地图道路元素;查询各个地图道路元素对应的属性信息。This step is used to obtain the attribute information of the map road element on the map image within the local area. In order to improve the query efficiency of the road information on the map image, the tree-like hierarchical relationship can be used to query the road information. Take the map image within the local area as the root node; there are several rectangular areas under the root node, and each rectangular area is set with a center point to represent the location of the area, and each rectangular area corresponds to the map road element on the map image; query the corresponding map road element Attribute information.
S204、将基于世界全局坐标系的地图图像转换到车体坐标系上,得到基于车体坐标系的地图图像。S204: Convert the map image based on the world global coordinate system to the vehicle body coordinate system to obtain the map image based on the vehicle body coordinate system.
在一种可能的实现方式中,该地图图像使用世界全局坐标系,坐标系内任意一点在地球上都有唯一对应的坐标(经纬度信息)。例如,可以使用ECEF(Earth-Centered-Earth-Fixed,以地球为中心且固定在地球上)地心坐标系,请参阅图4A所示的世界全局坐标系的示意图,该坐标系是右手笛卡尔直角坐标系,以地心为坐标原点,原点指向本初子午线和0度纬线交点方向为x轴正方向,原点指向北极点方向为z轴正方向,长度以米为单位。In a possible implementation manner, the map image uses the world global coordinate system, and any point in the coordinate system has a unique corresponding coordinate (longitude and latitude information) on the earth. For example, ECEF (Earth-Centered-Earth-Fixed, centered on the earth and fixed on the earth) geocentric coordinate system can be used. Please refer to the schematic diagram of the world global coordinate system shown in Figure 4A, which is a right-handed Cartesian coordinate system. The Cartesian coordinate system takes the center of the earth as the origin of the coordinates. The direction of the intersection of the origin pointing to the prime meridian and the 0 degree latitude is the positive direction of the x-axis, the direction of the origin pointing to the north pole is the positive direction of the z-axis, and the length is in meters.
基于车辆定位信息,得到车辆定位信息对应的局域区域的地图图像上的道路信息后,需要把这些道路信息从世界全局坐标系转换到车体坐标系。其中,如图4B所示的车体坐标系的示意图,车体坐标系也是右手笛卡尔直角坐标系,以车载高精惯导中心为原点,车头方向为x轴正方向,车身左侧为y轴正方向。长度以米为单位。Based on the vehicle positioning information, after obtaining the road information on the map image of the local area corresponding to the vehicle positioning information, the road information needs to be converted from the world global coordinate system to the vehicle body coordinate system. Among them, as shown in the schematic diagram of the vehicle body coordinate system shown in Figure 4B, the vehicle body coordinate system is also a right-handed Cartesian coordinate system, with the on-board high-precision inertial navigation center as the origin, the front direction as the positive direction of the x-axis, and the left side as the y Positive axis direction. The length is in meters.
在一种可能的实现方式中,世界全局坐标系和车体坐标系均是右手笛卡尔直角坐标系,两个右手笛卡尔直角坐标系之间的转换只需要一个旋转平移矩阵即可。则S204包括:根据车辆定位信息,得到旋转平移矩阵的旋转角度和平移量;以及根据该旋转平移矩阵,将基于世界全局坐标系的地图图像转换到车体坐标系上,得到基于车体坐标系的地图图像。其中,根据车辆定位信息在车体坐标系中的位置以及在世界全局坐标系中的位置,确定世界全局坐标系和车体坐标系之间的旋转平移矩阵的旋转角度和平移量。从而,可以根据旋 转平移矩阵,将基于世界全局坐标系的地图图像转换到车体坐标系上,得到基于车体坐标系的地图图像。In a possible implementation, the world global coordinate system and the vehicle body coordinate system are both right-handed Cartesian rectangular coordinate systems, and the conversion between the two right-handed Cartesian rectangular coordinate systems requires only one rotation and translation matrix. Then S204 includes: obtaining the rotation angle and translation amount of the rotation and translation matrix according to the vehicle positioning information; and according to the rotation and translation matrix, transforming the map image based on the world global coordinate system to the vehicle body coordinate system to obtain the vehicle body coordinate system Map image. Among them, according to the position of the vehicle positioning information in the vehicle body coordinate system and the position in the world global coordinate system, the rotation angle and translation amount of the rotation translation matrix between the world global coordinate system and the vehicle body coordinate system are determined. Therefore, the map image based on the world global coordinate system can be converted to the vehicle body coordinate system according to the rotation and translation matrix, and the map image based on the vehicle body coordinate system can be obtained.
S205、将基于车体坐标系的地图图像转换到相机坐标系和/或像素坐标系,以将地图图像上的道路信息投影到车辆采集图像上。S205: Convert the map image based on the vehicle body coordinate system to the camera coordinate system and/or the pixel coordinate system, so as to project the road information on the map image onto the vehicle collection image.
其中,要将地图图像上的道路信息投影到车辆采集图像上,而车辆采集图像是基于相机坐标系或像素坐标系的,因此,还需要将上述基于车体坐标系的地图图像转换到相机坐标系或像素坐标系。如图4C所示的相机坐标系和像素坐标系的示意图。其中,相机坐标系o-x-y-z是三维图,像素坐标系o’-x’-y’是二维图。Among them, the road information on the map image is projected onto the vehicle acquisition image, and the vehicle acquisition image is based on the camera coordinate system or pixel coordinate system. Therefore, the above-mentioned map image based on the vehicle body coordinate system needs to be converted to camera coordinates. System or pixel coordinate system. A schematic diagram of the camera coordinate system and the pixel coordinate system as shown in FIG. 4C. Among them, the camera coordinate system o-x-y-z is a three-dimensional image, and the pixel coordinate system o'-x'-y' is a two-dimensional image.
在一种可能的实现方式中,地图图像为二维地图,则S205包括:获取像素坐标系和车体坐标系之间的单应性矩阵;采用齐次坐标系表示基于车体坐标系的地图图像;以及根据单应性矩阵,将采用齐次坐标系表示的基于车体坐标系的地图图像,转换到像素坐标系上,得到投影到像素坐标系上的车辆采集图像的道路信息。在一种可能的实现方式中,对于二维的地图图像,可以使用单应性矩阵变换的方式完成车体坐标系到像素坐标系的转换。其中,单应性矩阵是指,一个三维物体可以投影到多个二维平面,单应性矩阵可以将一个三维物体在某个二维平面的投影转换到另一个二维平面的投影。根据三点确定一个平面的原理,选取三维物体上不少于三个点,分别计算这些点在两个二维投影平面上对应的投影点,然后两组对应的投影点之间的转换矩阵就是单应性矩阵,通过代数解析方式即可求解出单应性矩阵。在一种可能的实现方式中,可以通过人工标定数据预先标定好像素坐标系与车体坐标系之间的单应性矩阵。一种可选的实现方式中,假设该矩阵为3*3矩阵,有8个自由度,则可以完成一个平面到另一个平面的仿射变换。然后,使用齐次坐标系表示地图图像上的道路信息,然后每个道路信息的坐标都乘以单应性矩阵就可以得到基于像素坐标系的道路信息。In a possible implementation manner, the map image is a two-dimensional map, then S205 includes: obtaining a homography matrix between the pixel coordinate system and the vehicle body coordinate system; adopting a homogeneous coordinate system to represent the map based on the vehicle body coordinate system Image; and according to the homography matrix, the map image based on the vehicle body coordinate system represented by the homogeneous coordinate system is converted to the pixel coordinate system, and the road information of the vehicle acquisition image projected on the pixel coordinate system is obtained. In a possible implementation manner, for a two-dimensional map image, the transformation from the vehicle body coordinate system to the pixel coordinate system can be completed by using a homography matrix transformation. Among them, the homography matrix means that a three-dimensional object can be projected onto multiple two-dimensional planes, and the homography matrix can transform the projection of a three-dimensional object on a certain two-dimensional plane to the projection of another two-dimensional plane. According to the principle of three points to determine a plane, select no less than three points on a three-dimensional object, calculate the corresponding projection points of these points on two two-dimensional projection planes, and then the conversion matrix between the two sets of corresponding projection points is The homography matrix can be solved through algebraic analysis. In a possible implementation manner, the homography matrix between the pixel coordinate system and the vehicle body coordinate system can be calibrated in advance through manual calibration data. In an optional implementation manner, assuming that the matrix is a 3*3 matrix with 8 degrees of freedom, the affine transformation from one plane to another can be completed. Then, the homogeneous coordinate system is used to represent the road information on the map image, and then the coordinates of each road information are multiplied by the homography matrix to obtain the road information based on the pixel coordinate system.
在一种可能的实现方式中,地图图像为三维地图,则S205包括:根据车体坐标系与相机坐标系之间的旋转平移矩阵,将基于车体坐标系的地图图像转换到相机坐标系上,得到投影到相机坐标系上的车辆采集图像的道路信息;以及根据相机坐标系与像素坐标系之间的投影矩阵,将投影到相机坐标系上的车辆采集图像的道路信息转换到像素坐标系上,得到投影到像素坐标系上的车辆采集图像的道路信息。其中,对于三维的地图图像,可以使用相机内外参数完成车体坐标系、相机坐标系和像素坐标系之间的转换。相机成像原理是小孔成像,相机内参指相机凸透镜的焦距和像素坐标系下光心所在坐标;相机外参即相机坐标系和车体坐标系之间的旋转平移矩阵。相机坐标系为以相机光心为原点,相机上方、前方分别为y轴、z轴正方向的右手笛卡尔坐标系。根据人工标定数据预先标定相机内外参数以后,先通过相机外参将地图图像上的道路信息旋转平移到相机坐标系,然后再根据小孔成像的缩放原理和相机内参将基于相机坐标系的道路信息投影到像素坐标系,得到投影到车辆采集图像上的道路信息。In a possible implementation manner, the map image is a three-dimensional map, then S205 includes: converting the map image based on the vehicle body coordinate system to the camera coordinate system according to the rotation and translation matrix between the vehicle body coordinate system and the camera coordinate system , Get the road information of the vehicle captured image projected on the camera coordinate system; and according to the projection matrix between the camera coordinate system and the pixel coordinate system, convert the road information of the vehicle captured image projected on the camera coordinate system to the pixel coordinate system On the above, the road information of the collected image of the vehicle projected on the pixel coordinate system is obtained. Among them, for a three-dimensional map image, the internal and external parameters of the camera can be used to complete the conversion between the car body coordinate system, the camera coordinate system, and the pixel coordinate system. The principle of camera imaging is small hole imaging. The internal parameters of the camera refer to the focal length of the convex lens of the camera and the coordinates of the optical center in the pixel coordinate system; the external parameters of the camera refer to the rotation and translation matrix between the camera coordinate system and the vehicle body coordinate system. The camera coordinate system is a right-hand Cartesian coordinate system with the optical center of the camera as the origin, and the positive direction of the y-axis and the front of the camera respectively. After pre-calibrating the camera's internal and external parameters according to the manual calibration data, the road information on the map image is rotated and translated to the camera coordinate system through the camera external parameters, and then the road information based on the camera coordinate system is based on the zoom principle of the pinhole imaging and the camera internal parameters Projected to the pixel coordinate system to obtain the road information projected on the image collected by the vehicle.
S206、经用于提取道路信息的神经网络对车辆采集图像进行道路信息提取处理,得到感知道路信息。S206: Perform road information extraction processing on the vehicle collected image via a neural network for extracting road information to obtain perceived road information.
其中,感知道路信息包括感知道路元素的属性信息。感知道路元素的属性信息,可以包括与感知道路元素相关的一种或多种信息,例如道路元素的语义信息、位置信息、形状信息等等。与地图道路元素相似,该感知道路元素可以包括与道路相关的标识,可以至少包括以下之一:道路上的车道线、停止线、转向线,以及设置在道路旁或者道路前方的交通牌、交通灯、路灯等等。感知道路元素与地图道路元素的类型可以是全部相同的,也可以是部分相同的。Among them, the perceived road information includes the attribute information of the perceived road element. The attribute information of the perceived road element may include one or more kinds of information related to the perceived road element, such as semantic information, location information, shape information, etc. of the road element. Similar to the map road element, the perceived road element can include road-related signs, and can include at least one of the following: lane lines, stop lines, turning lines on the road, and traffic signs and traffic signs set beside or in front of the road Lamps, street lamps, etc. The types of perceptual road elements and map road elements can be all the same, or they can be partly the same.
理想情况下,感知道路元素应该是与地图中的地图道路元素是重合的。这种重合可以指感知道路元素和地图道路元素在同一坐标系下的重合。但是,由于在定位的过程中所获 得的车辆定位信息的定位偏差,或者定位精度不足,特别是在车辆上具有定位功能的硬件设备精度不高或较低的情况下,投影到车辆采集图像上的道路信息与地图图像中实际的道路信息不一定完全重合,所以需要对投影到车辆采集图像上的道路信息进行修正。Ideally, the perceived road elements should overlap with the map road elements in the map. This overlap can refer to the overlap of perceived road elements and map road elements in the same coordinate system. However, due to the positioning deviation of the vehicle positioning information obtained in the process of positioning, or the positioning accuracy is insufficient, especially when the accuracy of the hardware equipment with positioning function on the vehicle is not high or low, it is projected to the vehicle acquisition image The road information in the map image may not completely overlap with the actual road information in the map image, so it is necessary to correct the road information projected on the vehicle collected image.
在一种可能的实现方式中,可以用经过初步训练的神经网络对车辆采集图像进行道路信息提取处理,得到感知道路信息。In a possible implementation manner, a neural network that has undergone preliminary training can be used to extract road information from images collected by vehicles to obtain perceived road information.
上述神经网络可以通过带有标注信息的道路图像(不妨称为样本道路图像)进行训练,该样本道路图像中的道路元素带有标注信息,该标注信息可以是样本道路元素的属性信息。例如,样本道路元素的属性信息可以包括但不限于以下一种或多种:语义信息、形状信息、位置信息等等。通过样本道路图像对神经网络进行训练,可以使该模型具备在输入的道路图像中,识别道路元素的属性信息的能力。对于输入该神经网络的地图道路图像,可以输出该图像中地图道路元素的属性信息。神经网络能够识别道路元素的类别,取决于训练过程中使用的样本道路元素的类型。可以通过更多类型的样本道路元素对模型进行训练,以使该神经网络具有更高的识别能力。The above neural network can be trained by road images with label information (may be called sample road images). The road elements in the sample road images carry label information, and the label information may be attribute information of the sample road elements. For example, the attribute information of the sample road element may include, but is not limited to, one or more of the following: semantic information, shape information, location information, and so on. Training the neural network through sample road images can make the model have the ability to recognize the attribute information of road elements in the input road image. For the map road image input to the neural network, the attribute information of the map road element in the image can be output. The neural network's ability to recognize the types of road elements depends on the type of sample road elements used in the training process. The model can be trained with more types of sample road elements, so that the neural network has a higher recognition ability.
S207、根据感知道路信息,对投影到车辆采集图像上的道路信息进行修正。S207: Correct the road information projected on the image collected by the vehicle according to the perceived road information.
如果感知道路元素应该是与地图中的地图道路元素不是完全重合的,可以根据感知道路信息,对投影到车辆采集图像上的道路信息进行修正。If the perceived road elements should not completely overlap with the map road elements in the map, the road information projected on the image collected by the vehicle can be corrected according to the perceived road information.
在一种可能的实现方式中,S207包括确定感知道路信息中的感知道路元素与投影到车辆采集图像上的道路信息中的地图道路元素之间的偏移信息,根据该偏移信息对投影到车辆采集图像上的道路信息进行修正。这将在下面的实施例中详细描述。In a possible implementation, S207 includes determining the offset information between the perceived road element in the perceived road information and the map road element in the road information projected on the vehicle-collected image, and the pair is projected to the vehicle according to the offset information. The road information on the image collected by the vehicle is corrected. This will be described in detail in the following examples.
根据本公开实施例提供的一种信息标注方法,利用地图数据包含的丰富的道路信息,将该地图数据上的道路信息投影到车辆采集图像上,可以实现对车辆采集图像的道路信息的自动标注,提高了标注图像数据的效率,有利于降低数据标注的出错概率,减少图像数据标注的人工成本;且根据感知道路信息,对投影到车辆采集图像上的道路信息进行修正,提高了图像数据标注的准确性。According to an information labeling method provided by an embodiment of the present disclosure, the road information on the map data is projected onto the vehicle collection image by using the rich road information contained in the map data, which can realize the automatic labeling of the road information of the vehicle collection image , Improve the efficiency of annotating image data, help reduce the error probability of data annotation, and reduce the labor cost of image data annotation; and according to the perceived road information, the road information projected on the vehicle collection image is corrected, which improves the image data annotation Accuracy.
图5示出确定感知道路元素与地图道路元素之间的偏移信息的一种方法,如图5所示,该方法可以包括:Fig. 5 shows a method for determining the offset information between the perceived road element and the map road element. As shown in Fig. 5, the method may include:
S301、根据感知道路元素的属性信息,从地图图像中确定与感知道路元素配对的地图道路元素。S301: According to the attribute information of the perceived road element, determine the map road element paired with the perceived road element from the map image.
在一种可能的实现方式中,对于实时采集的车辆采集图像,如果对于该道路已经预先建立了地图,那么该车辆采集图像上的感知道路元素,可以在地图上获得与其配对的地图道路元素。也即,对于一个感知道路元素,如果其不是误识别的,也不是在地图建立或最新更新之后新出现的,那么通常在地图上,可以找到一个地图道路元素与之对应。In a possible implementation manner, for the vehicle acquisition image collected in real time, if a map has been established for the road in advance, then the perceptual road element on the vehicle acquisition image can obtain the map road element paired with it on the map. That is, for a perceptual road element, if it is not misrecognized, nor is it newly appeared after the map is created or the latest update, then usually a map road element can be found on the map to correspond to it.
S302、确定配对的感知道路元素和地图道路元素在同一设备坐标系下的位置信息。S302: Determine the position information of the paired perceived road element and the map road element in the same device coordinate system.
由于位置的比较需要在同一坐标系下进行,因此,如果所获得的感知道路元素的位置信息和地图道路元素的位置信息并非处于同一坐标系下,需要将二者转换至同一坐标系下。Since the position comparison needs to be performed in the same coordinate system, if the obtained position information of the perceived road element and the position information of the map road element are not in the same coordinate system, they need to be converted to the same coordinate system.
在地图道路元素的位置信息是经纬度坐标系下的地图位置信息的情况下,需要将该地图位置信息转换到车体坐标系下,以下以GPS设备坐标系为例进行描述。In the case where the location information of the map road element is the map location information in the latitude and longitude coordinate system, the map location information needs to be converted to the vehicle body coordinate system. The following describes the GPS device coordinate system as an example.
在一种可能的实现方式中,该坐标系转换的过程可以分为两个步骤:首先,将地图位置信息从经纬度坐标系(例如,WGS84坐标系)转换至UTM坐标系;然后,利用GPS的车辆定位信息,将地图道路元素从UTM坐标系转换至GPS设备坐标系。对于自动驾驶车辆来说,这一步骤可以通过先旋转车头朝向正东方向的夹角θ,再平移GPS的经纬度定位信息(x,y)获得。In a possible implementation, the process of the coordinate system conversion can be divided into two steps: first, the map location information is converted from the latitude and longitude coordinate system (for example, the WGS84 coordinate system) to the UTM coordinate system; Vehicle positioning information, which converts map road elements from UTM coordinate system to GPS device coordinate system. For self-driving vehicles, this step can be obtained by first rotating the angle θ of the front of the vehicle toward the true east, and then translating the GPS longitude and latitude positioning information (x, y).
本领域技术人员应当理解,对于其他的位置传感器,可以依据其转换规则进行从经纬度坐标系到车体坐标系的转换。Those skilled in the art should understand that, for other position sensors, the conversion from the latitude and longitude coordinate system to the vehicle body coordinate system can be performed according to its conversion rules.
S303、基于位置信息确定配对的感知道路元素和地图道路元素之间的定位偏移量。S303: Determine a positioning offset between the paired perceived road element and the map road element based on the location information.
在获得了同一设备坐标系下的感知道路元素以及与其配对的地图道路元素的位置信息后,基于二者的位置,即可以确定它们之间的定位偏移量。After obtaining the position information of the perceptual road element in the same device coordinate system and the map road element paired with it, the positioning offset between them can be determined based on the positions of the two.
在一种可能的实现方式中,通过将配对的感知道路元素和地图道路元素转换到同一设备坐标系下,再利用二者的位置信息确定它们之间的定位偏移量。In a possible implementation manner, the paired perceptual road elements and map road elements are converted to the same device coordinate system, and then the position information of the two is used to determine the positioning offset between them.
图6示出从地图图像中确定与感知道路元素配对的地图道路元素的一种方法,如图6所示,该方法可以包括:Fig. 6 shows a method for determining map road elements paired with perceived road elements from a map image. As shown in Fig. 6, the method may include:
S401、在地图图像中,基于车辆定位信息查找预设范围内的地图道路元素。S401: In the map image, search for map road elements within a preset range based on the vehicle positioning information.
其中,车辆定位信息是车辆自身所在的位置信息。例如,对于自动驾驶车辆来说,即是车辆所在的位置信息。通过该车辆定位信息,可以确定车辆在地图图像上的位置,由此可以在地图图像中查找出设定范围内的地图道路元素,也即车辆附近的地图道路元素。Among them, the vehicle positioning information is the location information of the vehicle itself. For example, for an autonomous vehicle, it is the location information of the vehicle. Through the vehicle positioning information, the location of the vehicle on the map image can be determined, so that map road elements within a set range can be found in the map image, that is, map road elements near the vehicle.
由于车辆采集图像是通过配备在车辆上的视觉传感器获得的,因此车辆采集图像的感知道路元素,是车辆进行定位的过程中位于车辆附近的道路元素。因此,通过在地图上查找车辆附近的地图道路元素,是最有可能、也是最快能够查找到与感知道路元素配对的地图道路元素的方式。Since the image collected by the vehicle is obtained through the visual sensor equipped on the vehicle, the perceptual road element of the image collected by the vehicle is the road element located near the vehicle during the positioning process of the vehicle. Therefore, finding the map road elements near the vehicle on the map is the most likely and fastest way to find the map road elements paired with the perceived road elements.
该预设范围可以依据需求设定。例如,如果需要匹配的精度高,则可以将该范围设置的相对大一些,获得较多地图道路元素以在后续过程中与感知道路元素配对;如果对实时性要求高,希望匹配的速度快一些,则可以将该范围设置的相对较小。例如,预设范围可以地图上为以车辆定位信息为中心点、上述视觉传感器的可视范围与初始定位识差的2倍到5倍的范围,由此权衡匹配速度和准确度。The preset range can be set according to requirements. For example, if the matching accuracy is high, the range can be set to be relatively large, and more map road elements can be obtained to pair with the perceived road elements in the subsequent process; if the real-time requirements are high, it is hoped that the matching speed will be faster , The range can be set relatively small. For example, the preset range may be a range of 2 to 5 times the difference between the visual range of the visual sensor and the initial positioning recognition on the map with the vehicle positioning information as the center point, thereby weighing the matching speed and accuracy.
举例来说,视觉传感器的可视范围为60m,初始定位误差为10m,则可以将预设范围设置为(60+10)*2。也即是说,在这种情况下,预设范围可以是以车辆定位为中心,140m*140m的矩形框。For example, if the visual range of the vision sensor is 60m and the initial positioning error is 10m, the preset range can be set to (60+10)*2. That is to say, in this case, the preset range can be a 140m*140m rectangular frame with the vehicle positioning as the center.
S402、将车辆采集图像中的感知道路元素与预设范围内的地图道路元素基于属性信息进行两两配对,获得多种配对方案。S402: Pair the perceived road elements in the image collected by the vehicle with the map road elements within a preset range based on the attribute information to obtain multiple pairing schemes.
在一种可能的实现方式中,可以通过枚举的方式,将车辆采集图像中的每个车辆采集图像与预设范围内的每个地图道路元素,分别两两进行配对,获得多种不同的配对方案。In a possible implementation manner, each vehicle acquisition image in the vehicle acquisition image can be paired with each map road element within a preset range through enumeration to obtain a variety of different Matching plan.
上述不同的配对方案,可以是至少一感知道路元素与所述预设范围内的地图道路元素的配对方式不同。例如,车辆采集图像中的感知道路元素包括a1,a2,…,aM,上述预设范围内的地图道路元素包括b1,b2,…,bN,其中,M、N均为正整数,且N大于等于M。也就是说,地图道路元素的数量多于,或者至少等于感知道路元素的数量。对感知道路元素(a1,a2,…,aM)和地图道路元素(b1,b2,…,bN)进行两两配对,所得到的每一种配对方案是一组二元组的集合,每个二元组(ai,bj)即是道路元素的一种配对方式。在二元组(ai,bj)中,i≤M,i可以是[1,M]范围之内的任意整数;j≤N,j可以是[1,N]范围内的任何整数。并且,在配对方案中,感知道路元素(a1,a2,…,aM)需要全部实现配对,而地图道路元素(b1,b2,…,bN)中可以包含未找到配对目标的元素。不同的配对方案中,至少有一组二元组(ai,bj)是不同的。The above-mentioned different pairing schemes may be that at least one perceived road element has a different pairing method with the map road element within the preset range. For example, the perceptual road elements in the image collected by the vehicle include a1, a2,..., aM, and the map road elements within the above preset range include b1, b2,..., bN, where M and N are both positive integers, and N is greater than Equal to M. In other words, the number of map road elements is more than, or at least equal to the number of perceived road elements. Pairs of perceptual road elements (a1, a2,..., aM) and map road elements (b1, b2,..., bN), each of the resulting pairing schemes is a set of two-tuples, each The two-tuple (ai, bj) is a pairing method of road elements. In the two-tuple (ai, bj), i≤M, i can be any integer in the range of [1,M]; j≤N, j can be any integer in the range of [1,N]. In addition, in the matching scheme, the perceptual road elements (a1, a2,..., aM) need to be all paired, and the map road elements (b1, b2,..., bN) may contain elements for which no matching target is found. In different pairing schemes, at least one set of two-tuples (ai, bj) is different.
在一种可能的实现方式中,可以通过二分图模型实现感知道路元素与地图道路元素的两两配对。包括步骤:基于感知道路元素和地图道路元素构造二分图模型:将车辆采集图像中,将每一个感知道路元素抽象为一个点,所有感知道路元素形成一个感知点集;将地图中的地图道路元素也抽象为一个点,所有地图道路元素形成一个地图点集。响应于车辆采集图像中存在多种语义相同的道路元素的情况,例如,存在多条车道线,可以按照从车辆左边到右边的顺序对语义相同的感知道路元素进行排序,对于地图中语义相同的地图道路元素也利用相似的方法进行排序,所形成的对应点集中的点则按照道路元素的排序依次 排列。将感知点集和地图点集之间用边进行连接,每一条边表示了一个感知道路元素与一个地图道路元素之间的配对关系。不同的连接方式则产生了不同的配对方案,所得到的每一个配对方案即是一个边集。In a possible implementation manner, the pairwise pairing of perceptual road elements and map road elements can be realized through the bipartite graph model. Including steps: constructing a bipartite graph model based on perceptual road elements and map road elements: abstracting each perceptual road element as a point in the vehicle's collected image, and all perceptual road elements forming a perceptual point set; combining the map road elements in the map It is also abstracted as a point, and all map road elements form a map point set. In response to the situation where there are multiple road elements with the same semantics in the vehicle collected images, for example, there are multiple lane lines, the perceptual road elements with the same semantics can be sorted in the order from the left to the right of the vehicle, for the same semantics in the map The map road elements are also sorted using a similar method, and the points in the corresponding point set formed are arranged in order according to the order of the road elements. The perception point set and the map point set are connected by edges, and each edge represents the pairing relationship between a perception road element and a map road element. Different connection methods produce different pairing schemes, and each pairing scheme obtained is an edge set.
在一种可能的实现方式中,还可以利用基于上述模型的二分图匹配方法,在所有的配对方案中获取合理的配对方案。该方法包括:在所有边集中,选择尽可能多的边不相交(不交叉)的边集。这里所说的不相交是指,两条边没有公共点情况下的不相交,且其中一条边两个顶点在点集中的序号都大于另一条边两个顶点在点集中的序号,因此也可以理解为物理意义上的不相交。可以将具有不相交边的数量大于设定比例或者设定阈值的边集,称为合理边集,也即获得了合理地配对方案,例如图7所示。通过筛选出合理的配对方案,再进行置信度计算,减少了后续过程的计算量。In a possible implementation manner, a bipartite graph matching method based on the above model can also be used to obtain a reasonable matching scheme from all the matching schemes. The method includes: in all edge sets, selecting as many edge sets that do not intersect (not cross) as many edges as possible. The disjointness mentioned here means that the two edges do not intersect when there is no common point, and the sequence numbers of the two vertices of one side in the point set are greater than the sequence numbers of the two vertices of the other side in the point set, so it can also It is understood as disjoint in the physical sense. An edge set with a number of disjoint edges greater than a set ratio or a set threshold can be called a reasonable edge set, that is, a reasonable pairing scheme is obtained, as shown in Figure 7, for example. By screening out a reasonable matching scheme, and then calculating the confidence, the amount of calculation in the subsequent process is reduced.
S403、确定每个配对方案的置信度。S403. Determine the confidence of each pairing scheme.
对于一个配对方案,置信度是感知道路元素与地图道路元素之间配对情况的评价指标。在一个配对方案中,各个感知道路元素和地图道路的配对,二者的语义信息一致性越高,匹配的配对数目越多,配对方案的置信度就越高。For a pairing scheme, confidence is an evaluation index that perceives the pairing between road elements and map road elements. In a pairing scheme, each perceived road element is paired with a map road. The higher the semantic information consistency between the two, the more the number of matching pairs, the higher the confidence of the pairing scheme.
在一种可能的实现方式中,可以通过以下方式确定每个配对方案的置信度:In a possible implementation, the confidence of each pairing scheme can be determined in the following way:
A、分别确定每个配对方案中,每个感知道路元素和地图道路元素的配对的个体相似度。其中,个体相似度,可以是指针对配对方案中的每个二元对,其中两个元素的属性信息的相似程度。例如,可以包括语义信息的相似度、位置信息的相似度、形状信息的相似度等等。以车道线为例,可以通过以下公式(1)计算感知车道线和地图车道线之间的个体相似度,其中,感知车道线可以指车辆采集图像中的车道线,地图车道线可以指地图中的车道线。A. Determine the individual similarity of each pairing of perceptual road elements and map road elements in each pairing scheme. Among them, the individual similarity can refer to the similarity of the attribute information of the two elements in each binary pair in the pairing scheme. For example, it may include the similarity of semantic information, the similarity of position information, the similarity of shape information, and so on. Taking the lane line as an example, the individual similarity between the perceived lane line and the map lane line can be calculated by the following formula (1), where the perceived lane line can refer to the lane line in the image collected by the vehicle, and the map lane line can refer to the map Lane line.
Weight(i,j)=-Distance(i,j)+O type(i,j)*LaneWidth+O edgetype(i,j)*LaneWidth   (1); Weight(i,j)=-Distance(i,j)+O type (i,j)*LaneWidth+O edgetype (i,j)*LaneWidth (1);
其中,Weight(i,j)表示第i个(从左往右数,下同)感知车道线与第j个地图车道线之间边的个体相似度,也可以称其为权值;Distance(i,j)表示第i个感知车道线与第j个地图车道线之间的距离,此处将车道线抽象为线段,距离的计算方式可以是线段到线段的欧式距离,也即一条线段上两个端点到另一条线段距离的中位数,即平均值;LaneWidth表示车道宽度,也即两条车道线之间的宽度;O type(i,j)当且仅当第i个感知车道线与第j个地图车道线的车道线属性相同的情况下为1,否则为0;其中,车道线属性可以包括车道线颜色、线型等等,例如黄色实线,白色虚线;O edgetype(i,j)当且仅当第i个感知车道线与第j个地图车道线的边缘车道线属性相同的情况下为1,否则为0;其中,边缘车道线属性表示该车道线是否属于道路的边缘。在上述公式中,Distance(i,j)用于计算感知车道线与地图车道下之间的位置信息相似度,LaneWidth于计算它们之间的形状信息相似度,O type(i,j)和O edgetype(i,j)则用于计算它们之间的语义信息相似度。本领域技术人员应当理解,对于其他的道路元素之间的个体相似度,可以设置其他合理的公式进行计算。 Among them, Weight(i,j) represents the individual similarity between the i-th (counted from left to right, the same below) perceptual lane line and the j-th map lane line, which can also be called the weight; Distance( i, j) represents the distance between the i-th perceived lane line and the j-th map lane line. The lane line is abstracted as a line segment. The distance calculation method can be the Euclidean distance from the line segment to the line segment, that is, on a line segment The median of the distance between two endpoints and another line segment, that is, the average; LaneWidth represents the lane width, that is, the width between the two lane lines; O type (i,j) if and only if the i-th perceived lane line If the lane line attribute of the j-th map lane line is the same, it is 1, otherwise it is 0; where the lane line attributes can include lane line color, line type, etc., such as yellow solid line and white dashed line; O edgetype (i ,j) If and only if the edge lane line attributes of the i-th perceived lane line and the j-th map lane line are the same, it is 1, otherwise it is 0; where, the edge lane line attribute indicates whether the lane line belongs to the road edge. In the above formula, Distance (i, j) is used to calculate the similarity of the position information between the perceived lane line and the lane under the map, and LaneWidth is used to calculate the similarity of the shape information between them, O type (i, j) and O edgetype (i, j) is used to calculate the similarity of semantic information between them. Those skilled in the art should understand that other reasonable formulas can be set to calculate the individual similarity between other road elements.
B、在确定了个体相似度之后,接下来确定每个所述配对方案中各感知道路元素和地图道路元素配对的整体相似度。B. After the individual similarity is determined, next, determine the overall similarity of the pairing of each perceived road element and map road element in each of the pairing schemes.
其中,整体相似度,可以是一个配对方案中,所有的二元对的属性信息相似程度的整体评估。其中,属性信息可以包括位置信息和语义信息。对于位置信息的整体相似度,可以利用所有二元对中两个元素距离的方差来表示。方差越小,表示所有二元对中两个元素的距离越相近,则位置信息的整体相似度越高;对于语义信息的整体相似度,可以利用所有二元对中两个元素语义信息相似度进行平均,或者加权平均计算来获得。Among them, the overall similarity can be an overall evaluation of the similarity of the attribute information of all binary pairs in a pairing scheme. Among them, the attribute information may include location information and semantic information. The overall similarity of location information can be represented by the variance of the distance between two elements in all binary pairs. The smaller the variance, the closer the distance between the two elements in all binary pairs, and the higher the overall similarity of the position information; for the overall similarity of semantic information, the semantic information similarity of the two elements in all binary pairs can be used Perform average or weighted average calculation to obtain.
C、根据每个所述配对方案的各个体相似度和整体相似度,确定每个所述配对方案的置信度。例如,可以利用每个配对方案中,各个二元组的个体相似度之和,再与整体相似度进行平均,或者加权平均,来获得该配对方案的置信度。C. Determine the confidence of each pairing plan according to the individual similarity and the overall similarity of each pairing plan. For example, in each pairing scheme, the sum of the individual similarities of each binary group can be averaged with the overall similarity, or weighted average, to obtain the confidence of the pairing scheme.
在本实施例中,基于配对方案中,各个二元组的个体相似度和整体相似度来综合评估该配对方案的置信度,避免了个体配对的极端效果(极好,或者极差)对整个配对方案置信度的影响,使得置信度的计算结果更加可靠。In this embodiment, based on the individual similarity and overall similarity of each binary group in the pairing program, the confidence of the pairing program is comprehensively evaluated, avoiding the extreme effect (extremely good or extremely poor) of individual pairing on the entire The influence of the confidence of the pairing scheme makes the calculation result of the confidence more reliable.
公式(2)为用于计算配对方案的置信度得分的函数示例,其是通过三个部分来对计算得分:个体相似度之和、距离信息的整体相似度、语义信息的整体相似度。Formula (2) is an example of a function for calculating the confidence score of the pairing scheme, which is to calculate the score through three parts: the sum of individual similarities, the overall similarity of distance information, and the overall similarity of semantic information.
match_weight_sum=sum(match_items_[pr_idx][hdm_idx].weight)+CalculateVarianceOfM atchResult(match_result)+CalculateMMConfidence(match_result);    (2);match_weight_sum=sum(match_items_[pr_idx][hdm_idx].weight)+CalculateVarianceOfMatchResult(match_result)+CalculateMMConfidence(match_result); (2);
其中,match_weight_sum表示一种配对方案的置信度得分;Among them, match_weight_sum represents the confidence score of a pairing scheme;
sum(match_items_[pr_idx][hdm_idx].weight)表示配对方案中每个二元组的个体相似度之和,其通过对该配对方案中选中的边的权值求和,即每一对点集对应的边权值求和计算;sum(match_items_[pr_idx][hdm_idx].weight) represents the sum of the individual similarity of each two-tuple in the pairing scheme, which is calculated by summing the weights of the edges selected in the pairing scheme, that is, each pair of point sets Corresponding edge weight summation calculation;
CalculateVarianceOfMatchResult(match_result)表示配对方案中,各个二元组距离信息的整体相似度,其通过配对方案中各个二元组中两个元素之间距离的方差进行计算。以车道线为例,配对的车道线之间存在距离,该方差即所有这些距离的方差。理论上,所有配对的感知车道线和地图车道线之间的距离应该相等,即方差为零,但实际上,因为误差不可避免的引入,该方差有可能不为零;CalculateVarianceOfMatchResult(match_result) represents the overall similarity of the distance information of each two-tuple in the pairing scheme, which is calculated by the variance of the distance between two elements in each two-tuple in the pairing scheme. Taking lane lines as an example, there is a distance between paired lane lines, and the variance is the variance of all these distances. Theoretically, the distance between all paired perception lane lines and map lane lines should be equal, that is, the variance is zero, but in fact, because of the inevitable introduction of errors, the variance may not be zero;
CalculateMMConfidence(match_result)表示配对方案中,各个二元组语义信息的整体相似度,其通过对比各个二元组中两个元素之间的语义相似度进行计算。仍以车道线为例,可以判断所有配对的车道线的属性是否一致,数量是否一致。例如,属性全部一致的置信度为100%,每有一对车道线的属性不一致,例如可以设置为置信度下降10%,数量不匹配置信度直接下降30%。CalculateMMConfidence(match_result) represents the overall similarity of the semantic information of each two-tuple in the matching scheme, which is calculated by comparing the semantic similarity between two elements in each two-tuple. Still taking lane lines as an example, it can be judged whether the attributes of all paired lane lines are the same and whether the numbers are the same. For example, the confidence that the attributes are all consistent is 100%, and the attributes of each pair of lane lines are inconsistent, for example, the confidence can be set to decrease by 10%, and the confidence that the number does not match directly decrease by 30%.
通过计算以上三部分的结果,并将结果相加,即可获得配对方案的置信度得分。By calculating the results of the above three parts, and adding the results, the confidence score of the pairing scheme can be obtained.
S404、在多种配对方案中置信度最高或超过设定阈值的配对方案中,确定与感知道路元素配对的地图道路元素。S404: Determine the map road element that is paired with the perceived road element in the matching scheme with the highest confidence level or exceeding the set threshold among the multiple matching schemes.
在一种可能的实现方式中,可以将其中置信度最高的方案作为最终选择的配对方案,也可以将其中超过设定阈值的配对方案作为终选择的配对方案,从而能够确定与感知道路元素配对的地图道路元素。In a possible implementation manner, the scheme with the highest confidence level can be used as the final selected matching scheme, and the matching scheme that exceeds the set threshold can be used as the final selected matching scheme, so as to be able to determine the pairing with the perceived road element Map road elements.
在本实例中,通过利用车辆定位信息在地图上获得设备附近的地图道路元素,用于与感知道路元素进行配对,相较于在全局地图中寻找与感知道路元素配对的地图道路元素,减小了运算量,提高了匹配速度,有利于实现实时定位。In this example, the map road elements near the device are obtained on the map by using the vehicle positioning information to be paired with the perceived road elements. Compared with finding the map road elements paired with the perceived road elements in the global map, it reduces The calculation amount is improved, the matching speed is improved, and it is beneficial to realize real-time positioning.
在一种可能的实现方式中,在将所述车辆采集图像中的感知道路元素与所述预设范围内的地图道路元素配对的过程中,在车辆采集图像中的感知道路元素在所述预设范围内的地图道路元素无法确定配对的道路元素的情况下,在待进行配对的地图道路元素中设置空或虚拟元素与所述感知道路元素配对。In a possible implementation, in the process of pairing the perceived road elements in the vehicle acquisition image with the map road elements within the preset range, the perceived road elements in the vehicle acquisition image are in the preview When it is assumed that the map road element within the range cannot determine the paired road element, an empty or virtual element is set in the map road element to be paired to be paired with the perceived road element.
在理想情况下,车辆采集图像中的感知道路元素是与地图中的地图道路元素一一对应的,但在感知道路元素是误识别的结果的情况下,或者感知道路元素是在地图建立后才出现的情况下,是无法找到与该感知道路元素对应的地图道路元素的。通过设置空(null)或虚拟元素,使得所有的感知道路元素在确定配对方案的过程上,都有配对的对象,使得配对方案更加丰富,有利于综合评估最佳的配对方案。In an ideal situation, the perceived road elements in the image collected by the vehicle correspond to the map road elements in the map one-to-one, but in the case that the perceived road elements are the result of misrecognition, or the perceived road elements are only after the map is established When this happens, the map road element corresponding to the perceived road element cannot be found. By setting null or virtual elements, all perceptual road elements have matching objects in the process of determining the matching plan, making the matching plan richer and conducive to comprehensive evaluation of the best matching plan.
图8示出确定配对的感知道路元素和地图道路元素之间的定位偏移量的一种方法,如图8所示,该方法包括:Fig. 8 shows a method for determining the positioning offset between the paired perceived road element and the map road element. As shown in Fig. 8, the method includes:
S501、对感知道路元素的像素点进行采样,获得感知采样点集。S501: Sampling the pixel points of the perceptual road element to obtain a set of perceptual sampling points.
在一种可能的实现方式中,可以以固定间隔(例如0.1米)对感知道路元素的像素点进行采样,获得感知采样点集。In a possible implementation manner, the pixel points of the perceptual road element may be sampled at a fixed interval (for example, 0.1 m) to obtain a set of perceptual sampling points.
以道路上的车道线为例,通过对车道线进行采样,可以将感知车道线抽象为一个点集。 对于并行多条车道线的情况,可以按照车辆左边到右边的顺序对车道线进行排列,对应的点集依据车道线的顺序,从上到下排列。Taking the lane line on the road as an example, by sampling the lane line, the perceived lane line can be abstracted as a set of points. In the case of multiple lane lines in parallel, the lane lines can be arranged in the order from the left to the right of the vehicle, and the corresponding point sets are arranged from top to bottom according to the order of the lane lines.
S502、对地图道路元素的像素点进行采样,获得地图采样点集。S502: Sampling the pixel points of the map road element to obtain a map sampling point set.
在一种可能的实现方式中,可以应用与步骤901相似的方式对地图道路元素进行采样,获得地图采样点集。In a possible implementation manner, a method similar to step 901 may be applied to sample the map road elements to obtain a map sampling point set.
S503、确定感知采样点集与地图采样点集各自包括的采样点之间的旋转平移矩阵。S503: Determine a rotation and translation matrix between the sampling points included in the sensing sampling point set and the map sampling point set.
在一种可能的实现方式中,对于配对的感知采样点集和地图采样点集,可以应用最近点迭代法计算出两个点集之间的旋转平移矩阵。图9示出最近点迭代法的示意图,箭头左侧表示输入至该算法模型的两个关联的点集(配对的点集),应用该算法模型,例如可以是最小二乘算法模型,可以得到旋转平移矩阵。通过对输入的点集应用该旋转平移矩阵,则可以实现两个点集的重合,如图9所示,箭头右侧即表示重合的两个点集。In a possible implementation manner, for the paired sensing sampling point set and map sampling point set, the closest point iteration method may be used to calculate the rotation and translation matrix between the two point sets. Figure 9 shows a schematic diagram of the closest point iteration method. The left side of the arrow represents two associated point sets (paired point sets) input to the algorithm model. Applying the algorithm model, for example, the least squares algorithm model can be obtained. Rotation and translation matrix. By applying the rotation and translation matrix to the input point set, the overlap of the two point sets can be realized. As shown in Figure 9, the right side of the arrow indicates the two overlapping point sets.
S504、基于旋转平移矩阵获得感知道路元素与地图道路元素的坐标偏移量和方向偏移量。S504: Obtain a coordinate offset and a direction offset between the perceived road element and the map road element based on the rotation and translation matrix.
步骤S503中所获得的旋转平均移矩阵则是所需确定的定位偏移量,该旋转平移矩阵中的平移系数对应于坐标偏移量,旋转系数对应于方向偏移量。The rotation average shift matrix obtained in step S503 is the required positioning offset, the translation coefficient in the rotation translation matrix corresponds to the coordinate offset, and the rotation coefficient corresponds to the direction offset.
在一种可能的实现方式中,该车辆定位信息可以表示为(x0,y0,θ0),定位偏移量可以表示为(dx,dy,dθ)。对应地,对车辆定位信息进行修正得到的定位信息可以如公式(3)。In a possible implementation manner, the vehicle positioning information may be expressed as (x0, y0, θ0), and the positioning offset may be expressed as (dx, dy, dθ). Correspondingly, the positioning information obtained by correcting the vehicle positioning information can be as in formula (3).
(x=x0+dx,y=y0+dy,θ=θ0+dθ)          (3);(x=x0+dx,y=y0+dy,θ=θ0+dθ) (3);
例如,可以通过卡尔曼滤波、均值计算、加权平均计算等方法,对所得到定位信息与车辆定位信息进行融合,避免了地图信息对定位信息过度修正,使图像数据标注更加可靠。For example, Kalman filtering, mean value calculation, weighted average calculation and other methods can be used to fuse the obtained positioning information and vehicle positioning information, avoiding excessive correction of the positioning information by the map information, and making the image data labeling more reliable.
下面,将说明本公开实施例在一个实际的应用场景中的示例性应用。In the following, an exemplary application of the embodiments of the present disclosure in an actual application scenario will be described.
本公开实施例提供的图像数据自动标注方法,通过该方法可以完成对道路图像中存在的车道线、停止线和标志牌等静态道路元素位置和属性的全自动标注;该自动标注算法基于高精度地图,高精度地图内包含了丰富的道路元素,并且具有厘米级别的精度。高精度地图是自动驾驶中的基础的模块之一,目前高精度地图已经具有广泛应用和成熟的获取方案;该图像数据自动标注方法标注准确率只跟高精度地图相关,如果高精度地图足够可信,标注算法就可以达到足够到高准确率;该图像数据自动标注方法是自动驾驶***的附属产品,无需额外成本。主要原理是通过高精度的定位信息获取车辆附近高精度地图信息,并利用车载相机参数将地图元素投影到道路图像上,获取相应的道路元素位置和语义信息。同时,本公开实施例还提供一种高精度地图建立方案,利用低成本的高精度定位方案辅助完成图像数据的自动标注。The image data automatic labeling method provided by the embodiments of the present disclosure can be used to complete automatic labeling of the positions and attributes of static road elements such as lane lines, stop lines, and signs in road images; the automatic labeling algorithm is based on high precision Maps, high-precision maps contain a wealth of road elements, and have centimeter-level accuracy. High-precision maps are one of the basic modules in automatic driving. At present, high-precision maps have been widely used and mature acquisition schemes; the automatic labeling method of image data is only related to high-precision maps. If high-precision maps are sufficient According to the belief, the labeling algorithm can achieve a sufficiently high accuracy rate; the automatic labeling method of image data is an accessory product of the automatic driving system, and no additional cost is required. The main principle is to obtain high-precision map information near the vehicle through high-precision positioning information, and use the on-board camera parameters to project the map elements onto the road image to obtain the corresponding road element location and semantic information. At the same time, the embodiments of the present disclosure also provide a high-precision map building solution, which uses a low-cost high-precision positioning solution to assist in completing automatic labeling of image data.
在一个可能的实现方式中,该图像数据自动标注方法是自动驾驶***的附属产品,以自动驾驶***已有的高精度地图、高精度定位方案、车载相机标定方案以及用来辅助定位和检测标注效果的图像车道线检测深度学习模型为基础。In a possible implementation, the image data automatic labeling method is an accessory product of the automatic driving system, which uses the existing high-precision maps, high-precision positioning schemes, on-board camera calibration schemes, and auxiliary positioning and detection annotations of the automatic driving system. The effect of image lane line detection is based on a deep learning model.
在一个可能的实现方式中,该图像数据自动标注方法首先根据高精度定位方案从高精度地图中获取车辆附近地图信息,之后根据车载相机标定参数将地图道路元素投影到道路图像中,然后通过比较图像车道线检测深度学习模型提取的车道线和投影得到的车道线之间的偏移量来校准投影函数,最终得到准确率和精度较高的图像道路元素标注结果。In a possible implementation, the image data automatic labeling method first obtains the map information near the vehicle from the high-precision map according to the high-precision positioning scheme, and then projects the map road elements into the road image according to the calibration parameters of the on-board camera, and then compares them The image lane line detection deep learning model extracts the offset between the lane line and the projected lane line to calibrate the projection function, and finally obtain the image road element labeling result with higher accuracy and precision.
在一个可能的实现方式中,该图像数据自动标注方法使用的高精度地图可以通过简单处理自动驾驶数据采集车辆获取的激光点云数据而获得。一般地,可以通过过滤激光点云反射率得到车道线、停止线等道路元素的点云,然后使用模板匹配、聚类和拟合方式最终获得一份包含丰富道路元素的高精度地图。In a possible implementation manner, the high-precision map used in the image data automatic labeling method can be obtained by simply processing the laser point cloud data obtained by the automatic driving data collection vehicle. Generally, the point cloud of road elements such as lane lines and stop lines can be obtained by filtering the reflectivity of the laser point cloud, and then template matching, clustering and fitting methods are used to finally obtain a high-precision map containing rich road elements.
在一个可能的实现方式中,本公开提供的图像数据自动标注方法包括地图查询模块、地图信息投影模块和投影误差修正模块三部分。首先基于高精度地图和高精度定位信息的 地图查询模块,综合利用车载GPS定位设备、车载高精惯导设备、和基于视觉的定位修正信息得到一个至少分米级别的定位结果,然后基于这个定位结果查询高精度地图中车所在地点附近方圆100m区域的道路信息,包括车道线、停止线位置和属性信息。其次,地图信息投影模块支持两种投影方式,第一种是基于2D(2-Dimension,二维)地图信息和预先标定的相机单应性矩阵的投影方式;第二种是基于3D(3-Dimension,三维)地图信息和预先标定的相机内外参数的投影方式。两种投影方式本质上都是几何数据的空间变换,只不过一个是2D空间到2D空间的仿射变换,一个是3D空间到2D空间的投影变换。最后,投影误差修正模块利用预先准备好的车道线检测深度学习模型提取图像中的车道线位置和属性信息,然后最小化提取的车道线和投影得到的车道线之间的误差,优化投影函数,得到优化后的车道线、停止线等道路元素的位置和属性的标注信息。In a possible implementation, the method for automatically labeling image data provided by the present disclosure includes three parts: a map query module, a map information projection module, and a projection error correction module. First, a map query module based on high-precision maps and high-precision positioning information, comprehensively using vehicle-mounted GPS positioning equipment, vehicle-mounted high-precision inertial navigation equipment, and vision-based positioning correction information to obtain a positioning result of at least decimeter level, and then based on this positioning Results Query road information in a 100m area around the location of the car on the high-precision map, including lane lines, stop line positions and attribute information. Secondly, the map information projection module supports two projection methods, the first is based on 2D (2-Dimension) map information and a pre-calibrated camera homography matrix; the second is based on 3D (3- Dimension, the projection method of map information and pre-calibrated camera internal and external parameters. The two projection methods are essentially spatial transformations of geometric data, but one is the affine transformation from 2D space to 2D space, and the other is the projection transformation from 3D space to 2D space. Finally, the projection error correction module uses the pre-prepared lane line detection deep learning model to extract the lane line position and attribute information in the image, and then minimizes the error between the extracted lane line and the projected lane line, and optimizes the projection function. Obtain the optimized label information of the position and attributes of road elements such as lane lines and stop lines.
其中,该地图查询模块的输入为高精度地图和高精度定位信息,输出为定位位置附近的局部地图信息。Among them, the input of the map query module is high-precision map and high-precision positioning information, and the output is local map information near the positioning location.
在一个可能的实现方式中,本公开实施例基于三个坐标系:世界全局坐标系(包括WGS84经纬度坐标系和ECEF地心坐标系)、车体坐标系和相机图像像素坐标系,三个坐标系见图4A、图4B和图4C。其中,高精度地图使用世界全局坐标系,坐标系内任意一点在地球上都有唯一对应的坐标,比如经纬度信息;其中WGS84经纬度坐标系由于使用弧度值表示点坐标信息,使用起来不方便,所以涉及到坐标系转换时使用ECEF地心坐标系。车体坐标系也是右手笛卡尔直角坐标系,相机图像像素坐标系是一个二维直角坐标系,以像素为单位。In a possible implementation, the embodiments of the present disclosure are based on three coordinate systems: the world global coordinate system (including the WGS84 latitude and longitude coordinate system and the ECEF geocentric coordinate system), the vehicle body coordinate system and the camera image pixel coordinate system, three coordinates See Figure 4A, Figure 4B and Figure 4C. Among them, the high-precision map uses the world's global coordinate system, and any point in the coordinate system has unique coordinates on the earth, such as latitude and longitude information; among them, the WGS84 latitude and longitude coordinate system uses radian values to represent point coordinate information, so it is inconvenient to use. When it comes to coordinate system conversion, the ECEF geocentric coordinate system is used. The vehicle body coordinate system is also a right-hand Cartesian rectangular coordinate system, and the camera image pixel coordinate system is a two-dimensional rectangular coordinate system with pixels as the unit.
在一个可能的实现方式中,得到车辆在世界全局坐标系中唯一的定位信息以后,只关心车前一定范围内(和相机的视野范围相关)的局部地图信息,所以需要提取对应区域的道路元素信息。In a possible implementation, after obtaining the unique positioning information of the vehicle in the world's global coordinate system, only the local map information within a certain range in front of the vehicle (related to the camera's field of view) needs to be extracted, so the road elements of the corresponding area need to be extracted information.
在一个可能的实现方式中,为了提升地图查询效率,可以使用树状层级关系存储道路元素,地图根节点下设若干矩形区域,每个矩形区域设置中心点代表该区域定位位置,并下设Road为基础道路节点,Road节点内部存储车道线、停止线的位置和属性信息。In a possible implementation, in order to improve the efficiency of map query, you can store road elements in a tree-like hierarchical relationship. There are several rectangular areas under the root node of the map, and each rectangular area is set with a center point to represent the location of the area, and Road is set under As the basic road node, the road node internally stores the position and attribute information of the lane line and the stop line.
在一个可能的实现方式中,查询是一个递归过程,首先找到距离定位位置最近的区域,然后依次找到距离定位位置最近的Road和相应的车道线、停止线信息。为了高效响应批量大规模查询,使用KD树顺序存储地图每一层节点坐标,加速查询过程。In a possible implementation, the query is a recursive process. First, find the area closest to the positioning position, and then find the road closest to the positioning position and the corresponding lane line and stop line information in turn. In order to efficiently respond to large-scale batch queries, the KD tree is used to sequentially store the coordinates of each layer of the map to speed up the query process.
在一个可能的实现方式中,得到定位位置附近车道线和停止线信息以后,需要把这些道路元素从世界全局坐标系转换到车体局部坐标系,两个右手笛卡尔直角坐标系之间的转换只需要一个旋转平移矩阵即可,旋转角度和平移量从定位信息中获取。In a possible implementation, after obtaining the lane line and stop line information near the positioning position, these road elements need to be converted from the world global coordinate system to the vehicle body local coordinate system, and the conversion between the two right-hand Cartesian rectangular coordinate systems Only one rotation and translation matrix is needed, and the rotation angle and translation amount are obtained from the positioning information.
在一个可能的实现方式中,最后还需要对地图道路元素进行过滤,只保留相机视野范围内车道线和停止线即可。这里根据需要可以过滤视野遮挡部分的道路元素信息,比如近处物体会遮挡远处物体,但此步骤不是必须。In a possible implementation, the road elements on the map need to be filtered at the end, and only the lane lines and stop lines within the camera's field of view are retained. Here, the road element information in the occluded part of the field of view can be filtered as needed. For example, nearby objects will occlude distant objects, but this step is not necessary.
其中,该地图信息投影模块的输入为定位位置附近的局部地图信息,输出为相机图像像素坐标系下的地图道路元素。Wherein, the input of the map information projection module is local map information near the positioning location, and the output is the map road element in the pixel coordinate system of the camera image.
在一个可能的实现方式中,本公开实施例的车体坐标系到相机图像像素坐标系的转换方法有两种,分别适应于2D地图和3D地图。In a possible implementation manner, there are two conversion methods from the vehicle body coordinate system to the camera image pixel coordinate system in the embodiment of the present disclosure, which are respectively adapted to a 2D map and a 3D map.
如果使用2D地图,地图中高度信息精度较低,可以使用单应性矩阵变换的方式完成车体坐标系到相机图像像素坐标系的转换,通过人工标定数据预先标定好相机图像像素坐标系与车体坐标系之间的单应性矩阵(该矩阵为3*3矩阵,有8个自由度,可以完成一个平面到另一个平面的仿射变换)。然后只需要使用齐次坐标系表示地图元素信息,然后每个地图元素的坐标都乘以单应性矩阵就可以得到相机图像像素坐标系下的地图道路元素。If you use a 2D map, the accuracy of the height information in the map is low. You can use the homography matrix transformation to complete the conversion from the car body coordinate system to the camera image pixel coordinate system. The camera image pixel coordinate system and the car can be pre-calibrated through manual calibration data. The homography matrix between the body coordinate systems (the matrix is a 3*3 matrix, has 8 degrees of freedom, and can complete the affine transformation from one plane to another). Then only need to use the homogeneous coordinate system to represent the map element information, and then the coordinates of each map element are multiplied by the homography matrix to get the map road elements in the camera image pixel coordinate system.
如果使用3D地图,可以直接使用相机内外参数完成坐标系转换。相机成像原理是小 孔成像,相机内参指相机凸透镜的焦距和像素坐标系下光心所在坐标;相机外参即相机坐标系和车体坐标系之间的旋转平移矩阵。相机坐标系即以相机光心为原点,相机上方、前方分别为y轴、z轴正方向的右手笛卡尔坐标系。根据人工标定数据预先标定相机内外参数以后,先通过相机外参将地图道路元素旋转平移到相机坐标系,然后再根据小孔成像的缩放原理和相机内参将地图道路元素投影到相机图像像素坐标系。If you use a 3D map, you can directly use the camera's internal and external parameters to complete the coordinate system conversion. The principle of camera imaging is aperture imaging. The internal parameters of the camera refer to the focal length of the convex lens of the camera and the coordinates of the optical center in the pixel coordinate system; the external parameters of the camera refer to the rotation and translation matrix between the camera coordinate system and the vehicle body coordinate system. The camera coordinate system is the right-hand Cartesian coordinate system with the camera's optical center as the origin, and the upper and front of the camera are the positive directions of the y-axis and z-axis respectively. After pre-calibrating the camera's internal and external parameters according to the manual calibration data, the map road elements are rotated and translated to the camera coordinate system through the camera external parameters, and then the map road elements are projected to the camera image pixel coordinate system according to the zoom principle of small hole imaging and the camera internal parameters .
地图元素投影到像素坐标系的图示见图3A、图3B和图3C。The map elements projected to the pixel coordinate system are shown in Figure 3A, Figure 3B and Figure 3C.
其中,投影误差修正模块的输入为像素坐标系下地图道路元素和深度学习检测分割模型提取的感知车道线信息,输出为修正后的图像标签。Among them, the input of the projection error correction module is the map road element in the pixel coordinate system and the perceptual lane line information extracted by the deep learning detection segmentation model, and the output is the corrected image label.
在一个可能的实现方式中,因为定位误差,地图误差,相机内外参数标定误差等客观因素的存在,投影到像素坐标系下的地图元素不一定可以和图像中真实的信息完全重合,所以投影误差修正是极其重要的一环。In a possible implementation, because of the existence of objective factors such as positioning error, map error, and camera internal and external parameter calibration errors, the map elements projected to the pixel coordinate system may not completely coincide with the real information in the image, so the projection error Correction is an extremely important part.
在一个可能的实现方式中,利用已有的深度学***均距离。得到一份带权二分图以后,目标是进行二分图匹配搜索,找到一个不相交的边集,边权之和最大。不相交即边与边两侧的点不能相同,边权之和最大表示车道线匹配方案最优,即该匹配方案下地图车道线和感知车道线相似度最高。得到车道线匹配方案以后,问题转换为最小化车道线与车道线之间的距离。车道线可以表示为曲线,对曲线上的点进行采样,可以得到点集。最终问题转换为最小化点集到点集的距离,该问题可以通过最近点迭代(即ICP)算法解决。In a possible implementation, the existing deep learning detection segmentation model is used to extract all lane line information in the image, and these lane lines are regarded as perceptual lane lines. The characteristic of perceiving lane lines is the high accuracy of position information, but there are certain errors in attribute information, quantity information and completeness. It mainly uses the offset between the perceived lane line and the map lane line to correct the projection function. Among them, the correction projection function is divided into two steps. The first step is to find the correspondence between the map lane line and the perceived lane line; the second step is Minimize the distance between the corresponding lane lines. There is a good total sequence relationship between lane lines, that is, they are generally arranged from left to right. The map lane line and the perception lane line are both abstracted as points, and the perception point set and the map point set can be obtained. By connecting the points in the perception point concentration and the map point concentration (ie edge connection), the points in the perception point set are connected. Not connected, the points in the map point set are not connected, so that a bipartite graph model can be obtained. Therefore, the lane line matching problem can be transformed into a bipartite graph matching problem. Each side of the bipartite graph represents the pairing relationship between a perceived lane line and a map lane line. The bipartite graph model can be seen in Figure 7. Continue to assign weights to the edges of the bipartite graph. The weights can be equal to the degree of similarity and the opposite of the distance between the two lane lines. The degree of similarity can be quantified by the similarity of the location information, the similarity of the shape information, and whether the attributes of the lane line match. The distance between the lane line and the lane line can be converted into the average distance from one point set to another curve. After obtaining a weighted bipartite graph, the goal is to perform a bipartite graph matching search to find a disjoint edge set with the largest sum of edge weights. Disjoint means that the edge and the points on both sides of the edge cannot be the same. The maximum sum of edge weights indicates that the lane line matching scheme is optimal, that is, the map lane line and the perceived lane line have the highest similarity under this matching scheme. After getting the lane line matching scheme, the problem is converted to minimize the distance between the lane line and the lane line. The lane line can be expressed as a curve, and a point set can be obtained by sampling the points on the curve. The final problem is converted to minimize the distance from the point set to the point set. This problem can be solved by the nearest point iteration (ie ICP) algorithm.
其中,请参阅图9,ICP算法步骤包括:(1)根据最近点配对原则将输入的两个点集中的点两两配对;(2)将点坐标带入最小二乘公式,以寻找旋转平移矩阵使得其中一个点集的点经过旋转平移变换以后,配对点之间的距离和最小;(3)利用奇异值分解解出旋转平移矩阵,该旋转平移矩阵即是优化问题的最优解,通过该旋转平移矩阵可以实现两个点集的重合(即实现上述感知点集和地图点集的重合)。ICP算法可以输出一个修正量,通过将该修正量加到所有地图元素上,可以得到和道路图像全吻合的地图道路元素信息,这些信息包括位置和属性信息,即图像的标注信息。Among them, please refer to Figure 9. The steps of the ICP algorithm include: (1) Pair the points in the two input points according to the closest point pairing principle; (2) Bring the point coordinates into the least square formula to find the rotation and translation The matrix makes the distance between the paired points minimum after the points of one of the point sets are transformed by rotation and translation; (3) Use singular value decomposition to solve the rotation and translation matrix, which is the optimal solution of the optimization problem, through The rotation and translation matrix can realize the overlap of the two point sets (that is, the overlap of the above-mentioned sensing point set and the map point set). The ICP algorithm can output a correction amount. By adding the correction amount to all map elements, the map road element information that is fully consistent with the road image can be obtained. This information includes location and attribute information, that is, image annotation information.
根据本公开实施例提供的图像数据自动标注方法,由于附属于自动驾驶***,无需额外成本;可以在实现自动化标注的同时,保证较高的准确率和位置精度;适应于大规模、全自动批量生产过程。本公开实施例提供的图像数据自动标注方法可以部署在自动驾驶车辆上,源源不断地自动获得免费的标注数据,建立大规模数据集,可以用于深度学习研究或应用模型训练;还可以利用自动标注算法获取不同天气、时间段、区域的标注数据并进行分类,利用分类好的数据进行风格转换相关的模型训练;本方法本质上是将地图信息投影到图像上完成了图像的标注,并用于深度学习模型的训练;这个过程反过来,可以利用深度学习模型识别的道路元素信息投影到全局坐标进行自动化建图。According to the image data automatic labeling method provided by the embodiments of the present disclosure, since it is attached to the automatic driving system, no additional cost is required; automatic labeling can be realized while ensuring high accuracy and position accuracy; it is suitable for large-scale, fully automatic batches production process. The image data automatic labeling method provided by the embodiments of the present disclosure can be deployed on autonomous vehicles, continuously and automatically obtain free labeling data, and establish large-scale data sets, which can be used for deep learning research or application model training; The labeling algorithm obtains and classifies the labeling data of different weather, time periods, and regions, and uses the classified data to carry out model training related to style conversion; this method essentially projects the map information onto the image to complete the image labeling, and use it for Deep learning model training; in turn, this process can be used to project the road element information recognized by the deep learning model to global coordinates for automated mapping.
在一种可能的实现方式中,该图像数据自动标注方法的执行主体可以是图像处理装置,例如,图像数据自动标注方法可以由终端设备或服务器或其它处理设备执行,其中,终端设备可以为用户设备(User Equipment,UE)、移动设备、用户终端、终端、蜂窝电话、无绳电话、个人数字处理(Personal Digital Assistant,PDA)、手持设备、计算设备、车载设备、可穿戴设备等。在一些可能的实现方式中,该图像数据自动标注方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。In a possible implementation, the execution subject of the image data automatic labeling method may be an image processing device. For example, the image data automatic labeling method may be executed by a terminal device or a server or other processing equipment, where the terminal device may be a user Equipment (User Equipment, UE), mobile devices, user terminals, terminals, cellular phones, cordless phones, personal digital assistants (PDAs), handheld devices, computing devices, in-vehicle devices, wearable devices, etc. In some possible implementations, the method for automatically labeling image data may be implemented by a processor invoking computer-readable instructions stored in a memory.
基于上述实施例中的图像数据自动标注方法的同一构思,如图10所示,本公开实施例还提供一种图像数据自动标注装置1000,该装置1000可应用于上述图1、图2、图5、图6、图8所示的方法中。该装置1000包括:第一获取部分11,被配置为获取车辆定位信息、地图图像和车辆采集图像,其中,所述地图图像包括道路信息;第二获取部分12,被配置为根据所述车辆定位信息,获取所述车辆定位信息对应的局部区域内的所述地图图像上的道路信息;投影部分13,被配置为将所述地图图像上的道路信息投影到所述车辆采集图像上,以在所述车辆采集图像上标注所述道路信息。Based on the same concept of the image data automatic labeling method in the above embodiment, as shown in FIG. 10, an embodiment of the present disclosure also provides an image data automatic labeling device 1000, which can be applied to the above-mentioned FIG. 1, FIG. 2 and FIG. 5. In the method shown in Figure 6, Figure 8. The device 1000 includes: a first acquisition part 11 configured to acquire vehicle positioning information, a map image, and a vehicle acquisition image, wherein the map image includes road information; and a second acquisition part 12 is configured to acquire vehicle positioning information, map images, and vehicle acquisition images. Information, acquiring road information on the map image in the local area corresponding to the vehicle positioning information; the projection part 13 is configured to project the road information on the map image onto the vehicle collection image, so as to The road information is marked on the image collected by the vehicle.
在一种可能的实现方式中,所述第二获取部分12被配置为以所述局部区域内的地图图像为根节点,依次查询所述局部区域内的地图图像的地图道路元素的属性信息,其中,所述地图道路元素的属性信息包括以下至少一个信息:所述地图道路元素的语义信息,所述地图道路元素的位置信息,所述地图道路元素的形状信息。In a possible implementation manner, the second acquiring part 12 is configured to take the map image in the local area as a root node, and sequentially query the attribute information of the map road elements of the map image in the local area. Wherein, the attribute information of the map road element includes at least one of the following information: semantic information of the map road element, location information of the map road element, and shape information of the map road element.
在一种可能的实现方式中,所述装置1000还包括:第一确定部分14,被配置为根据所述车辆定位信息和所述地图图像的范围,确定所述局部区域的范围;所述第二获取部分12被配置为获取所述局部区域的范围内的所述地图图像上的地图道路元素的属性信息。In a possible implementation manner, the device 1000 further includes: a first determining part 14 configured to determine the range of the local area according to the vehicle positioning information and the range of the map image; The second acquiring part 12 is configured to acquire attribute information of map road elements on the map image within the range of the local area.
在一种可能的实现方式中,所述地图图像基于世界全局坐标系,所述装置还包括:第一转换部分15,被配置为将基于所述世界全局坐标系的地图图像转换到车体坐标系上,得到基于所述车体坐标系的地图图像;所述投影部分13被配置为将基于所述车体坐标系的地图图像转换到相机坐标系和/或像素坐标系,以将所述地图图像上的道路信息投影到所述车辆采集图像上。In a possible implementation manner, the map image is based on the world global coordinate system, and the device further includes: a first conversion part 15 configured to convert the map image based on the world global coordinate system to vehicle body coordinates System to obtain a map image based on the vehicle body coordinate system; the projection part 13 is configured to convert the map image based on the vehicle body coordinate system to a camera coordinate system and/or a pixel coordinate system to convert the The road information on the map image is projected onto the vehicle collection image.
在一种可能的实现方式中,所述第一转换部分15被配置为根据所述车辆定位信息,得到旋转平移矩阵的旋转角度和平移量;以及根据所述旋转平移矩阵,将基于所述世界全局坐标系的地图图像转换到所述车体坐标系上,得到基于所述车体坐标系的地图图像。In a possible implementation manner, the first conversion part 15 is configured to obtain the rotation angle and translation amount of the rotation translation matrix according to the vehicle positioning information; and according to the rotation translation matrix, it is based on the world The map image of the global coordinate system is converted to the vehicle body coordinate system to obtain a map image based on the vehicle body coordinate system.
在一种可能的实现方式中,所述地图图像为二维地图,所述投影部分13包括:第三获取部分131,被配置为获取所述像素坐标系和所述车体坐标系之间的单应性矩阵;表示部分132,被配置为采用齐次坐标系表示基于所述车体坐标系的地图图像;第二转换部分133,被配置为根据单应性矩阵,将采用齐次坐标系表示的基于所述车体坐标系的地图图像,转换到所述像素坐标系上,得到投影到所述像素坐标系上的车辆采集图像的道路信息。In a possible implementation, the map image is a two-dimensional map, and the projection part 13 includes: a third acquiring part 131 configured to acquire the distance between the pixel coordinate system and the vehicle body coordinate system A homography matrix; the representation part 132 is configured to use a homogeneous coordinate system to represent the map image based on the vehicle body coordinate system; the second conversion part 133 is configured to use a homogeneous coordinate system according to the homography matrix The displayed map image based on the vehicle body coordinate system is converted to the pixel coordinate system to obtain road information of the vehicle collected image projected on the pixel coordinate system.
在一种可能的实现方式中,所述地图图像为三维地图,所述投影部分13包括:第三转换部分,被配置为根据所述车体坐标系与所述相机坐标系之间的旋转平移矩阵,将所述基于车体坐标系的地图图像转换到所述相机坐标系上,得到投影到所述相机坐标系上的车辆采集图像的道路信息;第四转换部分,被配置为根据所述相机坐标系与所述像素坐标系之间的投影矩阵,将投影到所述相机坐标系上的车辆采集图像的道路信息转换到所述像素坐标系上,得到投影到所述像素坐标系上的车辆采集图像的道路信息。In a possible implementation, the map image is a three-dimensional map, and the projection part 13 includes: a third conversion part configured to translate according to the rotation between the vehicle body coordinate system and the camera coordinate system Matrix to convert the map image based on the vehicle body coordinate system to the camera coordinate system to obtain road information of the vehicle-collected image projected on the camera coordinate system; the fourth conversion part is configured to The projection matrix between the camera coordinate system and the pixel coordinate system converts the road information of the vehicle-collected image projected on the camera coordinate system to the pixel coordinate system to obtain the image projected on the pixel coordinate system The road information of the image collected by the vehicle.
在一种可能的实现方式中,所述装置还包括:提取部分16,被配置为经用于提取道路信息的神经网络对所述车辆采集图像进行道路信息提取处理,得到感知道路信息;第一修正部分17,被配置为根据感知道路信息,对投影到车辆采集图像上的道路信息进行修正。In a possible implementation, the device further includes: an extraction part 16 configured to perform road information extraction processing on the vehicle collected image via a neural network for extracting road information to obtain perceived road information; first The correcting part 17 is configured to correct the road information projected on the collected image of the vehicle based on the perceived road information.
在一种可能的实现方式中,所述第一修正部分17包括:第二确定部分,被配置为确定所述感知道路信息中的感知道路元素与投影到所述车辆采集图像上的道路信息中的地 图道路元素之间的偏移信息;第二修正部分,被配置为根据所述偏移信息,对投影到所述车辆采集图像上的道路信息进行修正。In a possible implementation manner, the first correction part 17 includes: a second determination part configured to determine the perceived road elements in the perceived road information and the road information projected on the vehicle acquisition image The offset information between the map road elements; the second correction part is configured to correct the road information projected on the vehicle collected image according to the offset information.
在一种可能的实现方式中,所述第二确定部分包括:第三确定部分,被配置为根据所述感知道路元素的属性信息,从所述地图图像中确定与所述感知道路元素配对的地图道路元素;第四确定部分,被配置为确定配对的感知道路元素和地图道路元素在同一设备坐标系下的位置信息;第五确定部分,被配置为基于所述位置信息确定配对的感知道路元素和地图道路元素之间的定位偏移量。In a possible implementation manner, the second determining part includes: a third determining part configured to determine from the map image a pair of the perceived road element according to the attribute information of the perceived road element The map road element; the fourth determining part is configured to determine the position information of the paired perceived road element and the map road element in the same device coordinate system; the fifth determining part is configured to determine the paired perceived road based on the position information The positioning offset between the element and the map road element.
在一种可能的实现方式中,所述第三确定部分包括:查找部分,被配置为在所述地图图像中,基于所述车辆定位信息查找预设范围内的地图道路元素;匹配部分,被配置为将所述车辆采集图像中的感知道路元素与所述预设范围内的地图道路元素基于属性信息进行两两配对,获得多种配对方案,其中,不同配对方案中至少一感知道路元素与所述预设范围内的地图道路元素的配对方式不同;第六确定部分,被配置为确定每个所述配对方案的置信度;第七确定部分,被配置为在所述多种配对方案中置信度最高或超过设定阈值的配对方案中,确定与所述感知道路元素配对的地图道路元素。In a possible implementation manner, the third determining part includes: a searching part configured to search for map road elements within a preset range in the map image based on the vehicle positioning information; It is configured to pair the perceived road elements in the vehicle-collected image with the map road elements within the preset range based on the attribute information to obtain multiple pairing schemes, wherein at least one perceptual road element in the different pairing schemes is The pairing modes of the map road elements within the preset range are different; the sixth determining part is configured to determine the confidence of each of the pairing schemes; the seventh determining part is configured to be among the multiple pairing schemes In the matching scheme with the highest confidence or exceeding the set threshold, the map road element that is paired with the perceived road element is determined.
在一种可能的实现方式中,所述匹配部分被配置为在所述车辆采集图像中的感知道路元素在所述预设范围内的地图道路元素无法确定配对的道路元素的情况下,在待进行配对的地图道路元素中设置空或虚拟元素与所述感知道路元素配对。In a possible implementation manner, the matching part is configured to, in the case that the road elements on the map that perceive the road elements within the preset range in the vehicle acquisition image cannot determine the paired road elements, the matching part is configured to Set empty or virtual elements in the map road elements for pairing to be paired with the perceived road elements.
在一种可能的实现方式中,所述第六确定部分被配置为分别确定每个所述配对方案中每个感知道路元素和地图道路元素的配对的个体相似度;确定每个所述配对方案中各感知道路元素和地图道路元素配对的整体相似度;以及根据每个所述配对方案的各个体相似度和整体相似度,确定每个所述配对方案的置信度。In a possible implementation manner, the sixth determining part is configured to separately determine the individual similarity of the pairing of each perceived road element and the map road element in each of the pairing schemes; determine each of the pairing schemes The overall similarity between each perceived road element and the map road element in the pairing; and the confidence of each pairing scheme is determined according to the individual similarity and the overall similarity of each pairing scheme.
在一种可能的实现方式中,所述定位偏移量包括坐标偏移量和/或方向偏移量;所述第五确定部分包括:第一采样部分,被配置为对所述感知道路元素的像素点进行采样,获得感知采样点集;第二采样部分,被配置为对所述地图道路元素的像素点进行采样,获得地图采样点集;第八确定部分,被配置为确定所述感知采样点集与所述地图采样点集各自包括的采样点之间的旋转平移矩阵;第四获取部分,被配置为基于所述旋转平移矩阵获得所述感知道路元素与所述地图道路元素的坐标偏移量和方向偏移量。In a possible implementation manner, the positioning offset includes a coordinate offset and/or a direction offset; the fifth determining part includes: a first sampling part configured to perceive the road element Sampling the pixels of the map to obtain a set of sensing sampling points; the second sampling part is configured to sample the pixel points of the map road elements to obtain a set of map sampling points; the eighth determining part is configured to determine the sensing The rotation and translation matrix between the sampling point set and the sampling points included in the map sampling point set; the fourth obtaining part is configured to obtain the coordinates of the perceived road element and the map road element based on the rotation and translation matrix Offset and direction offset.
有关上述各个部分更详细的描述可以参考图1、图2、图5、图6、图8所示的方法实施例中的相关描述得到,这里不加赘述。For more detailed descriptions of the foregoing parts, reference may be made to related descriptions in the method embodiments shown in FIG. 1, FIG. 2, FIG. 5, FIG. 6, and FIG. 8, and details are not repeated here.
根据本公开实施例提供的图像数据自动标注装置,利用地图数据包含的丰富的道路信息,可以实现对车辆采集图像的道路信息的自动标注,提高了标注图像数据的效率,有利于降低数据标注的出错概率,减少图像数据标注的人工成本。According to the image data automatic labeling device provided by the embodiment of the present disclosure, the rich road information contained in the map data can be used to realize the automatic labeling of the road information of the image collected by the vehicle, which improves the efficiency of labeling image data and is beneficial to reduce the cost of data labeling. The error probability reduces the labor cost of image data annotation.
本公开实施例还提供一种图像数据自动标注装置,该装置用于执行上述图像数据自动标注方法。上述方法中的部分或全部可以通过硬件来实现,也可以通过软件或固件来实现。The embodiment of the present disclosure also provides an automatic labeling device for image data, which is used to execute the above-mentioned automatic labeling method for image data. Part or all of the above methods can be implemented by hardware, and can also be implemented by software or firmware.
在一种可能的实现方式中,该装置可以是芯片或者集成电路。可选的,在上述实施例的图像数据自动标注方法中的部分或全部通过软件或固件来实现的情况下,可以通过图11提供的一种图像数据自动标注装置1100来实现。如图11所示,该装置1100可包括:输入装置111、输出装置112、存储器113和处理器114(装置中的处理器114可以是一个或多个,图11中以一个处理器为例)。在本实施例中,输入装置111、输出装置112、存储器113和处理器114可通过总线或其它方式连接,其中,图11中以通过总线连接为例。In a possible implementation, the device may be a chip or an integrated circuit. Optionally, in the case where part or all of the image data automatic labeling method of the foregoing embodiment is implemented by software or firmware, it may be implemented by an image data automatic labeling apparatus 1100 provided in FIG. 11. As shown in FIG. 11, the device 1100 may include: an input device 111, an output device 112, a memory 113, and a processor 114 (the processor 114 in the device may be one or more, and one processor is taken as an example in FIG. 11) . In this embodiment, the input device 111, the output device 112, the memory 113, and the processor 114 may be connected by a bus or in other ways. Among them, the connection by a bus is taken as an example in FIG. 11.
其中,处理器114被配置于执行图1、图2、图5、图6、图8中装置所执行的方法步骤。可选的,上述图像数据自动标注方法的程序可以存储在存储器113中。该存储器113可以是物理上独立的部分,也可以与处理器114集成在一起。该存储器113也可以用于存储数据。可选的,在上述实施例的图像数据自动标注方法中的部分或全部通过软件实现的 情况下,该装置也可以只包括处理器。用于存储程序的存储器位于装置之外,处理器通过电路或电线与存储器连接,用于读取并执行存储器中存储的程序。Wherein, the processor 114 is configured to execute the method steps executed by the devices in FIG. 1, FIG. 2, FIG. 5, FIG. 6, and FIG. 8. Optionally, the program of the foregoing image data automatic labeling method may be stored in the memory 113. The memory 113 may be a physically independent part, or may be integrated with the processor 114. The memory 113 can also be used to store data. Optionally, in the case where part or all of the image data automatic labeling method of the foregoing embodiment is implemented by software, the device may also only include a processor. The memory for storing the program is located outside the device, and the processor is connected to the memory through a circuit or wire for reading and executing the program stored in the memory.
处理器可以是中央处理器(central processing unit,CPU),网络处理器(network processor,NP),或广域网(Wide Area Network,WAN)设备。处理器还可以进一步包括硬件芯片。上述硬件芯片可以是专用集成电路(application-specific integrated circuit,ASIC),可编程逻辑器件(programmable logic device,PLD)或其组合。上述PLD可以是复杂可编程逻辑器件(complex programmable logic device,CPLD),现场可编程逻辑门阵列(field-programmable gate array,FPGA),通用阵列逻辑(generic array logic,GAL)或其任意组合。存储器可以包括易失性存储器(volatile memory),例如随机存取存储器(random-access memory,RAM);存储器也可以包括非易失性存储器(non-volatile memory),例如快闪存储器(flash memory),硬盘(hard disk drive,HDD)或固态硬盘(solid-state drive,SSD);存储器还可以包括上述种类的存储器的组合。The processor may be a central processing unit (CPU), a network processor (NP), or a wide area network (WAN) device. The processor may further include a hardware chip. The aforementioned hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), or a combination thereof. The above-mentioned PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL), or any combination thereof. The memory may include volatile memory (volatile memory), such as random-access memory (RAM); the memory may also include non-volatile memory (non-volatile memory), such as flash memory (flash memory) , Hard disk drive (HDD) or solid-state drive (solid-state drive, SSD); the memory may also include a combination of the foregoing types of memory.
根据本公开实施例提供的一种图像数据自动标注装置,利用地图数据包含的丰富的道路信息,可以实现对车辆采集图像的道路信息的自动标注,提高了标注图像数据的效率,有利于降低数据标注的出错概率,减少图像数据标注的人工成本。According to an image data automatic labeling device provided by an embodiment of the present disclosure, the rich road information contained in the map data can be used to automatically label the road information of the image collected by the vehicle, which improves the efficiency of labeling image data and is beneficial to reducing the data. The error probability of labeling reduces the labor cost of image data labeling.
本公开实施例可以是***、方法和/或计算机程序产品。计算机程序产品可以包括计算机可读存储介质,其上载有用于使处理器实现本公开实施例的各个方面的计算机可读程序指令。The embodiments of the present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium loaded with computer-readable program instructions for enabling a processor to implement various aspects of the embodiments of the present disclosure.
计算机可读存储介质可以是可以保持和存储由指令执行设备使用的指令的有形设备。计算机可读存储介质例如可以但不限于是电存储设备、磁存储设备、光存储设备、电磁存储设备、半导体存储设备或者上述的任意合适的组合。计算机可读存储介质的例子(非穷举的列表)包括:便携式计算机盘、硬盘、随机存取存储器(Random Access Memory,RAM)、只读存储器(Read Only Memory,ROM)、可擦式可编程只读存储器(Electrical Programmable Read Only Memory,EPROM或闪存)、静态随机存取存储器(Static Random Access Memory,SRAM)、便携式压缩盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、数字多功能盘(Digital Video Disc,DVD)、记忆棒、软盘、机械编码设备、例如其上存储有指令的打孔卡或凹槽内凸起结构、以及上述的任意合适的组合。这里所使用的计算机可读存储介质不被解释为瞬时信号本身,诸如无线电波或者其他自由传播的电磁波、通过波导或其他传输媒介传播的电磁波、或者通过电线传输的电信号。The computer-readable storage medium may be a tangible device that can hold and store instructions used by the instruction execution device. The computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. Examples of computer-readable storage media (non-exhaustive list) include: portable computer disks, hard disks, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), erasable programmable Read-only memory (Electrical Programmable Read Only Memory, EPROM or flash memory), static random access memory (Static Random Access Memory, SRAM), portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), digital multi Function disks (Digital Video Disc, DVD), memory sticks, floppy disks, mechanical encoding devices, such as punch cards on which instructions are stored or raised structures in the grooves, and any suitable combination of the foregoing. The computer-readable storage medium used herein is not interpreted as a transient signal itself, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media, or electrical signals transmitted through wires.
这里所描述的计算机可读程序指令可以从计算机可读存储介质下载到各个计算/处理设备,或者通过网络、例如因特网、局域网、广域网和/或无线网下载到外部计算机或外部存储设备。网络可以包括铜传输电缆、光纤传输、无线传输、路由器、防火墙、交换机、网关计算机和/或边缘服务器。每个计算/处理设备中的网络适配卡或者网络接口从网络接收计算机可读程序指令,并转发该计算机可读程序指令,以供存储在各个计算/处理设备中的计算机可读存储介质中。The computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to various computing/processing devices, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network, and forwards the computer-readable program instructions for storage in the computer-readable storage medium in each computing/processing device .
用于执行本公开实施例操作的计算机程序指令可以是汇编指令、指令集架构(Industry Standard Architecture,ISA)指令、机器指令、机器相关指令、微代码、固件指令、状态设置数据、或者以一种或多种编程语言的任意组合编写的源代码或目标代码,所述编程语言包括面向对象的编程语言—诸如Smalltalk、C++等,以及常规的过程式编程语言—诸如“C”语言或类似的编程语言。计算机可读程序指令可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络—包括局域网(local area network,LAN)或广域网(Wide Area Network,WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服 务提供商来通过因特网连接)。在一些实施例中,通过利用计算机可读程序指令的状态信息来个性化定制电子电路,例如可编程逻辑电路、现场可编程门阵列(FPGA)或可编程逻辑阵列(programmable logic array,PLA),该电子电路可以执行计算机可读程序指令,从而实现本公开实施例的各个方面。The computer program instructions used to perform the operations of the embodiments of the present disclosure may be assembly instructions, instruction set architecture (Industry Standard Architecture, ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or a combination of Or source code or object code written in any combination of multiple programming languages, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as "C" language or similar programming Language. Computer-readable program instructions can be executed entirely on the user's computer, partly on the user's computer, executed as a stand-alone software package, partly on the user's computer and partly executed on a remote computer, or entirely on the remote computer or server carried out. In the case of a remote computer, the remote computer can be connected to the user's computer through any kind of network-including local area network (LAN) or wide area network (WAN)-or it can be connected to an external computer (such as Use an Internet service provider to connect via the Internet). In some embodiments, the electronic circuit is customized by using the state information of the computer-readable program instructions, such as programmable logic circuit, field programmable gate array (FPGA) or programmable logic array (PLA), The electronic circuit can execute computer-readable program instructions to implement various aspects of the embodiments of the present disclosure.
这里参照根据本公开实施例的方法、装置(***)和计算机程序产品的流程图和/或框图描述了本公开实施例的各个方面。应当理解,流程图和/或框图的每个方框以及流程图和/或框图中各方框的组合,都可以由计算机可读程序指令实现。Here, various aspects of the embodiments of the present disclosure are described with reference to flowcharts and/or block diagrams of methods, devices (systems) and computer program products according to the embodiments of the present disclosure. It should be understood that each block of the flowcharts and/or block diagrams, and combinations of blocks in the flowcharts and/or block diagrams, can be implemented by computer-readable program instructions.
这些计算机可读程序指令可以提供给通用计算机、专用计算机或其它可编程数据处理装置的处理器,从而生产出一种机器,使得这些指令在通过计算机或其它可编程数据处理装置的处理器执行的情况下,产生了实现流程图和/或框图中的一个或多个方框中规定的功能/动作的装置。也可以把这些计算机可读程序指令存储在计算机可读存储介质中,这些指令使得计算机、可编程数据处理装置和/或其他设备以特定方式工作,从而,存储有指令的计算机可读介质则包括一个制造品,其包括实现流程图和/或框图中的一个或多个方框中规定的功能/动作的各个方面的指令。These computer-readable program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, thereby producing a machine that allows these instructions to be executed by the processor of the computer or other programmable data processing device. In this case, a device that implements the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams is produced. It is also possible to store these computer-readable program instructions in a computer-readable storage medium. These instructions make computers, programmable data processing apparatuses, and/or other devices work in a specific manner, so that the computer-readable medium storing instructions includes An article of manufacture, which includes instructions for implementing various aspects of the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
也可以把计算机可读程序指令加载到计算机、其它可编程数据处理装置、或其它设备上,使得在计算机、其它可编程数据处理装置或其它设备上执行一系列操作步骤,以产生计算机实现的过程,从而使得在计算机、其它可编程数据处理装置、或其它设备上执行的指令实现流程图和/或框图中的一个或多个方框中规定的功能/动作。It is also possible to load computer-readable program instructions on a computer, other programmable data processing device, or other equipment, so that a series of operation steps are executed on the computer, other programmable data processing device, or other equipment to produce a computer-implemented process , So that the instructions executed on the computer, other programmable data processing apparatus, or other equipment realize the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams.
附图中的流程图和框图显示了根据本公开实施例的多个实施例的***、方法和计算机程序产品的可能实现的体系架构、功能和操作。流程图或框图中的每个方框可以代表一个模块、程序段或指令的一部分,所述模块、程序段或指令的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个连续的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或动作的专用的基于硬件的***来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the accompanying drawings show the possible implementation architecture, functions, and operations of the system, method, and computer program product according to multiple embodiments of the embodiments of the present disclosure. Each block in the flowchart or block diagram may represent a module, program segment, or part of an instruction, and the module, program segment, or part of an instruction includes one or more executable instructions for realizing specified logical functions. In some alternative implementations, the functions marked in the block may also occur in a different order than the order marked in the drawings. For example, two consecutive blocks can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart, can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions.
以上已经描述了本公开的各实施例,上述说明是示例性的,并非穷尽性的,并且也不限于所披露的各实施例。在不偏离所说明的各实施例的范围和精神的情况下,对于本技术领域的普通技术人员来说许多修改和变更都是显而易见的。本文中所用术语的选择,旨在最好地解释各实施例的原理、实际应用或对市场中的技术改进,或者使本技术领域的其它普通技术人员能理解本文披露的各实施例。The embodiments of the present disclosure have been described above, and the above description is exemplary, not exhaustive, and is not limited to the disclosed embodiments. Without departing from the scope and spirit of the described embodiments, many modifications and changes are obvious to those of ordinary skill in the art. The choice of terms used herein is intended to best explain the principles, practical applications, or technical improvements in the market of the various embodiments, or to enable other ordinary skilled in the art to understand the various embodiments disclosed herein.

Claims (18)

  1. 一种图像数据自动标注方法,所述方法包括:A method for automatically labeling image data, the method comprising:
    获取车辆定位信息、地图图像和车辆采集图像,其中,所述地图图像包括道路信息;Acquiring vehicle positioning information, map images, and vehicle collection images, where the map images include road information;
    根据所述车辆定位信息,获取所述车辆定位信息对应的局部区域内的所述地图图像上的道路信息;Acquiring, according to the vehicle positioning information, road information on the map image in a local area corresponding to the vehicle positioning information;
    将所述地图图像上的道路信息投影到所述车辆采集图像上,以在所述车辆采集图像上标注所述道路信息。Projecting the road information on the map image onto the vehicle acquisition image to mark the road information on the vehicle acquisition image.
  2. 根据权利要求1所述的方法,其中,所述根据所述车辆定位信息,获取所述车辆定位信息对应的局部区域内的所述地图图像上的道路信息,包括:The method according to claim 1, wherein the acquiring, according to the vehicle positioning information, the road information on the map image in the local area corresponding to the vehicle positioning information comprises:
    以所述局部区域内的地图图像为根节点,依次查询所述局部区域内的地图图像的地图道路元素的属性信息,其中,所述地图道路元素的属性信息包括以下至少一个信息:所述地图道路元素的语义信息,所述地图道路元素的位置信息,所述地图道路元素的形状信息。Using the map image in the local area as the root node, query the attribute information of the map road element of the map image in the local area in turn, where the attribute information of the map road element includes at least one of the following information: the map The semantic information of the road element, the location information of the map road element, and the shape information of the map road element.
  3. 根据权利要求2所述的方法,其中,所述方法还包括:The method according to claim 2, wherein the method further comprises:
    根据所述车辆定位信息和所述地图图像的范围,确定所述局部区域的范围;Determining the range of the local area according to the vehicle positioning information and the range of the map image;
    所述根据所述车辆定位信息,获取所述车辆定位信息对应的局部区域内的所述地图图像上的道路信息,包括:The acquiring, according to the vehicle positioning information, the road information on the map image in the local area corresponding to the vehicle positioning information includes:
    获取所述局部区域的范围内的所述地图图像上的地图道路元素的属性信息。Acquiring the attribute information of the map road element on the map image within the range of the local area.
  4. 根据权利要求1-3任一项所述的方法,其中,所述地图图像基于世界全局坐标系,所述将所述地图图像上的道路信息投影到车辆采集图像上之前,所述方法还包括:The method according to any one of claims 1 to 3, wherein the map image is based on the world global coordinate system, and before the road information on the map image is projected onto the vehicle acquisition image, the method further comprises :
    将基于所述世界全局坐标系的地图图像转换到车体坐标系上,得到基于所述车体坐标系的地图图像;Converting the map image based on the world global coordinate system to the vehicle body coordinate system to obtain the map image based on the vehicle body coordinate system;
    所述将所述地图图像上的道路信息投影到车辆采集图像上,包括:The projecting the road information on the map image onto the vehicle collection image includes:
    将基于所述车体坐标系的地图图像转换到相机坐标系和/或像素坐标系,以将所述地图图像上的道路信息投影到所述车辆采集图像上。The map image based on the vehicle body coordinate system is converted to a camera coordinate system and/or a pixel coordinate system, so as to project the road information on the map image onto the vehicle collection image.
  5. 根据权利要求4所述的方法,其中,所述将基于所述世界全局坐标系的地图图像转换到车体坐标系上,得到基于所述车体坐标系的地图图像,包括:The method according to claim 4, wherein the converting the map image based on the world global coordinate system to the vehicle body coordinate system to obtain the map image based on the vehicle body coordinate system comprises:
    根据所述车辆定位信息,得到旋转平移矩阵的旋转角度和平移量;Obtain the rotation angle and the translation amount of the rotation and translation matrix according to the vehicle positioning information;
    根据所述旋转平移矩阵,将基于所述世界全局坐标系的地图图像转换到所述车体坐标系上,得到基于所述车体坐标系的地图图像。According to the rotation and translation matrix, a map image based on the world global coordinate system is converted to the vehicle body coordinate system to obtain a map image based on the vehicle body coordinate system.
  6. 根据权利要求4或5所述的方法,其中,所述地图图像为二维地图,所述将基于所述车体坐标系的地图图像转换到相机坐标系和/或像素坐标系,以将所述地图图像上的道路信息投影到所述车辆采集图像上,包括:The method according to claim 4 or 5, wherein the map image is a two-dimensional map, and the map image based on the vehicle body coordinate system is converted to a camera coordinate system and/or a pixel coordinate system, so as to convert the map image based on the vehicle body coordinate system to a camera coordinate system and/or a pixel coordinate system. The projection of the road information on the map image onto the vehicle collection image includes:
    获取所述像素坐标系和所述车体坐标系之间的单应性矩阵;Acquiring a homography matrix between the pixel coordinate system and the vehicle body coordinate system;
    采用齐次坐标系表示基于所述车体坐标系的地图图像;Using a homogeneous coordinate system to represent the map image based on the vehicle body coordinate system;
    根据所述单应性矩阵,将所述采用齐次坐标系表示的基于所述车体坐标系的地图图像,转换到所述像素坐标系上,得到投影到所述像素坐标系上的车辆采集图像的道路信息。According to the homography matrix, the map image based on the vehicle body coordinate system represented by the homogeneous coordinate system is converted to the pixel coordinate system to obtain the vehicle collection projected on the pixel coordinate system Image road information.
  7. 根据权利要求5所述的方法,其中,所述地图图像为三维地图,所述将基于所述车体坐标系的地图图像转换到相机坐标系和/或像素坐标系,以将所述地图图像上的道路信息投影到所述车辆采集图像上,包括:The method according to claim 5, wherein the map image is a three-dimensional map, and the map image based on the vehicle body coordinate system is converted to a camera coordinate system and/or a pixel coordinate system to convert the map image Projecting the road information on the vehicle onto the image collected by the vehicle includes:
    根据所述车体坐标系与所述相机坐标系之间的旋转平移矩阵,将所述基于车体坐标系的地图图像转换到所述相机坐标系上,得到投影到所述相机坐标系上的车辆采集图像的道路信息;According to the rotation and translation matrix between the vehicle body coordinate system and the camera coordinate system, the map image based on the vehicle body coordinate system is converted to the camera coordinate system, and the image projected on the camera coordinate system is obtained. Road information of images collected by vehicles;
    根据所述相机坐标系与所述像素坐标系之间的投影矩阵,将投影到所述相机坐标系上的车辆采集图像的道路信息转换到所述像素坐标系上,得到投影到所述像素坐标系上的车辆采集图像的道路信息。According to the projection matrix between the camera coordinate system and the pixel coordinate system, the road information of the vehicle-collected image projected on the camera coordinate system is converted to the pixel coordinate system to obtain the projection to the pixel coordinate The vehicles on the line collect the road information of the image.
  8. 根据权利要求1-7任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 1-7, wherein the method further comprises:
    经用于提取道路信息的神经网络对所述车辆采集图像进行道路信息提取处理,得到感知道路信息;Perform road information extraction processing on the vehicle collected image via a neural network for extracting road information to obtain perceived road information;
    根据所述感知道路信息,对投影到所述车辆采集图像上的道路信息进行修正。According to the perceived road information, the road information projected on the vehicle-collected image is corrected.
  9. 根据权利要求8所述的方法,其中,所述根据所述感知道路信息,对投影到所述车辆采集图像上的道路信息进行修正,包括:The method according to claim 8, wherein the correcting the road information projected on the vehicle-collected image according to the perceived road information comprises:
    确定所述感知道路信息中的感知道路元素与投影到所述车辆采集图像上的道路信息中的地图道路元素之间的偏移信息;Determining the offset information between the perceived road element in the perceived road information and the map road element in the road information projected on the vehicle collected image;
    根据所述偏移信息,对投影到所述车辆采集图像上的道路信息进行修正。According to the offset information, the road information projected on the collected image of the vehicle is corrected.
  10. 根据权利要求9所述的方法,其中,所述确定所述感知道路信息中的感知道路元素与投影到所述车辆采集图像上的道路信息中的地图道路元素之间的偏移信息,包括:The method according to claim 9, wherein the determining the offset information between the perceived road element in the perceived road information and the map road element in the road information projected on the vehicle-collected image comprises:
    根据所述感知道路元素的属性信息,从所述地图图像中确定与所述感知道路元素配对的地图道路元素;Determine, from the map image, a map road element paired with the perceived road element according to the attribute information of the perceived road element;
    确定配对的感知道路元素和地图道路元素在同一设备坐标系下的位置信息;Determine the location information of the paired perceptual road elements and map road elements in the same device coordinate system;
    基于所述位置信息确定配对的感知道路元素和地图道路元素之间的定位偏移量。The positioning offset between the paired perceived road element and the map road element is determined based on the position information.
  11. 根据权利要求10所述的方法,其中,根据所述感知道路元素的属性信息,从所述地图中确定与所述感知道路元素配对的地图道路元素,包括:The method according to claim 10, wherein, according to the attribute information of the perceived road element, determining the map road element paired with the perceived road element from the map comprises:
    在所述地图图像中,基于所述车辆定位信息查找预设范围内的地图道路元素;In the map image, search for map road elements within a preset range based on the vehicle positioning information;
    将所述车辆采集图像中的感知道路元素与所述预设范围内的地图道路元素基于属性信息进行两两配对,获得多种配对方案,其中,不同配对方案中至少一感知道路元素与所述预设范围内的地图道路元素的配对方式不同;The perceived road elements in the vehicle-collected image are paired with the map road elements within the preset range based on the attribute information to obtain multiple pairing schemes. Among them, at least one perceptual road element in the different pairing schemes is The matching methods of map road elements within the preset range are different;
    确定每个所述配对方案的置信度;Determine the confidence level of each of the pairing schemes;
    在所述多种配对方案中置信度最高或超过设定阈值的配对方案中,确定与所述感知道路元素配对的地图道路元素。In the multiple matching schemes with the highest confidence or exceeding a set threshold, the map road element that is paired with the perceived road element is determined.
  12. 根据权利要求11所述的方法,其中,将所述车辆采集图像中的感知道路元素与所述预设范围内的地图道路元素进行配对,包括:The method according to claim 11, wherein the pairing of the perceived road elements in the vehicle-collected image with the map road elements within the preset range comprises:
    在所述车辆采集图像中的感知道路元素在所述预设范围内的地图道路元素无法确定配对的道路元素的情况下,在待进行配对的地图道路元素中设置空或虚拟元素与所述感知道路元素配对。In the case where the perceived road element in the vehicle acquisition image is within the preset range and the map road element cannot determine the road element to be paired, an empty or virtual element is set in the map road element to be paired with the perception Road elements are paired.
  13. 根据权利要求11或12所述的方法,其中,确定每个所述配对方案的置信度,包括:The method according to claim 11 or 12, wherein determining the confidence level of each of the pairing schemes comprises:
    分别确定每个所述配对方案中每个感知道路元素和地图道路元素的配对的个体相似度;Respectively determining the individual similarity of the pairing of each perceived road element and the map road element in each of the pairing schemes;
    确定每个所述配对方案中各感知道路元素和地图道路元素配对的整体相似度;Determine the overall similarity between the perceptual road elements and the map road elements in each pairing scheme;
    根据每个所述配对方案的各个体相似度和整体相似度,确定每个所述配对方案的置信度。According to the individual similarity and the overall similarity of each of the pairing schemes, the confidence of each of the pairing schemes is determined.
  14. 根据权利要求10-13任一项所述的方法,其中,所述定位偏移量包括坐标偏移量和/或方向偏移量;The method according to any one of claims 10-13, wherein the positioning offset includes a coordinate offset and/or a direction offset;
    基于所述车辆定位信息确定配对的感知道路元素和地图道路元素之间的定位偏移量,包括:Determining the positioning offset between the paired perceived road element and the map road element based on the vehicle positioning information includes:
    对所述感知道路元素的像素点进行采样,获得感知采样点集;Sampling the pixels of the perceived road elements to obtain a set of perceived sampling points;
    对所述地图道路元素的像素点进行采样,获得地图采样点集;Sampling the pixel points of the map road element to obtain a map sampling point set;
    确定所述感知采样点集与所述地图采样点集各自包括的采样点之间的旋转平移矩阵;Determining a rotation and translation matrix between the sampling points included in the sensing sampling point set and the map sampling point set;
    基于所述旋转平移矩阵获得所述感知道路元素与所述地图道路元素的坐标偏移量和方向偏移量。Obtain the coordinate offset and the direction offset of the perceived road element and the map road element based on the rotation and translation matrix.
  15. 一种图像数据自动标注装置,其中,所述装置包括:A device for automatically labeling image data, wherein the device includes:
    第一获取部分,被配置为获取车辆定位信息、地图图像和车辆采集图像,其中,所述地图图像包括道路信息;The first acquiring part is configured to acquire vehicle positioning information, a map image, and a vehicle acquisition image, wherein the map image includes road information;
    第二获取部分,被配置为根据所述车辆定位信息,获取所述车辆定位信息对应的局部区域内的所述地图图像上的道路信息;The second acquiring part is configured to acquire road information on the map image in a local area corresponding to the vehicle positioning information according to the vehicle positioning information;
    投影部分,被配置为将所述地图图像上的道路信息投影到所述车辆采集图像上,以在所述车辆采集图像上标注所述道路信息。The projection part is configured to project the road information on the map image onto the vehicle acquisition image to mark the road information on the vehicle acquisition image.
  16. 一种图像数据自动标注装置,其中,所述装置包括:存储器和处理器;其中,所述存储器中存储一组程序指令,且所述处理器用于调用所述存储器中存储的程序指令,执行如权利要求1-14中任一项所述的方法。An image data automatic labeling device, wherein the device includes: a memory and a processor; wherein, the memory stores a group of program instructions, and the processor is used to call the program instructions stored in the memory to execute such as The method of any one of claims 1-14.
  17. 一种计算机可读存储介质,其上存储有计算机程序指令,其中,所述计算机程序指令被处理器执行时实现权利要求1-14中任意一项所述的方法。A computer-readable storage medium having computer program instructions stored thereon, wherein the computer program instructions implement the method of any one of claims 1-14 when executed by a processor.
  18. 一种计算机程序,包括计算机可读代码,当所述计算机可读代码在电子设备中运行时,所述电子设备中的处理器执行用于实现权利要求1-14中任意一项所述的方法。A computer program, comprising computer readable code, when the computer readable code runs in an electronic device, a processor in the electronic device executes the method for implementing the method according to any one of claims 1-14 .
PCT/CN2020/122514 2019-10-16 2020-10-21 Method for automatically labeling image data and device WO2021073656A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020217017022A KR20220053513A (en) 2019-10-16 2020-10-21 Image data automatic labeling method and device
JP2021539968A JP2022517961A (en) 2019-10-16 2020-10-21 Method and device for automatically annotating image data

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910983438.9A CN112667837A (en) 2019-10-16 2019-10-16 Automatic image data labeling method and device
CN201910983438.9 2019-10-16

Publications (1)

Publication Number Publication Date
WO2021073656A1 true WO2021073656A1 (en) 2021-04-22

Family

ID=75400660

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/122514 WO2021073656A1 (en) 2019-10-16 2020-10-21 Method for automatically labeling image data and device

Country Status (4)

Country Link
JP (1) JP2022517961A (en)
KR (1) KR20220053513A (en)
CN (1) CN112667837A (en)
WO (1) WO2021073656A1 (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113609241A (en) * 2021-08-13 2021-11-05 武汉市交通发展战略研究院 Road network and public traffic network matching method and system
CN113822943A (en) * 2021-09-17 2021-12-21 中汽创智科技有限公司 External parameter calibration method, device and system of camera and storage medium
US20210405651A1 (en) * 2020-06-26 2021-12-30 Tusimple, Inc. Adaptive sensor control
CN114111817A (en) * 2021-11-22 2022-03-01 武汉中海庭数据技术有限公司 Vehicle positioning method and system based on SLAM map and high-precision map matching
CN114419882A (en) * 2021-12-30 2022-04-29 联通智网科技股份有限公司 Method for optimizing layout parameters of sensing system, equipment terminal and storage medium
CN114782447A (en) * 2022-06-22 2022-07-22 小米汽车科技有限公司 Road surface detection method, device, vehicle, storage medium and chip
CN115223118A (en) * 2022-06-09 2022-10-21 广东省智能网联汽车创新中心有限公司 High-precision map confidence judgment method and system and vehicle
US20230030660A1 (en) * 2021-08-02 2023-02-02 Nio Technology (Anhui) Co., Ltd Vehicle positioning method and system for fixed parking scenario
CN116468870A (en) * 2023-06-20 2023-07-21 佛山科学技术学院 Three-dimensional visual modeling method and system for urban road
US11877066B2 (en) 2018-09-10 2024-01-16 Tusimple, Inc. Adaptive illumination for a time-of-flight camera on a vehicle
US11932238B2 (en) 2020-06-29 2024-03-19 Tusimple, Inc. Automated parking technology
JP7494809B2 (en) 2021-06-29 2024-06-04 株式会社デンソー Support device, support method, and support program

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113205447A (en) * 2021-05-11 2021-08-03 北京车和家信息技术有限公司 Road picture marking method and device for lane line identification
CN113269165B (en) * 2021-07-16 2022-04-22 智道网联科技(北京)有限公司 Data acquisition method and device
CN114136333A (en) * 2021-10-15 2022-03-04 阿波罗智能技术(北京)有限公司 High-precision map road data generation method, device and equipment based on hierarchical features
CN114018240A (en) * 2021-10-29 2022-02-08 广州小鹏自动驾驶科技有限公司 Map data processing method and device
CN115526987A (en) * 2022-09-22 2022-12-27 清华大学 Label element reconstruction method, system, device and medium based on monocular camera
CN117894015B (en) * 2024-03-15 2024-05-24 浙江华是科技股份有限公司 Point cloud annotation data optimization method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11271074A (en) * 1998-03-20 1999-10-05 Fujitsu Ltd Device and method for comparing mark image and program storage medium
CN105701449A (en) * 2015-12-31 2016-06-22 百度在线网络技术(北京)有限公司 Method and device for detecting lane lines on road surface
CN108305475A (en) * 2017-03-06 2018-07-20 腾讯科技(深圳)有限公司 A kind of traffic lights recognition methods and device
CN109949439A (en) * 2019-04-01 2019-06-28 星觅(上海)科技有限公司 Driving outdoor scene information labeling method, apparatus, electronic equipment and medium
CN110135323A (en) * 2019-05-09 2019-08-16 北京四维图新科技股份有限公司 Image labeling method, device, system and storage medium
CN110136199A (en) * 2018-11-13 2019-08-16 北京初速度科技有限公司 A kind of vehicle location based on camera, the method and apparatus for building figure

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5971112B2 (en) * 2012-12-25 2016-08-17 富士通株式会社 Image processing method, image processing apparatus, and image processing program
US10209089B2 (en) * 2017-04-03 2019-02-19 Robert Bosch Gmbh Automated image labeling for vehicles based on maps
JP6908843B2 (en) * 2017-07-26 2021-07-28 富士通株式会社 Image processing equipment, image processing method, and image processing program
US11544938B2 (en) * 2018-12-21 2023-01-03 Continental Autonomous Mobility US, LLC Systems and methods for automatic labeling of images for supervised machine learning
WO2020210127A1 (en) * 2019-04-12 2020-10-15 Nvidia Corporation Neural network training using ground truth data augmented with map information for autonomous machine applications
CN112069856B (en) * 2019-06-10 2024-06-14 商汤集团有限公司 Map generation method, driving control device, electronic equipment and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11271074A (en) * 1998-03-20 1999-10-05 Fujitsu Ltd Device and method for comparing mark image and program storage medium
CN105701449A (en) * 2015-12-31 2016-06-22 百度在线网络技术(北京)有限公司 Method and device for detecting lane lines on road surface
CN108305475A (en) * 2017-03-06 2018-07-20 腾讯科技(深圳)有限公司 A kind of traffic lights recognition methods and device
CN110136199A (en) * 2018-11-13 2019-08-16 北京初速度科技有限公司 A kind of vehicle location based on camera, the method and apparatus for building figure
CN109949439A (en) * 2019-04-01 2019-06-28 星觅(上海)科技有限公司 Driving outdoor scene information labeling method, apparatus, electronic equipment and medium
CN110135323A (en) * 2019-05-09 2019-08-16 北京四维图新科技股份有限公司 Image labeling method, device, system and storage medium

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11877066B2 (en) 2018-09-10 2024-01-16 Tusimple, Inc. Adaptive illumination for a time-of-flight camera on a vehicle
US20210405651A1 (en) * 2020-06-26 2021-12-30 Tusimple, Inc. Adaptive sensor control
US11932238B2 (en) 2020-06-29 2024-03-19 Tusimple, Inc. Automated parking technology
JP7494809B2 (en) 2021-06-29 2024-06-04 株式会社デンソー Support device, support method, and support program
US20230030660A1 (en) * 2021-08-02 2023-02-02 Nio Technology (Anhui) Co., Ltd Vehicle positioning method and system for fixed parking scenario
CN113609241B (en) * 2021-08-13 2023-11-14 武汉市规划研究院(武汉市交通发展战略研究院) Road network and public transport network matching method and system
CN113609241A (en) * 2021-08-13 2021-11-05 武汉市交通发展战略研究院 Road network and public traffic network matching method and system
CN113822943A (en) * 2021-09-17 2021-12-21 中汽创智科技有限公司 External parameter calibration method, device and system of camera and storage medium
CN113822943B (en) * 2021-09-17 2024-06-11 中汽创智科技有限公司 External parameter calibration method, device and system of camera and storage medium
CN114111817B (en) * 2021-11-22 2023-10-13 武汉中海庭数据技术有限公司 Vehicle positioning method and system based on SLAM map and high-precision map matching
CN114111817A (en) * 2021-11-22 2022-03-01 武汉中海庭数据技术有限公司 Vehicle positioning method and system based on SLAM map and high-precision map matching
CN114419882A (en) * 2021-12-30 2022-04-29 联通智网科技股份有限公司 Method for optimizing layout parameters of sensing system, equipment terminal and storage medium
CN115223118A (en) * 2022-06-09 2022-10-21 广东省智能网联汽车创新中心有限公司 High-precision map confidence judgment method and system and vehicle
CN115223118B (en) * 2022-06-09 2024-03-01 广东省智能网联汽车创新中心有限公司 High-precision map confidence judging method, system and vehicle
CN114782447A (en) * 2022-06-22 2022-07-22 小米汽车科技有限公司 Road surface detection method, device, vehicle, storage medium and chip
CN116468870A (en) * 2023-06-20 2023-07-21 佛山科学技术学院 Three-dimensional visual modeling method and system for urban road
CN116468870B (en) * 2023-06-20 2024-01-23 佛山科学技术学院 Three-dimensional visual modeling method and system for urban road

Also Published As

Publication number Publication date
CN112667837A (en) 2021-04-16
KR20220053513A (en) 2022-04-29
JP2022517961A (en) 2022-03-11

Similar Documents

Publication Publication Date Title
WO2021073656A1 (en) Method for automatically labeling image data and device
WO2020224305A1 (en) Method and apparatus for device positioning, and device
WO2021233029A1 (en) Simultaneous localization and mapping method, device, system and storage medium
US20200089971A1 (en) Sensor calibration method and device, computer device, medium, and vehicle
CN110146910B (en) Positioning method and device based on data fusion of GPS and laser radar
CN112861653A (en) Detection method, system, equipment and storage medium for fusing image and point cloud information
US11625851B2 (en) Geographic object detection apparatus and geographic object detection method
Xiao et al. Monocular localization with vector HD map (MLVHM): A low-cost method for commercial IVs
CN111121754A (en) Mobile robot positioning navigation method and device, mobile robot and storage medium
WO2020043081A1 (en) Positioning technique
CN112232275B (en) Obstacle detection method, system, equipment and storage medium based on binocular recognition
EP4105600A2 (en) Method for automatically producing map data, related apparatus and computer program product
WO2020258297A1 (en) Image semantic segmentation method, movable platform, and storage medium
CN114111774B (en) Vehicle positioning method, system, equipment and computer readable storage medium
CN112150448B (en) Image processing method, device and equipment and storage medium
CN116997771A (en) Vehicle, positioning method, device, equipment and computer readable storage medium thereof
Liao et al. SE-Calib: Semantic Edge-Based LiDAR–Camera Boresight Online Calibration in Urban Scenes
CN111833443A (en) Landmark position reconstruction in autonomous machine applications
CN113838129A (en) Method, device and system for obtaining pose information
US20220164595A1 (en) Method, electronic device and storage medium for vehicle localization
KR102249381B1 (en) System for generating spatial information of mobile device using 3D image information and method therefor
WO2023131203A1 (en) Semantic map updating method, path planning method, and related apparatuses
CN109785388B (en) Short-distance accurate relative positioning method based on binocular camera
WO2023283929A1 (en) Method and apparatus for calibrating external parameters of binocular camera
CN112880692B (en) Map data labeling method and device and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20876002

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021539968

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20876002

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20876002

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 031122)

122 Ep: pct application non-entry in european phase

Ref document number: 20876002

Country of ref document: EP

Kind code of ref document: A1