WO2021073656A1 - Procédé de marquage automatique de données d'image et dispositif - Google Patents

Procédé de marquage automatique de données d'image et dispositif Download PDF

Info

Publication number
WO2021073656A1
WO2021073656A1 PCT/CN2020/122514 CN2020122514W WO2021073656A1 WO 2021073656 A1 WO2021073656 A1 WO 2021073656A1 CN 2020122514 W CN2020122514 W CN 2020122514W WO 2021073656 A1 WO2021073656 A1 WO 2021073656A1
Authority
WO
WIPO (PCT)
Prior art keywords
map
road
information
coordinate system
image
Prior art date
Application number
PCT/CN2020/122514
Other languages
English (en)
Chinese (zh)
Inventor
付万增
王哲
石建萍
Original Assignee
上海商汤临港智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海商汤临港智能科技有限公司 filed Critical 上海商汤临港智能科技有限公司
Priority to KR1020217017022A priority Critical patent/KR20220053513A/ko
Priority to JP2021539968A priority patent/JP2022517961A/ja
Publication of WO2021073656A1 publication Critical patent/WO2021073656A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/38Electronic maps specially adapted for navigation; Updating thereof
    • G01C21/3804Creation or updating of map data
    • G01C21/3807Creation or updating of map data characterised by the type of data
    • G01C21/3815Road data
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/28Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network with correlation of data from several navigational instruments
    • G01C21/30Map- or contour-matching
    • G01C21/32Structuring or formatting of map data
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/26Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00 specially adapted for navigation in a road network
    • G01C21/34Route searching; Route guidance
    • G01C21/36Input/output arrangements for on-board computers
    • G01C21/3667Display of a road map
    • G01C21/367Details, e.g. road map scale, orientation, zooming, illumination, level of detail, scrolling of road map or positioning of current position marker
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/38Electronic maps specially adapted for navigation; Updating thereof
    • G01C21/3804Creation or updating of map data
    • G01C21/3833Creation or updating of map data characterised by the source of data
    • G01C21/3844Data obtained from position sensors only, e.g. from inertial navigation
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C21/00Navigation; Navigational instruments not provided for in groups G01C1/00 - G01C19/00
    • G01C21/38Electronic maps specially adapted for navigation; Updating thereof
    • G01C21/3863Structures of map data
    • G01C21/387Organisation of map data, e.g. version management or database structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/5854Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using shape and object relationship
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/587Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present disclosure relates to the technical field of image processing, and relates to a method and device for automatically labeling image data.
  • the embodiments of the present disclosure provide a technical solution for automatic labeling of image data.
  • an embodiment of the present disclosure provides a method for automatically labeling image data.
  • the method includes: acquiring vehicle positioning information, a map image, and a vehicle collection image, wherein the map image includes road information; Information, acquiring road information on the map image in the local area corresponding to the vehicle positioning information; projecting the road information on the map image onto the vehicle acquisition image to mark the vehicle acquisition image The road information.
  • the acquiring, according to the vehicle positioning information, the road information on the map image in the local area corresponding to the vehicle positioning information includes: using the map image in the local area Is the root node, which sequentially queries the attribute information of the map road element of the map image in the local area, where the attribute information of the map road element includes at least one of the following information: semantic information of the map road element, the map The location information of the road element, and the shape information of the road element on the map.
  • the method further includes: determining the range of the local area according to the vehicle positioning information and the range of the map image; and acquiring the vehicle according to the vehicle positioning information
  • the road information on the map image in the local area corresponding to the positioning information includes: acquiring attribute information of map road elements on the map image in the range of the local area.
  • the map image is based on the world global coordinate system
  • the method further includes: The map image of the coordinate system is converted to the vehicle body coordinate system to obtain a map image based on the vehicle body coordinate system; the projecting the road information on the map image onto the vehicle collection image includes: The map image of the volume coordinate system is converted to the camera coordinate system and/or the pixel coordinate system, so as to project the road information on the map image onto the vehicle acquisition image.
  • the converting the map image based on the world global coordinate system to the vehicle body coordinate system to obtain the map image based on the vehicle body coordinate system includes: according to the vehicle positioning information , Obtain the rotation angle and translation amount of the rotation and translation matrix; according to the rotation and translation matrix, convert the map image based on the world global coordinate system to the vehicle body coordinate system to obtain a map based on the vehicle body coordinate system image.
  • the map image is a two-dimensional map
  • the map image based on the vehicle body coordinate system is converted to a camera coordinate system and/or a pixel coordinate system, so as to map the map image to a camera coordinate system and/or a pixel coordinate system.
  • Projecting the road information of the vehicle onto the vehicle collection image includes: obtaining a homography matrix between the pixel coordinate system and the vehicle body coordinate system; and using a homogeneous coordinate system to represent the map based on the vehicle body coordinate system Image; According to the homography matrix, the map image based on the vehicle body coordinate system expressed in a homogeneous coordinate system is converted to the pixel coordinate system to obtain a projected image on the pixel coordinate system The vehicle collects road information from the image.
  • the map image is a three-dimensional map
  • the map image based on the vehicle body coordinate system is converted to a camera coordinate system and/or a pixel coordinate system, so as to convert the map image on the map image into a camera coordinate system and/or a pixel coordinate system.
  • Projecting road information onto the vehicle collection image includes: converting the map image based on the vehicle body coordinate system to the camera coordinates according to the rotation and translation matrix between the vehicle body coordinate system and the camera coordinate system System, the road information of the vehicle acquisition image projected on the camera coordinate system is obtained; according to the projection matrix between the camera coordinate system and the pixel coordinate system, the vehicle projected on the camera coordinate system is acquired The road information of the image is converted to the pixel coordinate system, and the road information of the vehicle-collected image projected on the pixel coordinate system is obtained.
  • the method further includes: performing road information extraction processing on the vehicle collected image via a neural network for extracting road information to obtain perceived road information; and projecting road information based on the perceived road information Correct the road information on the image collected by the vehicle.
  • the correcting the road information projected on the image collected by the vehicle according to the perceived road information includes: determining the perceived road element in the perceived road information and the projected The offset information between map road elements in the road information on the vehicle collected image; and the road information projected on the vehicle collected image is corrected according to the offset information.
  • the determining the offset information between the perceived road element in the perceived road information and the map road element in the road information projected on the vehicle-collected image includes: The attribute information of the perceived road element, the map road element paired with the perceived road element is determined from the map image; the position information of the paired perceived road element and the map road element in the same device coordinate system is determined; based on the The location information determines the positioning offset between the paired perceived road element and the map road element.
  • determining the map road element paired with the perceived road element from the map according to the attribute information of the perceived road element includes: in the map image, based on the vehicle The positioning information searches for map road elements within a preset range; the perceived road elements in the vehicle collected image and the map road elements within the preset range are paired in pairs based on the attribute information to obtain a variety of matching schemes, where: At least one perceptual road element in different matching schemes has a different pairing method with map road elements within the preset range; determining the confidence of each of the matching schemes; among the multiple matching schemes, the confidence is the highest or exceeds the setting In a pairing scheme with a predetermined threshold, the map road element paired with the perceived road element is determined.
  • pairing the perceived road elements in the vehicle acquisition image with the map road elements within the preset range includes: the perceived road elements in the vehicle acquisition image are in the In the case that the map road element within the preset range cannot determine the paired road element, an empty or virtual element is set in the map road element to be paired to be paired with the perceived road element.
  • determining the confidence of each of the pairing schemes includes: separately determining the individual similarity of the pairing of each perceived road element and the map road element in each of the pairing schemes; determining each The overall similarity between each perceived road element and the map road element in the pairing scheme; and the confidence of each pairing scheme is determined according to the individual similarity and the overall similarity of each pairing scheme.
  • the positioning offset includes a coordinate offset and/or a direction offset; the positioning offset between the paired perceived road element and the map road element is determined based on the vehicle positioning information
  • the quantity includes: sampling the pixels of the perceived road element to obtain a set of perceived sampling points; sampling the pixel points of the map road element to obtain a set of map sampling points; determining the set of perceived sampling points and the The rotation and translation matrix between the sampling points included in each map sampling point set; the coordinate offset and the direction offset between the perceived road element and the map road element are obtained based on the rotation and translation matrix.
  • an embodiment of the present disclosure provides an automatic labeling device for image data.
  • the device includes: a first acquiring part configured to acquire vehicle positioning information, a map image, and a vehicle collection image, wherein the map image includes Road information; a second acquisition part configured to acquire road information on the map image in a local area corresponding to the vehicle positioning information according to the vehicle positioning information; a projection part configured to transfer the map image
  • the road information on the above is projected onto the vehicle acquisition image to mark the road information on the vehicle acquisition image.
  • the second acquisition part is configured to use the map image in the local area as a root node to sequentially query the attribute information of the map road elements of the map image in the local area, where
  • the attribute information of the map road element includes at least one of the following information: semantic information of the map road element, position information of the map road element, and shape information of the map road element.
  • the device further includes: a first determining part configured to determine the range of the local area according to the vehicle positioning information and the range of the map image; the second acquisition The part is configured to obtain attribute information of map road elements on the map image within the range of the local area.
  • the map image is based on the world global coordinate system
  • the device further includes: a first conversion part configured to convert the map image based on the world global coordinate system to the vehicle body coordinate system Above, a map image based on the vehicle body coordinate system is obtained; the projection part is configured to convert the map image based on the vehicle body coordinate system into a camera coordinate system and/or a pixel coordinate system to convert the map image
  • the road information on the map is projected onto the image collected by the vehicle.
  • the first conversion part is configured to: obtain a rotation angle and a translation amount of a rotation translation matrix according to the vehicle positioning information; and, according to the rotation translation matrix, based on the world
  • the map image of the global coordinate system is converted to the vehicle body coordinate system to obtain a map image based on the vehicle body coordinate system.
  • the map image is a two-dimensional map
  • the projection part includes: a third acquisition part configured to acquire the homography between the pixel coordinate system and the vehicle body coordinate system
  • the representation part is configured to use a homogeneous coordinate system to represent the map image based on the vehicle body coordinate system;
  • the second conversion part is configured to use the homogeneous coordinate system to convert the homogeneous coordinate system according to the homography matrix
  • the displayed map image based on the vehicle body coordinate system is converted to the pixel coordinate system to obtain road information of the vehicle collected image projected on the pixel coordinate system.
  • the map image is a three-dimensional map
  • the projection part includes: a third conversion part configured to rotate and translate a matrix between the vehicle body coordinate system and the camera coordinate system , Converting the map image based on the vehicle body coordinate system to the camera coordinate system to obtain road information of the vehicle collected image projected on the camera coordinate system;
  • the fourth conversion part is configured to convert the road information of the vehicle collected image projected on the camera coordinate system to the pixel coordinate system according to the projection matrix between the camera coordinate system and the pixel coordinate system , Obtain the road information of the vehicle-collected image projected on the pixel coordinate system.
  • the device further includes: an extraction part configured to perform road information extraction processing on the vehicle collected image via a neural network for extracting road information to obtain perceived road information;
  • the part is configured to correct the road information projected on the vehicle collected image according to the perceived road information.
  • the first correction part includes: a second determination part configured to determine the difference between the perceived road element in the perceived road information and the road information projected on the vehicle-collected image The offset information between the map road elements; the second correction part is configured to correct the road information projected on the vehicle collection image according to the offset information.
  • the second determining part includes: a third determining part configured to determine from the map image a pair of the perceived road element according to the attribute information of the perceived road element The map road element; the fourth determining part is configured to determine the position information of the paired perceived road element and the map road element in the same device coordinate system; the fifth determining part is configured to determine the paired perceived road based on the position information The positioning offset between the element and the map road element.
  • the third determining part includes: a searching part configured to search for map road elements within a preset range in the map image based on the vehicle positioning information; It is configured to pair the perceived road elements in the vehicle-collected image with the map road elements within the preset range based on the attribute information to obtain multiple pairing schemes, wherein at least one perceptual road element in the different pairing schemes is The pairing modes of the map road elements within the preset range are different; the sixth determining part is configured to determine the confidence of each of the pairing schemes; the seventh determining part is configured to be among the multiple pairing schemes In the matching scheme with the highest confidence or exceeding the set threshold, the map road element that is paired with the perceived road element is determined.
  • the matching part is configured to, in the case that the road elements on the map that perceive the road elements within the preset range in the vehicle acquisition image cannot determine the paired road elements, the matching part is configured to Set empty or virtual elements in the map road elements for pairing to be paired with the perceived road elements.
  • the sixth determining part is configured to separately determine the individual similarity of the pairing of each perceived road element and the map road element in each of the pairing schemes; determine each of the pairing schemes The overall similarity between each perceived road element and the map road element in the pairing; and the confidence of each pairing scheme is determined according to the individual similarity and the overall similarity of each pairing scheme.
  • the positioning offset includes a coordinate offset and/or a direction offset
  • the fifth determining part includes: a first sampling part configured to perceive the road element Sampling the pixels of the map to obtain a set of sensing sampling points;
  • the second sampling part is configured to sample the pixel points of the map road elements to obtain a set of map sampling points;
  • the eighth determining part is configured to determine the sensing The rotation and translation matrix between the sampling point set and the sampling points included in the map sampling point set;
  • the fourth obtaining part is configured to obtain the coordinates of the perceived road element and the map road element based on the rotation and translation matrix Offset and direction offset.
  • an embodiment of the present disclosure provides a device for automatically labeling image data.
  • the device includes: an input device, an output device, a memory, and a processor; wherein the memory stores a set of program codes, and the processing The device is used to call the program code stored in the memory to execute the method described in the first aspect or various possible implementations thereof.
  • embodiments of the present disclosure provide a computer-readable storage medium that stores instructions in the computer-readable storage medium, which when run on a computer, cause the computer to execute the first aspect or its various possibilities. To implement the described method.
  • the embodiments of the present disclosure provide a computer program, including computer-readable code.
  • the processor in the electronic device executes the first Aspect or its various possible implementations.
  • Utilizing the rich road information contained in the map data, projecting the road information on the map data onto the vehicle collection image can realize automatic labeling of the road information of the vehicle collection image, which improves the efficiency of labeling image data and helps reduce data
  • the error probability of labeling reduces the labor cost of image data labeling.
  • FIG. 1 is a schematic flowchart of a method for automatically labeling image data according to an embodiment of the disclosure
  • FIG. 2 is a schematic flowchart of another method for automatically labeling image data according to an embodiment of the present disclosure
  • Figures 3A-3C show the effect diagrams of identifying the semantic information of road elements
  • Figure 4A is a schematic diagram of the world global coordinate system
  • Figure 4B is a schematic diagram of the vehicle body coordinate system
  • 4C is a schematic diagram of the camera coordinate system and the pixel coordinate system
  • FIG. 5 is an example of a method for determining offset information between a perceived road element and a map road element provided by an embodiment of the disclosure
  • FIG. 6 is an example of a method for determining a map road element paired with a perceived road element from a map image provided by an embodiment of the disclosure
  • FIG. 7 is a schematic diagram of a pairing scheme provided by an embodiment of the disclosure.
  • Fig. 8 is an example of a method for determining a positioning offset between a paired perceived road element and a map road element provided by an embodiment of the disclosure
  • FIG. 9 is a schematic diagram of the closest point iteration method provided by an embodiment of the disclosure.
  • FIG. 10 is a schematic structural diagram of a device for automatically labeling image data provided by an embodiment of the disclosure.
  • FIG. 11 is a schematic structural diagram of another image data automatic labeling device provided by an embodiment of the disclosure.
  • FIG. 1 is a schematic flowchart of an image data automatic labeling method provided by an embodiment of the present disclosure.
  • the method may include the following steps:
  • S101 Acquire vehicle positioning information, map images, and vehicle collection images.
  • vehicles are understood in a broad sense and can include various types of vehicles with transportation or operation functions in the traditional sense, such as trucks, buses, buses, and cars; they also include movable robot equipment.
  • smart home devices such as blind guide devices, smart toys, sweeping robots, etc., can also be industrial robots, service robots, toy robots, educational robots, etc., which are not limited in the embodiments of the present disclosure.
  • the vehicle may be equipped with a position sensor to obtain the vehicle positioning information.
  • the vehicle may also be equipped with a vision sensor to collect images around the vehicle in real time, and the obtained images may be referred to as vehicle collection images. Since the image collected by the visual sensor installed on the vehicle is equivalent to the “perception” of the vehicle driving control system on the surrounding environment of the vehicle, the image collected by the vehicle can also be referred to as a perceived road image. In the embodiment of the present disclosure, the image collected by the vehicle is the collected image itself, and there is no label information on the image.
  • the map image can also be obtained from a server or a vehicle-mounted terminal.
  • the map image can be a semantic map, a high-precision map, etc., but is not limited to this, and can also be other types of maps.
  • the map image includes rich road information. Road information refers to attribute information of map road elements identified based on the map image.
  • the map road element in the road information may include road-related signs, and may include at least one or more of the following: various types of lane lines, stop lines, turning lines, and roads on the road. Edge lines, as well as traffic signs, traffic lights, street lights, etc. installed on the side of the road or on the road.
  • Various types of lane lines can include, but are not limited to, white solid lane lines, yellow dashed lane lines, left edge lane lines, right edge measurement lines, etc.
  • various types of traffic signs can include, but are not limited to, slow traffic Signs, no-stop traffic signs, speed limit traffic signs, etc.
  • the road elements are not limited to the above.
  • the attribute information of the map road element may include one or more kinds of information related to the above-mentioned map road element, such as semantic information, position information, shape information, and so on of the road element.
  • the semantic information of the road element can be the meaning represented by the road element and the information it wants to express. For example, in the case that a line on the road is detected in the collected road image, it can be based on the line on the road. The position, relative to the width and length of the road, etc., can determine that the line is a stop line, a lane line, and so on. Since the lane line can be subdivided into many types, the lane line is the basic semantic information, and its specific semantic information can be further determined according to the position of the line and the shape of the line, such as the left edge lane line, the white solid line lane line, etc. ; For traffic signs, slow traffic signs and no-stop traffic signs can be the specific semantic information of the road element. Those skilled in the art should understand that the specific expression form of the semantic information of the road element does not affect the implementation of the method of the present disclosure.
  • the above-mentioned position sensor may include at least one of the following: GPS (Global Positioning System), IMU (Inertial measurement unit, inertial measurement unit), etc.; the above-mentioned visual sensor may include at least one of the following: a camera, a video camera, Camera etc. Those skilled in the art should understand that the vision sensor and the position sensor are not limited to the above.
  • the vehicle positioning information may be a synchronized positioning information obtained by collecting images of each frame of the vehicle. It may be GPS positioning information, or IMU positioning information, or fusion information of the GPS positioning information and the IMU positioning information.
  • the fusion information is a more reliable positioning result obtained based on GPS positioning information and IMU positioning information. It can be obtained by Kalman filtering on GPS positioning information and IMU positioning information, or it can be calculated by means of GPS positioning information and IMU positioning information, or weighted average calculation.
  • the map image includes all or most of the road elements of the road section.
  • the vehicle acquisition image acquired during the positioning process is a partial area image of the road.
  • the scope of this local area can be set.
  • the local area corresponding to the vehicle positioning information is also related to the field of view of the visual sensor.
  • S103 Project the road information on the map image onto the vehicle acquisition image to mark the road information on the vehicle acquisition image.
  • the map image contains rich and accurate road information
  • projecting the road information on the map image onto the vehicle acquisition image is essentially marking the road information on the vehicle acquisition image, so that the vehicle acquisition
  • the image also contains the above-mentioned road information, which realizes the automatic labeling of the road information.
  • the trained neural network can be used to identify the road information in the collected images of vehicles.
  • the rich road information contained in the map image can be used to automatically label the road information of the image collected by the vehicle, which improves the efficiency of labeling image data and helps reduce the amount of data labeling.
  • the error probability reduces the labor cost of image data annotation.
  • FIG. 2 is a schematic flowchart of another method for automatically labeling image data according to an embodiment of the present disclosure.
  • the method may include the following steps:
  • S201 Acquire vehicle positioning information, map images, and vehicle collection images, where the map images include road information.
  • the map image of the road can be collected by the collection vehicle, and the attribute information of the map road element in the map image can be recognized.
  • the collected map images can be stored in a server or a vehicle-mounted terminal.
  • a visual sensor is provided on the collection vehicle to collect map images.
  • the visual sensor may include at least one of the following: a camera, a video camera, a camera, and so on.
  • the visual sensor configured in the collection vehicle can be a high-precision visual sensor, so that a map image with high definition and high accuracy can be collected.
  • the visual sensor used to collect the image collected by the vehicle can be a sensor with a relatively low accuracy.
  • the collection vehicle may also be equipped with a high-precision position sensor to obtain the location information of the collection vehicle more accurately.
  • the position sensor that obtains the vehicle positioning information can be a position sensor with a lower positioning accuracy, or an existing position sensor in the vehicle can be used.
  • the attribute information of the map road element may include at least one of the following: semantic information, location information, shape information, and so on.
  • the above attribute information can be obtained by using a trained neural network for road element detection.
  • the above neural network can be trained by road images with label information (may be called sample road images).
  • the road elements in the sample road images have label information.
  • the label information can be attribute information of the sample road elements, for example, Including but not limited to one or more of the following: semantic information, shape information, location information, etc.
  • Training the neural network through sample road images can make the model have the ability to recognize the attribute information of road elements in the input road image.
  • the attribute information of the map road element in the image can be output.
  • the neural network's ability to recognize the types of road elements depends on the type of sample road elements used in the training process.
  • the model can be trained with more types of sample road elements to make it have higher recognition capabilities.
  • Figure 3A-3C show the effect diagrams of recognizing the semantic information of road elements.
  • Figure 3A is a road image input to the neural network model, which can be a vehicle acquisition image, a map road image, or other road images
  • Figure 3B shows the road elements identified by the neural network model, as shown in the horizontal thick The solid line 31 is shown, and its semantic information is obtained as "stopline" 32 (stopline), which is marked on the upper left of the picture
  • Figure 3C shows the road elements recognized by the neural network model, as shown in Figure 3C.
  • Line 33 is shown, and the basic semantic information and specific semantic information of each line are obtained.
  • the basic semantic information is "laneline”, and the specific semantic information is (from left to right): "white solid line” (white solid line), “white solid line” (white solid line) , “White solid line” (white solid line), “white solid line” (white solid line), and “right edge” (right edge) are all marked on the upper left of the picture, as shown in Figure 3C 34 on the top left of the picture.
  • the identification method of the map road element attribute information is not limited to the above, and can also be obtained by other identification methods.
  • S202 Determine the range of the local area corresponding to the vehicle positioning information according to the vehicle positioning information and the range of the map image.
  • the map image for a road includes all or most of the road elements of the road.
  • the vehicle acquisition image acquired during the positioning process is a partial area image of the road.
  • the range of the local area can be set manually. Therefore, the range of the local area corresponding to the vehicle positioning information can be determined according to empirical values and the like.
  • the local area corresponding to the vehicle positioning information is related to the field of view of the vision sensor. Therefore, the range of the local area corresponding to the vehicle positioning information can also be determined according to the field of view of the visual sensor.
  • S203 Take the map image in the local area as a root node, and sequentially query the attribute information of the map road elements on the map image in the local area.
  • This step is used to obtain the attribute information of the map road element on the map image within the local area.
  • the tree-like hierarchical relationship can be used to query the road information. Take the map image within the local area as the root node; there are several rectangular areas under the root node, and each rectangular area is set with a center point to represent the location of the area, and each rectangular area corresponds to the map road element on the map image; query the corresponding map road element Attribute information.
  • S204 Convert the map image based on the world global coordinate system to the vehicle body coordinate system to obtain the map image based on the vehicle body coordinate system.
  • the map image uses the world global coordinate system, and any point in the coordinate system has a unique corresponding coordinate (longitude and latitude information) on the earth.
  • ECEF Earth-Centered-Earth-Fixed, centered on the earth and fixed on the earth
  • FIG. 4A is a right-handed Cartesian coordinate system.
  • the Cartesian coordinate system takes the center of the earth as the origin of the coordinates.
  • the direction of the intersection of the origin pointing to the prime meridian and the 0 degree latitude is the positive direction of the x-axis
  • the direction of the origin pointing to the north pole is the positive direction of the z-axis
  • the length is in meters.
  • the vehicle body coordinate system is also a right-handed Cartesian coordinate system, with the on-board high-precision inertial navigation center as the origin, the front direction as the positive direction of the x-axis, and the left side as the y Positive axis direction.
  • the length is in meters.
  • the world global coordinate system and the vehicle body coordinate system are both right-handed Cartesian rectangular coordinate systems, and the conversion between the two right-handed Cartesian rectangular coordinate systems requires only one rotation and translation matrix.
  • S204 includes: obtaining the rotation angle and translation amount of the rotation and translation matrix according to the vehicle positioning information; and according to the rotation and translation matrix, transforming the map image based on the world global coordinate system to the vehicle body coordinate system to obtain the vehicle body coordinate system Map image.
  • the rotation angle and translation amount of the rotation translation matrix between the world global coordinate system and the vehicle body coordinate system are determined. Therefore, the map image based on the world global coordinate system can be converted to the vehicle body coordinate system according to the rotation and translation matrix, and the map image based on the vehicle body coordinate system can be obtained.
  • S205 Convert the map image based on the vehicle body coordinate system to the camera coordinate system and/or the pixel coordinate system, so as to project the road information on the map image onto the vehicle collection image.
  • the road information on the map image is projected onto the vehicle acquisition image, and the vehicle acquisition image is based on the camera coordinate system or pixel coordinate system. Therefore, the above-mentioned map image based on the vehicle body coordinate system needs to be converted to camera coordinates.
  • System or pixel coordinate system A schematic diagram of the camera coordinate system and the pixel coordinate system as shown in FIG. 4C. Among them, the camera coordinate system o-x-y-z is a three-dimensional image, and the pixel coordinate system o'-x'-y' is a two-dimensional image.
  • the map image is a two-dimensional map
  • S205 includes: obtaining a homography matrix between the pixel coordinate system and the vehicle body coordinate system; adopting a homogeneous coordinate system to represent the map based on the vehicle body coordinate system Image; and according to the homography matrix, the map image based on the vehicle body coordinate system represented by the homogeneous coordinate system is converted to the pixel coordinate system, and the road information of the vehicle acquisition image projected on the pixel coordinate system is obtained.
  • the transformation from the vehicle body coordinate system to the pixel coordinate system can be completed by using a homography matrix transformation.
  • the homography matrix means that a three-dimensional object can be projected onto multiple two-dimensional planes, and the homography matrix can transform the projection of a three-dimensional object on a certain two-dimensional plane to the projection of another two-dimensional plane.
  • the homography matrix can be solved through algebraic analysis.
  • the homography matrix between the pixel coordinate system and the vehicle body coordinate system can be calibrated in advance through manual calibration data.
  • the affine transformation from one plane to another can be completed. Then, the homogeneous coordinate system is used to represent the road information on the map image, and then the coordinates of each road information are multiplied by the homography matrix to obtain the road information based on the pixel coordinate system.
  • the map image is a three-dimensional map
  • S205 includes: converting the map image based on the vehicle body coordinate system to the camera coordinate system according to the rotation and translation matrix between the vehicle body coordinate system and the camera coordinate system , Get the road information of the vehicle captured image projected on the camera coordinate system; and according to the projection matrix between the camera coordinate system and the pixel coordinate system, convert the road information of the vehicle captured image projected on the camera coordinate system to the pixel coordinate system
  • the road information of the collected image of the vehicle projected on the pixel coordinate system is obtained.
  • the internal and external parameters of the camera can be used to complete the conversion between the car body coordinate system, the camera coordinate system, and the pixel coordinate system.
  • the principle of camera imaging is small hole imaging.
  • the internal parameters of the camera refer to the focal length of the convex lens of the camera and the coordinates of the optical center in the pixel coordinate system; the external parameters of the camera refer to the rotation and translation matrix between the camera coordinate system and the vehicle body coordinate system.
  • the camera coordinate system is a right-hand Cartesian coordinate system with the optical center of the camera as the origin, and the positive direction of the y-axis and the front of the camera respectively.
  • the road information on the map image is rotated and translated to the camera coordinate system through the camera external parameters, and then the road information based on the camera coordinate system is based on the zoom principle of the pinhole imaging and the camera internal parameters Projected to the pixel coordinate system to obtain the road information projected on the image collected by the vehicle.
  • S206 Perform road information extraction processing on the vehicle collected image via a neural network for extracting road information to obtain perceived road information.
  • the perceived road information includes the attribute information of the perceived road element.
  • the attribute information of the perceived road element may include one or more kinds of information related to the perceived road element, such as semantic information, location information, shape information, etc. of the road element.
  • the perceived road element can include road-related signs, and can include at least one of the following: lane lines, stop lines, turning lines on the road, and traffic signs and traffic signs set beside or in front of the road Lamps, street lamps, etc.
  • the types of perceptual road elements and map road elements can be all the same, or they can be partly the same.
  • the perceived road elements should overlap with the map road elements in the map.
  • This overlap can refer to the overlap of perceived road elements and map road elements in the same coordinate system.
  • the road information in the map image may not completely overlap with the actual road information in the map image, so it is necessary to correct the road information projected on the vehicle collected image.
  • a neural network that has undergone preliminary training can be used to extract road information from images collected by vehicles to obtain perceived road information.
  • the above neural network can be trained by road images with label information (may be called sample road images).
  • the road elements in the sample road images carry label information, and the label information may be attribute information of the sample road elements.
  • the attribute information of the sample road element may include, but is not limited to, one or more of the following: semantic information, shape information, location information, and so on.
  • Training the neural network through sample road images can make the model have the ability to recognize the attribute information of road elements in the input road image.
  • the attribute information of the map road element in the image can be output.
  • the neural network's ability to recognize the types of road elements depends on the type of sample road elements used in the training process.
  • the model can be trained with more types of sample road elements, so that the neural network has a higher recognition ability.
  • the road information projected on the image collected by the vehicle can be corrected according to the perceived road information.
  • S207 includes determining the offset information between the perceived road element in the perceived road information and the map road element in the road information projected on the vehicle-collected image, and the pair is projected to the vehicle according to the offset information.
  • the road information on the image collected by the vehicle is corrected. This will be described in detail in the following examples.
  • the road information on the map data is projected onto the vehicle collection image by using the rich road information contained in the map data, which can realize the automatic labeling of the road information of the vehicle collection image , Improve the efficiency of annotating image data, help reduce the error probability of data annotation, and reduce the labor cost of image data annotation; and according to the perceived road information, the road information projected on the vehicle collection image is corrected, which improves the image data annotation Accuracy.
  • Fig. 5 shows a method for determining the offset information between the perceived road element and the map road element. As shown in Fig. 5, the method may include:
  • S301 According to the attribute information of the perceived road element, determine the map road element paired with the perceived road element from the map image.
  • the perceptual road element on the vehicle acquisition image can obtain the map road element paired with it on the map. That is, for a perceptual road element, if it is not misrecognized, nor is it newly appeared after the map is created or the latest update, then usually a map road element can be found on the map to correspond to it.
  • S302 Determine the position information of the paired perceived road element and the map road element in the same device coordinate system.
  • the position comparison needs to be performed in the same coordinate system, if the obtained position information of the perceived road element and the position information of the map road element are not in the same coordinate system, they need to be converted to the same coordinate system.
  • the map location information of the map road element is the map location information in the latitude and longitude coordinate system
  • the map location information needs to be converted to the vehicle body coordinate system.
  • the following describes the GPS device coordinate system as an example.
  • the process of the coordinate system conversion can be divided into two steps: first, the map location information is converted from the latitude and longitude coordinate system (for example, the WGS84 coordinate system) to the UTM coordinate system; Vehicle positioning information, which converts map road elements from UTM coordinate system to GPS device coordinate system.
  • this step can be obtained by first rotating the angle ⁇ of the front of the vehicle toward the true east, and then translating the GPS longitude and latitude positioning information (x, y).
  • the conversion from the latitude and longitude coordinate system to the vehicle body coordinate system can be performed according to its conversion rules.
  • S303 Determine a positioning offset between the paired perceived road element and the map road element based on the location information.
  • the positioning offset between them can be determined based on the positions of the two.
  • the paired perceptual road elements and map road elements are converted to the same device coordinate system, and then the position information of the two is used to determine the positioning offset between them.
  • Fig. 6 shows a method for determining map road elements paired with perceived road elements from a map image. As shown in Fig. 6, the method may include:
  • S401 In the map image, search for map road elements within a preset range based on the vehicle positioning information.
  • the vehicle positioning information is the location information of the vehicle itself.
  • the location information of the vehicle For example, for an autonomous vehicle, it is the location information of the vehicle.
  • the location of the vehicle on the map image can be determined, so that map road elements within a set range can be found in the map image, that is, map road elements near the vehicle.
  • the perceptual road element of the image collected by the vehicle is the road element located near the vehicle during the positioning process of the vehicle. Therefore, finding the map road elements near the vehicle on the map is the most likely and fastest way to find the map road elements paired with the perceived road elements.
  • the preset range can be set according to requirements. For example, if the matching accuracy is high, the range can be set to be relatively large, and more map road elements can be obtained to pair with the perceived road elements in the subsequent process; if the real-time requirements are high, it is hoped that the matching speed will be faster , The range can be set relatively small.
  • the preset range may be a range of 2 to 5 times the difference between the visual range of the visual sensor and the initial positioning recognition on the map with the vehicle positioning information as the center point, thereby weighing the matching speed and accuracy.
  • the preset range can be set to (60+10)*2. That is to say, in this case, the preset range can be a 140m*140m rectangular frame with the vehicle positioning as the center.
  • S402 Pair the perceived road elements in the image collected by the vehicle with the map road elements within a preset range based on the attribute information to obtain multiple pairing schemes.
  • each vehicle acquisition image in the vehicle acquisition image can be paired with each map road element within a preset range through enumeration to obtain a variety of different Matching plan.
  • the above-mentioned different pairing schemes may be that at least one perceived road element has a different pairing method with the map road element within the preset range.
  • the perceptual road elements in the image collected by the vehicle include a1, a2,..., aM
  • the map road elements within the above preset range include b1, b2,..., bN, where M and N are both positive integers, and N is greater than Equal to M.
  • the number of map road elements is more than, or at least equal to the number of perceived road elements.
  • each of the resulting pairing schemes is a set of two-tuples, each
  • the two-tuple (ai, bj) is a pairing method of road elements.
  • the perceptual road elements (a1, a2,..., aM) need to be all paired, and the map road elements (b1, b2,..., bN) may contain elements for which no matching target is found.
  • the map road elements (b1, b2,..., bN) may contain elements for which no matching target is found.
  • at least one set of two-tuples (ai, bj) is different.
  • the pairwise pairing of perceptual road elements and map road elements can be realized through the bipartite graph model. Including steps: constructing a bipartite graph model based on perceptual road elements and map road elements: abstracting each perceptual road element as a point in the vehicle's collected image, and all perceptual road elements forming a perceptual point set; combining the map road elements in the map It is also abstracted as a point, and all map road elements form a map point set.
  • the perceptual road elements with the same semantics can be sorted in the order from the left to the right of the vehicle, for the same semantics in the map
  • the map road elements are also sorted using a similar method, and the points in the corresponding point set formed are arranged in order according to the order of the road elements.
  • the perception point set and the map point set are connected by edges, and each edge represents the pairing relationship between a perception road element and a map road element. Different connection methods produce different pairing schemes, and each pairing scheme obtained is an edge set.
  • a bipartite graph matching method based on the above model can also be used to obtain a reasonable matching scheme from all the matching schemes.
  • the method includes: in all edge sets, selecting as many edge sets that do not intersect (not cross) as many edges as possible.
  • the disjointness mentioned here means that the two edges do not intersect when there is no common point, and the sequence numbers of the two vertices of one side in the point set are greater than the sequence numbers of the two vertices of the other side in the point set, so it can also It is understood as disjoint in the physical sense.
  • An edge set with a number of disjoint edges greater than a set ratio or a set threshold can be called a reasonable edge set, that is, a reasonable pairing scheme is obtained, as shown in Figure 7, for example.
  • confidence is an evaluation index that perceives the pairing between road elements and map road elements.
  • each perceived road element is paired with a map road. The higher the semantic information consistency between the two, the more the number of matching pairs, the higher the confidence of the pairing scheme.
  • the confidence of each pairing scheme can be determined in the following way:
  • the individual similarity can refer to the similarity of the attribute information of the two elements in each binary pair in the pairing scheme. For example, it may include the similarity of semantic information, the similarity of position information, the similarity of shape information, and so on.
  • the individual similarity between the perceived lane line and the map lane line can be calculated by the following formula (1), where the perceived lane line can refer to the lane line in the image collected by the vehicle, and the map lane line can refer to the map Lane line.
  • Weight(i,j) -Distance(i,j)+O type (i,j)*LaneWidth+O edgetype (i,j)*LaneWidth (1);
  • Weight(i,j) represents the individual similarity between the i-th (counted from left to right, the same below) perceptual lane line and the j-th map lane line, which can also be called the weight;
  • Distance( i, j) represents the distance between the i-th perceived lane line and the j-th map lane line.
  • the lane line is abstracted as a line segment.
  • the distance calculation method can be the Euclidean distance from the line segment to the line segment, that is, on a line segment The median of the distance between two endpoints and another line segment, that is, the average; LaneWidth represents the lane width, that is, the width between the two lane lines; O type (i,j) if and only if the i-th perceived lane line If the lane line attribute of the j-th map lane line is the same, it is 1, otherwise it is 0; where the lane line attributes can include lane line color, line type, etc., such as yellow solid line and white dashed line; O edgetype (i ,j) If and only if the edge lane line attributes of the i-th perceived lane line and the j-th map lane line are the same, it is 1, otherwise it is 0; where, the edge lane line attribute indicates whether the lane line belongs to the road edge.
  • Distance (i, j) is used to calculate the similarity of the position information between the perceived lane line and the lane under the map
  • LaneWidth is used to calculate the similarity of the shape information between them
  • O type (i, j) and O edgetype (i, j) is used to calculate the similarity of semantic information between them.
  • the overall similarity can be an overall evaluation of the similarity of the attribute information of all binary pairs in a pairing scheme.
  • the attribute information may include location information and semantic information.
  • the overall similarity of location information can be represented by the variance of the distance between two elements in all binary pairs. The smaller the variance, the closer the distance between the two elements in all binary pairs, and the higher the overall similarity of the position information; for the overall similarity of semantic information, the semantic information similarity of the two elements in all binary pairs can be used Perform average or weighted average calculation to obtain.
  • C. Determine the confidence of each pairing plan according to the individual similarity and the overall similarity of each pairing plan. For example, in each pairing scheme, the sum of the individual similarities of each binary group can be averaged with the overall similarity, or weighted average, to obtain the confidence of the pairing scheme.
  • the confidence of the pairing program is comprehensively evaluated, avoiding the extreme effect (extremely good or extremely poor) of individual pairing on the entire.
  • the influence of the confidence of the pairing scheme makes the calculation result of the confidence more reliable.
  • Formula (2) is an example of a function for calculating the confidence score of the pairing scheme, which is to calculate the score through three parts: the sum of individual similarities, the overall similarity of distance information, and the overall similarity of semantic information.
  • match_weight_sum sum(match_items_[pr_idx][hdm_idx].weight)+CalculateVarianceOfMatchResult(match_result)+CalculateMMConfidence(match_result); (2);
  • match_weight_sum represents the confidence score of a pairing scheme
  • sum(match_items_[pr_idx][hdm_idx].weight) represents the sum of the individual similarity of each two-tuple in the pairing scheme, which is calculated by summing the weights of the edges selected in the pairing scheme, that is, each pair of point sets Corresponding edge weight summation calculation;
  • CalculateVarianceOfMatchResult(match_result) represents the overall similarity of the distance information of each two-tuple in the pairing scheme, which is calculated by the variance of the distance between two elements in each two-tuple in the pairing scheme.
  • the variance is the variance of all these distances. Theoretically, the distance between all paired perception lane lines and map lane lines should be equal, that is, the variance is zero, but in fact, because of the inevitable introduction of errors, the variance may not be zero;
  • CalculateMMConfidence(match_result) represents the overall similarity of the semantic information of each two-tuple in the matching scheme, which is calculated by comparing the semantic similarity between two elements in each two-tuple. Still taking lane lines as an example, it can be judged whether the attributes of all paired lane lines are the same and whether the numbers are the same. For example, the confidence that the attributes are all consistent is 100%, and the attributes of each pair of lane lines are inconsistent, for example, the confidence can be set to decrease by 10%, and the confidence that the number does not match directly decrease by 30%.
  • the confidence score of the pairing scheme can be obtained.
  • S404 Determine the map road element that is paired with the perceived road element in the matching scheme with the highest confidence level or exceeding the set threshold among the multiple matching schemes.
  • the scheme with the highest confidence level can be used as the final selected matching scheme, and the matching scheme that exceeds the set threshold can be used as the final selected matching scheme, so as to be able to determine the pairing with the perceived road element Map road elements.
  • the map road elements near the device are obtained on the map by using the vehicle positioning information to be paired with the perceived road elements.
  • the map road elements near the device are obtained on the map by using the vehicle positioning information to be paired with the perceived road elements.
  • the perceived road elements in the vehicle acquisition image are in the preview
  • an empty or virtual element is set in the map road element to be paired to be paired with the perceived road element.
  • the perceived road elements in the image collected by the vehicle correspond to the map road elements in the map one-to-one, but in the case that the perceived road elements are the result of misrecognition, or the perceived road elements are only after the map is established When this happens, the map road element corresponding to the perceived road element cannot be found.
  • all perceptual road elements have matching objects in the process of determining the matching plan, making the matching plan richer and conducive to comprehensive evaluation of the best matching plan.
  • Fig. 8 shows a method for determining the positioning offset between the paired perceived road element and the map road element. As shown in Fig. 8, the method includes:
  • S501 Sampling the pixel points of the perceptual road element to obtain a set of perceptual sampling points.
  • the pixel points of the perceptual road element may be sampled at a fixed interval (for example, 0.1 m) to obtain a set of perceptual sampling points.
  • the perceived lane line can be abstracted as a set of points.
  • the lane lines can be arranged in the order from the left to the right of the vehicle, and the corresponding point sets are arranged from top to bottom according to the order of the lane lines.
  • S502 Sampling the pixel points of the map road element to obtain a map sampling point set.
  • a method similar to step 901 may be applied to sample the map road elements to obtain a map sampling point set.
  • S503 Determine a rotation and translation matrix between the sampling points included in the sensing sampling point set and the map sampling point set.
  • the closest point iteration method may be used to calculate the rotation and translation matrix between the two point sets.
  • Figure 9 shows a schematic diagram of the closest point iteration method.
  • the left side of the arrow represents two associated point sets (paired point sets) input to the algorithm model.
  • Applying the algorithm model for example, the least squares algorithm model can be obtained.
  • Rotation and translation matrix By applying the rotation and translation matrix to the input point set, the overlap of the two point sets can be realized.
  • the right side of the arrow indicates the two overlapping point sets.
  • S504 Obtain a coordinate offset and a direction offset between the perceived road element and the map road element based on the rotation and translation matrix.
  • the rotation average shift matrix obtained in step S503 is the required positioning offset, the translation coefficient in the rotation translation matrix corresponds to the coordinate offset, and the rotation coefficient corresponds to the direction offset.
  • the vehicle positioning information may be expressed as (x0, y0, ⁇ 0), and the positioning offset may be expressed as (dx, dy, d ⁇ ).
  • the positioning information obtained by correcting the vehicle positioning information can be as in formula (3).
  • Kalman filtering mean value calculation, weighted average calculation and other methods can be used to fuse the obtained positioning information and vehicle positioning information, avoiding excessive correction of the positioning information by the map information, and making the image data labeling more reliable.
  • the image data automatic labeling method provided by the embodiments of the present disclosure can be used to complete automatic labeling of the positions and attributes of static road elements such as lane lines, stop lines, and signs in road images; the automatic labeling algorithm is based on high precision Maps, high-precision maps contain a wealth of road elements, and have centimeter-level accuracy. High-precision maps are one of the basic modules in automatic driving. At present, high-precision maps have been widely used and mature acquisition schemes; the automatic labeling method of image data is only related to high-precision maps. If high-precision maps are sufficient According to the belief, the labeling algorithm can achieve a sufficiently high accuracy rate; the automatic labeling method of image data is an accessory product of the automatic driving system, and no additional cost is required.
  • the main principle is to obtain high-precision map information near the vehicle through high-precision positioning information, and use the on-board camera parameters to project the map elements onto the road image to obtain the corresponding road element location and semantic information.
  • the embodiments of the present disclosure also provide a high-precision map building solution, which uses a low-cost high-precision positioning solution to assist in completing automatic labeling of image data.
  • the image data automatic labeling method is an accessory product of the automatic driving system, which uses the existing high-precision maps, high-precision positioning schemes, on-board camera calibration schemes, and auxiliary positioning and detection annotations of the automatic driving system.
  • the effect of image lane line detection is based on a deep learning model.
  • the image data automatic labeling method first obtains the map information near the vehicle from the high-precision map according to the high-precision positioning scheme, and then projects the map road elements into the road image according to the calibration parameters of the on-board camera, and then compares them
  • the image lane line detection deep learning model extracts the offset between the lane line and the projected lane line to calibrate the projection function, and finally obtain the image road element labeling result with higher accuracy and precision.
  • the high-precision map used in the image data automatic labeling method can be obtained by simply processing the laser point cloud data obtained by the automatic driving data collection vehicle.
  • the point cloud of road elements such as lane lines and stop lines can be obtained by filtering the reflectivity of the laser point cloud, and then template matching, clustering and fitting methods are used to finally obtain a high-precision map containing rich road elements.
  • the method for automatically labeling image data provided by the present disclosure includes three parts: a map query module, a map information projection module, and a projection error correction module.
  • a map query module based on high-precision maps and high-precision positioning information, comprehensively using vehicle-mounted GPS positioning equipment, vehicle-mounted high-precision inertial navigation equipment, and vision-based positioning correction information to obtain a positioning result of at least decimeter level, and then based on this positioning Results Query road information in a 100m area around the location of the car on the high-precision map, including lane lines, stop line positions and attribute information.
  • the map information projection module supports two projection methods, the first is based on 2D (2-Dimension) map information and a pre-calibrated camera homography matrix; the second is based on 3D (3- Dimension, the projection method of map information and pre-calibrated camera internal and external parameters.
  • the two projection methods are essentially spatial transformations of geometric data, but one is the affine transformation from 2D space to 2D space, and the other is the projection transformation from 3D space to 2D space.
  • the projection error correction module uses the pre-prepared lane line detection deep learning model to extract the lane line position and attribute information in the image, and then minimizes the error between the extracted lane line and the projected lane line, and optimizes the projection function. Obtain the optimized label information of the position and attributes of road elements such as lane lines and stop lines.
  • the input of the map query module is high-precision map and high-precision positioning information
  • the output is local map information near the positioning location.
  • the embodiments of the present disclosure are based on three coordinate systems: the world global coordinate system (including the WGS84 latitude and longitude coordinate system and the ECEF geocentric coordinate system), the vehicle body coordinate system and the camera image pixel coordinate system, three coordinates See Figure 4A, Figure 4B and Figure 4C.
  • the high-precision map uses the world's global coordinate system, and any point in the coordinate system has unique coordinates on the earth, such as latitude and longitude information; among them, the WGS84 latitude and longitude coordinate system uses radian values to represent point coordinate information, so it is inconvenient to use.
  • the ECEF geocentric coordinate system is used.
  • the vehicle body coordinate system is also a right-hand Cartesian rectangular coordinate system
  • the camera image pixel coordinate system is a two-dimensional rectangular coordinate system with pixels as the unit.
  • the query is a recursive process. First, find the area closest to the positioning position, and then find the road closest to the positioning position and the corresponding lane line and stop line information in turn.
  • the KD tree is used to sequentially store the coordinates of each layer of the map to speed up the query process.
  • these road elements need to be converted from the world global coordinate system to the vehicle body local coordinate system, and the conversion between the two right-hand Cartesian rectangular coordinate systems Only one rotation and translation matrix is needed, and the rotation angle and translation amount are obtained from the positioning information.
  • the road elements on the map need to be filtered at the end, and only the lane lines and stop lines within the camera's field of view are retained.
  • the road element information in the occluded part of the field of view can be filtered as needed. For example, nearby objects will occlude distant objects, but this step is not necessary.
  • the input of the map information projection module is local map information near the positioning location
  • the output is the map road element in the pixel coordinate system of the camera image.
  • the accuracy of the height information in the map is low.
  • the camera image pixel coordinate system and the car can be pre-calibrated through manual calibration data.
  • the homography matrix between the body coordinate systems (the matrix is a 3*3 matrix, has 8 degrees of freedom, and can complete the affine transformation from one plane to another). Then only need to use the homogeneous coordinate system to represent the map element information, and then the coordinates of each map element are multiplied by the homography matrix to get the map road elements in the camera image pixel coordinate system.
  • the principle of camera imaging is aperture imaging.
  • the internal parameters of the camera refer to the focal length of the convex lens of the camera and the coordinates of the optical center in the pixel coordinate system; the external parameters of the camera refer to the rotation and translation matrix between the camera coordinate system and the vehicle body coordinate system.
  • the camera coordinate system is the right-hand Cartesian coordinate system with the camera's optical center as the origin, and the upper and front of the camera are the positive directions of the y-axis and z-axis respectively.
  • the map road elements are rotated and translated to the camera coordinate system through the camera external parameters, and then the map road elements are projected to the camera image pixel coordinate system according to the zoom principle of small hole imaging and the camera internal parameters .
  • the input of the projection error correction module is the map road element in the pixel coordinate system and the perceptual lane line information extracted by the deep learning detection segmentation model, and the output is the corrected image label.
  • the map elements projected to the pixel coordinate system may not completely coincide with the real information in the image, so the projection error Correction is an extremely important part.
  • the existing deep learning detection segmentation model is used to extract all lane line information in the image, and these lane lines are regarded as perceptual lane lines.
  • the characteristic of perceiving lane lines is the high accuracy of position information, but there are certain errors in attribute information, quantity information and completeness. It mainly uses the offset between the perceived lane line and the map lane line to correct the projection function. Among them, the correction projection function is divided into two steps. The first step is to find the correspondence between the map lane line and the perceived lane line; the second step is Minimize the distance between the corresponding lane lines. There is a good total sequence relationship between lane lines, that is, they are generally arranged from left to right.
  • the map lane line and the perception lane line are both abstracted as points, and the perception point set and the map point set can be obtained.
  • the points in the perception point concentration and the map point concentration ie edge connection
  • the points in the perception point set are connected.
  • the points in the map point set are not connected, so that a bipartite graph model can be obtained. Therefore, the lane line matching problem can be transformed into a bipartite graph matching problem.
  • Each side of the bipartite graph represents the pairing relationship between a perceived lane line and a map lane line.
  • the bipartite graph model can be seen in Figure 7. Continue to assign weights to the edges of the bipartite graph.
  • the weights can be equal to the degree of similarity and the opposite of the distance between the two lane lines.
  • the degree of similarity can be quantified by the similarity of the location information, the similarity of the shape information, and whether the attributes of the lane line match.
  • the distance between the lane line and the lane line can be converted into the average distance from one point set to another curve.
  • the goal is to perform a bipartite graph matching search to find a disjoint edge set with the largest sum of edge weights. Disjoint means that the edge and the points on both sides of the edge cannot be the same.
  • the maximum sum of edge weights indicates that the lane line matching scheme is optimal, that is, the map lane line and the perceived lane line have the highest similarity under this matching scheme.
  • the problem is converted to minimize the distance between the lane line and the lane line.
  • the lane line can be expressed as a curve, and a point set can be obtained by sampling the points on the curve.
  • the final problem is converted to minimize the distance from the point set to the point set. This problem can be solved by the nearest point iteration (ie ICP) algorithm.
  • the steps of the ICP algorithm include: (1) Pair the points in the two input points according to the closest point pairing principle; (2) Bring the point coordinates into the least square formula to find the rotation and translation
  • the matrix makes the distance between the paired points minimum after the points of one of the point sets are transformed by rotation and translation; (3) Use singular value decomposition to solve the rotation and translation matrix, which is the optimal solution of the optimization problem, through
  • the rotation and translation matrix can realize the overlap of the two point sets (that is, the overlap of the above-mentioned sensing point set and the map point set).
  • the ICP algorithm can output a correction amount. By adding the correction amount to all map elements, the map road element information that is fully consistent with the road image can be obtained. This information includes location and attribute information, that is, image annotation information.
  • the image data automatic labeling method provided by the embodiments of the present disclosure can be deployed on autonomous vehicles, continuously and automatically obtain free labeling data, and establish large-scale data sets, which can be used for deep learning research or application model training;
  • the labeling algorithm obtains and classifies the labeling data of different weather, time periods, and regions, and uses the classified data to carry out model training related to style conversion; this method essentially projects the map information onto the image to complete the image labeling, and use it for Deep learning model training; in turn, this process can be used to project the road element information recognized by the deep learning model to global coordinates for automated mapping.
  • the execution subject of the image data automatic labeling method may be an image processing device.
  • the image data automatic labeling method may be executed by a terminal device or a server or other processing equipment, where the terminal device may be a user Equipment (User Equipment, UE), mobile devices, user terminals, terminals, cellular phones, cordless phones, personal digital assistants (PDAs), handheld devices, computing devices, in-vehicle devices, wearable devices, etc.
  • the method for automatically labeling image data may be implemented by a processor invoking computer-readable instructions stored in a memory.
  • an embodiment of the present disclosure also provides an image data automatic labeling device 1000, which can be applied to the above-mentioned FIG. 1, FIG. 2 and FIG. 5.
  • the device 1000 includes: a first acquisition part 11 configured to acquire vehicle positioning information, a map image, and a vehicle acquisition image, wherein the map image includes road information; and a second acquisition part 12 is configured to acquire vehicle positioning information, map images, and vehicle acquisition images.
  • Information acquiring road information on the map image in the local area corresponding to the vehicle positioning information;
  • the projection part 13 is configured to project the road information on the map image onto the vehicle collection image, so as to The road information is marked on the image collected by the vehicle.
  • the second acquiring part 12 is configured to take the map image in the local area as a root node, and sequentially query the attribute information of the map road elements of the map image in the local area.
  • the attribute information of the map road element includes at least one of the following information: semantic information of the map road element, location information of the map road element, and shape information of the map road element.
  • the device 1000 further includes: a first determining part 14 configured to determine the range of the local area according to the vehicle positioning information and the range of the map image;
  • the second acquiring part 12 is configured to acquire attribute information of map road elements on the map image within the range of the local area.
  • the map image is based on the world global coordinate system
  • the device further includes: a first conversion part 15 configured to convert the map image based on the world global coordinate system to vehicle body coordinates System to obtain a map image based on the vehicle body coordinate system; the projection part 13 is configured to convert the map image based on the vehicle body coordinate system to a camera coordinate system and/or a pixel coordinate system to convert the The road information on the map image is projected onto the vehicle collection image.
  • the first conversion part 15 is configured to obtain the rotation angle and translation amount of the rotation translation matrix according to the vehicle positioning information; and according to the rotation translation matrix, it is based on the world
  • the map image of the global coordinate system is converted to the vehicle body coordinate system to obtain a map image based on the vehicle body coordinate system.
  • the map image is a two-dimensional map
  • the projection part 13 includes: a third acquiring part 131 configured to acquire the distance between the pixel coordinate system and the vehicle body coordinate system A homography matrix; the representation part 132 is configured to use a homogeneous coordinate system to represent the map image based on the vehicle body coordinate system; the second conversion part 133 is configured to use a homogeneous coordinate system according to the homography matrix
  • the displayed map image based on the vehicle body coordinate system is converted to the pixel coordinate system to obtain road information of the vehicle collected image projected on the pixel coordinate system.
  • the map image is a three-dimensional map
  • the projection part 13 includes: a third conversion part configured to translate according to the rotation between the vehicle body coordinate system and the camera coordinate system Matrix to convert the map image based on the vehicle body coordinate system to the camera coordinate system to obtain road information of the vehicle-collected image projected on the camera coordinate system; the fourth conversion part is configured to The projection matrix between the camera coordinate system and the pixel coordinate system converts the road information of the vehicle-collected image projected on the camera coordinate system to the pixel coordinate system to obtain the image projected on the pixel coordinate system The road information of the image collected by the vehicle.
  • the device further includes: an extraction part 16 configured to perform road information extraction processing on the vehicle collected image via a neural network for extracting road information to obtain perceived road information; first The correcting part 17 is configured to correct the road information projected on the collected image of the vehicle based on the perceived road information.
  • the first correction part 17 includes: a second determination part configured to determine the perceived road elements in the perceived road information and the road information projected on the vehicle acquisition image The offset information between the map road elements; the second correction part is configured to correct the road information projected on the vehicle collected image according to the offset information.
  • the second determining part includes: a third determining part configured to determine from the map image a pair of the perceived road element according to the attribute information of the perceived road element The map road element; the fourth determining part is configured to determine the position information of the paired perceived road element and the map road element in the same device coordinate system; the fifth determining part is configured to determine the paired perceived road based on the position information The positioning offset between the element and the map road element.
  • the third determining part includes: a searching part configured to search for map road elements within a preset range in the map image based on the vehicle positioning information; It is configured to pair the perceived road elements in the vehicle-collected image with the map road elements within the preset range based on the attribute information to obtain multiple pairing schemes, wherein at least one perceptual road element in the different pairing schemes is The pairing modes of the map road elements within the preset range are different; the sixth determining part is configured to determine the confidence of each of the pairing schemes; the seventh determining part is configured to be among the multiple pairing schemes In the matching scheme with the highest confidence or exceeding the set threshold, the map road element that is paired with the perceived road element is determined.
  • the matching part is configured to, in the case that the road elements on the map that perceive the road elements within the preset range in the vehicle acquisition image cannot determine the paired road elements, the matching part is configured to Set empty or virtual elements in the map road elements for pairing to be paired with the perceived road elements.
  • the sixth determining part is configured to separately determine the individual similarity of the pairing of each perceived road element and the map road element in each of the pairing schemes; determine each of the pairing schemes The overall similarity between each perceived road element and the map road element in the pairing; and the confidence of each pairing scheme is determined according to the individual similarity and the overall similarity of each pairing scheme.
  • the positioning offset includes a coordinate offset and/or a direction offset
  • the fifth determining part includes: a first sampling part configured to perceive the road element Sampling the pixels of the map to obtain a set of sensing sampling points;
  • the second sampling part is configured to sample the pixel points of the map road elements to obtain a set of map sampling points;
  • the eighth determining part is configured to determine the sensing The rotation and translation matrix between the sampling point set and the sampling points included in the map sampling point set;
  • the fourth obtaining part is configured to obtain the coordinates of the perceived road element and the map road element based on the rotation and translation matrix Offset and direction offset.
  • the rich road information contained in the map data can be used to realize the automatic labeling of the road information of the image collected by the vehicle, which improves the efficiency of labeling image data and is beneficial to reduce the cost of data labeling.
  • the error probability reduces the labor cost of image data annotation.
  • the embodiment of the present disclosure also provides an automatic labeling device for image data, which is used to execute the above-mentioned automatic labeling method for image data.
  • Part or all of the above methods can be implemented by hardware, and can also be implemented by software or firmware.
  • the device may be a chip or an integrated circuit.
  • the device 1100 may include: an input device 111, an output device 112, a memory 113, and a processor 114 (the processor 114 in the device may be one or more, and one processor is taken as an example in FIG. 11) .
  • the input device 111, the output device 112, the memory 113, and the processor 114 may be connected by a bus or in other ways. Among them, the connection by a bus is taken as an example in FIG. 11.
  • the processor 114 is configured to execute the method steps executed by the devices in FIG. 1, FIG. 2, FIG. 5, FIG. 6, and FIG. 8.
  • the program of the foregoing image data automatic labeling method may be stored in the memory 113.
  • the memory 113 may be a physically independent part, or may be integrated with the processor 114.
  • the memory 113 can also be used to store data.
  • the device may also only include a processor.
  • the memory for storing the program is located outside the device, and the processor is connected to the memory through a circuit or wire for reading and executing the program stored in the memory.
  • the processor may be a central processing unit (CPU), a network processor (NP), or a wide area network (WAN) device.
  • the processor may further include a hardware chip.
  • the aforementioned hardware chip may be an application-specific integrated circuit (ASIC), a programmable logic device (PLD), or a combination thereof.
  • ASIC application-specific integrated circuit
  • PLD programmable logic device
  • the above-mentioned PLD may be a complex programmable logic device (CPLD), a field-programmable gate array (FPGA), a general array logic (generic array logic, GAL), or any combination thereof.
  • the memory may include volatile memory (volatile memory), such as random-access memory (RAM); the memory may also include non-volatile memory (non-volatile memory), such as flash memory (flash memory) , Hard disk drive (HDD) or solid-state drive (solid-state drive, SSD); the memory may also include a combination of the foregoing types of memory.
  • volatile memory volatile memory
  • non-volatile memory non-volatile memory
  • flash memory flash memory
  • HDD Hard disk drive
  • SSD solid-state drive
  • the rich road information contained in the map data can be used to automatically label the road information of the image collected by the vehicle, which improves the efficiency of labeling image data and is beneficial to reducing the data.
  • the error probability of labeling reduces the labor cost of image data labeling.
  • the embodiments of the present disclosure may be systems, methods, and/or computer program products.
  • the computer program product may include a computer-readable storage medium loaded with computer-readable program instructions for enabling a processor to implement various aspects of the embodiments of the present disclosure.
  • the computer-readable storage medium may be a tangible device that can hold and store instructions used by the instruction execution device.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • Examples of computer-readable storage media include: portable computer disks, hard disks, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), erasable programmable Read-only memory (Electrical Programmable Read Only Memory, EPROM or flash memory), static random access memory (Static Random Access Memory, SRAM), portable compact disk read-only memory (Compact Disc Read-Only Memory, CD-ROM), digital multi Function disks (Digital Video Disc, DVD), memory sticks, floppy disks, mechanical encoding devices, such as punch cards on which instructions are stored or raised structures in the grooves, and any suitable combination of the foregoing.
  • the computer-readable storage medium used herein is not interpreted as a transient signal itself, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media, or electrical signals transmitted through wires.
  • the computer-readable program instructions described herein can be downloaded from a computer-readable storage medium to various computing/processing devices, or downloaded to an external computer or external storage device via a network, such as the Internet, a local area network, a wide area network, and/or a wireless network.
  • the network may include copper transmission cables, optical fiber transmission, wireless transmission, routers, firewalls, switches, gateway computers, and/or edge servers.
  • the network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network, and forwards the computer-readable program instructions for storage in the computer-readable storage medium in each computing/processing device .
  • the computer program instructions used to perform the operations of the embodiments of the present disclosure may be assembly instructions, instruction set architecture (Industry Standard Architecture, ISA) instructions, machine instructions, machine-related instructions, microcode, firmware instructions, state setting data, or a combination of Or source code or object code written in any combination of multiple programming languages, including object-oriented programming languages—such as Smalltalk, C++, etc., and conventional procedural programming languages—such as "C" language or similar programming Language.
  • Computer-readable program instructions can be executed entirely on the user's computer, partly on the user's computer, executed as a stand-alone software package, partly on the user's computer and partly executed on a remote computer, or entirely on the remote computer or server carried out.
  • the remote computer can be connected to the user's computer through any kind of network-including local area network (LAN) or wide area network (WAN)-or it can be connected to an external computer (such as Use an Internet service provider to connect via the Internet).
  • the electronic circuit is customized by using the state information of the computer-readable program instructions, such as programmable logic circuit, field programmable gate array (FPGA) or programmable logic array (PLA), The electronic circuit can execute computer-readable program instructions to implement various aspects of the embodiments of the present disclosure.
  • These computer-readable program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, or other programmable data processing device, thereby producing a machine that allows these instructions to be executed by the processor of the computer or other programmable data processing device.
  • a device that implements the functions/actions specified in one or more blocks in the flowcharts and/or block diagrams is produced. It is also possible to store these computer-readable program instructions in a computer-readable storage medium.
  • each block in the flowchart or block diagram may represent a module, program segment, or part of an instruction, and the module, program segment, or part of an instruction includes one or more executable instructions for realizing specified logical functions.
  • the functions marked in the block may also occur in a different order than the order marked in the drawings. For example, two consecutive blocks can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagram and/or flowchart, and the combination of the blocks in the block diagram and/or flowchart can be implemented by a dedicated hardware-based system that performs the specified functions or actions Or it can be realized by a combination of dedicated hardware and computer instructions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Automation & Control Theory (AREA)
  • Library & Information Science (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)
  • Traffic Control Systems (AREA)
  • Navigation (AREA)

Abstract

Un procédé de marquage automatique de données d'image et un dispositif sont divulgués. Le procédé comprend : l'obtention d'informations de positionnement de véhicule, des données de carte et de l'image d'acquisition de véhicule (S101) ; obtention des informations de route sur l'image de carte dans la zone locale correspondant aux informations de positionnement de véhicule selon les informations de positionnement de véhicule (S102) ; et projection des informations de route sur l'image de carte sur l'image d'acquisition de véhicule pour marquer les informations de route sur l'image d'acquisition de véhicule (S103).
PCT/CN2020/122514 2019-10-16 2020-10-21 Procédé de marquage automatique de données d'image et dispositif WO2021073656A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
KR1020217017022A KR20220053513A (ko) 2019-10-16 2020-10-21 이미지 데이터 자동 라벨링 방법 및 장치
JP2021539968A JP2022517961A (ja) 2019-10-16 2020-10-21 画像データを自動的にアノテーションする方法及び装置

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910983438.9 2019-10-16
CN201910983438.9A CN112667837A (zh) 2019-10-16 2019-10-16 图像数据自动标注方法及装置

Publications (1)

Publication Number Publication Date
WO2021073656A1 true WO2021073656A1 (fr) 2021-04-22

Family

ID=75400660

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/122514 WO2021073656A1 (fr) 2019-10-16 2020-10-21 Procédé de marquage automatique de données d'image et dispositif

Country Status (4)

Country Link
JP (1) JP2022517961A (fr)
KR (1) KR20220053513A (fr)
CN (1) CN112667837A (fr)
WO (1) WO2021073656A1 (fr)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113609241A (zh) * 2021-08-13 2021-11-05 武汉市交通发展战略研究院 一种道路网络与公交线网匹配方法与***
CN113822943A (zh) * 2021-09-17 2021-12-21 中汽创智科技有限公司 一种相机的外参标定方法、装置、***及存储介质
US20210405651A1 (en) * 2020-06-26 2021-12-30 Tusimple, Inc. Adaptive sensor control
CN114111817A (zh) * 2021-11-22 2022-03-01 武汉中海庭数据技术有限公司 基于slam地图与高精度地图匹配的车辆定位方法及***
CN114419882A (zh) * 2021-12-30 2022-04-29 联通智网科技股份有限公司 感知***布置参数优化方法、设备终端及存储介质
CN114782447A (zh) * 2022-06-22 2022-07-22 小米汽车科技有限公司 路面检测方法、装置、车辆、存储介质及芯片
CN115223118A (zh) * 2022-06-09 2022-10-21 广东省智能网联汽车创新中心有限公司 一种高精地图置信度判断方法、***及车辆
US20230030660A1 (en) * 2021-08-02 2023-02-02 Nio Technology (Anhui) Co., Ltd Vehicle positioning method and system for fixed parking scenario
CN116468870A (zh) * 2023-06-20 2023-07-21 佛山科学技术学院 一种城市道路三维可视化建模方法及***
US11877066B2 (en) 2018-09-10 2024-01-16 Tusimple, Inc. Adaptive illumination for a time-of-flight camera on a vehicle
US11932238B2 (en) 2020-06-29 2024-03-19 Tusimple, Inc. Automated parking technology
JP7494809B2 (ja) 2021-06-29 2024-06-04 株式会社デンソー 支援装置、支援方法、支援プログラム
KR102676291B1 (ko) 2023-06-28 2024-06-19 주식회사 카비 딥 러닝 학습 데이터 구축을 위하여 영상데이터에서 이미지프레임 자동 추출 및 레이블링 방법 및 장치

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113205447A (zh) * 2021-05-11 2021-08-03 北京车和家信息技术有限公司 用于车道线识别的道路图片标注方法和装置
CN113269165B (zh) * 2021-07-16 2022-04-22 智道网联科技(北京)有限公司 数据获取方法及其装置
CN114136333A (zh) * 2021-10-15 2022-03-04 阿波罗智能技术(北京)有限公司 基于分层特征的高精地图道路数据生成方法、装置、设备
CN114018240A (zh) * 2021-10-29 2022-02-08 广州小鹏自动驾驶科技有限公司 一种地图数据的处理方法和装置
CN115526987A (zh) * 2022-09-22 2022-12-27 清华大学 基于单目相机的标牌要素重建方法、***、设备及介质
CN117894015B (zh) * 2024-03-15 2024-05-24 浙江华是科技股份有限公司 点云标注数据优选方法及***

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11271074A (ja) * 1998-03-20 1999-10-05 Fujitsu Ltd 目印画像照合装置及び目印画像照合方法及びプログラム記憶媒体
CN105701449A (zh) * 2015-12-31 2016-06-22 百度在线网络技术(北京)有限公司 路面上的车道线的检测方法和装置
CN108305475A (zh) * 2017-03-06 2018-07-20 腾讯科技(深圳)有限公司 一种交通灯识别方法及装置
CN109949439A (zh) * 2019-04-01 2019-06-28 星觅(上海)科技有限公司 行车实景信息标注方法、装置、电子设备和介质
CN110135323A (zh) * 2019-05-09 2019-08-16 北京四维图新科技股份有限公司 图像标注方法、装置、***及存储介质
CN110136199A (zh) * 2018-11-13 2019-08-16 北京初速度科技有限公司 一种基于摄像头的车辆定位、建图的方法和装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5971112B2 (ja) * 2012-12-25 2016-08-17 富士通株式会社 画像処理方法、画像処理装置および画像処理プログラム
US10209089B2 (en) * 2017-04-03 2019-02-19 Robert Bosch Gmbh Automated image labeling for vehicles based on maps
JP6908843B2 (ja) * 2017-07-26 2021-07-28 富士通株式会社 画像処理装置、画像処理方法、及び画像処理プログラム
US11544938B2 (en) * 2018-12-21 2023-01-03 Continental Autonomous Mobility US, LLC Systems and methods for automatic labeling of images for supervised machine learning
WO2020210127A1 (fr) * 2019-04-12 2020-10-15 Nvidia Corporation Entraînement de réseau neuronal à l'aide de données de réalité de terrain augmentées avec des informations cartographiques pour des applications de machine autonome
CN112069856B (zh) * 2019-06-10 2024-06-14 商汤集团有限公司 地图生成方法、驾驶控制方法、装置、电子设备及***

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11271074A (ja) * 1998-03-20 1999-10-05 Fujitsu Ltd 目印画像照合装置及び目印画像照合方法及びプログラム記憶媒体
CN105701449A (zh) * 2015-12-31 2016-06-22 百度在线网络技术(北京)有限公司 路面上的车道线的检测方法和装置
CN108305475A (zh) * 2017-03-06 2018-07-20 腾讯科技(深圳)有限公司 一种交通灯识别方法及装置
CN110136199A (zh) * 2018-11-13 2019-08-16 北京初速度科技有限公司 一种基于摄像头的车辆定位、建图的方法和装置
CN109949439A (zh) * 2019-04-01 2019-06-28 星觅(上海)科技有限公司 行车实景信息标注方法、装置、电子设备和介质
CN110135323A (zh) * 2019-05-09 2019-08-16 北京四维图新科技股份有限公司 图像标注方法、装置、***及存储介质

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11877066B2 (en) 2018-09-10 2024-01-16 Tusimple, Inc. Adaptive illumination for a time-of-flight camera on a vehicle
US20210405651A1 (en) * 2020-06-26 2021-12-30 Tusimple, Inc. Adaptive sensor control
US11932238B2 (en) 2020-06-29 2024-03-19 Tusimple, Inc. Automated parking technology
JP7494809B2 (ja) 2021-06-29 2024-06-04 株式会社デンソー 支援装置、支援方法、支援プログラム
US20230030660A1 (en) * 2021-08-02 2023-02-02 Nio Technology (Anhui) Co., Ltd Vehicle positioning method and system for fixed parking scenario
CN113609241B (zh) * 2021-08-13 2023-11-14 武汉市规划研究院(武汉市交通发展战略研究院) 一种道路网络与公交线网匹配方法与***
CN113609241A (zh) * 2021-08-13 2021-11-05 武汉市交通发展战略研究院 一种道路网络与公交线网匹配方法与***
CN113822943A (zh) * 2021-09-17 2021-12-21 中汽创智科技有限公司 一种相机的外参标定方法、装置、***及存储介质
CN113822943B (zh) * 2021-09-17 2024-06-11 中汽创智科技有限公司 一种相机的外参标定方法、装置、***及存储介质
CN114111817B (zh) * 2021-11-22 2023-10-13 武汉中海庭数据技术有限公司 基于slam地图与高精度地图匹配的车辆定位方法及***
CN114111817A (zh) * 2021-11-22 2022-03-01 武汉中海庭数据技术有限公司 基于slam地图与高精度地图匹配的车辆定位方法及***
CN114419882A (zh) * 2021-12-30 2022-04-29 联通智网科技股份有限公司 感知***布置参数优化方法、设备终端及存储介质
CN115223118A (zh) * 2022-06-09 2022-10-21 广东省智能网联汽车创新中心有限公司 一种高精地图置信度判断方法、***及车辆
CN115223118B (zh) * 2022-06-09 2024-03-01 广东省智能网联汽车创新中心有限公司 一种高精地图置信度判断方法、***及车辆
CN114782447A (zh) * 2022-06-22 2022-07-22 小米汽车科技有限公司 路面检测方法、装置、车辆、存储介质及芯片
CN116468870A (zh) * 2023-06-20 2023-07-21 佛山科学技术学院 一种城市道路三维可视化建模方法及***
CN116468870B (zh) * 2023-06-20 2024-01-23 佛山科学技术学院 一种城市道路三维可视化建模方法及***
KR102676291B1 (ko) 2023-06-28 2024-06-19 주식회사 카비 딥 러닝 학습 데이터 구축을 위하여 영상데이터에서 이미지프레임 자동 추출 및 레이블링 방법 및 장치

Also Published As

Publication number Publication date
CN112667837A (zh) 2021-04-16
KR20220053513A (ko) 2022-04-29
JP2022517961A (ja) 2022-03-11

Similar Documents

Publication Publication Date Title
WO2021073656A1 (fr) Procédé de marquage automatique de données d'image et dispositif
WO2020224305A1 (fr) Procédé et appareil de positionnement de dispositif et dispositif
WO2021233029A1 (fr) Procédé de localisation et de cartographie simultanées, dispositif, système et support de stockage
CN110146910B (zh) 一种基于gps与激光雷达数据融合的定位方法及装置
US20200089971A1 (en) Sensor calibration method and device, computer device, medium, and vehicle
CN112861653A (zh) 融合图像和点云信息的检测方法、***、设备及存储介质
US11625851B2 (en) Geographic object detection apparatus and geographic object detection method
Xiao et al. Monocular localization with vector HD map (MLVHM): A low-cost method for commercial IVs
CN111121754A (zh) 移动机器人定位导航方法、装置、移动机器人及存储介质
WO2020043081A1 (fr) Technique de positionnement
CN112232275B (zh) 基于双目识别的障碍物检测方法、***、设备及存储介质
EP4105600A2 (fr) Procédé de production automatique de données cartographiques, appareil associé et produit de programme informatique
WO2020258297A1 (fr) Procédé de segmentation sémantique d'image, plateforme mobile et support de stockage
CN114111774B (zh) 车辆的定位方法、***、设备及计算机可读存储介质
CN112150448B (zh) 图像处理方法、装置及设备、存储介质
CN116997771A (zh) 车辆及其定位方法、装置、设备、计算机可读存储介质
Liao et al. SE-Calib: Semantic Edge-Based LiDAR–Camera Boresight Online Calibration in Urban Scenes
CN111833443A (zh) 自主机器应用中的地标位置重建
CN113838129A (zh) 一种获得位姿信息的方法、装置以及***
KR102249381B1 (ko) 3차원 영상 정보를 이용한 모바일 디바이스의 공간 정보 생성 시스템 및 방법
CN109785388B (zh) 一种基于双目摄像头的短距离精确相对定位方法
WO2023283929A1 (fr) Procédé et appareil permettant d'étalonner des paramètres externes d'une caméra binoculaire
CN112880692B (zh) 地图数据标注方法及装置、存储介质
CN115345944A (zh) 外参标定参数确定方法、装置、计算机设备和存储介质
CN112802095B (zh) 定位方法、装置及设备、以及自动驾驶定位***

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20876002

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021539968

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20876002

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20876002

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 031122)

122 Ep: pct application non-entry in european phase

Ref document number: 20876002

Country of ref document: EP

Kind code of ref document: A1