CN116311157A

CN116311157A - Obstacle recognition method and obstacle recognition model training method

Info

Publication number: CN116311157A
Application number: CN202310180237.1A
Authority: CN
Inventors: 苏金明; 段祎婷; 赵佳伟; 杨旺旺; 郭亭佚; 罗钧峰; 魏晓林
Original assignee: Beijing Sankuai Online Technology Co Ltd
Current assignee: Beijing Sankuai Online Technology Co Ltd
Priority date: 2023-02-15
Filing date: 2023-02-15
Publication date: 2023-06-23

Abstract

The application discloses an obstacle recognition method and an obstacle recognition model training method, and belongs to the technical field of automatic driving. The method comprises the following steps: acquiring an image of a target scene; identifying the image through a closed set identification model to obtain a first identification result, wherein the closed set identification model is used for identifying the learned known type of obstacle image, and the first identification result comprises the identified obstacle image and the type of the obstacle image; identifying the image through an open set identification model to obtain a second identification result, wherein the open set identification model is used for identifying the unknown type of obstacle image based on the learned known type of obstacle image, and the second identification result comprises the identified obstacle image; and fusing the first recognition result and the second recognition result to obtain a third recognition result. The method improves the accuracy of the identification result.

Description

Obstacle recognition method and obstacle recognition model training method

Technical Field

The application relates to the technical field of automatic driving, in particular to an obstacle recognition method and an obstacle recognition model training method.

Background

Autopilot is a technique for controlling a vehicle to automatically travel on a road through a computer system. Because the actual road condition is complex, a large number of obstacles such as pedestrians, vehicles and the like exist, so how to realize obstacle recognition, and further planning a driving route for avoiding the obstacles becomes a key of automatic driving.

In the related art, a camera is arranged on an automatic driving vehicle, an environmental image is acquired through the camera, and an obstacle in the environmental image is identified through a closed set identification model. However, the closed set recognition model is trained based on manually labeled samples, and if only cars, pedestrians and bicycles are labeled in the samples, the closed set recognition model can only recognize cars, pedestrians and bicycles. However, the obstacles in the road where the automatic driving vehicle runs are various, the problem of missed detection often occurs by adopting the closed set recognition model, and the accuracy rate of obstacle recognition is low.

Disclosure of Invention

The embodiment of the application provides an obstacle recognition method and an obstacle recognition model training method, which improve the accuracy of obstacle recognition. The technical scheme is as follows:

in one aspect, there is provided a method of identifying an obstacle, the method comprising:

Acquiring an image of a target scene;

identifying the image through a closed set identification model to obtain a first identification result, wherein the closed set identification model is used for identifying the learned known type of obstacle image, and the first identification result comprises the identified obstacle image and the type of the obstacle image;

identifying the image through an open set identification model to obtain a second identification result, wherein the open set identification model is used for identifying the unknown type of obstacle image based on the learned known type of obstacle image, and the second identification result comprises the identified obstacle image;

and fusing the first recognition result and the second recognition result to obtain a third recognition result.

In a possible implementation manner, the determining, based on the second recognition result, a distance between the obstacle and the road in the recognized obstacle image includes:

identifying a road image from an image to which any obstacle image belongs in the second identification result;

based on the positions of the obstacle image and the road image in the image, a distance between an obstacle indicated by the obstacle image and a road indicated by the road image in the image is determined.

In one possible implementation manner, the identifying the road image from the images to which the obstacle image belongs includes:

acquiring point cloud data corresponding to the image, wherein the image and the point cloud data correspond to the same scene; acquiring points belonging to a road surface from the point cloud data based on the height of the points in the point cloud data; projecting the obtained points into the image to obtain a road image in the image; or,

and processing the image through a road segmentation model to obtain a road image in the image.

In one possible implementation, the method further includes:

for any obstacle image in the second recognition result, acquiring point cloud data based on an image to which the obstacle image belongs, wherein the image and the point cloud data correspond to the same scene; mapping points in the point cloud data into the image, determining a depth map of the image based on the depth of the points in the point cloud data, and determining the image depth of the obstacle image in the image based on the depth map of the image; or,

for any obstacle image in the second recognition result, acquiring point cloud data based on an image to which the obstacle image belongs, wherein the image and the point cloud data correspond to the same scene; mapping points in the point cloud data into the image, and determining the image depth of the obstacle image based on the depth of the point cloud data corresponding to the obstacle image in the image; or,

And determining the image depth of any obstacle image in the second recognition result through a depth determination model.

In one aspect, there is provided a method for training an obstacle recognition model, the method comprising:

obtaining a sample set, the sample set comprising a plurality of sample images;

in the N-th iteration process of the training process, identifying an unknown class of obstacle image in a sample set corresponding to the N-1 th iteration through an open set identification model corresponding to the N-1 th iteration, marking the identified obstacle image as the unknown class, wherein the open set identification model is used for identifying the unknown class of obstacle image based on the learned known class of obstacle image, and N is a positive integer greater than 1;

training a closed set recognition model based on the marked sample set to obtain a closed set recognition model corresponding to the nth iteration, wherein the closed set recognition model is used for recognizing known class obstacle images;

if the training process reaches a cycle termination condition, outputting a closed set recognition model corresponding to the nth iteration as an obstacle recognition model;

and if the training process does not reach the loop termination condition, identifying the plurality of sample images in the sample set through a closed set identification model corresponding to the nth iteration, marking the plurality of sample images in the sample set based on an identification result to obtain a sample set corresponding to the nth iteration, and training an open set identification model corresponding to the (N-1) th iteration based on the sample set corresponding to the nth iteration to obtain the open set identification model corresponding to the nth iteration.

In one possible implementation, the labeling the identified obstacle image as an unknown class includes:

screening out an obstacle image meeting an obstacle condition from the identified obstacle image, wherein the obstacle condition is a condition that an object influences the running of the automatic driving vehicle;

and labeling the screened obstacle image as the unknown class.

In one possible implementation manner, the screening the obstacle image from the identified obstacle images to meet the obstacle condition includes at least one of the following:

the obstacle condition includes a distance threshold indicating that an object having a distance to the road less than the distance threshold is an obstacle affecting the travel of the autonomous vehicle; determining the distance between the obstacle and the road in the identified obstacle image, and screening the obstacle image with the distance smaller than a distance threshold value from the identified obstacle image;

the obstacle condition includes a depth threshold value indicating a maximum image depth in a driving environment image of an object affecting the driving of the autonomous vehicle; screening an obstacle image with an image depth greater than the depth threshold from the identified obstacle images based on the image depth of the identified obstacle images;

The obstacle condition includes a size threshold indicating a minimum image size in a driving environment image of an object affecting the driving of the autonomous vehicle; and screening the obstacle images with the image sizes larger than the size threshold value from the identified obstacle images based on the image sizes of the identified obstacle images.

In one possible implementation, the determining the distance between the obstacle and the road in the identified obstacle image includes:

identifying a road image from a sample image to which any identified obstacle image belongs;

a distance between an obstacle indicated by the obstacle image in the sample image and a road indicated by the road image is determined based on the positions of the obstacle image and the road image in the sample image.

In one possible implementation manner, the identifying the road image from the sample image to which the obstacle image belongs includes:

acquiring point cloud data corresponding to the sample image, wherein the sample image and the point cloud data correspond to the same scene; acquiring points belonging to a road surface from the point cloud data based on the height of the points in the point cloud data; projecting the obtained points into the sample image to obtain a road image in the sample image; or,

And processing the sample image through a road segmentation model to obtain a road image in the sample image.

In one possible implementation, the method further includes:

for any identified obstacle image, acquiring point cloud data based on a sample image to which the obstacle image belongs, wherein the sample image and the point cloud data correspond to the same scene; mapping points in the point cloud data into the sample image, determining a depth map of the sample image based on the depth of the points in the point cloud data, and determining the image depth of the obstacle image in the sample image based on the depth map of the sample image; or,

for any identified obstacle image, acquiring point cloud data based on a sample image to which the obstacle image belongs, wherein the sample image and the point cloud data correspond to the same scene; mapping points in the point cloud data into the sample image, and determining the image depth of the obstacle image based on the depth of the point cloud data corresponding to the obstacle image in the sample image; or,

for any identified obstacle image, determining the image depth of the obstacle image through a depth determination model.

In one possible implementation, the method further includes:

in any iteration process, acquiring point cloud data corresponding to the sample image, wherein the sample image and the point cloud data correspond to the same scene;

clustering points, of which the distance between the points and the road does not exceed a distance threshold value, in the point cloud data to obtain a plurality of point clusters;

and mapping the plurality of point clusters into the sample image, and if an object image corresponding to any point cluster in the sample image is not an obstacle image identified by the open set identification model, determining the object image as an obstacle image and marking the object image as an unknown class.

In one possible implementation manner, the labeling the plurality of sample images in the sample set based on the identification result, to obtain a sample set corresponding to the nth iteration, includes:

labeling the identified obstacle images in the sample set based on the category to which the identified obstacle images belong;

screening out obstacle images which meet a confidence coefficient condition and belong to the unknown class from the identified obstacle images;

and resetting the position of the screened obstacle image according to a target mode to obtain a sample set corresponding to the Nth iteration.

In one possible implementation manner, if the training process reaches a loop termination condition, outputting the closed-set recognition model corresponding to the nth iteration as an obstacle recognition model includes:

and if the iteration times of the training process reach the target times, outputting the closed set recognition model corresponding to the Nth iteration as the obstacle recognition model.

In one aspect, there is provided an obstacle identifying apparatus, comprising:

the acquisition module is used for acquiring an image of the target scene;

the first recognition module is used for recognizing the image through a closed set recognition model to obtain a first recognition result, wherein the closed set recognition model is used for recognizing the learned known type of obstacle image, and the first recognition result comprises the recognized obstacle image and the type of the obstacle image;

the second recognition module is used for recognizing the image through an open set recognition model to obtain a second recognition result, the open set recognition model is used for recognizing the unknown type of obstacle image based on the learned known type of obstacle image, and the second recognition result comprises the recognized obstacle image;

And the fusion module is used for fusing the first identification result and the second identification result to obtain a third identification result.

In one possible implementation, the fusing module includes:

the screening unit is used for screening out an obstacle image meeting an obstacle condition from the second identification result to obtain a fourth identification result, wherein the obstacle condition is a condition that an object influences the running of the automatic driving vehicle;

and the fusion unit is used for fusing the first identification result and the fourth identification result to obtain the third identification result.

In one possible implementation, the obstacle condition includes a distance threshold indicating that an object having a distance to the road less than the distance threshold is an obstacle affecting travel of the autonomous vehicle; the screening unit is used for determining the distance between the obstacle and the road in the identified obstacle image based on the second identification result, and screening the obstacle image with the distance smaller than a distance threshold value from the second identification result;

the obstacle condition includes a depth threshold value indicating a maximum image depth in a driving environment image of an object affecting the driving of the autonomous vehicle; the screening unit is used for determining the image depth of the identified obstacle image based on the second identification result, and screening the obstacle image with the image depth larger than the depth threshold value from the second identification result;

The obstacle condition includes a size threshold indicating a minimum image size in a driving environment image of an object affecting the driving of the autonomous vehicle; the screening unit is used for screening the obstacle image with the image size larger than the size threshold value from the second recognition result based on the image size of the obstacle image in the second recognition result.

In one possible implementation manner, the screening unit is configured to identify, for any obstacle image in the second identification result, a road image from images to which the obstacle image belongs; based on the positions of the obstacle image and the road image in the image, a distance between an obstacle indicated by the obstacle image and a road indicated by the road image in the image is determined.

In a possible implementation manner, the filtering unit is configured to obtain point cloud data corresponding to the image, where the image and the point cloud data correspond to the same scene; acquiring points belonging to a road surface from the point cloud data based on the height of the points in the point cloud data; projecting the obtained points into the image to obtain a road image in the image; or,

And the screening unit is used for processing the image through a road segmentation model to obtain a road image in the image.

In a possible implementation manner, the screening unit is configured to, for any obstacle image in the second recognition result, obtain point cloud data based on an image to which the obstacle image belongs, where the image and the point cloud data correspond to the same scene; mapping points in the point cloud data into the image, determining a depth map of the image based on the depth of the points in the point cloud data, and determining the image depth of the obstacle image in the image based on the depth map of the image; or,

the screening unit is configured to, for any obstacle image in the second recognition result, obtain point cloud data based on an image to which the obstacle image belongs, where the image and the point cloud data correspond to the same scene; mapping points in the point cloud data into the image, and determining the image depth of the obstacle image based on the depth of the point cloud data corresponding to the obstacle image in the image; or,

the screening unit is configured to determine, for any obstacle image in the second recognition result, an image depth of the obstacle image through a depth determination model.

In one possible implementation manner, the fusion module is configured to obtain point cloud data of the target scene; clustering points, of which the distance between the points and the road does not exceed a distance threshold value, in the point cloud data to obtain a plurality of point clusters; mapping the plurality of point clusters into the sample image to obtain object images corresponding to the plurality of point clusters respectively; determining a fifth recognition result based on the object images respectively corresponding to the plurality of point clusters and the second recognition result, wherein the fifth recognition result comprises object images which do not belong to the second recognition result in the object images respectively corresponding to the plurality of point clusters; and fusing the first identification result, the second identification result and the fifth identification result to obtain the third identification result.

In one aspect, there is provided an obstacle recognition model training apparatus, the apparatus comprising:

an acquisition module for acquiring a sample set comprising a plurality of sample images;

the labeling module is used for identifying unknown class obstacle images in a sample set corresponding to the N-1 th iteration through an open set identification model corresponding to the N-1 th iteration in the N-th iteration of the training process, labeling the identified obstacle images as the unknown class, wherein the open set identification model is used for identifying the unknown class obstacle images based on the learned known class obstacle images, and N is a positive integer greater than 1;

The training module is used for training the closed set recognition model based on the marked sample set to obtain a closed set recognition model corresponding to the nth iteration, wherein the closed set recognition model is used for recognizing the known class of obstacle images;

the output module is used for outputting the closed set recognition model corresponding to the nth iteration as an obstacle recognition model if the training process reaches a cycle termination condition;

the training module is further configured to identify, by using a closed set identification model corresponding to an nth iteration, the plurality of sample images in the sample set if the training process does not reach a loop termination condition, label the plurality of sample images in the sample set based on an identification result, obtain a sample set corresponding to the nth iteration, and train, by using a sample set corresponding to the nth iteration, an open set identification model corresponding to the nth-1 th iteration, so as to obtain an open set identification model corresponding to the nth iteration.

In one possible implementation, the labeling module includes:

a screening unit for screening out an obstacle image satisfying an obstacle condition, which is a condition that an object affects the travel of the autonomous vehicle, from the identified obstacle images;

And the labeling unit is used for labeling the screened obstacle images as the unknown categories.

In a possible implementation manner, the screening unit is configured to at least one of the following:

In one possible implementation manner, the screening unit is configured to identify, for any identified obstacle image, a road image from a sample image to which the obstacle image belongs; a distance between an obstacle indicated by the obstacle image in the sample image and a road indicated by the road image is determined based on the positions of the obstacle image and the road image in the sample image.

In a possible implementation manner, the screening unit is configured to obtain point cloud data corresponding to the sample image, where the sample image and the point cloud data correspond to the same scene; acquiring points belonging to a road surface from the point cloud data based on the height of the points in the point cloud data; projecting the obtained points into the sample image to obtain a road image in the sample image; or,

and the screening unit is used for processing the sample image through a road segmentation model to obtain a road image in the sample image.

In one possible implementation, the apparatus further includes:

the determining module is used for acquiring point cloud data for any identified obstacle image based on a sample image to which the obstacle image belongs, wherein the sample image and the point cloud data correspond to the same scene; mapping points in the point cloud data into the sample image, determining a depth map of the sample image based on the depth of the points in the point cloud data, and determining the image depth of the obstacle image in the sample image based on the depth map of the sample image; or,

The determining module is used for acquiring point cloud data for any identified obstacle image based on a sample image to which the obstacle image belongs, wherein the sample image and the point cloud data correspond to the same scene; mapping points in the point cloud data into the sample image, and determining the image depth of the obstacle image based on the depth of the point cloud data corresponding to the obstacle image in the sample image; or,

the determining module is used for determining the image depth of any identified obstacle image through a depth determining model.

In one possible implementation, the apparatus further includes:

the acquisition module is further configured to acquire point cloud data corresponding to the sample image in any iteration process, where the sample image and the point cloud data correspond to the same scene;

the clustering module is used for clustering the points, the distance between the points and the road of which does not exceed a distance threshold value, in the point cloud data to obtain a plurality of point clusters;

the labeling module is further configured to map the plurality of point clusters into the sample image, and if an object image corresponding to any point cluster in the sample image is not an obstacle image identified by the open set identification model, determine the object image as an obstacle image and label the object image as an unknown class.

In one possible implementation manner, the training module is configured to label, in the sample set, the identified obstacle image based on a category to which the identified obstacle image belongs; screening out obstacle images which meet a confidence coefficient condition and belong to the unknown class from the identified obstacle images; and resetting the position of the screened obstacle image according to a target mode to obtain a sample set corresponding to the Nth iteration.

In one possible implementation manner, the output module is configured to output the closed-set recognition model corresponding to the nth iteration as the obstacle recognition model if the number of iterations of the training process reaches a target number.

In one aspect, a computer device is provided that includes one or more processors and one or more memories having stored therein at least one piece of program code that is loaded and executed by the one or more processors to implement operations performed by an obstacle recognition method, or to implement operations performed by an obstacle recognition model training method, as in any of the possible implementations described above.

In one aspect, a computer readable storage medium is provided having at least one program code stored therein, the at least one program code loaded and executed by a processor to implement operations performed by an obstacle recognition method, or to implement operations performed by an obstacle recognition model training method, as any one of the possible implementations described above.

In one aspect, there is provided a computer program or computer program product comprising: computer program code which, when executed by a computer, causes the computer to perform the operations performed by the obstacle recognition method of any one of the possible implementations described above, or to perform the operations performed by the obstacle recognition model training method of any one of the possible implementations described above.

According to the obstacle recognition method provided by the embodiment of the application, the fact that the obstacle image recognized by the closed set recognition model is not comprehensive enough is considered, but the accuracy of the recognition result is higher; the obstacle image identified by the open-set identification model is comprehensive, but the accuracy of the identification result is low, so that the embodiment of the application identifies the same image through the closed-set identification model and the open-set identification model, fuses the two identification results, and improves the accuracy of the identification result as a final identification result.

According to the barrier identification model training method, the open set identification model is used for mining new types of barrier images in the sample set, and the open set identification model and the closed set identification model are used for iteratively training, so that the open set identification model and the closed set identification model can identify more types of barrier images, the barrier identification capacity of the open set identification model and the closed set identification model is improved, the finally obtained closed set identification model is used as a barrier identification model, and the accuracy of the barrier identification model is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic illustration of an implementation environment provided by embodiments of the present application;

FIG. 2 is a flowchart of a training method for an obstacle recognition model according to an embodiment of the present application;

FIG. 3 is a flowchart of a training method for an obstacle recognition model according to an embodiment of the present application;

FIG. 4 is a flowchart of a training method for an obstacle recognition model according to an embodiment of the present application;

fig. 5 is a flowchart of an obstacle identifying method according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of an obstacle recognition model training device according to an embodiment of the present application;

FIG. 7 is a schematic structural diagram of another training device for an obstacle recognition model according to an embodiment of the present disclosure;

fig. 8 is a schematic structural diagram of an obstacle identifying apparatus according to an embodiment of the present application;

fig. 9 is a schematic structural view of another obstacle identifying apparatus according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a terminal according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

It will be understood that the terms "first," "second," and the like, as used herein, may be used to describe various concepts, but are not limited by these terms unless otherwise specified. These terms are only used to distinguish one concept from another. For example, a first sample image may be referred to as a second sample image and a second sample image may be referred to as a first sample image without departing from the scope of the present application.

As used herein, the terms "at least one", "a plurality", "each", "any", at least one include one, two or more, a plurality includes two or more, and each refers to each of a corresponding plurality, any one refers to any one of a plurality, for example, a plurality of sample images includes 3 sample images, and each refers to each of the 3 sample images, any one refers to any one of the 3 sample images, either the first, the second, or the third.

It should be noted that, information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, presented data, etc.), and signals referred to in this application are all authorized by the user or are fully authorized by the parties, and the collection, use, and processing of relevant data is required to comply with relevant laws and regulations and standards of relevant countries and regions. For example, the sample images, point cloud data, etc., referred to in this application are acquired with sufficient authorization. And the information and the data are processed and then used in big data application scenes, and can not be identified to any natural person or generate specific association with the natural person.

In some embodiments, the obstacle recognition model training method provided by the embodiments of the application is performed by a computer device. In some embodiments, the computer device is a terminal, which is any type of terminal, such as a cell phone, a computer, a tablet computer, an autonomous vehicle, and the like. The autonomous vehicle includes a vehicle (e.g., an automobile, a truck, a bus, etc.) traveling on the ground, a vehicle traveling in the air (e.g., an unmanned plane, an airplane, a helicopter, etc.), and a vehicle traveling on or in water (e.g., a ship, a submarine, etc.). The autonomous vehicle may or may not accommodate one or more passengers. In addition, the autonomous vehicle can be applied to the unmanned distribution field, such as the express logistics field, the take-away meal delivery field, and the like.

In other embodiments, the computer device is a server, which is a server, or a server cluster comprising a plurality of servers, or a cloud computing service center. In other embodiments, the computer device includes an autonomous vehicle and a server.

It should be noted that, in the embodiment of the present application, the execution subject of the training method of the obstacle recognition model is not limited.

Fig. 1 is a schematic diagram of an implementation environment provided in an embodiment of the present application, and as shown in fig. 1, the implementation environment includes an autonomous vehicle 101 and a server 102, where the autonomous vehicle 101 and the server 102 are connected through a wireless or wired network.

In some embodiments, the server 102 is a server that provides services to the autonomous vehicle 101. Optionally, the server 102 provides an electronic map for the autonomous vehicle 101, and the server 102 is used for updating an obstacle recognition model of the autonomous vehicle, etc., and the effect of the server 102 is not limited in the embodiment of the present application.

In some embodiments, the server 102 is configured to train the obstacle recognition model and deploy the trained obstacle recognition model into the autonomous vehicle 101.

Fig. 2 is a flowchart of an obstacle recognition model training method according to an embodiment of the present application. The embodiment of the present application is exemplified by taking an execution subject as a computer device, and the embodiment includes:

201. the computer device obtains a sample set comprising a plurality of sample images.

The sample image may be any image. Optionally, the sample image is an image of a photographed road. In some embodiments, the sample image is an image taken by an autopilot during a driving process, or may be an image taken by a camera disposed on a road, or may be an image obtained by other means.

202. In the N iteration process of the training process, the computer equipment identifies the unknown class of obstacle images in the sample set corresponding to the N-1 iteration through an open set identification model corresponding to the N-1 iteration, marks the identified obstacle images as the unknown class, the open set identification model is used for identifying the unknown class of obstacle images based on the learned known class of obstacle images, and N is a positive integer greater than 1.

In the embodiment of the application, the open set recognition model corresponding to the N-1 iteration is obtained through training of the sample set corresponding to the N-1 iteration, so that the known type of obstacle image learned by the open set recognition model corresponding to the N-1 iteration is the obstacle image marked in the sample set corresponding to the N-1 iteration. The sample set corresponding to the N-1 iteration is identified through the open set identification model corresponding to the N-1 iteration, so that not only can the obstacle images marked in the sample set corresponding to the N-1 iteration be identified, but also the unlabeled obstacle images in the sample set corresponding to the N-1 iteration can be identified, and the categories of the obstacle images are determined to be unknown.

In some embodiments, when the open set recognition model recognizes an image, an object image is first detected from the image, and a confidence level of the object image is determined, where the confidence level is used to represent a probability that an object in the object image is an obstacle. If the confidence of the object image is larger than the first threshold, classifying the known categories of the object image, and selecting the category which is matched with the object image best from the known categories as the category corresponding to the object image. If the confidence of the object image is smaller than the first threshold and larger than the second threshold, determining that the object image is an obstacle image, and the category corresponding to the object image is an unknown category. Wherein the first threshold is greater than the second threshold. Specific numerical values of the first threshold and the second threshold are not limited in the embodiment of the present application. Optionally, the first threshold is 0.8 and the second threshold is 0.3.

If a certain sample set is used for training the open set recognition model, the confidence of the obstacle image marked in the sample set in the open set recognition model can be improved, and even is 1. In this way, the confidence of other object images similar to the obstacle images is correspondingly increased, so that the open set recognition model judges the other object images similar to the obstacle images as unknown types of obstacle images.

For example, using a sample set labeled with three types of obstacles, such as an automobile, a pedestrian and a bicycle, the open set recognition model is trained, and the trained open set recognition model can recognize not only the three types of obstacles, such as the automobile, the pedestrian and the bicycle, but also other obstacles similar to the three types of obstacles, such as the automobile, the pedestrian and the bicycle, such as a tricycle, and determine the recognized other obstacles as unknown types.

203. The computer device trains a closed-set recognition model based on the marked sample set to obtain a closed-set recognition model corresponding to the nth iteration, wherein the closed-set recognition model is used for recognizing the obstacle images of known categories.

The closed set identification model can only identify known classes of obstacle images. For example, the closed set recognition model is trained using a sample set labeled with three types of obstacles, namely an automobile, a pedestrian and a bicycle, and the trained closed set recognition model can only recognize the three types of obstacles, namely the automobile, the pedestrian and the bicycle.

Since the open set recognition model labels the obstacle image of the recognized unknown class as the unknown class, the "unknown class" can be regarded as the "known class". Based on the marked sample set, training the closed set recognition model, so that the closed set recognition model can learn the known type of obstacle image and the unknown type of obstacle image marked in the sample set.

204. And if the training process reaches the cycle termination condition, outputting the closed set recognition model corresponding to the nth iteration as an obstacle recognition model.

The training process in the embodiment of the application is a process of multiple iterations, and training is stopped when the training process reaches the loop termination condition. Optionally, the loop termination condition is that the number of iterations reaches a target number.

Through N iterative processes, the open set recognition model continuously excavates new obstacle images in the sample set, so that the closed set recognition model corresponding to the N iteration can recognize obstacle images with more abundant categories, the obstacle recognition capability of the closed set recognition model is greatly improved, the closed set recognition model corresponding to the N iteration can be output as the obstacle recognition model, and the obstacle recognition model is subsequently put into use.

205. If the training process does not reach the loop termination condition, based on a closed set identification model corresponding to the nth iteration, identifying a plurality of sample images in the sample set, and based on the identification result, marking the plurality of sample images in the sample set to obtain a sample set corresponding to the nth iteration, and based on the sample set corresponding to the nth iteration, training an open set identification model corresponding to the (N-1) th iteration to obtain an open set identification model corresponding to the nth iteration.

If the training process does not reach the cycle termination condition, then training needs to be continued. In order to enable the open set recognition model to continue to mine out new classes of obstacles from the sample set, it is also necessary to retrain the open set recognition model.

Taking the obstacle that the open set recognition model corresponding to the N-1 th iteration excavates the tricycle as an example, the open set recognition model corresponding to the N-1 th iteration is likely to fail to recognize all tricycle images in the sample set, so that the accuracy of the "marked sample set" in the step 203 is low. In order to obtain a more accurate sample set, the embodiment of the application identifies a plurality of sample images in the sample set based on a closed set identification model corresponding to the nth iteration, and marks the plurality of sample images in the sample set based on an identification result to obtain the sample set corresponding to the nth iteration. Training an open set recognition model corresponding to the N-1 th iteration based on the sample set corresponding to the N-th iteration to obtain the open set recognition model corresponding to the N-th iteration.

Fig. 3 is a flowchart of an obstacle recognition model training method according to an embodiment of the present application. The embodiment of the present application is exemplified by taking an execution subject as a computer device, and the embodiment includes:

301. the computer device obtains a sample set comprising a plurality of sample images.

The sample set may be a labeled sample set or an unlabeled sample set, which is not limited in this embodiment of the present application.

302. The computer device identifies a plurality of sample images in the sample set through a closed set identification model during a 1 st iteration of the training process.

The closed-set recognition model in step 302 is a trained closed-set recognition model capable of recognizing M classes of obstacles. Wherein M is any positive integer. The closed set recognition model may be trained from a sample set labeled with M classes of obstacles. The closed set identification model may be a cascades model or an RCNN (Region Convolutional Neural Networks, a convolutional neural network) model.

By the closed set recognition model, a plurality of sample images in a sample set are recognized, so that not only M types of obstacles in the sample images can be recognized, but also the types of the obstacles can be determined.

303. The computer device identifies an unknown class of obstacle image in the sample set by opening the set identification model.

The open set recognition model in step 302 is a trained open set recognition model that is trained from a sample set labeled with M classes of obstacles. Therefore, the open set recognition model can depend on the M types of obstacles, and mine 1 type of unknown type of obstacles in the sample set.

The open set recognition model may be an OpenDet model, where the OpenDet model is composed of two learners, one learner is used for learning and extracting more accurate image features, and the other learner is used for learning more accurate first and second thresholds. In some embodiments, for the OpenDet model, training may be performed in an optimization manner with random gradient descent, training may be performed in a learning manner with an initial learning rate of 0.001, a momentum of 0.9, and a weight attenuation coefficient of 0.00001, and 10 times of descent of the learning rate is performed in the 3 rd and 5 th ten thousand times of training, for 6 ten thousand times.

304. The computer device screens out obstacle images meeting obstacle conditions, which are conditions of the object affecting the running of the automatic driving vehicle, from the obstacle images identified by the open set identification model.

The "obstacle image recognized by the open set recognition model" in the above step 304 is only an obstacle image of an unknown class recognized by the open set recognition model, which can recognize both an obstacle image of a known class and an obstacle image of an unknown class.

As can be seen from the description in step 202, the open set recognition model determines, as an unknown obstacle image, an object image having a confidence level smaller than the first threshold value and larger than the second threshold value, and the determined obstacle image is not accurate enough, and the determined obstacle image may include a non-obstacle image. Therefore, after the unknown class obstacle images in the sample set are identified through the open set identification model, the identified obstacle images are screened to eliminate unreliable detection results of the unknown class obstacle images.

In one possible implementation manner, the computer device screens out the obstacle image meeting the obstacle condition from the obstacle images identified by the open set identification model, and the method comprises at least one of the following steps:

(1) The obstacle condition includes a distance threshold indicating that an object having a distance to the road less than the distance threshold is an obstacle affecting the travel of the autonomous vehicle; and determining the distance between the obstacle and the road in the identified obstacle images, and screening the obstacle images with the distance smaller than a distance threshold value from the identified obstacle images.

It should be noted that, an object on the road surface or an object closer to the road surface may affect the running of the autonomous vehicle and may be regarded as an obstacle, while an object farther from the road surface is an object higher in the air and does not affect the running of the autonomous vehicle, so that an object farther from the road surface is not an obstacle. The computer device will screen out the obstacle image of the obstacle located on the road or nearer to the road from the identified obstacle images.

The distance threshold may be any distance, and the distance threshold may be a tested value or a value set by a technician.

The road in the sample image needs to be identified before the distance of the obstacle from the road in the identified obstacle image is determined. In some embodiments, determining the distance of the obstacle from the road in the identified obstacle image comprises: identifying a road image from a sample image to which any identified obstacle image belongs; based on the position of the obstacle image and the road image in the sample image, a distance between an obstacle indicated by the obstacle image in the sample image and a road indicated by the road image is determined.

In one possible implementation, the sample set further includes point cloud data corresponding to the sample image, and the computer device may determine the road image in the sample image based on the point cloud data. The computer device identifies a road image from a sample image to which the obstacle image belongs, including: acquiring point cloud data corresponding to the sample image, wherein the sample image and the point cloud data correspond to the same scene; acquiring points belonging to the road surface from the point cloud data based on the height of the points in the point cloud data; and projecting the acquired points into the sample image to obtain a road image in the sample image.

The road surface can be regarded as a plane, and the computer equipment can cluster the point clouds on the road surface based on a clustering algorithm to obtain a point cloud road segmentation result, namely, obtain points belonging to the road surface in the point cloud data; and projecting points belonging to the road surface into the sample image, and determining concave polygon formed by the projection points to obtain the road image in the sample image.

In another possible implementation, the computer device identifies the road image in the sample image by a road segmentation model. The computer device identifies a road image from a sample image to which the obstacle image belongs, including: and processing the sample image through the road segmentation model to obtain a road image in the sample image.

In some embodiments, the sample set includes point cloud data corresponding to a portion of the sample image, as shown in fig. 4, the computer device may perform road segmentation on the portion of the sample image based on the point cloud data corresponding to the portion of the sample image, to obtain a road image in the portion of the sample image, train a road segmentation model based on the road image in the portion of the sample image, and then directly process the sample image using the road segmentation model to obtain the road image in the sample image.

(2) The obstacle condition includes a depth threshold value indicating a maximum image depth in a driving environment image of an object affecting the driving of the autonomous vehicle; and screening the obstacle images with the image depth larger than a depth threshold value from the identified obstacle images based on the image depth of the identified obstacle images.

It should be noted that, the depth of the object in the sample image can represent the distance between the object and the automatic driving vehicle, and the greater the depth of the object, the greater the distance between the object and the automatic driving vehicle, the less the driving effect on the automatic driving vehicle; the smaller the depth of an object, the closer the object is to the autonomous vehicle, and the greater the impact on the travel of the autonomous vehicle. Moreover, small objects at a distance are generally difficult to detect, and more errors exist in determining unknown obstacles at a distance, so in the embodiment of the present application, the small objects at a distance are filtered out.

The depth threshold may be any value, alternatively, the depth threshold is a tested value; optionally, the depth threshold is a value set by a technician, and the embodiment of the present application does not limit the depth threshold.

In some embodiments, to obtain the depth of the obstacle image, the computer device determines the depth of the obstacle image based on the point cloud data. Optionally, the method further comprises: for any identified obstacle image, acquiring point cloud data based on a sample image to which the obstacle image belongs, wherein the sample image and the point cloud data correspond to the same scene; mapping points in the point cloud data into a sample image, determining a depth map of the sample image based on the depth of the points in the point cloud data, and determining the image depth of the obstacle image in the sample image based on the depth map of the sample image.

Wherein, after the computer device maps the points in the point cloud data to the sample image, a conventional depth-complement method (e.g., IP-Basic method) may be used to obtain a depth map of the sample image, where the depth map is a pixel-level depth map.

In other embodiments, the depth map of the sample image is not required to be determined, and the image depth of the obstacle image is determined directly based on the depth of the point cloud data corresponding to the obstacle image. The method further comprises the steps of: for any identified obstacle image, acquiring point cloud data based on a sample image to which the obstacle image belongs, wherein the sample image and the point cloud data correspond to the same scene; and mapping the point cloud in the point cloud data into a sample image, and determining the image depth of the obstacle image based on the depth of the point cloud data corresponding to the obstacle image in the sample image.

In other embodiments, the computer device may also determine an image depth of the obstacle image based on the depth determination model. The method further comprises the steps of: for any identified obstacle image, determining the image depth of the obstacle image by a depth determination model.

Optionally, the sample set includes point cloud data corresponding to a portion of the sample image, as shown in fig. 4, the computer device may determine a depth map of the portion of the sample image based on the point cloud data corresponding to the portion of the sample image, train a depth determination model based on the depth map of the portion of the sample image, and then determine an image depth of the obstacle image directly based on the depth determination model.

(3) The obstacle condition includes a size threshold indicating a minimum image size in a running environment image of an object affecting the running of the autonomous vehicle; and screening the obstacle images with the image size larger than the size threshold value from the identified obstacle images based on the image sizes of the identified obstacle images.

It should be noted that, if the image size of the obstacle image is smaller, the size of the obstacle may be smaller, or the distance between the obstacle and the autonomous vehicle may be longer; whether the obstacle itself is small in size or the obstacle is far from the automated driving vehicle, it is difficult to influence the running of the automated driving vehicle, and therefore, the obstacle image with a small image size can be filtered out without affecting the running of the automated driving vehicle.

Wherein the size threshold may be any value, alternatively the size threshold is a tested value; optionally, the size threshold is a value set by a technician, and the embodiment of the present application does not limit the size threshold.

305. The computer device annotates the sample set based on the recognition result of the closed set recognition model and the obstacle image satisfying the obstacle condition.

The recognition result of the closed set recognition model is M types of known obstacle detection results with labeling information, and the obstacle image meeting the obstacle condition is 1 type of unknown obstacle detection results. The computer equipment marks the sample set based on the recognition result of the closed set recognition model and the obstacle image meeting the obstacle condition, and the sample set marked with the M+1 type obstacle can be obtained.

In some embodiments, the sample set further includes point cloud data corresponding to a portion of the sample image, and in order to improve recall rate of the unknown class of obstacle, the embodiments of the present application may further supplement the unknown class of obstacle based on the point cloud data. As shown in fig. 4, the method further includes: acquiring point cloud data corresponding to a sample image, wherein the sample image and the point cloud data correspond to the same scene; clustering points, the distance between the points and the road of which does not exceed a distance threshold value, in the point cloud data to obtain a plurality of point clusters; and mapping the plurality of point clusters into a sample image, and if the object image corresponding to any point cluster in the sample image is not an obstacle image identified by the open set identification model, determining the object image as the obstacle image and marking the object image as an unknown type.

The obstacle images identified by the open set identification model include unknown class obstacle images and M class obstacle images of known classes. The object image corresponding to any point cluster in the sample image is not an obstacle image identified by the open set identification model, which means that: the object image corresponding to the point cluster in the sample image is neither an unknown class of obstacle image identified by the open set identification model nor an M class of obstacle image of known class identified by the open set identification model.

After the points, the distance between which and the road does not exceed the distance threshold value, in the point cloud data are clustered, the clusters of the clustered points can be framed by an obstacle preselection frame, the preselection frames are compared with preselection frames obtained by an open set recognition model, and overlapping preselection frames are filtered.

306. The computer equipment trains the closed set recognition model based on the marked sample set to obtain a closed set recognition model corresponding to the 1 st iteration.

The marked sample set comprises M types of known obstacles and 1 type of unknown obstacles, a closed set recognition model is trained based on the marked sample set, and the obtained closed set recognition model corresponding to the 1 st iteration is a model for recognizing the M+1 types of known obstacles.

307. The computer equipment identifies a plurality of sample images in the sample set through a closed set identification model corresponding to the 1 st iteration, marks the plurality of sample images in the sample set based on the identification result to obtain a sample set corresponding to the 1 st iteration, and trains the open set identification model based on the sample set corresponding to the 1 st iteration to obtain an open set identification model corresponding to the 1 st iteration.

The open set recognition model, while mining 1 class of unknown class of obstacles, may have the same class of obstacles in the sample set that are not mined. For example, the open set recognition model digs the tricycle into an unknown obstacle, but the sample set may further include a part of tricycle images which are not marked, so as to improve the accuracy of the sample set, and in the embodiment of the present application, the sample set is marked by the closed set recognition model corresponding to the 1 st iteration. Since the closed set recognition model corresponding to the 1 st iteration can recognize the m+1 type of known obstacle, the closed set recognition model corresponding to the 1 st iteration can relatively accurately recognize the m+1 type of known obstacle in the sample set.

In some embodiments, the training process of the closed-set recognition model typically has a problem of class imbalance, the open-set recognition model has a problem of detection errors, and the computer device may also resample unknown classes of obstacles in order to alleviate the above problem. That is, the unknown type obstacle images with high reliability are selected from the unknown type obstacles excavated in the sample set, the unknown type obstacle images are reset according to a plurality of methods such as random placement, linear placement or dense placement along a radius, and the like, and meanwhile, the labeling information of the unknown type obstacle images is used as training supervision information.

In one possible implementation manner, the computer device marks a plurality of sample images in the sample set based on the identification result to obtain a sample set corresponding to the 1 st iteration, including: labeling the identified obstacle images in the sample set based on the category to which the identified obstacle images belong; screening out obstacle images which meet the confidence coefficient condition and belong to unknown categories from the identified obstacle images; and resetting the position of the screened obstacle image according to a target mode to obtain a sample set corresponding to the 1 st iteration.

For example, based on the confidence of the obstacle images, the obstacle images far away from the center of the class are clustered, the obstacle images in the clustering result are filtered, then the target is sampled uniformly in the cluster according to the appearance probability of the sample, the target is placed by using the poisson fusion scheme, the natural amplification of the visual effect is achieved, the model is trained by using the sample set, and the accuracy rate and recall rate of unknown obstacle prediction are guaranteed.

308. In the N iteration process of the training process, the computer equipment identifies the unknown class of the obstacle image in the sample set corresponding to the N-1 iteration through the open set identification model corresponding to the N-1 iteration, marks the identified obstacle image as the unknown class, and N is a positive integer greater than 1.

The step 308 may be the 2 nd iteration of the training process, the 3 rd iteration … …, and the last iteration.

In one possible implementation, the computer device labeling the identified obstacle image as an unknown class includes: the computer equipment screens out an obstacle image meeting an obstacle condition from the identified obstacle images, wherein the obstacle condition is a condition that an object influences the running of the automatic driving vehicle; and labeling the screened obstacle images as unknown categories.

The step 308 is similar to the

steps

303 and 304, and will not be described in detail herein.

309. The computer equipment trains the closed set recognition model based on the marked sample set to obtain a closed set recognition model corresponding to the Nth iteration.

The step 309 is similar to the step 306, and will not be described in detail herein.

310. And if the iteration number of the training process reaches the target number, the computer equipment outputs a closed set recognition model corresponding to the Nth iteration as an obstacle recognition model.

The target number of times may be any number, for example, 5, 10, 20, 100, or the like. The embodiments of the present application do not limit the target number of times.

311. If the iteration number of the training process does not reach the target number, the computer equipment identifies a plurality of sample images in the sample set through a closed set identification model corresponding to the nth iteration, marks the plurality of sample images in the sample set based on the identification result to obtain a sample set corresponding to the nth iteration, and trains an open set identification model corresponding to the (N-1) th iteration based on the sample set corresponding to the nth iteration to obtain the open set identification model corresponding to the nth iteration.

The step 311 is the same as the step 307, and will not be described in detail here.

The embodiment of the present application loops to execute the

steps

308, 309 and 311 until the number of iterations of the training process reaches the target number of iterations, and then executes the step 310 to end.

According to the method and the device, reliable unknown class obstacle detection results are obtained through the open set recognition model, the unknown class obstacles are marked as unknown classes, then the original M class known class obstacles are fused, the M+1 class known obstacle closed set recognition model is trained, so that the closed set recognition model can detect more unknown class obstacles, and meanwhile detection performance of the unknown class obstacles is guaranteed. And training the open set recognition model based on the detection result of the trained known barrier closed set recognition model so as to enable the open set recognition model to mine new unknown class barriers. Through the steps of cyclic iteration execution, a powerful N+1 type known obstacle closed set recognition model and an unknown obstacle open set recognition model can be obtained.

It should be noted that, in the ECCV 2022SSLAD Track 3:Corner Case Detection (which is an international competition for detecting obstacles by using large-scale non-labeled data and small-scale labeled data in an automatic driving scenario), the first name of the competition is obtained based on the method provided by the embodiment of the present application, and the overall index is far higher than the second name.

It should be noted that, in the method for training the obstacle recognition model provided in the embodiment of the present application, the open set recognition model is used to mine a new type of obstacle image in the sample set, the open set recognition model and the closed set recognition model are trained iteratively, more types of obstacle images are mined from the sample set, and the closed set recognition model is trained by the sample set mining more types of obstacle images to obtain the obstacle recognition model. Therefore, an accurate recognition result can be obtained by combining the open set recognition model and the closed set recognition model. Therefore, in practical application, the obstacle recognition model trained by the obstacle recognition model training method shown in fig. 3 may be used for recognition, or the open set recognition model and the closed set recognition model may be used for recognition. The present embodiment exemplifies "obstacle recognition by the open set recognition model and the closed set recognition model" by the embodiment shown in fig. 5.

Fig. 5 is a flowchart of a method for identifying an obstacle according to an embodiment of the present application, where an execution subject is an autonomous vehicle, and referring to fig. 5, the embodiment includes:

501. an autonomous vehicle acquires an image of a target scene.

The image of the target scene may be any image shot in the running process of the automatic driving vehicle, and the embodiment of the application does not limit the target scene. It should be noted that, in the embodiment of the present application, the execution subject is merely taken as an example to perform an automatic driving of a vehicle, so when the execution subject is another device, the target scene may also be another scene, and the embodiment of the present application does not limit the target scene, and the target scene may be any scene.

502. The automatic driving vehicle recognizes the image through a closed set recognition model to obtain a first recognition result, wherein the closed set recognition model is used for recognizing the learned known obstacle image, and the first recognition result comprises the recognized obstacle image and the category of the obstacle image.

Step 502 is the same as step 302, and will not be described in detail here.

503. The automatic driving vehicle recognizes the image through an open set recognition model to obtain a second recognition result, wherein the open set recognition model is used for recognizing the unknown type of obstacle image based on the learned known type of obstacle image, and the second recognition result comprises the recognized obstacle image.

Step 503 is the same as step 303, and will not be described in detail here.

504. And the automatic driving vehicle fuses the first recognition result and the second recognition result to obtain a third recognition result.

The obstacle image identified by the closed set identification model is not comprehensive enough, but the accuracy of the identification result is higher; the obstacle images identified by the open set identification model are comprehensive, but the accuracy of the identification result is low. Therefore, the embodiment of the application identifies the same image through the closed set identification model and the open set identification model, fuses the two identification results, and improves the accuracy of the identification results as the final identification result.

In one possible implementation manner, the automatically driving vehicle fuses the first recognition result and the second recognition result to obtain a third recognition result, including: the automatic driving vehicle determines an obstacle image which is the same as the obstacle image in the first recognition result in the second recognition result, and the confidence of the determined obstacle image is obtained; screening the obstacle image from the second recognition result based on the confidence; and fusing the screened obstacle image with the first recognition result to obtain a third recognition secondary fruit.

Wherein, based on the confidence, the screening of the obstacle image from the second recognition result may be screening the obstacle image with the confidence not less than the confidence from the second recognition result; alternatively, obstacle images having a confidence level not less than a confidence level threshold, which is determined based on the confidence level and may be less than the confidence level, are selected from the second recognition result.

In another possible implementation manner, the automatically driving vehicle fuses the first recognition result and the second recognition result to obtain a third recognition result, including: screening an obstacle image meeting an obstacle condition from the second identification result by the automatic driving vehicle to obtain a fourth identification result, wherein the obstacle condition is a condition that an object influences the running of the automatic driving vehicle; and fusing the first recognition result and the fourth recognition result to obtain a third recognition result.

Optionally, the autonomous vehicle screens out an obstacle image satisfying the obstacle condition from the second recognition result, including at least one of: (1) The obstacle condition includes a distance threshold indicating that an object having a distance to the road less than the distance threshold is an obstacle affecting the travel of the autonomous vehicle; determining the distance between the obstacle and the road in the identified obstacle image based on the second identification result, and screening the obstacle image with the distance smaller than the distance threshold value from the second identification result; (2) The obstacle condition includes a depth threshold value indicating a maximum image depth in the driving environment image of an object affecting the driving of the autonomous vehicle; determining the image depth of the identified obstacle image based on the second identification result, and screening the obstacle image with the image depth larger than a depth threshold value from the second identification result; (3) The obstacle condition includes a size threshold value indicating a minimum image size in the running environment image of an object affecting the running of the autonomous vehicle; and screening the obstacle images with the image sizes larger than the size threshold value from the second recognition results based on the image sizes of the obstacle images in the second recognition results.

Optionally, the automatic driving vehicle determines a distance between the obstacle and the road in the identified obstacle image based on the second identification result, including: identifying a road image from an image to which the obstacle image belongs for any obstacle image in the second identification result; based on the positions of the obstacle image and the road image in the image, a distance between the obstacle indicated by the obstacle image and the road indicated by the road image in the image is determined.

Optionally, the automatic driving vehicle identifies the road image from the images to which the obstacle image belongs, including: acquiring point cloud data corresponding to an image, wherein the image and the point cloud data correspond to the same scene; acquiring points belonging to the road surface from the point cloud data based on the height of the points in the point cloud data; projecting the obtained points into an image to obtain a road image in the image; or processing the image through the road segmentation model to obtain a road image in the image.

Optionally, the method further comprises: for any obstacle image in the second recognition result, acquiring point cloud data based on an image to which the obstacle image belongs, wherein the image and the point cloud data correspond to the same scene; mapping points in the point cloud data into an image, determining a depth map of the image based on the depth of the points in the point cloud data, and determining the image depth of the obstacle image in the image based on the depth map of the image; or, for any obstacle image in the second recognition result, acquiring point cloud data based on an image to which the obstacle image belongs, wherein the image and the point cloud data correspond to the same scene; mapping points in the point cloud data into an image, and determining the image depth of the obstacle image based on the depth of the point cloud data corresponding to the obstacle image in the image; or, for any obstacle image in the second recognition result, determining the image depth of the obstacle image by a depth determination model.

The step of obtaining the fourth recognition result may refer to the step 304, which is not described herein.

In another possible way, the third recognition result may also be obtained based on the point cloud data. Optionally, fusing the first recognition result and the second recognition result to obtain a third recognition result, including: acquiring point cloud data of a target scene; clustering points, of which the distance between the points and the road does not exceed a distance threshold value, in the point cloud data to obtain a plurality of point clusters; mapping the plurality of point clusters into a sample image to obtain object images corresponding to the plurality of point clusters respectively; determining a fifth recognition result based on the object images and the second recognition results which correspond to the plurality of point clusters respectively, wherein the fifth recognition result comprises object images which do not belong to the second recognition result in the object images which correspond to the plurality of point clusters respectively; and fusing the first identification result, the second identification result and the fifth identification result to obtain a third identification result.

Fig. 6 is a schematic structural diagram of an obstacle recognition model training device provided in an embodiment of the present application, referring to fig. 6, the device includes:

an acquisition module 601, configured to acquire a sample set, where the sample set includes a plurality of sample images;

The labeling module 602 is configured to identify, in an nth iteration process of the training process, an unknown class of obstacle image in a sample set corresponding to an nth-1 iteration through an open set identification model corresponding to the nth-1 iteration, label the identified obstacle image as the unknown class, where the open set identification model is used to identify the unknown class of obstacle image based on the learned known class of obstacle image, and N is a positive integer greater than 1;

the training module 603 is configured to train a closed-set recognition model based on the labeled sample set, so as to obtain a closed-set recognition model corresponding to the nth iteration, where the closed-set recognition model is used to recognize an obstacle image of a known class;

an output module 604, configured to output the closed set recognition model corresponding to the nth iteration as an obstacle recognition model if the training process reaches a loop termination condition;

the training module 603 is further configured to identify, by using a closed set identification model corresponding to an nth iteration, the plurality of sample images in the sample set if the training process does not reach a loop termination condition, label the plurality of sample images in the sample set based on an identification result, obtain a sample set corresponding to the nth iteration, and train, based on the sample set corresponding to the nth iteration, an open set identification model corresponding to the N-1 th iteration, so as to obtain an open set identification model corresponding to the nth iteration.

As shown in fig. 7, in one possible implementation, the labeling module 602 includes:

a screening unit 6021 for screening out an obstacle image satisfying an obstacle condition, which is a condition that an object affects the travel of the automated driving vehicle, from the identified obstacle images;

and a labeling unit 6022, configured to label the screened obstacle image as the unknown class.

In a possible implementation manner, the screening unit 6021 is configured to at least one of the following:

In a possible implementation manner, the screening unit 6021 is configured to identify, for any identified obstacle image, a road image from a sample image to which the obstacle image belongs; a distance between an obstacle indicated by the obstacle image in the sample image and a road indicated by the road image is determined based on the positions of the obstacle image and the road image in the sample image.

In a possible implementation manner, the filtering unit 6021 is configured to obtain point cloud data corresponding to the sample image, where the sample image and the point cloud data correspond to the same scene; acquiring points belonging to a road surface from the point cloud data based on the height of the points in the point cloud data; projecting the obtained points into the sample image to obtain a road image in the sample image; or,

The screening unit 6021 is configured to process the sample image through a road segmentation model to obtain a road image in the sample image.

In one possible implementation, the apparatus further includes:

a determining module 605, configured to obtain, for any identified obstacle image, point cloud data based on a sample image to which the obstacle image belongs, where the sample image and the point cloud data correspond to the same scene; mapping points in the point cloud data into the sample image, determining a depth map of the sample image based on the depth of the points in the point cloud data, and determining the image depth of the obstacle image in the sample image based on the depth map of the sample image; or,

the determining module 605 is configured to obtain, for any identified obstacle image, point cloud data based on a sample image to which the obstacle image belongs, where the sample image and the point cloud data correspond to the same scene; mapping points in the point cloud data into the sample image, and determining the image depth of the obstacle image based on the depth of the point cloud data corresponding to the obstacle image in the sample image; or,

The determining module 605 is configured to determine, for any identified obstacle image, an image depth of the obstacle image by using a depth determination model.

In one possible implementation, the apparatus further includes:

the obtaining module 601 is further configured to obtain, in any iteration process, point cloud data corresponding to the sample image, where the sample image and the point cloud data correspond to the same scene;

a clustering module 606, configured to cluster points in the point cloud data, where the distance between the points and the road does not exceed a distance threshold value, to obtain a plurality of point clusters;

the labeling module 602 is further configured to map the plurality of point clusters into the sample image, and if an object image corresponding to any point cluster in the sample image is not an obstacle image identified by the open set identification model, determine the object image as an obstacle image and label the object image as an unknown class.

In a possible implementation manner, the training module 603 is configured to label, in the sample set, the identified obstacle image based on a category to which the identified obstacle image belongs; screening out obstacle images which meet a confidence coefficient condition and belong to the unknown class from the identified obstacle images; and resetting the position of the screened obstacle image according to a target mode to obtain a sample set corresponding to the Nth iteration.

In one possible implementation, the output module 604 is configured to output the closed-set identification model corresponding to the nth iteration as the target closed-set identification model if the iteration process of the training process reaches the target number of times.

It should be noted that: in the training device for the obstacle recognition model provided in the above embodiment, only the division of the functional modules is used for illustration when training the obstacle recognition model, and in practical application, the functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the computer device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the device for training the obstacle recognition model provided in the above embodiment and the method embodiment for training the obstacle recognition model belong to the same concept, and detailed implementation processes of the device are shown in the method embodiment, and are not repeated here.

Fig. 8 is a schematic structural diagram of an obstacle identifying apparatus according to an embodiment of the present application, referring to fig. 8, the apparatus includes:

an acquiring module 801, configured to acquire an image of a target scene;

a first recognition module 802, configured to recognize the image through a closed-set recognition model, to obtain a first recognition result, where the closed-set recognition model is used to recognize the learned known class of obstacle image, and the first recognition result includes the recognized obstacle image and the class of the obstacle image;

A second recognition module 803, configured to recognize the image through an open-set recognition model, to obtain a second recognition result, where the open-set recognition model is configured to recognize an unknown class of obstacle image based on the learned known class of obstacle image, and the second recognition result includes the recognized obstacle image;

and the fusion module 804 is configured to fuse the first recognition result and the second recognition result to obtain a third recognition result.

As shown in fig. 9, in one possible implementation, the fusing module 804 includes:

a screening unit 8041, configured to screen an obstacle image that meets an obstacle condition from the second recognition result, to obtain a fourth recognition result, where the obstacle condition is a condition that the object affects the running of the autonomous vehicle;

and a fusion unit 8042, configured to fuse the first recognition result and the fourth recognition result, and obtain a third recognition result.

In one possible implementation, the obstacle condition includes a distance threshold value indicating that an object having a distance to the road less than the distance threshold value is an obstacle affecting travel of the autonomous vehicle; a screening unit 8041, configured to determine a distance between an obstacle and a road in the identified obstacle image based on the second identification result, and screen an obstacle image with a distance less than a distance threshold from the second identification result;

The obstacle condition includes a depth threshold value indicating a maximum image depth in the driving environment image of an object affecting the driving of the autonomous vehicle; a screening unit 8041, configured to determine an image depth of the identified obstacle image based on the second identification result, and screen an obstacle image whose image depth is greater than a depth threshold value from the second identification result;

the obstacle condition includes a size threshold value indicating a minimum image size in the running environment image of an object affecting the running of the autonomous vehicle; a screening unit 8041, configured to screen, based on the image size of the obstacle image in the second recognition result, an obstacle image whose image size is larger than the size threshold from the second recognition result.

In a possible implementation manner, the screening unit 8041 is configured to identify, for any obstacle image in the second identification result, a road image from images to which the obstacle image belongs; based on the positions of the obstacle image and the road image in the image, a distance between the obstacle indicated by the obstacle image and the road indicated by the road image in the image is determined.

In a possible implementation manner, the filtering unit 8041 is configured to obtain point cloud data corresponding to an image, where the image and the point cloud data correspond to the same scene; acquiring points belonging to the road surface from the point cloud data based on the height of the points in the point cloud data; projecting the obtained points into an image to obtain a road image in the image; or,

And a screening unit 8041, configured to process the image through the road segmentation model, so as to obtain a road image in the image.

In a possible implementation manner, the screening unit 8041 is configured to, for any obstacle image in the second recognition result, obtain point cloud data based on an image to which the obstacle image belongs, where the image and the point cloud data correspond to the same scene; mapping points in the point cloud data into an image, determining a depth map of the image based on the depth of the points in the point cloud data, and determining the image depth of the obstacle image in the image based on the depth map of the image; or,

a screening unit 8041, configured to, for any obstacle image in the second recognition result, obtain point cloud data based on an image to which the obstacle image belongs, where the image and the point cloud data correspond to the same scene; mapping points in the point cloud data into an image, and determining the image depth of the obstacle image based on the depth of the point cloud data corresponding to the obstacle image in the image; or,

a screening unit 8041 for determining, for any obstacle image in the second recognition result, an image depth of the obstacle image by the depth determination model.

In one possible implementation, the fusion module 804 is configured to obtain point cloud data of the target scene; clustering points, of which the distance between the points and the road does not exceed a distance threshold value, in the point cloud data to obtain a plurality of point clusters; mapping the plurality of point clusters into a sample image to obtain object images corresponding to the plurality of point clusters respectively; determining a fifth recognition result based on the object images and the second recognition results which correspond to the plurality of point clusters respectively, wherein the fifth recognition result comprises object images which do not belong to the second recognition result in the object images which correspond to the plurality of point clusters respectively; and fusing the first identification result, the second identification result and the fifth identification result to obtain a third identification result.

It should be noted that: in the obstacle identifying apparatus provided in the above embodiment, only the division of the above functional modules is used for illustration when identifying an obstacle, and in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the autonomous vehicle is divided into different functional modules, so as to perform all or part of the functions described above. In addition, the obstacle recognition device and the obstacle recognition method provided in the foregoing embodiments belong to the same concept, and detailed implementation processes of the obstacle recognition device and the obstacle recognition method are detailed in the method embodiments, which are not repeated herein.

Fig. 10 is a block diagram of a terminal 1000 according to an embodiment of the present application. Terminal 1000 includes: a processor 1001 and a memory 1002.

The processor 1001 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 1001 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 1001 may also include a main processor, which is a processor for processing data in an awake state, also referred to as a CPU (Central Processing Unit ), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 1001 may integrate a GPU (Graphics Processing Unit, image processor) for rendering and drawing of content required to be displayed by the display screen. In some embodiments, the processor 1001 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.

Memory 1002 may include one or more computer-readable storage media, which may be non-transitory. Memory 1002 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 1002 is used to store at least one program code for execution by processor 1001 to implement the obstacle recognition model training method or the obstacle recognition method provided by the method embodiments herein.

In some embodiments, terminal 1000 can optionally further include: a peripheral interface 1003, and at least one peripheral. The processor 1001, the memory 1002, and the peripheral interface 1003 may be connected by a bus or signal line. The various peripheral devices may be connected to the peripheral device interface 1003 via a bus, signal wire, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 1004, a display 1005, a camera 1006, and a power supply 1007.

Those skilled in the art will appreciate that the structure shown in fig. 10 is not limiting and that terminal 1000 can include more or fewer components than shown, or certain components can be combined, or a different arrangement of components can be employed.

Fig. 11 is a schematic structural diagram of a server provided in the embodiments of the present application, where the server 1100 may have a relatively large difference due to configuration or performance, and may include one or more processors (Central Processing Units, CPU) 1101 and one or more memories 1102, where the memories 1102 store at least one program code that is loaded and executed by the processor 1101 to implement the methods provided in the respective method embodiments described above. Of course, the server may also have a wired or wireless network interface, a keyboard, an input/output interface, and other components for implementing the functions of the device, which are not described herein.

The server 1100 is configured to perform the steps performed by the server in the method embodiments described above.

In an exemplary embodiment, a computer readable storage medium, e.g. a memory comprising program code, executable by a processor in a computer device to perform the obstacle recognition model training method or the obstacle recognition method of the above embodiments is also provided. For example, the computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

In an exemplary embodiment, there is also provided a computer program or a computer program product comprising computer program code which, when executed by a computer, causes the computer to implement the obstacle recognition model training method or the obstacle recognition method in the above embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the above storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The foregoing description of the preferred embodiments is merely exemplary in nature and is in no way intended to limit the invention, since it is intended that all modifications, equivalents, improvements, etc. that fall within the spirit and scope of the invention.

Claims

1. A method of identifying an obstacle, the method comprising:

acquiring an image of a target scene;

2. The method of claim 1, wherein fusing the first recognition result and the second recognition result to obtain a third recognition result comprises:

screening an obstacle image meeting an obstacle condition from the second identification result to obtain a fourth identification result, wherein the obstacle condition is a condition that an object influences the running of the automatic driving vehicle;

and fusing the first recognition result and the fourth recognition result to obtain the third recognition result.

3. The method of claim 2, wherein the screening the second recognition result for the obstacle image satisfying the obstacle condition comprises at least one of:

the obstacle condition includes a distance threshold indicating that an object having a distance to the road less than the distance threshold is an obstacle affecting the travel of the autonomous vehicle; determining the distance between the obstacle and the road in the identified obstacle image based on the second identification result, and screening the obstacle image with the distance smaller than a distance threshold value from the second identification result;

The obstacle condition includes a depth threshold value indicating a maximum image depth in a driving environment image of an object affecting the driving of the autonomous vehicle; determining the image depth of the identified obstacle image based on the second identification result, and screening the obstacle image with the image depth larger than the depth threshold value from the second identification result;

the obstacle condition includes a size threshold indicating a minimum image size in a driving environment image of an object affecting the driving of the autonomous vehicle; and screening the obstacle images with the image sizes larger than the size threshold from the second recognition results based on the image sizes of the obstacle images in the second recognition results.

4. The method of claim 1, wherein fusing the first recognition result and the second recognition result to obtain a third recognition result comprises:

acquiring point cloud data of the target scene;

mapping the plurality of point clusters into the sample image to obtain object images corresponding to the plurality of point clusters respectively;

Determining a fifth recognition result based on the object images respectively corresponding to the plurality of point clusters and the second recognition result, wherein the fifth recognition result comprises object images which do not belong to the second recognition result in the object images respectively corresponding to the plurality of point clusters;

and fusing the first identification result, the second identification result and the fifth identification result to obtain the third identification result.

5. A method of training an obstacle recognition model, the method comprising:

obtaining a sample set, the sample set comprising a plurality of sample images;

and if the training process does not reach the loop termination condition, identifying the plurality of sample images in the sample set through the closed set identification model corresponding to the nth iteration, marking the plurality of sample images in the sample set based on the identification result to obtain a sample set corresponding to the nth iteration, and training the open set identification model corresponding to the (N-1) th iteration based on the sample set corresponding to the nth iteration to obtain the open set identification model corresponding to the nth iteration.

6. The method according to claim 1, wherein the method further comprises:

7. An obstacle recognition device, the device comprising:

the acquisition module is used for acquiring an image of the target scene;

8. An obstacle recognition model training device, the device comprising:

and the training module is further configured to identify the plurality of sample images in the sample set through a closed set identification model corresponding to the nth iteration if the training process does not reach the loop termination condition, label the plurality of sample images in the sample set based on the identification result, obtain a sample set corresponding to the nth iteration, and train an open set identification model corresponding to the nth-1 iteration based on the sample set corresponding to the nth iteration, so as to obtain an open set identification model corresponding to the nth iteration.

9. A computer device comprising one or more processors and one or more memories, the one or more memories having stored therein at least one program code that is loaded and executed by the one or more processors to implement the operations performed by the obstacle recognition method of any of claims 1-4 or to implement the operations performed by the obstacle recognition model training method of any of claims 5-6.

10. A computer-readable storage medium, characterized in that at least one program code is stored in the storage medium, the at least one program code being loaded and executed by a processor to implement operations performed by the obstacle recognition method as claimed in any one of claims 1 to 4, or to implement operations performed by the obstacle recognition model training method as claimed in any one of claims 5 to 6.