CN112447065B

CN112447065B - Trajectory planning method and device

Info

Publication number: CN112447065B
Application number: CN201910760761.XA
Authority: CN
Inventors: 林鹏宏
Original assignee: Beijing Horizon Robotics Technology Research and Development Co Ltd
Current assignee: Beijing Horizon Robotics Technology Research and Development Co Ltd
Priority date: 2019-08-16
Filing date: 2019-08-16
Publication date: 2022-04-26
Anticipated expiration: 2039-08-16
Also published as: CN112447065A

Abstract

The invention discloses a track planning method and a device, wherein driving data corresponding to movable equipment is processed according to a preset model to obtain a first cost map; mapping relevant scene data corresponding to the movable equipment to obtain a second cost graph; determining a spatiotemporal cost map based on the first cost map and the second cost map; and determining a target driving track corresponding to the movable equipment according to the target driving strategy corresponding to the movable equipment and the space-time cost map.

Description

Trajectory planning method and device

Technical Field

The application relates to the technical field of automatic driving, in particular to a trajectory planning method and device.

Background

With the continuous development of science and technology, automatic driving is also developed at a rapid speed. The automatic driving is not required to be equipped with a driver, and the whole process is automatically controlled by a computer.

The key point of the automatic driving research is driving track planning, and whether the driving track planning is reasonable or not directly influences driving safety and driving efficiency. For example, if the driving trajectory of the vehicle is not planned properly, the driving safety is seriously affected if the passengers are automatically driven to the place of the traffic accident. For another example, the driving trajectory of the drone is not planned reasonably, and the drone may collide with other drones, thereby causing a collision accident.

However, the current trajectory planning has low precision, which causes unreasonable trajectory planning.

Disclosure of Invention

The present application is proposed to solve the above-mentioned technical problems.

According to an aspect of the present application, there is provided a trajectory planning method, the method including: processing driving data corresponding to the movable equipment according to a preset model to obtain a first price map; mapping relevant scene data corresponding to the movable equipment to obtain a second cost graph; determining a spatiotemporal cost map based on the first cost map and the second cost map; and determining a target driving track corresponding to the movable equipment according to the target driving strategy corresponding to the movable equipment and the space-time cost map.

According to another aspect of the present application, there is provided a trajectory planning apparatus comprising: the processing module is used for calling a preset model to process the driving data corresponding to the movable equipment to obtain a first price map; an obtaining module, configured to map relevant scene data corresponding to the mobile device to obtain a second cost map; a first determination module for determining a spatiotemporal cost map based on the first cost map and the second cost map; and the second determining module is used for determining a target driving track corresponding to the movable equipment according to the target driving strategy corresponding to the movable equipment and the space-time cost map.

According to still another aspect of the present application, there is provided a mobile device including: a processor; and a memory having stored therein computer program instructions which, when executed by the processor, cause the processor to perform the method as described above.

According to yet another aspect of the application, there is provided a computer readable medium having stored thereon computer program instructions which, when executed by a processor, cause the processor to perform the method as described above.

Compared with the prior art, the driving data corresponding to the movable equipment is processed by adopting the preset model to obtain the first cost map, so that the cost required to be paid through each track can be objectively and accurately predicted, then the related scene data of the movable equipment is mapped to obtain the second cost map, the second cost map is mapped by the scene data, so that the static cost of the obstacle corresponding to each track can be accurately reflected, and then the space-time cost map is determined based on the first cost map and the second cost map, so that the space-time cost map combines the track cost objectively reflected by the first cost map and the static cost in the second cost map from the time domain and the space domain, and the cost of each track can be accurately and comprehensively reflected from the space-time aspect and the space aspect. And then generating a target driving track corresponding to the movable equipment according to the target driving strategy corresponding to the movable equipment and the space-time cost map. The obtained space-time cost map can accurately and comprehensively reflect the cost of each track, so that the target driving track can be accurately matched, and the accuracy and the reasonability of the target driving track are effectively improved.

The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.

Drawings

The above and other objects, features and advantages of the present application will become more apparent by describing in more detail embodiments of the present application with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the principles of the application. In the drawings, like reference numbers generally represent like parts or steps.

Fig. 1 is a schematic flowchart of a trajectory planning method according to an exemplary embodiment of the present application.

FIG. 2 is a diagram of a spatiotemporal cost graph provided by another exemplary embodiment of the present application.

Fig. 3 is a schematic flowchart of determining a target driving trajectory corresponding to a mobile device according to an exemplary embodiment of the present application.

FIG. 4 is a flow chart illustrating a process for calculating a respective cost value of each track according to an exemplary embodiment of the present application.

Fig. 5 is a schematic diagram of a trajectory planning apparatus according to an exemplary embodiment of the present application.

FIG. 6 is an exemplary block diagram of a first determination module provided by an exemplary embodiment of the present application;

FIG. 7 is a block diagram of an example of a generation module provided by an exemplary embodiment of the present application;

FIG. 8 is an exemplary block diagram of a removable device provided by an exemplary embodiment of the present application.

Detailed Description

Hereinafter, example embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be understood that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and that the present application is not limited by the example embodiments described herein.

Summary of the application

Existing trajectory planning generally performs trajectory planning by collecting surrounding environment data through various sensors and converting the data into a cost function, so that positions of various obstacles (such as steps, guard posts, underground passages and slopes) in the surrounding environment need to be considered. If the movable equipment passes through a position, the relevant environment data of the position collected by the sensor is converted into a relevant cost function, and the cost of the movable equipment passing through the position is obtained. In the current trajectory planning scheme, in order to avoid collision, the setting of the terrain trafficability is very conservative, and a trafficable path is often identified as a non-trafficable path, so that the planned path is unreasonable.

In order to solve the problems, the driving data corresponding to the movable equipment is processed by adopting a preset model to obtain a first cost map, the cost of the movable equipment for driving operation during driving can be objectively and accurately predicted, then mapping the related scene data of the movable equipment to obtain a second cost map, wherein the second cost map is obtained by mapping the scene data, so that the static cost of the obstacle corresponding to each track can be accurately reflected, then determining a spatio-temporal cost map based on the first cost map and the second cost map, whereby the spatio-temporal cost map combines the trajectory cost objectively reflected by the first cost map and the static cost in the second cost map from both the time domain and the space domain, therefore, the cost of the movable equipment for performing driving operation in the driving process can be accurately and comprehensively reflected in both space and time. And then generating a target driving track corresponding to the movable equipment according to the target driving strategy corresponding to the movable equipment and the space-time cost map. The obtained space-time cost map can accurately and comprehensively reflect the cost of the mobile equipment for executing the driving operation in the driving process, so that the target driving track can be accurately matched according to the cost reflected by the space-time cost map, and the accuracy and the reasonability of the target driving track are effectively improved.

Exemplary method

Fig. 1 is a schematic flowchart of a trajectory planning method according to an exemplary embodiment of the present application. The embodiment can be applied to mobile equipment. The mobile equipment of this embodiment, including unmanned vehicle, unmanned aerial vehicle, arm and mobile robot etc. can the autonomous movement's equipment.

Since the trajectory planning method of the embodiment is directed to the mobile device, the application scenario is different from the existing route planning scenario. When an existing map APP plans a route for a user, route planning is performed according to a starting point position input by the user and an end point position input by the user, then a plurality of routes from the starting point to the end point are displayed on a display screen, navigation time is given, traffic jam sections of all routes and the like can be displayed, vehicle driving data cannot be considered, and only road data can be considered. The trajectory planning of the embodiment is different from the existing route planning, and since the environmental information, the speed, the position, and the like of the mobile device are changed at different times, the cost of driving the mobile device is calculated at each time in consideration of the time domain and the space domain. In particular, the mobile device may have multiple driving possibilities at each time. For example, the unmanned vehicle may accelerate left turn, decelerate left turn, maintain current speed left turn, accelerate right turn, decelerate right turn, maintain current speed right turn, accelerate straight going, decelerate straight going, maintain current speed straight going, stop, and so on, at a certain time, so it is necessary to determine the cost of the unmanned vehicle to perform each operation, and then control the unmanned vehicle according to the form of the track point with the lowest cost. For example, when a certain intersection is passed, 40 steps of speed or 20 steps of speed are passed through the intersection, acceleration or deceleration is passed through the intersection, and the like, so that if the cost value of 40 steps of speed for passing through the intersection is the lowest, the unmanned vehicle is controlled to pass through the intersection at 40 steps of speed. Therefore, the trajectory planning of the present application is different from the route planning in practice, and the present application focuses on the cost of all possible operations of the mobile device, and then performs the trajectory planning according to the cost.

The trajectory planning method described in one or more embodiments of the present application is shown in fig. 1, and includes the following steps:

step 101, processing driving data corresponding to the movable equipment according to a preset model to obtain a first cost map.

The preset models of the present embodiment include, but are not limited to: maximum entropy nonlinear deep inverse reinforcement learning model, and the like.

Further, the maximum entropy nonlinear deep inverse reinforcement learning model needs to be trained in advance, a large amount of driving data is modeled into distribution about the trajectory, and the distribution about the trajectory is constrained to one of the maximum entropies.

The driving data is used as a training sample to train the model. The driving data includes: time, speed, location, line, trajectory, real-time environmental information, and the like. Although the present embodiment uses the maximum entropy nonlinear deep inverse reinforcement learning model, the driving data is different for different mobile devices. Taking the unmanned vehicle as an example, the driving data of the unmanned vehicle includes: the specific position is data such as time, a certain intersection, vehicle speed, driving at the intersection along what track, real-time road conditions of the intersection and the like. And the driving data of the drone includes: the specific location is a certain altitude point, speed, the trajectory at which altitude a certain obstruction is passed, etc.

In particular, the cost function is learned from the presentation data by the IRL method. The IRL method refers to a reward function for reversely deducing the MDP Markov decision process on the premise of giving a strategy or some operation demonstration. The idea of the IRL method is to lead the mobile equipment to deduce a reward function which can guide the mobile equipment to converge to the strategy of driving by the driver from the expert demonstration behavior of the driver, namely, a cost graph is reversely deduced by demonstrating the strategy. The IRL determines the relative importance degree of each task and calculates a reward function which is a guiding principle of a series of decision behaviors; in the field where the reward function is difficult to quantify, the reward function for human drivers to make decisions can be learned through the IRL.

Further, the present embodiment is based on the Max-Ent DIRL framework of fully convolutional neural networks (FCNs) as a pre-set model. The end-to-end mapping from the original driving data sample to the cost is realized by utilizing a large number of driving data samples, so that the cost graph output by the preset model can objectively, comprehensively and accurately reflect the cost paid by the movable equipment when the movable equipment executes the driving operation in the driving process.

The driving data corresponding to the movable device comprises: time, speed, location, line, trajectory, real-time environmental information, and the like.

And the first price chart represents the price paid by the mobile equipment for driving operation during driving. The first cost map is displayed on the mobile device in the form of a three-dimensional map.

The first cost map comprises multiple layers (more than two layers), and each dimension layer represents the cost map at each moment. The first price map can reflect the price paid by the movable equipment for driving operation in the driving process from the time domain and the space domain.

Further, each pixel in the first cost map has a cost value. The cost value of each pixel specifically includes a static cost value and a dynamic cost value. The first cost map includes a static cost value and a dynamic cost value. The dynamic cost value is affected by the static cost value.

And 102, mapping the relevant scene data corresponding to the movable equipment to obtain a second cost map.

Specifically, the second cost map is also called a manual cost map, various related scene data are collected by various sensors on the movable equipment, and a mapping relation exists between the related scene data and the cost function, so that the cost function can be mapped according to the related scene data, and the second cost map is formed.

The price paid by the movable equipment to perform driving operation during driving is reflected on the time domain and the space domain in the second price map. It is noted that the second cost map includes static cost values of the obstacles and the graphical boundary, so that the cost of the mobile device to perform the driving operation during driving is reflected by the static cost values. In particular, the second cost map depicts the location, shape, and boundaries of various obstacles, etc. For example, static obstacles such as stairs, slopes, bollards, underground tunnels, etc. may have respective static costs, and graphical boundaries may also have costs.

And 103, determining a space-time cost map based on the first cost map and the second cost map.

In particular, the spatiotemporal cost map reflects the cost of the mobile device in driving operation from the time domain and the space domain. But is different from the first cost map in that the static cost values in the second cost map are combined into the first cost map by considering the positions and shapes of various obstacles in the second cost map, the boundaries of the cost map and the like, so that the cost values of the space-time cost map are more accurate and comprehensive and are more three-dimensional.

As shown in fig. 2, the spatio-temporal cost map includes multiple layers (more than two layers), and each dimension layer represents the spatio-temporal cost map at each time. If the spatial plan is 2-dimensional, the spatiotemporal cost map is 3-dimensional because the dimension of time is increased. Similarly, if the spatial plan is n-dimensional, then the spatio-temporal cost network is n + 1-dimensional. For ease of understanding, a 2-dimensional space example is presented.

In fig. 2, a certain layer (2D grid cost map) represents a space-time cost map at a certain time, and the space-time cost map includes a static cost value and a dynamic cost value. The dynamic cost value is affected by the static cost value.

Further, the static cost value refers to the cost value of the environmental information, such as the cost value of barriers at a certain intersection, a pillar, a red road lamp, etc., and is mainly expressed in the form of a static cost value. The static cost value in the space-time cost graph is obtained by combining the static cost value in the first cost graph and the static cost value in the second cost graph.

The dynamic cost value refers to the cost value paid by executing dynamic operation at the moment, and the dynamic cost value in the space-time cost graph is obtained by combining the static cost value in the first cost graph and the dynamic cost value in the second cost graph. For example, if there is a turning-around indication line on a road in the second cost map, and the static cost value of the turning-around indication line is 10, then the dynamic cost value of the mobile device when turning around the road may be 1, and if there is a barrier at the position of the turning-around indication line at the same road position in the first cost map, then turning is not allowed here, and after the first cost map is integrated, both the static cost value and the dynamic cost value in the space-time cost value may be 100, which indicates that a collision may occur when turning around.

The time increases upwards in the height direction, the bottom layer represents the current time t, the time increases upwards at equal time intervals, the second layer is t + delta t, the third layer is t +2 delta t …, and the like.

Each pixel in the graph has a cost value, and the range of cost values can be arbitrarily specified, such as from 0 to 100. 0 indicates the lowest cost and 100 indicates the highest cost, 100 indicating that a collision will occur at that point.

And 104, determining a target driving track corresponding to the movable equipment according to the target driving strategy corresponding to the movable equipment and the space-time cost map.

Specifically, the target driving maneuver is specifically a maneuver that performs the target driving maneuver. The preset number of driving operations is not fixed. For example, the movable equipment runs straight in the middle lane of the unidirectional three lanes, and the target driving strategy may include: right lane change, left lane change, lane keeping, etc. preset driving operations. And the target driving track is a planned route with the minimum cost value for driving the movable equipment in the driving process. The purpose of the step is to determine a cost value corresponding to preset driving operation in the target driving strategy from the space-time cost map, and then determine a track with the minimum cost value as a target driving track.

Through the analysis, the embodiment of the invention adopts the preset model to process the driving data corresponding to the movable equipment to obtain the first cost map, the cost of the movable equipment for driving operation during driving can be objectively and accurately predicted, then mapping the related scene data of the movable equipment to obtain a second cost map, wherein the second cost map is obtained by mapping the scene data, so that the static cost of the obstacle corresponding to each track can be accurately reflected, then determining a spatio-temporal cost map based on the first cost map and the second cost map, whereby the spatio-temporal cost map combines the trajectory cost objectively reflected by the first cost map and the static cost in the second cost map from both the time domain and the space domain, therefore, the cost of the movable equipment for performing driving operation in the driving process can be accurately and comprehensively reflected in both space and time. And then generating a target driving track corresponding to the movable equipment according to the target driving strategy corresponding to the movable equipment and the space-time cost map. The obtained space-time cost map can accurately and comprehensively reflect the cost of the mobile equipment for executing the driving operation in the driving process, so that the target driving track can be accurately matched according to the cost reflected by the space-time cost map, and the accuracy and the reasonability of the target driving track are effectively improved.

On the basis of the embodiment shown in fig. 1, the preset model comprises a maximum entropy nonlinear depth inverse reinforcement learning model. As an optional implementation manner of this embodiment, in a specific implementation process of step 101, the driving data of the mobile device is processed according to the maximum entropy nonlinear deep inverse reinforcement learning model, so as to obtain an inverse reinforcement learning cost map. The maximum entropy nonlinear deep inverse reinforcement learning model is trained in advance, so that the driving data of the mobile equipment is input into the model, and a first price map, which is also called an inverse reinforcement learning cost map, can be obtained.

The operation utilizes a large number of driving data samples, and end-to-end mapping from original driving data to cost is realized, so that the cost graph output by the preset model can objectively, comprehensively and accurately reflect the cost paid by the movable equipment when driving operation is performed.

Specifically, the driving data corresponding to the movable device includes: time, speed, location, line, trajectory, real-time environmental information, and the like. And inputting the specific parameters into the model to obtain the inverse reinforcement learning cost map.

The cost value of the driving operation is included in the inverse reinforcement learning cost map. For example, the unmanned vehicle can accelerate left turn, decelerate left turn, maintain the current speed left turn, accelerate right turn, decelerate right turn, maintain the current speed right turn, accelerate straight running, decelerate straight running, maintain the current speed straight running, stop and the like at the current moment, and all the operations have respective inverse reinforcement learning cost maps. Further, the inverse reinforcement learning cost map includes a static cost value and a dynamic cost value. The static cost value is a cost value of an obstacle in the inverse reinforcement learning cost map. The dynamic cost value is a corresponding cost value when each possible operation does not refer to the ambient environment information in the execution process. The dynamic cost value is affected by the static cost value. For example, when passing a certain intersection, if the cost value of an obstacle in the intersection is not referred to, the cost value of passing the intersection at a speed of 40 steps may be the lowest, but after considering the static cost value of the environmental information, the cost value of passing the intersection at a speed of 20 steps is lower in the cost map, and the unmanned vehicle is controlled to pass the intersection at a speed of 20 steps.

On the basis of the embodiment shown in fig. 1, as an optional implementation manner of this embodiment, in the process of step 103, the boundary and/or the obstacle in the second cost map are rendered into the first cost map, so as to obtain the spatiotemporal cost map.

The reason for implementing this step is that the driving data in the first cost map includes relevant environmental data, and the mobile device may not strictly comply with the relevant environmental data, for example, crosswalk, left turn need slow down according to the original traffic law. However, the training data during the training process may include driving data that accelerates through crosswalks or left turns, thus reducing the impact of the relevant environmental data on the cost value. Therefore, the first cost map needs to be adjusted by adopting the second cost map, and a more accurate space-time cost map is obtained.

And in the process of rendering, the boundary and/or the obstacle in the second cost map are/is rendered into the first cost map by taking the boundary and/or the obstacle in the second cost map as a standard, and the boundary and/or the obstacle at the corresponding position in the first cost map are/is replaced. Because the boundary and/or the obstacle in the second cost map have respective static cost values, when the boundary and/or the obstacle in the second cost map are rendered into the first cost map, the cost value of the corresponding position changes, and therefore the cost value in the first cost map is updated, and driving safety is guaranteed. For example, when the mobile device passes through an intersection, the cost value of turning left in the second cost map is 40, and it is possible to turn left. If the boundaries and/or obstacles in the second cost map are rendered therein, the possible left-turn road has been enclosed for maintenance in the second cost map, and the road is obstructed. Therefore, in the rendered spatio-temporal cost map, the cost value of the left turn may be updated to 100, indicating that a collision may occur. Thus, the driving safety can be ensured, and if the rendering of the second cost map is not available, a collision accident may occur. In addition, rendering the boundary of the second cost map into the first cost map also changes the cost value of the corresponding position in the first cost map. Further, since the first cost map is obtained by model processing, the boundary position is not obvious, and the cost value of the boundary in the second cost map is very high (indicating that collision cannot occur), therefore, in the rendered spatio-temporal cost map, the cost value of driving to the boundary may be updated to 100, indicating that driving may occur collision. Therefore, the trajectory can be further planned, and the driving safety is ensured.

Further, the first cost map is the cost value for representing the driving operation performed by the movable equipment in the time domain and the space domain. And the position of the obstacle of the second cost map may change in different time domains. Therefore, in the process of rendering, the second cost map at the same moment is rendered into the first cost map at the same moment corresponding to the time domain, so that the obtained space-time cost map can more accurately and comprehensively represent the cost value paid by the movable equipment in driving, and the driving safety is ensured.

Referring to fig. 3, on the basis of the embodiment shown in fig. 1, as an alternative implementation manner of this embodiment, the step 104 specifically includes the following steps:

step 301, sampling is performed from the space-time cost map according to the target driving strategy, and a sampling sample cost map is obtained.

Specifically, sampling is performed from the space-time cost map according to preset driving operation contained in the target driving strategy, and a sampling sample cost map corresponding to the preset driving operation is determined.

Further, the current time corresponding to the preset driving operation is determined, and according to the preset driving operation at the current time, a sampling sample cost map corresponding to the preset driving operation is determined. Notably, sampling according to a target driving strategy may result in a large sample cost map.

In connection with the above example, when the mobile device is going straight in the middle lane of the three unidirectional lanes, the target driving strategy may include: right lane changing, left lane changing and lane keeping. Further, taking right lane changing as an example, determining the current time, and determining all sampling sample cost maps corresponding to the right lane changing at the current time from the space-time cost map according to the current time and the right lane changing strategy. The sampling operations of the lane sampling sample cost maps of the left lane changing strategy, the lane keeping strategy and the like are similar, so the description is omitted.

As an alternative embodiment, during the sampling process, the number of samples is large, so as to simplify subsequent calculation and storage. An image area corresponding to a preset operation can be intercepted from a space-time cost map according to the preset driving operation, the image area comprises a cost value for executing the preset operation, and a sampling sample cost map corresponding to the preset driving operation of the image area is obtained.

Through the steps, the sampling sample cost graph can be accurately determined from the space-time cost graph, and a foundation is laid for obtaining accurate and reasonable target driving tracks.

Step 302, generating a plurality of tracks according to the sampled sample cost map.

And combining the obtained sampling sample cost graph according to a time domain and a space domain to generate a plurality of tracks. Taking the sampling sample cost graph in the right lane changing as an example, combining the sampling sample cost graphs with each other to obtain driving tracks such as accelerating right lane changing, decelerating right lane changing, keeping the current speed right lane changing and the like.

Step 303, determining the target driving trajectory from the plurality of trajectories according to the space-time cost map.

Specifically, each track has its own track point, so that a specific cost value can be determined in the spatio-temporal cost map according to the track points, and then the respective cost value of each track is calculated.

By adopting the embodiment, the accurate and reasonable target driving track can be determined according to the target driving strategy on the basis of the space-time cost map, the cost value of each driving operation can be accurately, comprehensively and objectively reflected by the space-time cost map, and the corresponding sampling sample cost map is screened out from a large amount of space-time cost maps, so that the disturbance of an irrelevant cost map can be avoided. In addition, multiple tracks generated according to the sampled cost map can be guaranteed to be tracks related to the target driving strategy. And then, the target driving track is determined on the basis of the cost value in the space-time cost map, so that the accurate, reasonable and safe target driving track can be determined.

Referring to fig. 4, based on the embodiment shown in fig. 3, as an optional implementation manner of this embodiment, each trajectory has its own trajectory point, so that a specific cost value can be determined in the spatio-temporal cost map according to the trajectory point, and then the respective cost value of each trajectory is calculated. The specific implementation of step 303 is described as follows:

step 401, determining respective track points of the plurality of tracks.

Specifically, each track comprises respective track points, and each track point has two dimensions of position and time. The specific location of each track point can be represented by x.y.z three-dimensional coordinates.

Step 402, determining respective cost values of the plurality of tracks from the spatio-temporal cost map according to respective track points of the plurality of tracks.

In particular, the cost value is used to characterize the cost of the mobile device to perform the driving maneuver. Typically expressed in terms of scores, percentages, etc. Each track comprises one or more track points, and each track point corresponds to a respective cost value. Therefore, in a specific implementation process, a cost value of each track point in the plurality of tracks is determined from the spatio-temporal cost map according to the respective track point of the plurality of tracks; and adding the cost values of each track point corresponding to each track in the plurality of tracks to obtain the respective cost value of each track in the plurality of tracks. Because each track is split into one or more track points, the cost value of each track point is determined, and the cost value of the corresponding track is further obtained, the accuracy of the cost value of the track can be ensured.

Furthermore, for each track point, each track point has respective time, position and other parameters, so that the cost value of the track point can be determined from the space-time cost map according to the time and the position of the track point. Furthermore, a corresponding space-time cost map can be determined according to the time of the track points. And then according to the positions of the track points, obtaining the cost values of the corresponding positions in the space-time cost graph. Of course, the cost values of the positions corresponding to one or more spatiotemporal cost maps are determined according to the positions, and then the cost values of the positions corresponding to the specific spatiotemporal cost maps are determined according to the time.

As an alternative embodiment, since the traces are combined from the sampled sample cost maps, a respective sampled sample cost map may be associated with each trace point. Therefore, the cost value of each track point can be determined in the sampled sample cost map according to the time and the position of the track point.

And for each track, each track comprises a plurality of track points. Therefore, in each track, the cost value of each track point is obtained. And adding the cost values of all the track points of the track point to obtain the cost value of the track. In the adding process, the cost values of all track points can be directly added to obtain the cost value of the track, or the weight of each track point is set, then the weight of each track point is multiplied by the cost value to obtain the total weight and cost value of each track point, and then the total weight and cost values of all track points are added to obtain the cost value of the track. Specifically, the preset driving operation has a time period from the start operation to the end operation, and each corresponding track in the time period is formed by track points at each moment. And the closer the track point is to the starting operation, the greater the influence of the cost value on the preset driving operation is, so that the corresponding weight is higher, and the cost value obtained by calculating the weight value is more accurate.

And 403, determining the target driving track according to the respective cost values of the plurality of tracks.

Specifically, there are various ways in which the target driving trajectory is determined.

As an optional embodiment, the respective cost values of the plurality of tracks are sorted; in the sorting process, the low cost values are sorted from low to high. Therefore, the lower the cost value, the more forward the ranking thereof, the lower the cost of performing the preset driving operation. The higher the cost value, the later the ranking thereof, the higher the cost of performing the preset driving operation. And taking the tracks with the preset number which are ranked in the front as the target driving tracks. The preset number is one or more. The cost values of all the tracks can be visually seen through the sequencing, so that misoperation of determining the target driving track is reduced, and the accuracy of determining the target driving track can be improved.

As an optional embodiment, a preset cost threshold is set, and respective cost values of the plurality of tracks are compared with the preset cost threshold; and determining the track with the cost value lower than a preset cost threshold value as a target driving track.

By the method, each track can be split into the track points, the cost value of each track can be obtained by refining the cost value of the track points, and the accuracy of the cost value of the track can be guaranteed. On the basis of accurate cost values, accurate and reasonable target driving tracks can be determined based on the cost values of all tracks, wrong tracks are avoided, various accidents are avoided, and driving safety can be further guaranteed.

Exemplary devices

Fig. 5 illustrates a block diagram of a trajectory planning apparatus 500 according to an embodiment of the present application.

As shown in fig. 5, the apparatus 500 for planning a trajectory in a video according to an embodiment of the present application includes: the processing module 501 is configured to invoke a preset model to process driving data corresponding to the mobile device, so as to obtain a first price map; an obtaining module 502, configured to map relevant scene data corresponding to the mobile device to obtain a second cost map; a first determining module 503, configured to determine a spatiotemporal cost map based on the first cost map and the second cost map; a second determining module 504, configured to determine a target driving trajectory corresponding to the mobile device according to the target driving strategy corresponding to the mobile device and the spatio-temporal cost map.

In one example, the preset model includes: a maximum entropy nonlinear depth inverse reinforcement learning model; the processing module 501 is specifically configured to process the driving data of the mobile device according to the maximum entropy nonlinear deep inverse reinforcement learning model to obtain an inverse reinforcement learning cost map.

In one example, the first determining module 503 is specifically configured to render the boundary and/or the obstacle in the second cost map into the first cost map, so as to obtain the spatio-temporal cost map.

Fig. 6 illustrates an example block diagram of the second determination module 504 according to an embodiment of this application. As shown in fig. 6, in one example, the second determining module 504 includes: the sampling module 601 is configured to sample the space-time cost map according to the target driving strategy to obtain a sampled sample cost map; a generating module 602, configured to generate a plurality of tracks according to the cost map of the sampled samples; a third determining module 603, configured to determine the target driving trajectory from the plurality of trajectories according to the spatiotemporal cost map.

Fig. 7 illustrates an example block diagram of the third determining module 603 according to an embodiment of this application. As shown in fig. 7, in one example, the third determining module 603 includes: a fourth determining module 701, configured to determine track points of the multiple tracks; a fifth determining module 702, configured to determine respective cost values of the multiple trajectories from the spatio-temporal cost map according to respective trajectory points of the multiple trajectories; a sixth determining module 703, configured to determine the target driving trajectory according to respective cost values of the multiple trajectories.

In one example, the fifth determining module 702 includes: a generation module comprising: a seventh determining module, configured to determine, according to respective trajectory points of the multiple trajectories, a cost value of each trajectory point in the multiple trajectories from the spatio-temporal cost map; and the adding module is used for adding the cost value of each track point corresponding to each track in the plurality of tracks to obtain the respective cost value of each track in the plurality of tracks.

In one example, the sixth determining module 703 includes: and the sorting module is used for sorting the respective cost values of the plurality of tracks. And the fifth determining module is used for taking the tracks with the preset number in the front sequence as the target driving tracks.

In one example, the sixth determining module 703 includes: the comparison module is used for comparing the respective cost values of the plurality of tracks with a preset cost threshold; and the sixth determining module is used for determining the track with the cost value lower than the preset cost threshold value as the target driving track.

Exemplary Mobile device

FIG. 8 illustrates a block diagram of a removable device according to an embodiment of the present application.

As shown in fig. 8, the removable device (electronic device 10) includes one or more processors 11 and memory 12.

The processor 11 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the removable device to perform desired functions.

Memory 12 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by the processor 11 to implement the method for tracking a pose of a target object of the various embodiments of the present application described above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.

In one example, the removable device may further include: an input device 13 and an output device 14, which are interconnected by a bus system and/or other form of connection mechanism (not shown).

For example, when the removable device is a first device or a second device, the input means 13 may be a microphone or a microphone array as described above for capturing an input signal of a sound source. When the electronic device is a stand-alone device, the input means 13 may be a communication network connector for receiving the acquired input signals from the first device and the second device.

The input device 13 may also include, for example, a keyboard, a mouse, and the like.

The output device 14 may output various information including determined distance information, direction information, and the like to the outside. The output devices 14 may include, for example, a display, speakers, a printer, and a communication network and its connected remote output devices, among others.

Of course, for simplicity, only some of the components of the removable device relevant to the present application are shown in fig. 8, omitting components such as buses, input/output interfaces, and the like. In addition, the removable device may include any other suitable components, depending on the particular application.

Exemplary computer program product and computer-readable storage Medium

In addition to the above-described methods and apparatus, embodiments of the present application may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the trajectory planning method according to various embodiments of the present application described in the "exemplary methods" section of this specification, supra.

The computer program product may be written with program code for performing the operations of embodiments of the present application in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present application may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform steps in a method of pose tracking of a target object according to various embodiments of the present application described in the "exemplary methods" section above in this specification.

The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing describes the general principles of the present application in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present application are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present application. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the foregoing disclosure is not intended to be exhaustive or to limit the disclosure to the precise details disclosed.

The block diagrams of devices, apparatuses, systems referred to in this application are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

It should also be noted that in the devices, apparatuses, and methods of the present application, the components or steps may be decomposed and/or recombined. These decompositions and/or recombinations are to be considered as equivalents of the present application.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the application to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims

1. A trajectory planning method, the method comprising:

processing driving data corresponding to the movable equipment according to a preset model to obtain a first price map;

mapping relevant scene data corresponding to the movable equipment to obtain a second cost graph;

determining a spatiotemporal cost map based on the first cost map and the second cost map;

determining a target driving track corresponding to the movable equipment according to a target driving strategy corresponding to the movable equipment and the space-time cost map;

the first cost graph comprises a plurality of layers, and each dimension layer represents the cost graph at each moment;

the first price map is used for objectively and accurately predicting the cost required to be paid through each track;

the second cost map is used for accurately reflecting the static cost of the obstacle corresponding to each track;

the spatiotemporal cost map reflects the cost of each track accurately and comprehensively from both the spatiotemporal aspects.

2. The method of claim 1, wherein the preset model comprises: a maximum entropy nonlinear depth inverse reinforcement learning model;

the processing of the driving data corresponding to the mobile device according to the preset model to obtain a first cost map comprises:

and processing the driving data of the movable equipment according to the maximum entropy nonlinear deep inverse reinforcement learning model to obtain an inverse reinforcement learning cost map.

3. The method of claim 1, wherein the determining a spatiotemporal cost map based on the first cost map and the second cost map comprises:

and rendering the boundary and/or the obstacle in the second cost map into the first cost map to obtain the space-time cost map.

4. The method of claim 1, wherein the determining the target driving trajectory for the mobile device according to the target driving strategy for the mobile device and the spatiotemporal cost map comprises:

sampling from the space-time cost map according to the target driving strategy to obtain a sampling sample cost map;

generating a plurality of tracks according to the sampling sample cost graph;

and determining the target driving track from the plurality of tracks according to the space-time cost graph.

5. The method of claim 4, wherein the determining the target driving trajectory from the plurality of trajectories according to the spatiotemporal cost map comprises:

determining respective track points of the plurality of tracks;

determining respective cost values of the plurality of tracks from the spatio-temporal cost graph according to respective track points of the plurality of tracks;

and determining the target driving track according to the respective cost values of the plurality of tracks.

6. The method of claim 5, wherein determining the cost value for each of the plurality of trajectories from the spatiotemporal cost map based on the trajectory point for each of the plurality of trajectories comprises:

determining a cost value of each track point in the plurality of tracks from the space-time cost map according to the respective track point of the plurality of tracks;

and adding the cost values of each track point corresponding to each track in the plurality of tracks to obtain the respective cost value of each track in the plurality of tracks.

7. The method of claim 5, wherein the determining the target driving trajectory from the respective cost values of the plurality of trajectories comprises:

sorting the respective cost values of the plurality of tracks;

and taking the tracks with the preset number which are ranked in the front as the target driving tracks.

8. A trajectory planning apparatus comprising:

the processing module is used for calling a preset model to process the driving data corresponding to the movable equipment to obtain a first price map;

an obtaining module, configured to map relevant scene data corresponding to the mobile device to obtain a second cost map;

a first determination module for determining a spatiotemporal cost map based on the first cost map and the second cost map;

the second determining module is used for determining a target driving track corresponding to the movable equipment according to a target driving strategy corresponding to the movable equipment and the space-time cost map;

9. A mobile device, comprising:

a processor; and a memory having stored therein computer program instructions which, when executed by the processor, cause the processor to perform the method of any one of claims 1-7.

10. A computer-readable storage medium, the storage medium storing a computer program for performing the method of any of the preceding claims 1-7.