CN115471549A

CN115471549A - Method, device and equipment for predicting position frame of target in image and storage medium

Info

Publication number: CN115471549A
Application number: CN202211029208.7A
Authority: CN
Inventors: 王幸鹏
Original assignee: China Automotive Innovation Co Ltd
Current assignee: China Automotive Innovation Co Ltd
Priority date: 2022-08-25
Filing date: 2022-08-25
Publication date: 2022-12-13

Abstract

The application discloses a method, a device, equipment and a storage medium for predicting a position frame of a target in an image, relates to the technical field of automatic driving, and can improve the prediction precision of the position frame of the target in the image. The specific scheme comprises the following steps: acquiring a first target pixel point from a first position frame of a first reference frame, wherein the first reference frame is a previous frame image at the current moment; obtaining a corresponding prediction frame of a target object in a current frame image according to a first target pixel point, the obtained current vehicle position and the current vehicle yaw angle at the current moment, a first vehicle position and a first vehicle yaw angle at a first moment corresponding to a first reference frame, a first target device position, a first height of a first position frame and a first width of the first position frame, a preset calibration parameter of an acquisition device and a preset second target world position; the second target pixel point is a pixel point corresponding to the first target pixel point in the last frame image of the first reference frame.

Description

Method, device and equipment for predicting position frame of target in image and storage medium

Technical Field

The present application relates to the field of automatic driving technologies, and in particular, to a method, an apparatus, a device, and a storage medium for predicting a position frame of a target in an image.

Background

With the progress of artificial intelligence, electronic information, automatic control, intelligent manufacturing and other technologies, the automatic driving technology of automobiles is developed at a high speed. In the matching and tracking technology of automatic driving, the position of a position frame of a target object in a current image frame, especially a target object such as a pedestrian, a vehicle, etc., needs to be predicted, so that the current track information of the target object can be matched.

At present, a prediction model is generally trained by using position frame information of a target object in a historical image frame, and then the position of the position frame of the target object in a current image frame is predicted by the prediction model, but the method has the problem of low prediction accuracy for an untrained scene.

Disclosure of Invention

The application provides a method, a device, equipment and a storage medium for predicting a position frame of an object in an image, which can improve the prediction accuracy of the position frame of the object in the image.

In order to achieve the purpose, the technical scheme is as follows:

in a first aspect of the embodiments of the present application, a method for predicting a position frame of a target in an image is provided, where the method includes: acquiring a first target pixel point from a first position frame of a first reference frame, wherein the first reference frame is a previous frame image at the current moment, the image is an image acquired by acquisition equipment on a vehicle, the first position frame is a boundary frame of a target object after the first reference frame passes through target detection, and the first target pixel point is any point on the bottom edge of the first position frame;

obtaining a corresponding prediction frame of a target object in a current frame image according to a first target pixel point, the obtained current vehicle position and the current vehicle yaw angle at the current moment, a first vehicle position and a first vehicle yaw angle at a first moment corresponding to a first reference frame, a first target device position, a first height of a first position frame and a first width of the first position frame, a preset calibration parameter of an acquisition device and a preset second target world position; the first target device position is the position of a first target pixel point under the device coordinate system of the acquisition device, the second target world position is the position of a second target pixel point under the world coordinate system, and the second target pixel point is a pixel point corresponding to the first target pixel point in the previous frame image of the first reference frame.

In one embodiment, obtaining a corresponding prediction frame of a target object in a current frame image according to a first target pixel point, an obtained current vehicle position and a current vehicle yaw angle at a current time, a first vehicle position and a first vehicle yaw angle at a first time corresponding to a first reference frame, a first target device position, a first height of a first position frame, and a first width of the first position frame, and according to a preset calibration parameter of an acquisition device and a preset second target world position, includes:

determining the position of a first target pixel point under a world coordinate system at the current moment according to the calibration parameter, the first vehicle position, the first vehicle yaw angle and the second target world position to obtain the current target world position;

and determining a corresponding prediction frame of the target object in the current frame image according to the current target world position, the current vehicle yaw angle, the first vehicle position, the first vehicle yaw angle, the first target device position, the calibration parameter, the first height and the first width.

In one embodiment, determining the position of the first target pixel point under the world coordinate system at the current moment according to the calibration parameter, the first vehicle position, the first vehicle yaw angle and the second target world position to obtain the current target world position includes:

determining the position of a first target pixel point under a world coordinate system according to the calibration parameter, the first vehicle position and the first vehicle yaw angle to obtain a first target world position;

and predicting the position of the first target pixel point under the world coordinate system at the current moment according to the first target world position and the second target world position to obtain the current target world position.

In one embodiment, determining the position of the first target pixel point in the world coordinate system according to the calibration parameter, the first vehicle position, and the first vehicle yaw angle, to obtain a first target world position, includes:

determining a conversion relation between a vehicle body coordinate system and a world coordinate system of the vehicle at a first moment according to the first vehicle position and the first vehicle yaw angle;

and determining the position of the first target pixel point under the world coordinate system according to the calibration parameters and the conversion relation to obtain the first target world position.

In one embodiment, determining a corresponding prediction frame of the target object in the current frame image according to the current target world position, the current vehicle yaw angle, the first vehicle position, the first vehicle yaw angle, the first target device position, the calibration parameter, the first height and the first width includes:

according to the conversion relation, determining the position of the current target world position under the vehicle body coordinate system to obtain the current target vehicle body position, and determining the position of the current vehicle position under the vehicle body coordinate system to obtain the current vehicle body position;

and determining a corresponding prediction frame of the target object in the current frame image according to the current target vehicle body position, the current vehicle yaw angle, the first target device position, the calibration parameter, the first height and the first width.

In one embodiment, determining a corresponding prediction frame of the target object in the current frame image according to the current target vehicle body position, the current vehicle yaw angle, the first target device position, the calibration parameter, the first height and the first width comprises:

determining the position of a first target pixel point relative to a vehicle at the current moment according to the current target vehicle body position and the current vehicle body position to obtain a current target vehicle reference position;

determining a yaw angle difference value according to the current vehicle yaw angle and the first vehicle yaw angle;

correcting the reference position of the current target vehicle according to the yaw angle difference value to obtain the position of the current target vehicle;

determining the current target equipment position of the current target vehicle position under the equipment coordinate system according to the calibration parameters;

and determining a corresponding prediction frame of the target object in the current frame image according to the current target device position, the calibration parameter, the first target device position, the first height and the first width.

In one embodiment, determining a corresponding prediction frame of a target object in a current frame image according to a current target device position, a calibration parameter, a first target device position, a first height and a first width includes:

determining the position of the current target equipment position in the image according to the calibration parameters to obtain the target position of the first target pixel point in the current frame image corresponding to the current moment;

determining the current height and the current width of the first position frame in the current frame image according to the position of the first target device, the first height and the first width;

and obtaining a corresponding prediction frame of the target object in the current frame image according to the current height, the current width and the target position.

In a second aspect of the embodiments of the present application, there is provided an apparatus for predicting a position frame of an object in an image, the apparatus including:

the acquisition module is used for acquiring a first target pixel point from a first position frame of a first reference frame, wherein the first reference frame is a last frame image at the current moment, the image is an image acquired by acquisition equipment on a vehicle, the first position frame is a boundary frame of a target object of the first reference frame after target detection, and the first target pixel point is any point on the bottom edge of the first position frame;

the determining module is used for obtaining a corresponding prediction frame of the target object in the current frame image according to the first target pixel point, the obtained current vehicle position and the current vehicle yaw angle at the current moment, the first vehicle position and the first vehicle yaw angle at the first moment corresponding to the first reference frame, the first target device position, the first height of the first position frame and the first width of the first position frame, and the preset calibration parameter of the acquisition device and the preset second target world position; the first target device position is the position of a first target pixel point under the device coordinate system of the acquisition device, the second target world position is the position of a second target pixel point under the world coordinate system, and the second target pixel point is a pixel point corresponding to the first target pixel point in the previous frame image of the first reference frame.

In a third aspect of the embodiments of the present application, there is provided an electronic device, where the electronic device includes a memory and a processor, and the memory stores a computer program, and the computer program, when executed by the processor, implements the method for predicting a position frame of an object in an image in the first aspect of the embodiments of the present application.

In a fourth aspect of the embodiments of the present application, there is provided a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the method for predicting a position frame of an object in an image in the first aspect of the embodiments of the present application.

The beneficial effects brought by the technical scheme provided by the embodiment of the application at least comprise:

according to the method for predicting the position frame of the target in the image, the corresponding prediction frame of the target object in the current frame image is obtained according to the first target pixel point, the current vehicle position and the current vehicle yaw angle at the current moment, the first vehicle position and the first vehicle yaw angle at the first moment corresponding to the first reference frame, the first target device position, the first height of the first position frame and the first width of the first position frame, the preset calibration parameter of the acquisition device and the preset second target world position. The first reference frame is a last frame image at the current moment, the image is an image acquired by acquisition equipment on a vehicle, the first position frame is a boundary frame of a target object of the first reference frame after target detection, the first target pixel point is any point on the bottom edge of the first position frame, the position of the first target equipment is a position of the first target pixel point under an equipment coordinate system of the acquisition equipment, the position of the second target world is a position of the second target pixel point under a world coordinate system, and the second target pixel point is a pixel point corresponding to the first target pixel point in the last frame image of the first reference frame. According to the prediction method, the position frame of the target in the image is predicted according to the target pixel points in the historical reference image of the target image, the position information of the current vehicle and the data of the yaw angle.

Drawings

Fig. 1 is a schematic structural diagram of a vehicle-mounted terminal according to an embodiment of the present disclosure;

FIG. 2 is a first flowchart of a method for predicting a location frame of a target in an image according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram of a position frame of a predicted target in an image according to an embodiment of the present disclosure;

FIG. 4 is a flowchart illustrating a second method for predicting a location frame of a target in an image according to an embodiment of the present disclosure;

fig. 5 is a block diagram of an apparatus for predicting a position frame of an object in an image according to an embodiment of the present disclosure.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the following, the terms "first", "second" are used for descriptive purposes only and are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the embodiments of the present disclosure, "a plurality" means two or more unless otherwise specified.

In addition, the use of "based on" or "according to" means open and inclusive, as a process, step, calculation, or other action that is "based on" or "according to" one or more conditions or values may in practice be based on additional conditions or values beyond those that are present.

With the progress of artificial intelligence, electronic information, automatic control, intelligent manufacturing and other technologies, the automatic driving technology of automobiles is developed at a high speed. In the matching and tracking technology of automatic driving, the position of the position frame of the target object in the current image frame needs to be predicted, so that the current track information of the target object can be matched.

At present, a prediction model is usually trained by using position frame information in a historical image frame of a target image, and then the position of a position frame of a target object in a current image frame is predicted by the prediction model, but the method has the problem of low prediction accuracy for an untrained scene.

In order to solve the above problem, an embodiment of the present application provides a method for predicting a position frame of a target in an image, where a first target pixel point is obtained from a first position frame of a first reference frame, and then, according to the first target pixel point, and according to an obtained current vehicle position and a current vehicle yaw angle at a current time, a first vehicle position and a first vehicle yaw angle at a first time corresponding to the first reference frame, a first target device position, a first height of the first position frame, and a first width of the first position frame, and according to a preset calibration parameter of an acquisition device and a preset second target world position, a corresponding prediction frame of the target object in the current frame image is obtained. The first reference frame is a last frame image at the current moment, the image is an image acquired by acquisition equipment on a vehicle, the first position frame is a boundary frame of a target object of the first reference frame after target detection, the first target pixel point is any point on the bottom edge of the first position frame, the position of the first target equipment is a position of the first target pixel point under an equipment coordinate system of the acquisition equipment, the position of the second target world is a position of the second target pixel point under a world coordinate system, and the second target pixel point is a pixel point corresponding to the first target pixel point in the last frame image of the first reference frame. According to the prediction method, the position frame of the target in the image is predicted according to the target pixel points in the historical reference image of the target image, the position information of the current vehicle and the data of the yaw angle.

An execution main body of the method for predicting the position frame of the target in the image provided by the embodiment of the application may be an electronic device, specifically, the electronic device may be a computer device, a terminal device, or a server, where the terminal device may be a vehicle-mounted terminal, various personal computers, a notebook computer, a smart phone, a tablet computer, a portable wearable device, and the like, and the comparison of the application is not particularly limited.

Fig. 1 shows an execution main body as an example of a vehicle-mounted terminal, and fig. 1 is a schematic internal structure diagram of a vehicle-mounted terminal provided in an embodiment of the present application. As shown in fig. 1, the in-vehicle terminal includes a processor and a memory connected by a system bus. Wherein the processor is configured to provide computational and control capabilities. The memory may include a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The computer program can be executed by a processor for implementing the steps of a method for predicting a position frame of an object in an image provided by the above embodiments. The internal memory provides a cached operating environment for the operating system and computer programs in the non-volatile storage medium.

Those skilled in the art will appreciate that the structure shown in fig. 1 is only a block diagram of a part of the structure related to the present application, and does not constitute a limitation to the in-vehicle terminal to which the present application is applied, and a specific in-vehicle terminal may include more or less components than those shown in the figure, or combine some components, or have a different arrangement of components.

Based on the execution subject, the embodiment of the application provides a method for predicting a position frame of a target in an image. As shown in fig. 2, the method comprises the steps of:

step 201, a first target pixel point is obtained from a first position frame of a first reference frame.

The first reference frame is the last frame image at the current moment, the image is the image acquired by the acquisition equipment on the vehicle, the first position frame is the boundary frame of the target object after the target detection of the first reference frame, and the first target pixel point is any point on the bottom edge of the first position frame.

Optionally, the first target pixel point may be a middle point of a bottom edge of the first position frame.

It should be noted that, since the first target pixel point is converted into the world coordinate system in the present application, as shown in fig. 3, it can be seen that the bottom edge of the detection frame of the target object is located on the ground, and therefore, the selected target pixel point is a pixel point located on the bottom edge of the first position frame.

Step 202, obtaining a corresponding prediction frame of the target object in the current frame image according to the first target pixel point, the obtained current vehicle position and the current vehicle yaw angle at the current moment, the first vehicle position and the first vehicle yaw angle at the first moment corresponding to the first reference frame, the first target device position, the first height of the first position frame and the first width of the first position frame, and the preset calibration parameter of the acquisition device and the preset second target world position.

The first target device position is the position of a first target pixel point under the device coordinate system of the acquisition device, the second target world position is the position of a second target pixel point under the world coordinate system, and the second target pixel point is a pixel point corresponding to the first target pixel point in the previous frame image of the first reference frame.

For example, if the first target pixel point is the middle point of the bottom edge of the first position frame, the second target pixel point is the middle point of the bottom edge of the second position frame, and the second position frame is the boundary frame of the target object obtained after the previous frame image of the first reference frame is subjected to target detection.

Wherein the current vehicle position at the current time and the first vehicle position at the first time are the acquired position information transmitted by a Global Positioning System (GPS) onboard the vehicle. The current vehicle yaw angle at the current time and the first vehicle yaw angle at the first time are the yaw angles of the vehicle that are transmitted by an Inertial Measurement Unit (IMU) of the vehicle are acquired.

The calibration parameters of the acquisition equipment comprise internal parameters and external parameters, the internal parameters are the calibration parameters of the acquisition equipment compared with images shot by the acquisition equipment, and the external parameters are the calibration parameters of the acquisition equipment compared with vehicles.

Optionally, the position of the first target pixel point in the current frame image can be predicted according to the obtained current vehicle position, the current vehicle yaw angle, the first vehicle position and the first vehicle yaw angle, according to the preset calibration parameter and the preset second target world position, so as to obtain the current target pixel point, then the height and the width of the position frame of the first position frame in the current frame image are determined according to the obtained first height, the obtained first width and the obtained position of the first target device, and finally the corresponding prediction frame of the target object in the current frame image is obtained according to the height and the width of the position frame of the current target pixel point and the first position frame in the current frame image.

According to the method for predicting the position frame of the target in the image, the first target pixel point is obtained from the first position frame of the first reference frame, and then according to the first target pixel point, the obtained current vehicle position and the current vehicle yaw angle at the current moment, the first vehicle position and the first vehicle yaw angle at the first moment corresponding to the first reference frame, the first target device position, the first height of the first position frame and the first width of the first position frame, the preset calibration parameter of the acquisition device and the preset second target world position, the corresponding prediction frame of the target object in the current frame image is obtained. The first reference frame is a last frame image at the current moment, the image is an image acquired by acquisition equipment on a vehicle, the first position frame is a boundary frame of a target object of the first reference frame after target detection, the first target pixel point is any point on the bottom edge of the first position frame, the position of the first target equipment is a position of the first target pixel point under an equipment coordinate system of the acquisition equipment, the position of the second target world is a position of the second target pixel point under a world coordinate system, and the second target pixel point is a pixel point corresponding to the first target pixel point in the last frame image of the first reference frame. According to the prediction method, the position frame of the target in the image is predicted according to the target pixel points in the historical reference image of the target image, the position information of the current vehicle and the data of the yaw angle.

Optionally, as shown in fig. 4, in step 202, according to the first target pixel point, and according to the obtained current vehicle position and current vehicle yaw angle at the current time, the first vehicle position and first vehicle yaw angle at the first time corresponding to the first reference frame, the first target device position, the first height of the first position frame, and the first width of the first position frame, and according to the preset calibration parameter of the acquisition device and the preset second target world position, the process of obtaining the corresponding prediction frame of the target object in the current frame image may be:

step 401, determining the position of the first target pixel point under the world coordinate system at the current moment according to the calibration parameter, the first vehicle position, the first vehicle yaw angle and the second target world position, so as to obtain the current target world position.

Step 402, determining a corresponding prediction frame of the target object in the current frame image according to the current target world position, the current vehicle yaw angle, the first vehicle position, the first vehicle yaw angle, the first target device position, the calibration parameter, the first height and the first width.

It can be understood that, since the current position and the yaw angle of the vehicle under the world coordinates can be obtained in real time, the target pixel point in the current frame image needs to be predicted first

And then determining a corresponding prediction frame of the target object in the current frame image according to the current target world position, the current vehicle yaw angle, the first vehicle position, the first vehicle yaw angle, the first target device position, the calibration parameter, the first height and the first width.

Optionally, in step 401, the position of the first target pixel point at the current time in the world coordinate system is determined according to the calibration parameter, the first vehicle position, the first vehicle yaw angle, and the second target world position, and the process of obtaining the current target world position may be:

and then predicting the position of the first target pixel point under the world coordinate system at the current moment according to the first target world position and the second target world position to obtain the current target world position.

Specifically, the determining the position of the first target pixel point in the world coordinate system according to the calibration parameter, the first vehicle position, and the first vehicle yaw angle may include: and determining a conversion relation between a vehicle body coordinate system of the vehicle and a world coordinate system at a first moment according to the first vehicle position and the first vehicle yaw angle, and then determining the position of a first target pixel point under the world coordinate system according to the calibration parameter and the conversion relation to obtain a first target world position.

Optionally, in step 402, according to the current target world position, the current vehicle yaw angle, the first vehicle position, the first vehicle yaw angle, the first target device position, the calibration parameter, the first height, and the first width, the process of determining the prediction frame corresponding to the target object in the current frame image may be:

according to the first vehicle position and the first vehicle yaw angle, determining a conversion relation between a vehicle body coordinate system and a world coordinate system of the vehicle at a first moment, then determining the position of a current target world position under the vehicle body coordinate system according to the conversion relation to obtain a current target vehicle body position, determining the position of the current vehicle position under the vehicle body coordinate system to obtain a current vehicle body position, and finally determining a corresponding prediction frame of the target object in the current frame image according to the current target vehicle body position, the current vehicle yaw angle, the first target equipment position, the calibration parameters, the first height and the first width.

Specifically, according to the current target vehicle body position, the current vehicle yaw angle, the first target device position, the calibration parameter, the first height, and the first width, the process of determining the prediction frame corresponding to the target object in the current frame image may be: determining the position of a first target pixel point relative to a vehicle at the current moment according to the current target vehicle body position and the current vehicle body position to obtain a current target vehicle reference position, determining a yaw angle difference value according to the current vehicle yaw angle and the first vehicle yaw angle, correcting the current target vehicle reference position according to the yaw angle difference value to obtain the current target vehicle position, determining the current target equipment position of the current target vehicle position under an equipment coordinate system according to calibration parameters, and finally determining a corresponding prediction frame of a target object in a current frame image according to the current target equipment position, the calibration parameters, the first target equipment position, the first height and the first width.

Further, the process of determining the corresponding prediction frame of the target object in the current frame image according to the current target device position, the calibration parameter, the first target device position, the first height, and the first width may be: and determining the position of the current target equipment position in the image according to the calibration parameters to obtain the target position of the first target pixel point in the current frame image corresponding to the current moment, then determining the current height and the current width of the first position frame in the current frame image according to the first target equipment position, the first height and the first width, and finally obtaining a corresponding prediction frame of the target object in the current frame image according to the current height, the current width and the target position.

In order to facilitate understanding of those skilled in the art, the method for predicting the position frame of the target in the image provided by the present application is described by taking an execution subject as an in-vehicle terminal as an example, and specifically, the method includes:

(1) And acquiring a first target pixel point from a first position frame of the first reference frame.

The first reference frame is an image of a last frame at the current moment, the image is an image collected by a collecting device on the vehicle, the first position frame is a boundary frame of a target object of the first reference frame after target detection, and the first target pixel point is any point on the bottom edge of the first position frame.

In the actual implementation process, for the existing history track information of the target object M, a first target pixel point is obtained from the first reference frame, and the pixel of the first target pixel point cb is (u, v).

(2) And determining the conversion relation between the vehicle body coordinate system and the world coordinate system of the vehicle at the first moment according to the first vehicle position and the first vehicle yaw angle.

The first vehicle position information is position information of the vehicle at a first time corresponding to the first reference frame in a world coordinate system, and the first vehicle yaw angle is a yaw angle of the vehicle at the first time.

The first vehicle position at the first time is acquired vehicle-mounted Global Positioning System (GPS) transmitted position information. The first vehicle position information is position information of the vehicle in a world coordinate system.

Optionally, the conversion relationship includes: translation matrix T of vehicle body coordinate system and world coordinate system _t-1 And a rotation matrix R of the vehicle body coordinate system and the world coordinate system _t-1 . For example, the rear axle center of the vehicle may be used as the origin of a vehicle body coordinate system (VCS), and then a translation matrix T of the vehicle body coordinate system and a world coordinate system at a first time may be determined according to the first vehicle position _t-1 And determining a rotation matrix R of the vehicle body coordinate system and the world coordinate system at the first moment according to the yaw angle of the first vehicle _t-1 。

(3) And determining the position of the first target pixel point under the world coordinate system according to the calibration parameters and the conversion relation to obtain the first target world position.

And the calibration parameters are calibration parameters of the acquisition equipment. Specifically, the calibration parameters comprise an internal parameter K and an external parameter R _cam 、T _cam The internal reference is the image of the acquisition device compared with the image shot by the acquisition deviceThe external parameter is a calibration parameter of the acquisition device compared with the vehicle. The first target world position is the position of the first target pixel point under the world coordinate system.

Optionally, the coordinates of the first target pixel point in the vehicle body coordinate system may be obtained by calculation according to the following formula (1)

And meanwhile, the value s of the first target pixel point (u, v) in the Z-axis direction of the camera coordinate system can be obtained through calculation.

Then, pass through the rotation matrix R at the first moment _t -1 and a translation matrix T _t 1 and formula (2) can convert the point in the VCS coordinate system to the world coordinate system to obtain a first target world position

The formula (2) is:

(4) And predicting the position of the first target pixel point under the world coordinate system at the current moment according to the first target world position and the second target world position to obtain the current target world position.

And the second target world position is the position of a second target pixel point in the world coordinate system, and the second target pixel point is a pixel point corresponding to the first target pixel point in the last frame image of the first reference frame.

It can be understood that, since the present application is a continuous prediction process for a plurality of image frames in a piece of video, the second target world position is already calculated in advance when the previous frame prediction is performed, and specifically, the determination process of the second target world position may refer to the above steps (1) - (3).

Optionally, the above process may be: according to the first target world position

And a second target world location

And calculating to obtain the displacement of the first target pixel point, and then obtaining the speed of the first target pixel point according to the time interval between the first moment corresponding to the first reference frame and the second moment corresponding to the previous frame image of the first reference frame and the displacement and the time interval. And according to the speed and the time interval between the current moment and the first moment, obtaining the displacement of the first target pixel point from the first moment to the current moment, and adding the first target world position to obtain the current target world position.

(5) And determining the position of the current target world position under the vehicle body coordinate system according to the conversion relation to obtain the current target vehicle body position, and determining the position of the current vehicle position under the vehicle body coordinate system to obtain the current vehicle body position.

The current vehicle position, which is the position of the vehicle in the world coordinate system at the current moment, is assumed to be

By a rotation matrix R at a first moment in time _t-1 And translation matrix T _t-1 The current vehicle position at the current moment

And current target world location

Respectively transferring to a VCS coordinate system to obtain the current targetVehicle body position

And current vehicle body position

The specific conversion process can be shown in equation (2).

(6) And determining the position of the first target pixel point relative to the vehicle at the current moment according to the current target vehicle body position and the current vehicle body position to obtain the current target vehicle reference position.

Optionally, by calculation

And obtaining the reference position of the current target vehicle.

(7) And determining a yaw angle difference value according to the current vehicle yaw angle and the first vehicle yaw angle.

Calculating the current Yaw angle Yaw of the vehicle _t And a first vehicle Yaw angle Yaw _t-1 And obtaining a Yaw angle difference delta Yaw.

(8) And correcting the reference position of the current target vehicle according to the yaw angle difference value to obtain the position of the current target vehicle.

Optionally, the position after the difference Δ Yaw of the reference position of the current target vehicle is obtained by calculation according to the formula (3), so as to obtain the position of the current target vehicle

(9) And determining the current target equipment position of the current target vehicle position under the equipment coordinate system according to the calibration parameters.

Optionally, the external parameter R of the acquisition equipment can be selected _cam 、T _cam The current target vehicle position

The current target equipment position is obtained by transferring to a camera coordinate system

(10) And determining the position of the current target equipment position in the image according to the calibration parameters to obtain the target position of the first target pixel point in the current frame image corresponding to the current moment.

Optionally, the current target device position may be determined according to the internal parameter K of the acquisition device

And converting the image into an image coordinate system to obtain a target position new _ cb in the current frame image corresponding to the first target pixel point at the current moment.

(11) And determining the current height and the current width of the first position frame in the current frame image according to the first target device position, the first height and the first width.

And the first target equipment position is the position of the first target pixel point in the equipment coordinate system of the acquisition equipment. Optionally, the first target device location may be obtained by converting the first target pixel point through the internal parameter K.

It can be understood that, because the distances from the target object to the acquisition device at different times are different, the sizes of the target object in the image at different times are different, and the sizes of the position frames of the target object in the image at corresponding different times are different. The height and width of the position frame of the target object are in proportional relation with the distance between the target object and the acquisition equipment, so that the position frame of the target object can be obtained according to the current position of the target equipment and the first height H of the first position frame _t-1 And a first width W _t-1 Calculating the current height H of the current position frame of the target object in the current frame image _t And a current width W _t 。

(12) And obtaining a corresponding prediction frame of the target object in the current frame image according to the current height, the current width and the target position.

The pixel point coordinates of the first target pixel point in the current frame image are predicted, and the current height and the current width of the first position frame in the current frame image are obtained through calculation, so that the corresponding prediction frame of the target object in the current frame image can be obtained.

In the above steps (1) to (12), the method for predicting the position frame of the target in the image is exemplified by one target object in the image, and the method for predicting the position frame of the other target object included in the image can be obtained according to the above method. Fig. 3 is a schematic diagram of a predicted position frame of a target object according to the method for predicting a position frame of a target in an image provided by the present application. The position frame indicated by the dotted line is the position frame of the first reference frame, and the boundary frame indicated by the solid line is the position frame of the predicted current frame image.

It should be understood that the steps in the step flow charts in the above embodiments are shown in sequence as indicated by the arrows, but the steps are not necessarily performed in sequence as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps of the above-mentioned flowcharts may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or the stages is not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a part of the steps or the stages in other steps.

As shown in fig. 5, an embodiment of the present application further provides an apparatus for predicting a position frame of an object in an image, where the apparatus includes:

the acquiring module 11 is configured to acquire a first target pixel point from a first position frame of a first reference frame, where the first reference frame is a previous frame image at a current time, the image is an image acquired by an acquisition device on a vehicle, the first position frame is a boundary frame of a target object after the first reference frame passes through target detection, and the first target pixel point is any point on a bottom edge of the first position frame;

the determining module 12 is configured to obtain a corresponding prediction frame of the target object in the current frame image according to the first target pixel point, the obtained current vehicle position and the current vehicle yaw angle at the current time, the first vehicle position and the first vehicle yaw angle at the first time corresponding to the first reference frame, the first target device position, the first height of the first position frame, and the first width of the first position frame, and according to a preset calibration parameter of the acquisition device and a preset second target world position; the first target device position is the position of a first target pixel point under the device coordinate system of the acquisition device, the second target world position is the position of a second target pixel point under the world coordinate system, and the second target pixel point is a pixel point corresponding to the first target pixel point in the previous frame image of the first reference frame.

In one embodiment, the determining module 12 is specifically configured to:

The device for predicting a position frame of a target in an image according to this embodiment may implement the method embodiments described above, and the implementation principle and the technical effect are similar, which are not described herein again.

For the specific definition of the prediction device of the position frame of the target in the image, reference may be made to the above definition of the prediction method of the position frame of the target in the image, and details are not described here. The above-mentioned modules in the prediction apparatus of the location frame of the target in the image can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or be independent of a processor of the electronic device, and can also be stored in a memory of the electronic device in a software form, so that the processor can call and execute operations corresponding to the modules.

In another embodiment of the present application, there is also provided an electronic device, including a memory and a processor, where the memory stores a computer program, and the computer program is executed by the processor to implement the steps of the method for predicting a position frame of an object in an image according to the embodiment of the present application.

In another embodiment of the present application, a computer-readable storage medium is further provided, on which a computer program is stored, and the computer program, when executed by a processor, implements the steps of the method for predicting the position frame of the target in the image according to the embodiment of the present application.

In another embodiment of the present application, a computer program product is also provided, where the computer program product includes computer instructions, when the computer instructions are executed on a device for predicting a position frame of an object in an image, the device for predicting a position frame of an object in an image is caused to perform the steps performed by the method for predicting a position frame of an object in an image in the method flow shown in the above method embodiment.

In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented using a software program, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The processes or functions according to the embodiments of the present application are generated in whole or in part when the computer-executable instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). Computer-readable storage media can be any available media that can be accessed by a computer or can comprise one or more data storage devices, such as servers, data centers, and the like, that can be integrated with the media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above examples only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A method for predicting a position frame of an object in an image, the method comprising:

acquiring a first target pixel point from a first position frame of a first reference frame, wherein the first reference frame is a previous frame image at the current moment, the image is an image acquired by acquisition equipment on a vehicle, the first position frame is a boundary frame of a target object of the first reference frame after target detection, and the first target pixel point is any point on the bottom edge of the first position frame;

according to a first target pixel point, and according to the obtained current vehicle position and the current vehicle yaw angle at the current moment, a first vehicle position and a first vehicle yaw angle at a first moment corresponding to the first reference frame, a first target device position, a first height of the first position frame and a first width of the first position frame, and according to a preset calibration parameter of the acquisition device and a preset second target world position, a corresponding prediction frame of the target object in the current frame image is obtained; the first target device position is the position of the first target pixel point under the device coordinate system of the acquisition device, the second target world position is the position of the second target pixel point under the world coordinate system, and the second target pixel point is the corresponding pixel point of the first target pixel point in the previous frame image of the first reference frame.

2. The method according to claim 1, wherein obtaining the prediction frame corresponding to the target object in the current frame image according to the first target pixel point, the obtained current vehicle position and the current vehicle yaw angle at the current time, the obtained first vehicle position and the first vehicle yaw angle at the first time corresponding to the first reference frame, the obtained first target device position, the obtained first height of the first position frame, the obtained first width of the first position frame, and the obtained preset calibration parameter of the acquisition device and the preset second target world position comprises:

determining the position of the first target pixel point under a world coordinate system at the current moment according to the calibration parameter, the first vehicle position, the first vehicle yaw angle and the second target world position to obtain the current target world position;

3. The method of claim 2, wherein determining the position of the first target pixel point in the world coordinate system at the current time according to the calibration parameter, the first vehicle position, the first vehicle yaw angle and the second target world position to obtain the current target world position comprises:

determining the position of the first target pixel point in a world coordinate system according to the calibration parameter, the first vehicle position and the first vehicle yaw angle to obtain a first target world position;

and predicting the position of the first target pixel point under a world coordinate system at the current moment according to the first target world position and the second target world position to obtain the current target world position.

4. The method of claim 3, wherein determining the position of the first target pixel point in the world coordinate system according to the calibration parameter, the first vehicle position, and the first vehicle yaw angle to obtain a first target world position comprises:

determining a conversion relation between a vehicle body coordinate system and a world coordinate system of the vehicle at the first moment according to the first vehicle position and the first vehicle yaw angle;

and determining the position of the first target pixel point under a world coordinate system according to the calibration parameter and the conversion relation to obtain a first target world position.

5. The method of claim 2, wherein determining the corresponding prediction frame of the target object in the current frame image according to the current target world position, the current vehicle yaw angle, the first vehicle position, the first vehicle yaw angle, the first target device position, the calibration parameter, the first height, and the first width comprises:

according to the conversion relation, determining the position of the current target world position under the vehicle body coordinate system to obtain a current target vehicle body position, and determining the position of the current vehicle position under the vehicle body coordinate system to obtain a current vehicle body position;

6. The method of claim 5, wherein the determining a corresponding prediction frame of the target object in the current frame image according to the current target vehicle body position, the current vehicle yaw angle, the first target device position, the calibration parameter, the first height, and the first width comprises:

determining the position of the first target pixel point relative to the vehicle at the current moment according to the current target vehicle body position and the current vehicle body position to obtain a current target vehicle reference position;

7. The method according to claim 6, wherein said determining a corresponding prediction frame of the target object in the current frame image according to the current target device position, the calibration parameter, the first target device position, the first height, and the first width comprises:

determining the current height and the current width of the first position frame in the current frame image according to the first target device position, the first height and the first width;

8. An apparatus for predicting a location frame of an object in an image, the apparatus comprising:

the determining module is used for obtaining a corresponding prediction frame of the target object in the current frame image according to the first target pixel point, the obtained current vehicle position and the current vehicle yaw angle at the current moment, the obtained first vehicle position and the first vehicle yaw angle at the first moment corresponding to the first reference frame, the obtained first target device position, the obtained first height of the first position frame and the obtained first width of the first position frame, the preset calibration parameter of the acquisition device and the preset second target world position; the first target device position is the position of the first target pixel point under the device coordinate system of the acquisition device, the second target world position is the position of the second target pixel point under the world coordinate system, and the second target pixel point is the corresponding pixel point of the first target pixel point in the previous frame image of the first reference frame.

9. An electronic device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, implements the method of predicting a location frame of an object in an image according to any one of claims 1 to 7.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the method for predicting a position frame of an object in an image according to any one of claims 1 to 7.