WO2021218693A1 - 一种图像的处理方法、网络的训练方法以及相关设备 - Google Patents

一种图像的处理方法、网络的训练方法以及相关设备 Download PDF

Info

Publication number
WO2021218693A1
WO2021218693A1 PCT/CN2021/088263 CN2021088263W WO2021218693A1 WO 2021218693 A1 WO2021218693 A1 WO 2021218693A1 CN 2021088263 W CN2021088263 W CN 2021088263W WO 2021218693 A1 WO2021218693 A1 WO 2021218693A1
Authority
WO
WIPO (PCT)
Prior art keywords
vehicle
coordinates
image
angle
point
Prior art date
Application number
PCT/CN2021/088263
Other languages
English (en)
French (fr)
Inventor
赵昕海
杨臻
张维
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP21797363.5A priority Critical patent/EP4137990A4/en
Publication of WO2021218693A1 publication Critical patent/WO2021218693A1/zh
Priority to US17/975,922 priority patent/US20230047094A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • G06T2207/30261Obstacle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Definitions

  • This application relates to the field of artificial intelligence, and in particular to an image processing method, a network training method, and related equipment.
  • Artificial Intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results.
  • artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Research in the field of artificial intelligence includes robotics, natural language processing, computer vision, decision-making and reasoning, human-computer interaction, recommendation and search, and basic AI theories.
  • Autonomous driving is a mainstream application in the field of artificial intelligence.
  • the self-driving car after collecting complete images of surrounding vehicles, the self-driving car can output information such as the orientation angle and size of the vehicle in the vehicle body coordinate system through the neural network according to the complete vehicle in the image, and then it can be positioned To the 3D outsourcing box of the vehicle.
  • the embodiment of the present application provides an image processing method, a network training method, and related equipment. According to the position information of the two-dimensional envelope of the second vehicle, the coordinates of the wheels and the first angle, the first The position information of the three-dimensional 3D envelope box of the vehicle improves the accuracy of the acquired 3D envelope box.
  • an embodiment of the present application provides an image processing method, which can be used in the image processing field of the artificial intelligence field.
  • the method includes: an execution device acquires a first image, the first image includes a first vehicle, and the execution device inputs the first image into an image processing network to obtain a first result output by the image processing network.
  • the first result includes the position information of the two-dimensional 2D envelope of the first vehicle, the coordinates of the wheels of the first vehicle, and the first angle of the first vehicle .
  • the position information of the 2D envelope frame of the first vehicle may include the coordinates of the center point of the 2D envelope frame and the side length of the 2D envelope frame.
  • the coordinates of the wheels of the first vehicle may refer to the coordinates of the location on the outside of the wheel, the coordinates of the location on the inside of the wheel, or the coordinates of the location in the middle of the thickness of the wheel.
  • the first angle of the first vehicle indicates the angle of the angle between the side line of the first vehicle and the first axis of the first image, and the side line of the first vehicle is the leakage side of the first vehicle and the plane where the first vehicle is located
  • the first axis of the first image is parallel to one side of the first image.
  • the first axis can be parallel to the U axis of the first image or parallel to the V axis of the first image.
  • the first angle The value range of can be 0 degrees to 360 degrees, or negative 180 degrees to positive 180 degrees.
  • the execution device generates the position information of the three-dimensional 3D envelope box of the first vehicle and the position information of the 3D envelope box of the first vehicle according to the position information of the 2D envelope box of the first vehicle, the coordinates of the wheels and the first angle It includes the coordinates of at least two first points, at least two of the first points are located on the side of the 3D envelope box of the first vehicle, and two of the at least two first points are positioned outside the 3D of the first vehicle On the side of the envelope box, the coordinates of at least two first points are used to locate the 3D envelope box of the first vehicle.
  • the acquired image is input into the image processing network, and the output of the image processing network is the position information of the two-dimensional envelope frame of the vehicle, the coordinates and the first angle of the wheels, according to the position of the two-dimensional envelope frame Information, the coordinates of the wheels and the first angle, the position information of the three-dimensional 3D envelope box of the first vehicle is generated, and then the 3D envelope box of the vehicle is located.
  • the accuracy of these three parameters at one angle has nothing to do with whether the vehicle in the image is complete, so no matter whether the vehicle in the image is complete, the coordinates of the first point obtained are accurate, so the accuracy of the 3D outer envelope box is better. High, that is, the accuracy of the acquired 3D outer envelope box is improved; further, it is possible to more accurately determine the driving intention of surrounding vehicles, thereby improving the driving safety of the autonomous vehicle.
  • the at least two first points include the side line of the first vehicle and the 2D envelope of the first vehicle Further, the at least two first points include the intersection between the side line of the first vehicle and the left boundary of the 2D envelope box of the first wheel, and, the side line of the first vehicle and The intersection between the right borders of the 2D envelope of the first wheel.
  • the first point in the case where the first vehicle only leaks the side in the first image, is the intersection between the side line of the first vehicle and the 2D envelope frame, which refines the situation in a specific scene.
  • the specific expression form of the first point has improved the degree of integration with the application scenario.
  • the execution device generates the position information of the 3D envelope box of the first vehicle according to the position information of the 2D envelope frame, the coordinates of the wheels and the first angle, which may include: execution device According to the coordinates of the wheels of the first vehicle and the first angle of the first vehicle, the position information of the side line of the first vehicle is generated.
  • the position information of the side line of the first vehicle may be a straight line of the side line of the first vehicle equation.
  • the execution device executes the coordinate generation operation according to the position information of the side line of the first vehicle and the position information of the 2D envelope frame of the first vehicle to obtain the coordinates of at least two first points; specifically, the execution device executes the coordinate generation operation according to the first
  • the position information of the 2D envelope box of the vehicle can determine the position of the left and right boundaries of the 2D envelope box of the first vehicle. According to the straight line equation of the side line of the first vehicle, the side line and the aforementioned left boundary can be generated. The coordinates of the intersection point generate the coordinates of the intersection point between the side line and the aforementioned right boundary.
  • the own vehicle can generate the position information of the side line of the first vehicle according to the coordinates and the first angle of the wheels of the first vehicle, which is simple to operate, easy to implement, and has high accuracy.
  • the first result also includes the position information of the boundary line of the first vehicle and the position information of the first vehicle.
  • the second angle, the dividing line is the dividing line between the side and the main surface, the main surface of the first vehicle is the front or back of the first vehicle, and the second angle of the first vehicle indicates the main sideline of the first vehicle and the first image
  • the angle between the first axis of the first vehicle, the main sideline of the first vehicle is the intersection between the main surface leaked by the first vehicle and the ground plane where the first vehicle is located, and the value range of the second angle can be 0 degrees To 360 degrees, it can also be negative 180 degrees to positive 180 degrees.
  • the at least two first points include a first intersection, a second intersection, and a third intersection.
  • the first intersection is the intersection of the side line of the first vehicle and the boundary line of the first vehicle, and the first intersection is the 3D exterior of the first vehicle.
  • a vertex of the envelope box, the second intersection is the intersection of the sideline of the first vehicle and the 2D envelope of the first vehicle, and the third intersection is the intersection of the main edge of the first vehicle and the 2D envelope of the first vehicle Point of intersection.
  • the specific manifestation of the first point enriches the application scenarios of this solution and improves the flexibility of implementation.
  • the demarcation line of the first vehicle crosses the contour of the lamp of the first vehicle, or the demarcation line of the first vehicle crosses the center point of the lamp of the first vehicle, or ,
  • the demarcation line of the first vehicle passes through the intersection of the sideline of the first vehicle and the main sideline of the first vehicle.
  • the execution device generates the position information of the three-dimensional 3D envelope box of the first vehicle according to the position information of the 2D envelope box of the first vehicle, the coordinates and the first angle of the wheels, It may include: the execution device generates the position information of the side line of the first vehicle according to the coordinates of the wheels of the first vehicle and the first angle of the first vehicle.
  • the execution device generates the coordinates of the first intersection point according to the position information of the side line of the first vehicle and the position information of the dividing line of the first vehicle; according to the position information of the side line of the first vehicle and the 2D envelope of the first vehicle
  • the position information of the frame is used to generate the coordinates of the second intersection point; the position information of the main sideline of the first vehicle is generated according to the coordinates of the first intersection point and the second angle of the first vehicle.
  • the position information of the main sideline of the first vehicle may be specifically The straight line equation of the main sideline of the first vehicle generates the coordinates of the third intersection point according to the position information of the main sideline of the first vehicle and the position information of the 2D envelope frame of the first vehicle.
  • the first result includes the position information of the 2D envelope frame of the first vehicle, and the main surface includes the front surface.
  • the position information of the 2D envelope frame includes the coordinates of the center point of the 2D envelope frame.
  • the first result may further include indication information of the leakage surface of the first vehicle in the first image, and the leakage surface includes one or more of the following: side, front, and back ,
  • the aforementioned side surface includes the left side and the right side.
  • the indication information that is leaking out can be expressed as a number sequence or a character string.
  • the method may further include: the execution device generates three-dimensional feature information of the first vehicle according to the coordinates of the at least two first points, and the three-dimensional feature information of the first vehicle includes one of the following: One or more items: the first vehicle is equivalent to the heading angle of the own vehicle, the position information of the centroid point of the first vehicle, and the size of the first vehicle.
  • the method further includes: in the case where the side of the first vehicle leaks from the first image, the execution device generates the orientation of the first vehicle relative to the self-vehicle according to the coordinates of the first point Angles, where the first vehicle leaks the side surface in the first image includes a case where the first vehicle only leaks the side surface in the first image, and a case where the first vehicle leaks both the side surface and the main surface in the first image.
  • the heading angle of the first vehicle relative to the self-vehicle can also be generated according to the coordinates of the first point, so as to improve the accuracy of the obtained heading angle.
  • the method may further include: the execution device generates the first point and the ground plane according to the coordinates of the first point and the ground plane assumption principle. The distance between self-vehicles.
  • the execution device generates the heading angle according to the coordinates of the first point, which may include: in the case that the distance between the first vehicle and the self-vehicle does not exceed a preset threshold according to the distance between the first point and the self-vehicle, passing the first point A calculation rule is to generate the heading angle according to the coordinates of the first point.
  • the preset threshold can be 10 meters, 15 meters, 30 meters or 25 meters; the first point is determined based on the distance between the first point and the vehicle.
  • the direction angle is generated according to the coordinates of the first point through the second calculation rule, and the second calculation rule and the first calculation rule are different calculation rules.
  • different calculation rules are adopted respectively, The heading angle of the first vehicle is generated to further improve the accuracy of the generated heading angle.
  • the distance between any one of the at least two first points and the self-vehicle when the distance between any one of the at least two first points and the self-vehicle does not exceed a preset threshold, it is regarded as the distance between the first vehicle and the self-vehicle The distance does not exceed the preset threshold; or, when the distance between the first point of at least two first points and the self-vehicle exceeds the preset threshold, it is deemed that the distance between the first vehicle and the self-vehicle exceeds the preset threshold Threshold.
  • two specific implementation manners for judging whether the distance between the first vehicle and the self-vehicle exceeds a preset preset are provided, which improves the implementation flexibility of the solution.
  • the execution device generates the orientation angle according to the coordinates of the first point through the first calculation rule, which may include: the execution device generates the first point according to the coordinates of the first point and the ground plane assumption principle.
  • the three-dimensional coordinates of a point in the vehicle body coordinate system among them, the origin of the vehicle body coordinate system is located in the vehicle, and the coordinate system origin of the vehicle body coordinate system can be the midpoint of the connection between the two rear wheels of the vehicle.
  • the origin of the coordinate system of the vehicle body coordinate system may also be the center of mass point of the vehicle.
  • the execution device generates an orientation angle according to the three-dimensional coordinates of the first point.
  • the accurate coordinates of the first point can be obtained. Since the orientation angle is generated based on the coordinates of the first point and the ground plane assumption principle, Ensure the accuracy of the generated orientation angle.
  • the execution device generates the heading angle according to the coordinates of the first point according to the second calculation rule, which may include: the execution device according to the coordinates of the first point and the first angle of the first vehicle , Generate the position information of the side line of the first vehicle, and generate the coordinates of the vanishing point based on the position information of the side line of the first vehicle and the vanishing line of the first image.
  • the vanishing point is the side of the first vehicle The intersection between the line and the vanishing line of the first image.
  • the execution device generates the heading angle according to the coordinates of the vanishing point and the principle of two-point perspective.
  • the self-vehicle After obtaining the coordinates of the vanishing point, the self-vehicle generates the heading angle of the first vehicle in the camera coordinate system according to the vanishing point coordinates and the principle of two-point perspective, and then according to the heading angle of the first vehicle in the camera coordinate system And the second transformation relationship, generate the heading angle of the first vehicle in the vehicle body coordinate system of the own vehicle.
  • the second transformation relationship refers to the transformation relationship between the camera coordinate system and the vehicle body coordinate system.
  • the second transformation relationship can also be It is called the external parameter of the camera.
  • a specific implementation manner for generating the heading angle of the first vehicle is provided when the distance between the first vehicle and the self-vehicle exceeds a preset threshold, and the side of the first vehicle is leaked from the first image ,
  • the operation is simple and the efficiency is high.
  • the execution device generates the heading angle according to the coordinates of the first point according to the second calculation rule, which may include: the execution device according to the coordinates of the first point, the first angle of the first vehicle According to the principle of small hole imaging, the mapping relationship between the first angle and the heading angle of the first vehicle is generated; the heading angle is generated according to the mapping relationship and the first angle of the first vehicle.
  • the second calculation rule may include: the execution device according to the coordinates of the first point, the first angle of the first vehicle According to the principle of small hole imaging, the mapping relationship between the first angle and the heading angle of the first vehicle is generated; the heading angle is generated according to the mapping relationship and the first angle of the first vehicle.
  • the method may further include: in the case that the first vehicle only leaks the main surface in the first image, the execution device according to the coordinates of the center point of the 2D envelope of the first vehicle And the principle of small hole imaging to generate the orientation angle.
  • the method may further include: the execution device obtains the coordinates of the vertices of the 3D envelope box of the first vehicle from the coordinates of the at least two first points, and according to the 3D coordinates of the first vehicle
  • the coordinates of the vertices of the outer envelope box and the ground plane assumption principle generate the three-dimensional coordinates of the centroid of the first vehicle in the vehicle body coordinate system, and the origin of the coordinate system of the vehicle body coordinate system is located in the own vehicle.
  • the coordinates of the first point not only the heading angle of the first vehicle can be generated, but also the three-dimensional coordinates of the centroid point of the first vehicle in the vehicle body coordinate system can be generated, which expands the application scenarios of this solution; in addition, , Improve the accuracy of the generated three-dimensional coordinates of the centroid point.
  • the method may further include: a first value range preset in the U-axis direction of the first image on the execution device, and a second value range in the V-axis direction of the first image. Value range.
  • the execution device determines whether the value of the U axis direction in the coordinates of the first point is within the first value range, and determines whether the value of the V axis direction in the coordinates of the first point is in the first point.
  • the first point is determined to be the vertex of the 3D envelope box of the first vehicle.
  • the method may further include: the execution device obtains the coordinates of the first vertex from the coordinates of the at least two first points, the first vertex being the coordinate of the 3D envelope box of the first vehicle A vertex.
  • the execution device generates the three-dimensional coordinates of the first vertex in the vehicle body coordinate system according to the coordinates of the first vertex and the ground plane assumption principle. If the at least two first points include at least two first vertices, the The three-dimensional coordinates in the body coordinate system generate one or more of the following: the length of the first vehicle, the width of the first vehicle, and the height of the first vehicle.
  • the origin of the coordinate system of the body coordinate system is located in the own vehicle.
  • the size of the first vehicle can also be generated, which further expands the application scenario of the solution; in addition, the accuracy of the generated size of the first vehicle is improved.
  • the method may further include: if one first vertex is included in the at least two first points, the execution device acquires a second image, the second image includes the first vehicle, and the second image The image acquisition angle is different from the first image.
  • the execution device obtains the coordinates of at least two second points through the image processing network according to the second image, at least two second points are located on the edge of the three-dimensional 3D envelope box of the first vehicle, and at least two second points The two second points in the middle position the side of the 3D outer box of the first vehicle, and the coordinates of the at least two second points are used to locate the 3D outer box of the first vehicle.
  • the execution device generates the three-dimensional coordinates of the second vertex in the vehicle body coordinate system according to the coordinates of the second point and the ground plane assumption principle.
  • the second vertex is a vertex of the 3D envelope box of the first vehicle.
  • One vertex is a different vertex, and according to the three-dimensional coordinates of the first vertex and the three-dimensional coordinates of the second vertex, one or more of the following is generated: the length of the first vehicle, the width of the first vehicle, and the height of the first vehicle.
  • another image of the first vehicle is used to jointly generate the size of the first vehicle, which ensures that the size of the first vehicle can be generated under various conditions.
  • the size of the first vehicle improves the comprehensiveness of this solution.
  • the second aspect of the embodiments of the present application provides an image processing method, which can be used in the image processing field of the artificial intelligence field.
  • the method includes: obtaining the first image by the execution device, and the first image includes the first vehicle; obtaining the position information of the three-dimensional 3D envelope box of the first vehicle according to the first image through the image processing network; The position information of the 3D envelope box generates the three-dimensional feature information of the first vehicle.
  • the three-dimensional feature information of the first vehicle includes one or more of the following: the first vehicle is equivalent to the heading angle of the own vehicle, The position information of the centroid point and the size of the first vehicle.
  • the position information of the 3D envelope box of the first vehicle includes the coordinates of at least two first points, and the at least two first points are both located in the three-dimensional 3D envelope of the first vehicle.
  • the at least two first points are both located in the three-dimensional 3D envelope of the first vehicle.
  • two of the at least two first points locate the edge of the 3D envelope box of the first vehicle, and the coordinates of the at least two first points are used to locate the 3D envelope of the first vehicle box.
  • the execution device obtains the position information of the three-dimensional 3D envelope box of the first vehicle according to the first image through the image processing network, including: the execution device inputs the first image into the image processing In the network, the first result output by the image processing network is obtained.
  • the first result includes the position information of the two-dimensional 2D envelope of the first vehicle and the information of the first vehicle.
  • the coordinates of the wheels and the first angle of the first vehicle indicates the angle between the side line of the first vehicle and the first axis of the first image
  • the side line of the first vehicle is The line of intersection between the leaking side surface of the first vehicle and the plane where the first vehicle is located
  • the first axis of the first image is parallel to one side of the first image.
  • the execution device executes a coordinate generation operation according to the position information of the 2D envelope frame of the first vehicle, the coordinates of the wheels, and the first angle to obtain the coordinates of at least two first points.
  • the at least two first points include the side line of the first vehicle and the 2D envelope of the first vehicle The two intersections of the box.
  • the execution device generates the position information of the 3D envelope box of the first vehicle according to the position information of the 2D envelope frame, the coordinates of the wheels and the first angle, including: the execution device generates the position information of the 3D envelope box of the first vehicle according to The coordinates of the wheels of the first vehicle and the first angle of the first vehicle are used to generate the position information of the side line of the first vehicle; according to the position information of the side line of the first vehicle and the position of the 2D envelope frame of the first vehicle Information, perform a coordinate generation operation to obtain the coordinates of at least two first points.
  • the first result also includes the position information of the boundary line of the first vehicle and the position information of the first vehicle.
  • the second angle, the dividing line is the dividing line between the side and the main surface, the main surface of the first vehicle is the front or back of the first vehicle, and the second angle of the first vehicle indicates the main sideline of the first vehicle and the first image
  • the angle of the included angle between the first axis, the main sideline of the first vehicle is the intersection between the main surface of the first vehicle leakage and the ground plane where the first vehicle is located.
  • the at least two first points include a first intersection, a second intersection, and a third intersection.
  • the first intersection is the intersection of the side line of the first vehicle and the boundary line of the first vehicle, and the first intersection is the 3D exterior of the first vehicle.
  • a vertex of the envelope box, the second intersection is the intersection of the sideline of the first vehicle and the 2D envelope of the first vehicle, and the third intersection is the intersection of the main edge of the first vehicle and the 2D envelope of the first vehicle Point of intersection.
  • the execution device generates the position information of the three-dimensional 3D envelope box of the first vehicle according to the position information of the 2D envelope box of the first vehicle, the coordinates and the first angle of the wheels, Including: the execution device generates the position information of the side line of the first vehicle according to the coordinates of the wheels of the first vehicle and the first angle of the first vehicle; the execution device generates the position information of the side line of the first vehicle according to the position information of the side line of the first vehicle and the first vehicle The position information of the dividing line is generated, and the coordinates of the first intersection point are generated.
  • the execution device generates the coordinates of the second intersection point according to the position information of the side line of the first vehicle and the position information of the 2D envelope frame of the first vehicle.
  • the execution device generates the position information of the main sideline of the first vehicle according to the coordinates of the first intersection and the second angle of the first vehicle; according to the position information of the main sideline of the first vehicle and the position information of the 2D envelope frame of the first vehicle , Generate the coordinates of the third intersection.
  • the method further includes: in the case where the side of the first vehicle leaks from the first image, the execution device generates the orientation of the first vehicle relative to the self-vehicle according to the coordinates of the first point Horn.
  • the method before the execution device generates the heading angle according to the coordinates of the first point, the method further includes: the execution device generates the first point and the self according to the coordinates of the first point and the ground plane assumption principle. The distance between the cars.
  • the execution device generates the heading angle according to the coordinates of the first point, including: the execution device determines that the distance between the first vehicle and the own vehicle does not exceed a preset threshold according to the distance between the first point and the own vehicle, passing The first calculation rule is to generate the heading angle according to the coordinates of the first point; when the distance between the first vehicle and the self-vehicle is determined to exceed the preset threshold according to the distance between the first point and the self-vehicle, pass the second
  • the calculation rule generates an orientation angle according to the coordinates of the first point, and the second calculation rule and the first calculation rule are different calculation rules.
  • the distance between any one of the at least two first points and the self-vehicle when the distance between any one of the at least two first points and the self-vehicle does not exceed a preset threshold, it is regarded as the distance between the first vehicle and the self-vehicle The distance does not exceed the preset threshold; or, when the distance between the first point of at least two first points and the self-vehicle exceeds the preset threshold, it is deemed that the distance between the first vehicle and the self-vehicle exceeds the preset threshold Threshold.
  • the execution device generates the heading angle according to the coordinates of the first point through the first calculation rule, including: generating the first point at the first point according to the coordinates of the first point and the ground plane assumption principle.
  • the origin of the coordinate system of the vehicle body coordinate system is located in the vehicle; the heading angle is generated according to the three-dimensional coordinates of the first point.
  • the execution device generates the heading angle according to the coordinates of the first point through the second calculation rule, including: generating the first angle according to the coordinates of the first point and the first angle of the first vehicle Based on the position information of the side line of a vehicle, the coordinates of the vanishing point are generated based on the position information of the side line of the first vehicle and the vanishing line of the first image.
  • the vanishing point is the side line of the first vehicle and the first image.
  • the intersection point between the vanishing lines of an image; the direction angle is generated according to the coordinates of the vanishing point and the principle of two-point perspective.
  • the execution device generates the heading angle according to the coordinates of the first point through the second calculation rule, including: according to the coordinates of the first point, the first angle of the first vehicle, and the small hole According to the imaging principle, the mapping relationship between the first angle and the heading angle of the first vehicle is generated; the heading angle is generated according to the mapping relationship and the first angle of the first vehicle.
  • the method further includes: the execution device obtains the coordinates of the vertices of the 3D envelope box of the first vehicle from the coordinates of the at least two first points;
  • the coordinates of the vertices of the envelope box and the ground plane assumption principle generate the three-dimensional coordinates of the center of mass point of the first vehicle in the vehicle body coordinate system, and the coordinate system origin of the vehicle body coordinate system is located in the own vehicle.
  • the method further includes: the execution device obtains the coordinates of the first vertex from the coordinates of the at least two first points, where the first vertex is one of the 3D envelope boxes of the first vehicle Vertex; According to the coordinates of the first vertex and the ground plane assumption principle, generate the three-dimensional coordinates of the first vertex in the vehicle body coordinate system; if the at least two first points include at least two first vertices, according to the first vertex in the vehicle The three-dimensional coordinates in the body coordinate system generate one or more of the following: the length of the first vehicle, the width of the first vehicle, and the height of the first vehicle. The origin of the coordinate system of the body coordinate system is located in the own vehicle.
  • the method further includes: if one first vertex is included in the at least two first points, the execution device acquires a second image, the second image includes the first vehicle, the second image, and The image acquisition angle of the first image is different; the execution device obtains the coordinates of at least two second points through the image processing network according to the second image, and the at least two second points are both located in the three-dimensional 3D envelope box of the first vehicle On the side, two of the at least two second points are positioned at the side of the 3D envelope box of the first vehicle, and the coordinates of the at least two second points are used to locate the 3D envelope box of the first vehicle.
  • the execution device generates the three-dimensional coordinates of the second vertex in the vehicle body coordinate system according to the coordinates of the second point and the ground plane assumption principle.
  • the second vertex is a vertex of the 3D envelope box of the first vehicle.
  • One vertex is a different vertex.
  • the execution device generates one or more of the following according to the three-dimensional coordinates of the first vertex and the three-dimensional coordinates of the second vertex: the length of the first vehicle, the width of the first vehicle, and the height of the first vehicle.
  • the embodiments of the present application provide an image processing method, which can be used in the image processing field of the artificial intelligence field.
  • the method includes: the execution device obtains a third image, the third image includes a first rigid body, and the first rigid body is a cube; the execution device inputs the third image into the image processing network, and obtains the second result output by the image processing network.
  • the second result includes the position information of the 2D envelope frame of the first rigid body and the first angle of the first rigid body, which indicates the side of the first rigid body The angle between the line and the first axis of the third image.
  • the side line of the first rigid body is the line of intersection between the leaking side surface of the first rigid body and the plane where the first rigid body is located.
  • One side of the third image is parallel.
  • the execution device generates the position information of the three-dimensional 3D envelope box of the first rigid body according to the position information and the first angle of the 2D envelope box of the first rigid body.
  • the position information of the 3D envelope box of the first rigid body includes at least two The coordinates of the third point, at least two third points are located on the edge of the 3D outer box of the first rigid body, and two of the at least two third points are located on the 3D outer box of the first rigid body On the side, the coordinates of at least two third points are used to locate the 3D outer box of the first rigid body.
  • the execution device generates the position information of the three-dimensional 3D envelope box of the first rigid body according to the position information and the first angle of the 2D envelope box of the first rigid body, which may include :
  • the vehicle can generate the coordinates of the lower left corner vertex of the 2D envelope frame of the first rigid body and/ Or the coordinates of the vertex of the lower right corner to replace the coordinates of the wheels in the first result of the first aspect.
  • the vehicle According to the coordinates of the lower left corner vertex and/or the coordinates of the lower right corner vertex of the 2D envelope frame of the first rigid body, the position information of the 2D envelope frame of the first rigid body and the first angle, the vehicle generates the three-dimensional 3D outer surface of the first rigid body.
  • the location information of the envelope box is
  • the second result may also include the position information of the boundary line of the first rigid body, which is
  • the car can generate one or more of the following coordinate information according to the position information of the 2D envelope frame of the first rigid body and the position information of the boundary line of the first rigid body: the boundary line of the first rigid body and the 2D envelope of the first rigid body.
  • the coordinates of the intersection of the bottom edges of the frame, the coordinates of the lower left vertex and the lower right vertex of the 2D envelope frame of the first rigid body replace the coordinates of the wheels in the first result of the first aspect.
  • the vehicle generates the position information of the three-dimensional 3D envelope box of the first rigid body according to the coordinate information generated above, the position information of the 2D envelope box of the first rigid body, and the first angle.
  • the second result may also include the lower left corner vertex of the 2D envelope of the first rigid body
  • the coordinates of and/or the coordinates of the vertex of the lower right corner are substituted for the coordinates of the wheels in the first result of the first aspect.
  • the second result can also include one or more of the following: the intersection of the boundary line of the first rigid body and the bottom edge of the 2D envelope frame
  • the coordinates of, the coordinates of the lower-left vertex and the lower-right vertex of the 2D envelope of the first rigid body replace the coordinates of the wheels in the first result of the first aspect.
  • the execution device may also execute the steps in each possible implementation manner of the first aspect.
  • the steps of the third aspect of the embodiments of the present application and the various possible implementation manners of the third aspect and each For the beneficial effects brought about by the possible implementation manners, reference may be made to the descriptions in the various possible implementation manners in the first aspect, which will not be repeated here.
  • an embodiment of the present application provides a network training method, which can be used in the image processing field of the artificial intelligence field.
  • the method may include: the training device obtains training images and annotation data of the training images, the training images include the second vehicle, and in the case that the second vehicle leaks the side in the training images, the annotation data includes the annotation coordinates of the wheels of the second vehicle and The first angle of the second vehicle is marked, the first angle of the second vehicle indicates the angle between the side line of the second vehicle and the first axis of the training image, and the side line of the second vehicle is the leakage of the second vehicle The line of intersection between the side surface of and the plane where the second vehicle is located, and the first axis of the training image is parallel to one side of the training image.
  • the training device inputs the training image into the image processing network to obtain a third result output by the image input network.
  • the third result includes the generated coordinates of the wheels of the second vehicle and the generated first angle of the second vehicle.
  • the training device uses the loss function to train the image processing network until the convergence condition of the loss function is met, and then outputs the trained image processing network.
  • the loss function is used to narrow the gap between the generated coordinates and the labeled coordinates. Similarity, and zoom in to generate the similarity between the first angle and label the first angle.
  • the trained image can output accurate information, which is conducive to improving the stability of the image processing network; in addition, the position information of the two-dimensional envelope, the coordinates of the wheel and the labeling rules of the first angle are simple, compared to the current use of Lidar for training The way of data labeling greatly reduces the difficulty of the training data labeling process.
  • the annotation data further includes the marked position information of the boundary line of the second vehicle and the marked second vehicle of the second vehicle. Two angles.
  • the third result also includes the generated position information of the boundary line of the second vehicle and the generated second angle of the second vehicle.
  • the loss function is also used to narrow the similarity between the generated position information and the marked position information, and reduce Nearly generate the similarity between the second angle and label the second angle.
  • the main surface of the second vehicle is the front or back of the second vehicle
  • the dividing line is the dividing line between the side and the main surface
  • the second angle of the second vehicle indicates the main sideline of the second vehicle and the first of the training image.
  • the angle of the included angle between the axes, and the main sideline of the second vehicle is the intersection line between the main surface of the leakage of the second vehicle and the ground plane where the second vehicle is located.
  • the image processing network includes a two-stage target detection network and a three-dimensional feature extraction network
  • the two-stage target detection network includes a region generation network.
  • the training device inputs the training image into the image processing network to obtain the third result of the image input network output, including: the training device inputs the training image into the two-stage target detection network, and obtains the second stage of the output of the second-stage target detection network. 2.
  • the position information of the 2D envelope frame of the vehicle; the training device inputs the first feature map into the three-dimensional feature extraction network, and obtains the third result output by the three-dimensional feature extraction network.
  • the first feature map is the feature map of the training image located in the region generating network The feature map in the output 2D envelope box.
  • the training device outputs the trained image processing network, including: the training device outputs an image processing network including a two-stage target detection network and a three-dimensional feature extraction network.
  • the accuracy of the 2D envelope box directly output by the region generation network is low, that is, the accuracy of the first feature map obtained based on the 2D envelope box directly output by the region generation network is low, which is beneficial to improve training
  • the difficulty of the stage further improves the robustness of the image processing network after training.
  • an embodiment of the present application provides an image processing device, which can be used in the image processing field of the artificial intelligence field.
  • the device includes an acquisition module, an input module, and a generation module.
  • the acquisition module is used to obtain a first image, and the first image includes the first vehicle; and the input module is used to input the first image into an image processing network to obtain image processing.
  • the first result includes the position information of the two-dimensional 2D envelope of the first vehicle, the coordinates of the wheels of the first vehicle, and the first The first angle of the vehicle, the first angle of the first vehicle indicates the angle between the side line of the first vehicle and the first axis of the first image, and the side line of the first vehicle is the side surface of the first vehicle leaking The line of intersection with the plane where the first vehicle is located, the first axis of the first image is parallel to one side of the first image; the generation module is used to generate information based on the position information of the 2D envelope of the first vehicle, the coordinates of the wheels and The first angle is to generate the position information of the three-dimensional 3D envelope box of the first vehicle.
  • the position information of the 3D envelope box of the first vehicle includes the coordinates of at least two first points. On the edge of the 3D envelope box of a vehicle, two of the at least two first points are positioned on the edge of the 3D envelope box of the first vehicle, and the coordinates of the at least two first points are used to locate the first The 3D outsourcing box of the vehicle.
  • the image processing apparatus including various modules can also be used to implement the steps in the various possible implementation manners of the first aspect.
  • the image processing apparatus including various modules can also be used to implement the steps in the various possible implementation manners of the first aspect.
  • the image processing apparatus including various modules can also be used to implement the steps in the various possible implementation manners of the first aspect.
  • the image processing apparatus including various modules can also be used to implement the steps in the various possible implementation manners of the first aspect.
  • an embodiment of the present application provides an image processing device, which can be used in the image processing field of the artificial intelligence field.
  • the device includes: an acquisition module and a generation module, wherein the acquisition module is used to obtain a first image, and the first image includes the first vehicle; the generation module is used to obtain an image of the first vehicle from the first image through an image processing network The location information of the three-dimensional 3D envelope box; the generating module is also used to generate the three-dimensional feature information of the first vehicle according to the location information of the three-dimensional 3D envelope box of the first vehicle.
  • the three-dimensional feature information of the first vehicle includes the following One or more of: the first vehicle is equivalent to the heading angle of the own vehicle, the position information of the centroid of the first vehicle, and the size of the first vehicle.
  • the image processing apparatus including various modules can also be used to implement the steps in the various possible implementation manners of the second aspect.
  • the sixth aspect of the embodiments of the present application and various possible implementation manners of the sixth aspect For the specific implementation manners of some steps in and the beneficial effects brought by each possible implementation manner, please refer to the descriptions in the various possible implementation manners in the second aspect, which will not be repeated here.
  • an embodiment of the present application provides an image processing device, which can be used in the image processing field of the artificial intelligence field.
  • the device includes an acquisition module, an input module, and a generation module, wherein the acquisition module is used to acquire a third image, the third image includes a first rigid body, and the first rigid body is a cube; the input module is used to input the third image into the image processing In the network, the second result output by the image processing network is obtained.
  • the second result includes the position information of the 2D envelope frame of the first rigid body and the first rigid body's first Angle, the first angle of the first rigid body indicates the angle between the side line of the first rigid body and the first axis of the third image.
  • the side line of the first rigid body is the leakage side of the first rigid body and the first rigid body
  • the intersection line between the planes, the first axis of the third image is parallel to one side of the third image;
  • the generation module is used to generate the first rigid body according to the position information and the first angle of the 2D envelope of the first rigid body
  • the position information of the three-dimensional 3D envelope box of the first rigid body includes the coordinates of at least two third points, and at least two third points are located in the 3D envelope box of the first rigid body On the side of the at least two third points, two of the third points are positioned at the sides of the 3D envelope box of the first rigid body, and the coordinates of the at least two third points are used to locate the 3D envelope box of the first rigid body.
  • the image processing apparatus including various modules can also be used to implement the steps in the various possible implementation manners of the third aspect.
  • the seventh aspect and various possible implementation manners of the seventh aspect of the embodiments of the present application For the specific implementation manners of some steps in and the beneficial effects brought by each possible implementation manner, please refer to the descriptions in the various possible implementation manners in the third aspect, which will not be repeated here.
  • an embodiment of the present application provides an image processing device, which can be used in the image processing field of the artificial intelligence field.
  • the device includes an acquisition module, an input module, and a training module.
  • the acquisition module is used to acquire training images and annotation data of the training images.
  • the training images include the second vehicle.
  • the labeling data includes the labeling coordinates of the wheels of the second vehicle and the labeling first angle of the second vehicle.
  • the first angle of the second vehicle indicates the angle between the side line of the second vehicle and the first axis of the training image,
  • the side line of the second vehicle is the line of intersection between the leaked side surface of the second vehicle and the plane where the second vehicle is located.
  • the first axis of the training image is parallel to one side of the training image; the input module is used to input the training image into the image In the processing network, the third result output by the image input network is obtained.
  • the third result includes the generated coordinates of the wheels of the second vehicle and the first angle generated by the second vehicle; the training module is used to use the annotation data and the third result.
  • the loss function trains the image processing network until the convergence condition of the loss function is met, and outputs the trained image processing network.
  • the loss function is used to narrow the similarity between the generated coordinates and the labeled coordinates, and the first angle sum is generated. Mark the similarity between the first angles.
  • the image processing apparatus includes various modules that can also be used to implement the steps in the various possible implementation manners of the fourth aspect.
  • the eighth aspect and various possible implementation manners of the eighth aspect of the embodiments of the present application For the specific implementation manners of some steps in the above and the beneficial effects brought by each possible implementation manner, reference may be made to the descriptions in the various possible implementation manners in the fourth aspect, which will not be repeated here.
  • an embodiment of the present application provides an execution device, which may include a processor, the processor and a memory are coupled, the memory stores program instructions, and when the program instructions stored in the memory are executed by the processor, the foregoing first aspect is implemented
  • the image processing method, or the image processing method described in the second aspect is implemented when the program instructions stored in the memory are executed by the processor, or the third aspect is implemented when the program instructions stored in the memory are executed by the processor
  • the described image processing method for the steps executed by the executing device in each possible implementation manner of the first aspect, the second aspect, or the third aspect executed by the processor, please refer to the foregoing first aspect, second aspect, or third aspect for details, and details are not described herein again.
  • an embodiment of the present application provides an autonomous driving vehicle, which may include a processor, the processor is coupled to a memory, and the memory stores program instructions.
  • the program instructions stored in the memory are executed by the processor, the above-mentioned first aspect is implemented.
  • the image processing method described above, or the image processing method described in the second aspect is implemented when the program instructions stored in the memory are executed by the processor, or the third image processing method described above is implemented when the program instructions stored in the memory are executed by the processor.
  • the image processing method described in the aspect For the steps executed by the executing device in each possible implementation manner of the first aspect, the second aspect, or the third aspect executed by the processor, please refer to the foregoing first aspect, second aspect, or third aspect for details, and details are not described herein again.
  • an embodiment of the present application provides a training device, which may include a processor, the processor and a memory are coupled, the memory stores program instructions, and when the program instructions stored in the memory are executed by the processor, the foregoing fourth aspect is implemented.
  • the training method of the network described.
  • the steps executed by the training device in each possible implementation manner of the fourth aspect by the processor please refer to the fourth aspect for details, which will not be repeated here.
  • the embodiments of the present application provide a computer-readable storage medium with a computer program stored in the computer-readable storage medium.
  • the computer program runs on a computer, the computer executes the first and second The image processing method according to the aspect or the third aspect, or the computer is caused to execute the network training method according to the fourth aspect.
  • an embodiment of the present application provides a circuit system, the circuit system includes a processing circuit configured to execute the image processing method described in the first, second, or third aspect above Or, the processing circuit is configured to execute the network training method described in the fourth aspect.
  • the embodiments of the present application provide a computer program that, when run on a computer, causes the computer to execute the image processing method described in the first, second, or third aspect, or The computer executes the network training method described in the fourth aspect.
  • an embodiment of the present application provides a chip system including a processor, which is used to support a server or an image processing device to implement the functions involved in the above aspects, for example, sending or processing the functions involved in the above methods Data and/or information.
  • the chip system further includes a memory, and the memory is used to store necessary program instructions and data for the server or the communication device.
  • the chip system can be composed of chips, and can also include chips and other discrete devices.
  • FIG. 1 is a schematic structural diagram of an artificial intelligence main frame provided by an embodiment of the application.
  • FIG. 2 is a system architecture diagram of an image processing system provided by an embodiment of the application
  • FIG. 3 is a schematic flowchart of an image processing method provided by an embodiment of the application.
  • FIG. 4 is a schematic diagram of the first result in the image processing method provided by the embodiment of the application.
  • FIG. 5 is another schematic diagram of the first result in the image processing method provided by the embodiment of the application.
  • FIG. 6 is a schematic diagram of the first point in the image processing method provided by the embodiment of the application.
  • FIG. 7 is another schematic diagram of the first point in the image processing method provided by the embodiment of the application.
  • FIG. 8 is a schematic flowchart of a network training method provided by an embodiment of this application.
  • FIG. 9 is a schematic diagram of another flow of a network training method provided by an embodiment of this application.
  • FIG. 10 is a schematic flowchart of another network training method provided by an embodiment of this application.
  • FIG. 11 is a schematic diagram of another flow chart of an image processing method provided by an embodiment of the application.
  • FIG. 12 is a schematic diagram of a 3D envelope box in an image processing method provided by an embodiment of the application.
  • FIG. 13 is a schematic structural diagram of an image processing device provided by an embodiment of the application.
  • FIG. 14 is a schematic diagram of another structure of an image processing apparatus provided by an embodiment of the application.
  • FIG. 15 is a schematic structural diagram of a network training device provided by an embodiment of this application.
  • FIG. 16 is a schematic structural diagram of an execution device provided by an embodiment of this application.
  • FIG. 17 is a schematic diagram of a structure of an autonomous vehicle provided by an embodiment of the application.
  • FIG. 18 is a schematic diagram of a structure of a training device provided by an embodiment of the application.
  • FIG. 19 is a schematic diagram of a structure of a chip provided by an embodiment of the application.
  • the embodiment of the present application provides an image processing method, a network training method, and related equipment. According to the position information of the two-dimensional envelope of the second vehicle, the coordinates of the wheels and the first angle, the first The position information of the three-dimensional 3D envelope box of the vehicle improves the accuracy of the acquired 3D envelope box.
  • Figure 1 shows a schematic diagram of the main framework of artificial intelligence.
  • the "intelligent information chain” reflects a series of processes from data acquisition to processing. For example, it can be the general process of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision-making, intelligent execution and output. In this process, the data has gone through the condensing process of "data-information-knowledge-wisdom".
  • the "IT value chain” from the underlying infrastructure of human intelligence, information (providing and processing technology realization) to the industrial ecological process of the system, reflects the value that artificial intelligence brings to the information technology industry.
  • the infrastructure provides computing power support for the artificial intelligence system, realizes communication with the outside world, and realizes support through the basic platform.
  • computing power is provided by smart chips
  • the aforementioned smart chips include but are not limited to central processing unit (CPU), embedded neural-network processing unit (NPU), graphics processor Hardware acceleration chips such as graphics processing unit (GPU), application specific integrated circuit (ASIC) and field programmable gate array (FPGA);
  • the basic platform includes distributed computing framework and network related
  • the platform guarantee and support of can include cloud storage and computing, interconnection network, etc.
  • sensors communicate with the outside to obtain data, and these data are provided to the smart chip in the distributed computing system provided by the basic platform for calculation.
  • the data in the upper layer of the infrastructure is used to represent the data source in the field of artificial intelligence.
  • the data involves graphics, images, voice, text, and IoT data of traditional devices, including business data of existing systems and sensory data such as force, displacement, liquid level, temperature, and humidity.
  • Data processing usually includes data training, machine learning, deep learning, search, reasoning, decision-making and other methods.
  • machine learning and deep learning can symbolize and formalize data for intelligent information modeling, extraction, preprocessing, training, etc.
  • Reasoning refers to the process of simulating human intelligent reasoning in a computer or intelligent system, using formal information to conduct machine thinking and solving problems based on reasoning control strategies.
  • the typical function is search and matching.
  • Decision-making refers to the process of making decisions after intelligent information is reasoned, and usually provides functions such as classification, ranking, and prediction.
  • some general capabilities can be formed based on the results of the data processing, such as an algorithm or a general system, for example, translation, text analysis, computer vision processing, speech recognition, etc. .
  • Intelligent products and industry applications refer to the products and applications of artificial intelligence systems in various fields. It is an encapsulation of the overall solution of artificial intelligence, productizing intelligent information decision-making and realizing landing applications. Its application fields mainly include: intelligent terminals, intelligent manufacturing, Intelligent transportation, smart home, smart medical, smart security, autonomous driving, safe city, etc.
  • a rigid body refers to an object whose shape and size remain unchanged during movement and after being subjected to a force, and the relative position of various internal points remains unchanged; the aforementioned rigid body can specifically be expressed as a vehicle on the road, a road block or other types of rigid bodies, etc. .
  • the embodiment of the present application can be applied to a scene where the heading angle of vehicles around the self-car (that is, the self-driving vehicle where the user is located) is estimated, and the 3D outsourcing box of the vehicles around the self-car can be located first , And use the points on the side of the 3D envelope box to generate the heading angles of vehicles around the vehicle.
  • the embodiment of the present application can be applied to a scene where the position of a roadblock around the own vehicle is estimated. The 3D envelope box of the roadblock around the own vehicle can be located first, and then the 3D envelope box can be used. The point on the side of the vehicle generates the location information of the roadblocks around the own vehicle and so on.
  • FIG. 2 is a system architecture diagram of the image processing system provided by the embodiment of the application.
  • the image processing The system 200 includes an execution device 210, a training device 220, a database 230, and a data storage system 240.
  • the execution device 210 includes a calculation module 211.
  • a training data set is stored in the database 230.
  • the training data set includes multiple training images and annotation data of each training image.
  • the training device 220 generates a target model/rule 201 for the image and uses the training data in the database.
  • the collection performs iterative training on the target model/rule 201 to obtain a mature target model/rule 201.
  • the image processing network obtained by the training device 220 can be applied to different systems or devices, such as autonomous vehicles, mobile phones, tablets, smart home appliances, monitoring systems, and so on.
  • the execution device 210 can call data, codes, etc. in the data storage system 240, and can also store data, instructions, etc. in the data storage system 240.
  • the data storage system 240 may be placed in the execution device 210, or the data storage system 240 may be an external memory relative to the execution device 210.
  • the calculation module 211 can process the image collected by the execution device 210 through the image processing network to obtain the position information and the first angle of the 2D envelope frame of the first rigid body in the image.
  • the first angle indicates the difference between the side line and the first axis.
  • the angle between the first rigid body and the side line of the first rigid body is the intersection line between the leaking side surface of the first rigid body and the plane where the first rigid body is located.
  • the first axis of the first image is parallel to one side of the first image, and then the execution The device 210 may generate the coordinates of points on the sides of the 3D envelope box of the first rigid body according to the position information of the two-dimension (2D) envelope box of the first rigid body and the first angle, so as to compare the coordinates of the first rigid body's 3D envelope box. Positioning is performed by the 3D envelope box. Since the accuracy of the position information and the first angle of the 2D envelope box is independent of whether the rigid body in the image is complete, the coordinates of the first point are obtained regardless of whether the rigid body in the image is complete or not. It is accurate, so that the 3D outer envelope boxes that are located are all accurate.
  • the "user" can directly interact with the execution device 210, that is, the execution device 210 and the client device are integrated in the same device.
  • FIG. 2 is only a schematic diagram of the architecture of the two image processing systems provided by the embodiment of the present invention, and the positional relationship between the devices, devices, modules, etc. shown in the figure does not constitute any limitation.
  • the execution device 210 and the client device may be independent devices.
  • the execution device 210 is equipped with an input/output interface for data interaction with the client device, and the “user” can input/output data through the client device.
  • the output interface inputs the collected image, and the execution device 210 returns the coordinates of the first point to the client device through the input/output interface.
  • an embodiment of the present application provides an image processing method, which can be applied to the execution device 210 shown in FIG. 2.
  • the self-vehicle can be pre-configured with a trained image processing network.
  • the first image is input into the image processing network to obtain the first result output by the image processing network.
  • the result includes the 2D envelope of the second vehicle, the coordinates of the first wheel, and the first angle of the angle between the side line of the first vehicle and an axis of the first image.
  • the coordinates of the first point are generated.
  • the first point refers to the point on the side of the 3D envelope box of the first vehicle.
  • the 3D envelope box is used for positioning.
  • the embodiment of the present application includes an inference phase and a training phase, and the processes of the inference phase and the training phase are different. The following describes the inference phase and the training phase respectively.
  • the reasoning stage describes how the execution device 210 uses a mature image processing network to locate the 3D envelope box of the first vehicle in the first image.
  • the self-vehicle can estimate the 3D feature information such as the orientation angle, the position of the centroid point, and/or the size of the first vehicle.
  • FIG. 3 is a schematic flowchart of an image processing method provided in an embodiment of this application.
  • the image processing method provided in an embodiment of this application may include:
  • the own vehicle may be equipped with a camera device for image collection, so that the own vehicle can perform image collection through the aforementioned camera device to obtain the first image.
  • the aforementioned camera equipment includes but is not limited to cameras, capture cards, radars or other types of camera equipment, etc.
  • a first image may include one or more first vehicles and the environment where the first vehicles are located.
  • the first image may be an independent image or a video frame in the video.
  • the first image can be collected through the monocular camera system; if the vehicle is equipped with a binocular camera system, the first image can be through the dual camera system. Any one of the two images collected by the eye camera system; if the vehicle is equipped with a multi-eye camera system, the first image may be any one of the multiple images collected by the multi-eye camera system.
  • the vehicle inputs the first image into the image processing network to obtain the first result output by the image processing network.
  • a mature image processing network is pre-configured on the vehicle. After the first image is obtained, the first image is input into the image processing network to obtain one or more groups of first images output by the image processing network. As a result, the number of first results is consistent with the number of first vehicles in the first image, and a set of first results is used to indicate the characteristic information of a first vehicle.
  • each set of first results may include the position information of the 2D bounding frame of the first vehicle, the coordinates of the wheels of the first vehicle, and the first result.
  • each set of first results may include the position information of the 2D envelope frame of the first vehicle, and the foregoing main surface refers to the front or back .
  • the position information of the 2D envelope frame of the first vehicle may include the coordinates of the center point of the 2D envelope frame and the side length of the 2D envelope frame. Because the wheels of the vehicle have a certain thickness, the coordinates of the wheels of the first vehicle can refer to the coordinates of finding the location on the outside of the wheel, the coordinates of finding the location on the inside of the wheel, and the coordinates of finding the location in the middle of the thickness of the wheel. and many more.
  • the coordinates of the wheels of the first vehicle may include the coordinates of one wheel or two wheels, and the specific situation may be determined by the actual captured image.
  • the coordinates of the aforementioned wheels and the coordinates of the center point may correspond to the same coordinate system, and the origin of the coordinate system may be any vertex of the first image, or the center point of the first image, or Other points in the first image are not limited here.
  • the two coordinate axes of the coordinate system are the U axis and the V axis of the first image.
  • the first angle of the first vehicle indicates the angle between the side line of the first vehicle and the first axis of the first image, and the side line of the first vehicle is the side surface of the first vehicle leaking from the plane where the first vehicle is located
  • the line of intersection between the first image is parallel to one side of the first image, and the first axis can be parallel to the U axis of the first image or parallel to the V axis of the first image; furthermore ,
  • the value range of the first angle can be 0 degrees to 360 degrees, and can also be negative 180 degrees to positive 180 degrees, which is not limited here.
  • FIG. 4 is a schematic diagram of the first result in the image processing method provided by the embodiment of the application.
  • the coordinates of the wheel are the coordinates of the location on the outside of the wheel
  • the first axis is the U-axis of the first image as an example.
  • A1 represents the 2D envelope of the first vehicle
  • A2 represents the coordinates of the wheels of the first vehicle
  • A3 represents the U axis of the first image
  • A4 represents the V axis of the first image
  • A5 represents the side line of the first vehicle
  • A6 represents the first angle of the first vehicle. It should be understood that the example in FIG. 5 is only to facilitate understanding of the solution, and is not used to limit the solution.
  • the first result may also include the position information of the boundary line of the first vehicle and the second angle of the first vehicle.
  • the dividing line is the dividing line between the side surface and the main surface. If the first vehicle leaks the front and side surfaces in the first image, the dividing line of the first vehicle is the dividing line between the leaking front surface and the side surface; if The first vehicle leaks the back and side in the first image, and the boundary of the first vehicle is the boundary between the leaked back and the side.
  • the front of the first vehicle refers to the front of the first vehicle
  • the back of the first vehicle refers to the rear of the first vehicle.
  • the dividing line of the first vehicle may be the outline of the lights passing through the first vehicle, or the dividing line of the first vehicle may also be the center point of the lights passing through the first vehicle, or the dividing line of the first vehicle It may also be the intersection of the sideline passing through the first vehicle and the main sideline of the first vehicle.
  • the boundary between the main surface and the side surface of the first vehicle may also be determined based on other information, which is not limited here.
  • the position information of the dividing line can be expressed as a numerical value.
  • the numerical value can be the distance between the dividing line of the first vehicle and one side of the 2D envelope of the first vehicle, or it can be the distance between the dividing line and the coordinate system.
  • the U-axis coordinate value of the intersection point between the U-axis is not limited here.
  • a specific implementation form of the position information of the 2D envelope frame and several specific implementation methods of the position information of the dividing line are provided, which improves the selection flexibility of the solution.
  • the second angle of the first vehicle indicates the angle between the main sideline of the first vehicle and the first axis of the first image.
  • the main sideline of the first vehicle is the main surface leaked from the first vehicle and the place where the first vehicle is located.
  • the intersection between the planes if the first vehicle leaks the front and the ground plane in the first image, then the main edge of the first vehicle is the intersection between the leaked front and the ground plane; if the first vehicle is in the first image If the rear surface and the ground plane are leaked out, the main sideline of the first vehicle is the intersection line between the leaked rear surface and the ground plane.
  • the value range of the second angle is consistent with the value range of the first angle, and will not be repeated here.
  • FIG. 5 is a schematic diagram of the first result in the image processing method provided by the embodiment of the application.
  • the coordinates of the wheel are the coordinates of the outer side of the wheel
  • the first axis is the U-axis of the first image
  • the dividing line passes through the outer contour of the car light as an example.
  • B1 represents the 2D envelope of the first vehicle
  • B2 represents the coordinates of the wheels of the first vehicle
  • B3 represents the U axis of the first image
  • B4 represents the V axis of the first image
  • B5 represents the side line of the first vehicle
  • B6 represents the first angle of the first vehicle
  • B7 represents the boundary between the side and front of the first vehicle
  • B8 represents the main sideline of the first vehicle (that is, the front sideline in Figure 5)
  • B9 represents the first vehicle From the second point of view, it should be understood that the example in FIG. 5 is only to facilitate understanding of the solution, and is not used to limit the solution.
  • the first result may also include indication information of the leakage surface of the first vehicle in the first image, the leakage surface includes one or more of the following: side, front and back, and the aforementioned side includes left and right.
  • the indication information of the leaking surface may be expressed as a digital sequence.
  • the digital sequence includes four sets of numbers, which correspond to the front, back, left, and right sides of the first vehicle, and one set of numbers indicates that the first image Whether the surface corresponding to the set of numbers is leaked out, the set of numbers includes one or more values.
  • the indication information of the leaking surface is specifically represented as 1010, which respectively correspond to the front, back, left and right of the first vehicle.
  • the indication information of the leaking surface can also be expressed as a string of characters. As an example, for example, the indication information of the leaking surface is specifically expressed as "front and right", indicating that the first vehicle has leaked the front and the right in the first image. For the aforementioned The specific form of the indication information can be determined in combination with the actual product form, and it is not limited here.
  • the first result also includes the indication information of the leakage surface of the first vehicle in the first image, so that in the subsequent process of using the coordinates of the first point to generate the three-dimensional feature information of the first vehicle, According to the indication information of the leaking surface in the first image, it is determined whether the first vehicle leaks only the main surface, only the side surface, or both the side surface and the main surface in the first image, which is beneficial to improve the subsequent three-dimensional feature information generation process Accuracy to improve the accuracy of the generated three-dimensional feature information.
  • the own vehicle generates position information of the three-dimensional 3D envelope box of the first vehicle according to the first result.
  • the own vehicle after obtaining the first result, the own vehicle first generates the position information of the three-dimensional 3D envelope box of the first vehicle according to the first result, and then according to the three-dimensional 3D envelope box of the first vehicle The position information of the box generates 3D feature information of the first vehicle.
  • the position information of the 3D envelope box of the first vehicle includes the coordinates of at least two first points, at least two first points are located on the side of the 3D envelope box of the first vehicle, and at least two first points The two first points of the points locate the sides of the 3D envelope box of the first vehicle, and the coordinates of the at least two first points are used to locate the 3D envelope box of the first vehicle.
  • the 3D envelope box of the first vehicle includes 12 edges and 8 vertices.
  • the concept of positioning in the embodiment of the present application refers to the ability to determine some of the aforementioned 12 edges and 8 vertices and/or vertices s position.
  • the first vehicle is mainly located according to the coordinates and the first angle of the first point. The position of the edges and/or vertices on the bottom surface of the 3D envelope box.
  • the vehicle can determine whether the first vehicle has only the side surface or both the side surface and the main surface in the first image according to the indication information of the leakage surface of the first vehicle. Deal with them separately, the two situations are introduced separately below.
  • the first vehicle only leaks the side in the first image
  • step 303 may include: the own vehicle according to the 2D envelope of the first vehicle The position information of the frame, the coordinates of the wheels of the first vehicle, and the first angle of the first vehicle are generated to generate the coordinates of at least two first points, the at least two first points including the side line of the first vehicle and the 2D envelope frame Of the two intersections.
  • the first point is a general concept, the first point refers to the side of the 3D envelope box of the first vehicle generated in the specific scene where only the side of the first vehicle leaks out of the first image The coordinates of the point on.
  • the first point is the intersection between the side line of the first vehicle and the 2D envelope frame, which refines the situation in a specific scene .
  • the specific form of the first point has improved the degree of integration with application scenarios.
  • the own vehicle generates the position information of the side line of the first vehicle according to the coordinates and the first angle of the wheels of the first vehicle, and according to the position information of the side line of the first vehicle and the position information of the 2D envelope frame, Perform a coordinate generation operation to obtain the coordinates of at least two first points.
  • the self-vehicle can generate the position information of the side line of the first vehicle according to the coordinates and the first angle of the wheels of the first vehicle, which is simple to operate, easy to implement, and highly accurate.
  • the self-vehicle can generate a straight line equation of the side line of the first vehicle according to the coordinates and the first angle of the wheels of the first vehicle, that is, obtain the position information of the side line of the first vehicle.
  • the own car can determine the position of the left and right boundaries of the 2D envelope box of the first vehicle, and generate the side line according to the straight line equation of the side line of the first vehicle.
  • the coordinates of the intersection point M with the aforementioned left boundary that is, a first point
  • generate the coordinates of the intersection point N that is, another first point
  • the line between the aforementioned intersection M and the intersection N is the side line of the first vehicle
  • the side line of the first vehicle is an edge of the bottom surface of the 3D envelope box of the first vehicle.
  • the left boundary and the right boundary of the 2D envelope box are respectively parallel to the two sides of the side surface of the 3D envelope box of the first vehicle. Therefore, using the coordinates of the first point can realize the positioning of the 3D envelope box of the first vehicle.
  • step 303 may further include: the own vehicle determines the coordinates of the upper left corner and the upper right corner of the 2D envelope frame of the first vehicle according to the position information of the 2D envelope frame of the first vehicle, and The vertex O in the upper left corner and the vertex P in the upper right corner are respectively determined as the two first points, and the coordinates of the vertex O and the coordinates of the vertex P are respectively determined as the coordinates of the first point.
  • the vertex O and the vertex P are both located in the first point On the edge of the top surface of the 3D outer envelope box of a vehicle.
  • FIG. 6 is a schematic diagram of the first point in the image processing method provided by the embodiment of this application.
  • the first vehicle is in the first image.
  • A1 represents the 2D envelope of the first vehicle, A1 is generated based on the position information of the 2D envelope of the first vehicle;
  • A5 represents the side line of the first vehicle, and A5 is based on the coordinates of the wheels of the first vehicle And generated from the first angle.
  • D1 represents an intersection point between the sideline of the first vehicle and the left boundary of the 2D envelope of the first vehicle (that is, the above-mentioned intersection M), and D2 represents the sideline of the first vehicle and the 2D envelope of the first vehicle
  • An intersection point of the right boundary of the first vehicle that is, the above-mentioned intersection point N
  • D3 represents the upper left vertex of the 2D envelope of the first vehicle (also the above vertex O)
  • D4 represents the upper right vertex of the 2D envelope of the first vehicle (That is, the above vertex P)
  • D1, D2, D3 and D4 are the four first points located on the side of the 3D outer box of the first vehicle
  • D1 and D2 are located in the 3D outer box of the first vehicle
  • D3 and D4 are located on the edge of the top surface of the 3D envelope box of the first vehicle.
  • the coordinates of D1, D2, D3, and D4 are the coordinates of the four first points generated. It should be understood that In other embodiments, the coordinates of D3 and D4 may not be generated.
  • the example in FIG. 6 is only to facilitate understanding of the solution, and is not used to limit the solution.
  • the first result when the first vehicle leaks the side and main surface in the first image, the first result also includes the position information of the boundary line of the first vehicle and the second angle of the first vehicle.
  • the specific meanings of the position information and the second angle have been introduced in step 302, and will not be repeated here.
  • the at least two first points include a first intersection, a second intersection, and a third intersection.
  • the coordinates of the first intersection, the coordinates of the second intersection, and the coordinates of the third intersection refer to the missing side of the first vehicle in the first image
  • the generated coordinates of the point located on the edge of the 3D envelope box of the first vehicle In the specific scenario of and the main surface, the generated coordinates of the point located on the edge of the 3D envelope box of the first vehicle.
  • the first point of intersection is the point of intersection between the side line of the first vehicle and the dividing line, and is also a vertex of the 3D envelope box of a vehicle.
  • the second point of intersection is the point of intersection between the sideline of the first vehicle and the 2D envelope.
  • the line connecting the first point of intersection and the second point of intersection is the sideline of the first vehicle, and the sideline of the first vehicle is also the first An edge of the bottom surface of the 3D envelope box of the vehicle.
  • the second intersection point is a vertex of the bottom surface of the 3D envelope box of the first vehicle.
  • the third intersection is the intersection of the main sideline of the first vehicle and the 2D envelope.
  • the line connecting the first intersection and the third intersection is the main sideline of the first vehicle.
  • the main sideline of the first vehicle is also the 3D of the first vehicle.
  • the other side of the bottom surface of the envelope box; further, if the first image envelopes the complete main surface of the first vehicle, the third intersection point is a vertex of the bottom surface of the 3D envelope box of the first vehicle.
  • the left boundary and the right boundary of the 2D envelope box of the first vehicle are respectively parallel to the two sides of the side surface of the 3D envelope box of the first vehicle. Therefore, using the coordinates of the first intersection, the coordinates of the second intersection, and the coordinates of the third intersection can realize the positioning of the 3D outer box of the first vehicle.
  • the specific manifestation of the first point in the case where the first vehicle only leaks the side surface in the first image, but also provides the side surface and main surface of the first vehicle in the first image.
  • the specific manifestation of the first point enriches the application scenarios of this solution and improves the flexibility of implementation.
  • step 303 may include: The coordinates of the first angle and the coordinates of the first point of intersection are generated.
  • the own vehicle generates the coordinates of the second intersection point according to the position information of the 2D envelope frame, the coordinates of the wheels, and the first angle. According to the position information of the 2D envelope frame, the coordinates of the first point of intersection, and the second angle, the coordinates of the third point of intersection are generated.
  • the coordinates of the first point of intersection, the coordinates of the second point of intersection, the coordinates of the third point of intersection, the first angle and the second angle are used to locate the 3D envelope box of the first vehicle.
  • an implementation method for generating the coordinates of multiple first points when the first vehicle leaks the side surface and the main surface in the first image which is simple to operate, easy to implement, and has high accuracy.
  • the main surface is specifically the front
  • the main sideline is specifically the front.
  • Sideline The self-vehicle generates the straight line equation of the side line according to the coordinates of the wheel and the first angle, that is, the position information of the side line of the first vehicle is generated; and the side line is generated according to the position information of the dividing line and the straight line equation of the side line
  • the coordinates of the intersection point between the line and the dividing line are the coordinates of the first intersection point.
  • the self-vehicle can determine the positions of the left and right boundaries of the 2D envelope box of the first vehicle. Position, the coordinates of the intersection point between the side line and the right boundary of the 2D envelope box are generated, that is, the coordinates of the second intersection point are generated.
  • the self-vehicle generates the straight line equation of the front line according to the coordinates of the first intersection point and the second angle, that is, the position information of the front line of the first vehicle is generated, and according to the straight line equation of the front line and the position of the left boundary of the 2D envelope box,
  • the coordinates of the intersection point between the front line and the left boundary of the 2D envelope box, that is, the coordinates of the third intersection point are generated.
  • the coordinates of the second intersection point may be generated first, and then the coordinates of the first intersection point may be generated, or the coordinates of the third intersection point may be generated first, and then the coordinates of the second intersection point may be generated.
  • the order of generating the coordinates of the first point of intersection, the coordinates of the second point of intersection, and the coordinates of the third point of intersection is not limited here.
  • the boundary between the side and the front of the first vehicle is one side of the side of the 3D envelope box of the first vehicle. If the first image includes a complete side surface of the first vehicle, the right boundary of the 2D envelope box of the first vehicle is an edge of the side surface of the 3D envelope box of the first vehicle; if the first image includes the complete side surface of the first vehicle In front of, the left boundary of the 2D envelope box of the first vehicle is an edge of the side surface of the 3D envelope box of the first vehicle, thereby realizing the positioning of the side surface of the 3D envelope box of the first vehicle.
  • the main surface is specifically the back and the main sideline is specifically the back side.
  • the self-vehicle generates the straight line equation of the side line according to the coordinates of the wheel and the first angle, that is, the position information of the side line of the first vehicle is generated; and the side line is generated according to the position information of the dividing line and the straight line equation of the side line.
  • the coordinates of the intersection point between the line and the dividing line are the coordinates of the first intersection point.
  • the self-vehicle can determine the positions of the left and right edges of the 2D envelope box of the first vehicle.
  • the straight line equation of the side line and the left edge of the 2D envelope box Position the coordinates of the intersection point between the side line and the left boundary of the 2D envelope box are generated, that is, the coordinates of the second intersection point are generated.
  • the self-vehicle generates the line equation of the back line according to the coordinates of the first intersection point and the second angle, that is, the position information of the back line of the first vehicle is generated, and the line equation of the back line and the position of the right boundary of the 2D envelope box are generated.
  • the coordinates of the intersection point between the back line and the right boundary of the 2D envelope box that is, the coordinates of the third intersection point are generated.
  • the boundary between the side and the back of the first vehicle is one side of the side of the 3D envelope box of the first vehicle.
  • the left boundary of the 2D envelope box of the first vehicle is one side of the side of the 3D envelope box of the first vehicle; if the first image includes the complete side of the first vehicle
  • the right boundary of the 2D envelope box of the first vehicle is an edge of the side surface of the 3D envelope box of the first vehicle, thereby realizing the positioning of the side surface of the 3D envelope box of the first vehicle.
  • step 303 may further include: the own vehicle determines the coordinates of the upper left corner and the upper right corner of the 2D envelope frame of the first vehicle according to the position information of the 2D envelope frame of the first vehicle, and The coordinates of the aforementioned two vertices are determined as the coordinates of the two first points.
  • the own car According to the position information of the 2D envelope frame of the first vehicle and the position information of the boundary line of the first vehicle, the own car generates the coordinates of the intersection point between the boundary line and the 2D including frame, that is, generates the coordinates of a first point, the aforementioned intersection point It is a vertex of the top surface of the 3D envelope box of the first vehicle.
  • FIG. 7 is a schematic diagram of the first point in the image processing method provided by the embodiment of the application.
  • the first vehicle is in the first image. Leaking the side and front as an example for illustration.
  • B1 represents the 2D envelope frame of the first vehicle, the 2D envelope frame of the first vehicle is determined according to the position information of the 2D envelope frame of the first vehicle;
  • B5 represents the side line of the first vehicle, and the first vehicle The side line of is generated based on the coordinates of the wheels of the first vehicle and the first angle;
  • B7 represents the dividing line between the side and the front of the first vehicle, which is determined according to the position information of the dividing line of the first vehicle ⁇ ;
  • B8 represents the main sideline of the first vehicle, and the main sideline of the first vehicle is generated based on the coordinates of the first intersection and the second angle.
  • E1 represents the point of intersection between the side line and the dividing line (that is, the first point of intersection), and the coordinates of E1 are the coordinates of the first point of intersection;
  • E2 represents the point of intersection between the side line and the right boundary of the 2D envelope box (also That is, the second point of intersection), the coordinates of E2 are the coordinates of the second point of intersection;
  • E3 represents the point of intersection between the main edge and the left boundary of the 2D envelope box (that is, the third point of intersection), and the coordinates of E3 are the coordinates of the third point of intersection.
  • E4 represents the upper-left vertex of the 2D envelope (that is, a first point), the coordinates of E4 are the coordinates of the first point;
  • E5 represents the upper-right vertex of the 2D envelope (that is, a first point) , The coordinates of E5 are the coordinates of the first point;
  • E6 represents the intersection point (that is, a first point) between the dividing line and the 2D envelope frame, and the coordinates of E6 are the coordinates of the first point.
  • E1 to E6 are the first points of a kind of embodiment, E1, E2 and E3 are all located on the bottom surface of the 3D envelope frame of the first vehicle, E1 is a vertex of the bottom surface of the 3D envelope frame of the first vehicle, E4, E5, and E6 are all located on the top surface of the 3D envelope frame of the first vehicle. E6 is a vertex of the top surface of the 3D envelope frame of the first vehicle. It should be understood that in other embodiments, E4 may not be generated.
  • the coordinates of, E5 and E6, the example in Fig. 7 is only to facilitate the understanding of this solution, and is not used to limit this solution.
  • the self-vehicle judges whether the distance between the first vehicle and the self-vehicle exceeds a preset threshold, and if it does not exceed the preset threshold, go to step 305; if it exceeds the preset threshold, go to step 314.
  • the self-vehicle generates the distance between the first vehicle and the self-vehicle, and then determines whether the distance between the first vehicle and the self-vehicle exceeds a preset threshold, and if it does not exceed the preset threshold, proceed to step 305; if the preset threshold is exceeded, go to step 314.
  • the value of the preset threshold can be 10 meters, 15 meters, 30 meters, 25 meters, or other values, etc., which can be determined in combination with the actual product form.
  • the self-vehicle generates the distance between each first point in at least one first point and the self-vehicle according to the coordinates of the first point and the ground plane assumption principle, and then according to each first point The distance between the point and the self-vehicle generates the distance between the first vehicle and the self-vehicle.
  • the self-vehicle can generate the three-dimensional coordinates of the first point in the vehicle body coordinate system according to the coordinates of the first point and the ground plane assumption principle.
  • the three-dimensional coordinates of a point in the vehicle body coordinate system generate the first distance between the first point and the vehicle.
  • the vehicle repeats the foregoing operations to generate the first distance between each first point and the vehicle.
  • the self-vehicle can select the smallest first distance from the aforementioned at least two first distances based on at least two first distances corresponding to at least two first points as the distance between the first vehicle and the self-vehicle, and then determine Whether the selected first distance with the smallest distance value exceeds the preset threshold value, if it does not exceed the threshold value, it is deemed that the distance between the first vehicle and the self-vehicle does not exceed the preset threshold value, that is, when any one of the at least two first points When the distance between the first point and the self-vehicle does not exceed the preset threshold, it is deemed that the distance between the first vehicle and the self-vehicle does not exceed the preset threshold.
  • the own vehicle may also select the largest first distance from the aforementioned at least two first distances as the distance between the first vehicle and the own vehicle, and then determine whether the selected first distance with the largest distance value exceeds a preset threshold. If it exceeds the preset threshold, it is deemed that the distance between the first vehicle and the self-vehicle exceeds the preset threshold, that is, when the distance between any one of the at least two first points and the self-vehicle exceeds the preset threshold, It is considered that the distance between the first vehicle and the self-vehicle exceeds a preset threshold.
  • the self-vehicle uses the average value of the aforementioned at least two first distances as the distance between the first vehicle and the self-vehicle, so as to perform a judgment operation of whether the distance between the first vehicle and the self-vehicle exceeds a preset threshold.
  • two specific implementation manners for determining whether the distance between the first vehicle and the self-vehicle exceeds a preset preset are provided, which improves the implementation flexibility of the solution.
  • the self-vehicle can also generate the three-dimensional coordinates of the wheel points of the first vehicle in the body coordinate system according to the coordinates of the wheel points of the first vehicle and the ground plane assumption principle, and then generate the three-dimensional coordinates of the wheel points of the first vehicle in the body coordinate system.
  • the three-dimensional coordinates of the wheel points are generated, and the distance between the wheel point of the first vehicle and the self-vehicle is generated, and the aforementioned distance is determined as the distance between the first vehicle and the self-vehicle to execute the distance between the first vehicle and the self-vehicle Judgment operation whether it exceeds the preset threshold.
  • the self-vehicle may also use the coordinates of other points on the 3D envelope box of the first vehicle to determine the distance between the first vehicle and the self-vehicle, which is not limited here.
  • the own vehicle judges whether the first vehicle has a side surface in the first image according to the indication information of the leakage surface of the first vehicle in the first image. If the side surface is not leaked, go to step 306; if the side surface is missing, go to step 307.
  • the own vehicle can determine whether the first vehicle has a side surface in the first image according to the indication information of the leakage surface of the first vehicle in the first image. Specifically, the own vehicle is regarded as the first vehicle when it is determined that the first vehicle has a left side or a right side in the leakage surface of the first image according to the indication information of the leakage surface of the first vehicle in the first image. The two missed the side in the first image.
  • the indication information of the leaking surface in the first image indicates which surface is leaking, please refer to the description in step 302, which will not be repeated here.
  • the self-vehicle generates an orientation angle of the first vehicle relative to the self-vehicle according to the coordinates of the center point of the 2D envelope frame and the principle of small hole imaging.
  • the own vehicle when it is determined that the first vehicle does not leak the side surface but only the main surface in the first image, the own vehicle can consider the projection point of the 3D centroid point of the first vehicle on the image as The 2D of the first vehicle includes the center point of the frame, and then according to the coordinates of the center point of the 2D envelope frame of the first vehicle and the principle of small hole imaging, the heading angle of the first vehicle relative to the self-vehicle is generated; further, the self-vehicle can According to the heading angle of the first vehicle relative to the self-vehicle, the driving intention of the first vehicle is determined, for example, whether the first vehicle will merge.
  • the position information of the 2D envelope frame of the first vehicle includes the coordinates of the center point of the 2D envelope frame of the first vehicle, and the own vehicle is based on the center of the 2D envelope frame of the first vehicle.
  • the coordinates of the point and the first transformation relationship are generated to generate the angle ⁇ between the projection of the first ray on the ground plane in the first image and the x-axis of the camera coordinate system.
  • the angle ⁇ is also the first vehicle in the camera coordinate system The downward facing angle.
  • the first transformation relationship is the transformation relationship between the camera coordinate system and the coordinate system.
  • the first transformation relationship may also be referred to as the internal parameters of the camera, which is pre-generated and configured on the vehicle based on the principle of small hole imaging.
  • the first ray is a ray in which the optical center of the camera that collects the first image passes through the 3D centroid point of the first vehicle.
  • the origin of the camera coordinate system is the camera configured on the vehicle for collecting the first image.
  • the vehicle then generates the heading angle ⁇ of the first vehicle in the vehicle's body coordinate system according to the angle ⁇ and the second transformation relationship.
  • the second transformation relationship refers to the difference between the aforementioned camera coordinate system and the aforementioned vehicle body coordinate system. Conversion relationship, the second conversion relationship may also be referred to as an external parameter of the camera.
  • the camera coordinate system and the vehicle body coordinate system are both 3D coordinate systems
  • the x-axis of the camera coordinate system can be rightward
  • the y-axis of the camera coordinate system can be downward
  • the z-axis of the camera coordinate system can be forward.
  • the x-axis and z-axis of the camera coordinate system can form a plane parallel to the ground plane.
  • the origin of the coordinate system of the vehicle body coordinate system can be the midpoint of the connection between the two rear wheels of the vehicle, the coordinate system origin of the vehicle body coordinate system can also be the center of mass point of the vehicle, and the x axis of the vehicle body coordinate system can be On the left, the y-axis of the vehicle body coordinate system can be forward, the z-axis of the vehicle body coordinate system can be downward, and the x-axis and y-axis of the vehicle body coordinate system can form a plane parallel to the ground plane. It should be understood that the foregoing description of the camera coordinate system and the vehicle body coordinate system is only to facilitate understanding of this solution.
  • the origin of the camera coordinate system and/or the coordinate system of the vehicle body coordinate system may also be adjusted, or the camera coordinates may be adjusted.
  • the orientations of the x-axis, y-axis, and/or z-axis of the system are not limited here.
  • the self-vehicle passes the first calculation rule and generates the heading angle of the first vehicle relative to the self-vehicle according to the coordinates of the first point.
  • the first calculation rule when it is determined that the distance between the first vehicle and the own vehicle does not exceed the preset threshold, and the own vehicle leaks from the side in the first vehicle, the first calculation rule can be passed according to The coordinates of the first point generate the heading angle of the first vehicle relative to the own vehicle.
  • the vehicle generates the three-dimensional coordinates of the first point in the vehicle body coordinate system based on the coordinates of the first point and the ground plane assumption principle, and the origin of the coordinate system of the vehicle body coordinate system is located in the vehicle. .
  • the concepts of the vehicle body coordinate system and the ground plane assumption principle have been described in step 306, and will not be repeated here.
  • the self-vehicle According to the three-dimensional coordinates of the first point, the self-vehicle generates a heading angle.
  • the orientation angle is generated based on the coordinates of the first point and the ground plane assumption principle, Thereby ensuring the accuracy of the generated orientation angle.
  • the coordinates of two first points may be generated, or the coordinates of four first points may be generated. Coordinates.
  • the own vehicle can obtain the two first points located on the side line of the first vehicle from the aforementioned two first points or the aforementioned four first points, that is, obtain the two first points located on the bottom surface of the 3D envelope box of the first vehicle. Two first points on the side.
  • the coordinates of three first points can be generated, or the coordinates of six first points can be generated, specifically the first point Refer to the description in step 303 for the process of generating the coordinates of.
  • the own vehicle may obtain the coordinates of the two first points located on the side line of the first vehicle from the coordinates of the aforementioned three first points or the aforementioned six first points.
  • the own vehicle For the process of generating orientation angles. According to the coordinates of the two first points on the bottom surface of the outer envelope box of the first vehicle and the ground plane assumption principle, the own vehicle generates the three-dimensional coordinates of the two first points in the vehicle body coordinate system, and then according to the two The three-dimensional coordinates of the first point in the vehicle body coordinate system generate the heading angle of the first vehicle relative to the own vehicle.
  • the own vehicle can generate the first vehicle based on the values of the x-axis and y-axis directions in the three-dimensional coordinates of the wheels of the first vehicle, and the values of the x-axis and y-axis directions in the three-dimensional coordinates of the target point.
  • the three-dimensional coordinates of a first point are (x 1 , y 1 , z 1 ), and the three-dimensional coordinates of another first point are (x 2 , y 2 , z 2 ), then the first vehicle is relative to Heading angle of own vehicle
  • the own vehicle can generate the three-dimensional coordinates of the first point in the vehicle body coordinate system and the wheels of the first vehicle according to the coordinates of the first point, the coordinates of the wheels of the first vehicle, and the ground plane assumption principle.
  • the three-dimensional coordinates in the vehicle body coordinate system according to the three-dimensional coordinates of the first point in the vehicle body coordinate system and the three-dimensional coordinates of the wheels of the first vehicle in the vehicle body coordinate system, generate the heading angle of the first vehicle relative to the own vehicle .
  • the vehicle can choose the coordinates of a target point from the coordinates of the aforementioned two first points, and generate the three-dimensional coordinates of the target point in the vehicle body coordinate system according to the coordinates of the target point, the coordinate of the wheel and the ground plane assumption principle. And the three-dimensional coordinates of the wheels in the vehicle body coordinate system, and then perform the generation operation of the direction angle.
  • the own car can also generate the three-dimensional coordinates of the two first points in the vehicle body coordinate system and the three-dimensional coordinates of the wheel in the vehicle body coordinate system according to the coordinates of the two first points, the coordinates of the wheels, and the ground plane assumption principle. And then perform the generation operation of the orientation angle.
  • step 307 is an optional step. If step 307 is not executed, step 308 can be directly executed after step 305 is executed.
  • the own vehicle obtains the coordinates of the vertices of the 3D envelope box of the first vehicle from the coordinates of the at least two first points.
  • the self-vehicle when it is determined that the distance between the first vehicle and the self-vehicle does not exceed the preset threshold, and the self-vehicle leaks out of the side in the first vehicle, the self-vehicle obtains multiple first vehicles in step 303. After the coordinates of a point, the coordinates of the vertices of the 3D envelope box of the first vehicle are selected from the coordinates of the multiple first points to generate the first vehicle by using the coordinates of the vertices of the 3D envelope box of the first vehicle The three-dimensional coordinates of the centroid point. If there is no vertex located in the 3D envelope box of the first vehicle among the plurality of first points, step 309 may not be executed, that is, the three-dimensional coordinates of the centroid point of the first vehicle may no longer be generated.
  • a first value range in the U-axis direction of the first image and a second value range in the V-axis direction of the first image may be preset on the vehicle. After obtaining the coordinates of a first point, determine whether the value of the U axis direction in the coordinates of the first point is within the first value range, and determine whether the value of the V axis direction in the coordinates of the first point is in the second value range.
  • the first point is determined One point is the vertex of the 3D envelope box; if the value of the U-axis direction in the coordinates of the first point is not within the first value range, or the value of the V-axis direction in the coordinates of the first point is in the second value Within the value range, it is determined that the first point is not the vertex of the 3D envelope box.
  • the own vehicle According to the coordinates of the vertices of the 3D envelope box of the first vehicle and the ground plane assumption principle, the own vehicle generates the three-dimensional coordinates of the center of mass of the first vehicle in the vehicle body coordinate system, and the origin of the vehicle body coordinate system is located From the car.
  • the self-vehicle after the self-vehicle obtains the coordinates of the vertices of the 3D envelope box of the first vehicle from the coordinates of multiple first points, it generates the 3D envelope of the first vehicle according to the ground plane assumption principle.
  • the three-dimensional coordinates of the vertex of the box in the car body coordinate system Then, according to the three-dimensional coordinates of the vertices of the 3D outer envelope box of the first vehicle and the preset size of the first vehicle, the three-dimensional coordinates of the centroid point of the first vehicle in the vehicle body coordinate system are generated.
  • the coordinates of the first point not only the heading angle of the first vehicle can be generated, but also the three-dimensional coordinates of the centroid point of the first vehicle in the vehicle body coordinate system can be generated, which expands the application scenario of this solution; In addition, the accuracy of the generated three-dimensional coordinates of the centroid point is improved.
  • the self-vehicle after the self-vehicle obtains the coordinates of a first vertex from the coordinates of multiple first points, it can generate the coordinates of the first vertex based on the coordinates of the first vertex and the ground plane assumption principle.
  • the three-dimensional coordinates in the vehicle body coordinate system there may be at least one first point located on the vertex of the 3D envelope box of the first vehicle among the plurality of first points, and the first vertex is any vertex of the aforementioned at least one first point.
  • the self-vehicle can determine the position of the 3D envelope box of the first vehicle in the vehicle body coordinate system, and then generate the three-dimensional coordinates of the centroid point of the first vehicle in the vehicle body coordinate system.
  • the self-vehicle may first generate the three-dimensional coordinates of the vertex of the initial 3D envelope box of the first vehicle according to the first image, and after obtaining the three-dimensional coordinates of the first vertex, compare the initial 3D external The three-dimensional coordinates of the vertices of the envelope box are corrected to obtain the three-dimensional coordinates of the vertices of the final 3D envelope box of the first vehicle, and then the first vehicle is generated according to the three-dimensional coordinates of the vertices of the final 3D envelope box of the first vehicle The three-dimensional coordinates of the centroid point of the vehicle body coordinate system.
  • the own car can also directly generate the three-dimensional coordinates of the aforementioned three first vertices in the vehicle body coordinate system according to the coordinates of the aforementioned three first vertices and the ground plane assumption principle, and then obtain the 3D outer envelope box of the first vehicle.
  • the position of the bottom surface in the vehicle body coordinate system, and the three-dimensional coordinates of the centroid point of the first vehicle in the vehicle body coordinate system are generated.
  • the height of the centroid point in the three-dimensional coordinates may not be taken into consideration.
  • the multiple first points may include multiple first vertices
  • the multiple first points include six first vertices
  • the own vehicle can also directly generate the three-dimensional coordinates of the six first vertices in the vehicle body coordinate system based on the coordinates of the six first vertices and the ground plane assumption principle, and then obtain the 3D outer envelope box of the first vehicle.
  • the position in the vehicle body coordinate system is generated, and the three-dimensional coordinates of the centroid point of the first vehicle in the vehicle body coordinate system are generated.
  • steps 308 and 309 are optional steps. If steps 308 and 309 are not executed, step 310 can be directly executed after step 307 is executed. If steps 308 and 309 are executed, the embodiment of the present application does not limit the execution sequence between steps 308 and 309 and step 307. Step 307 may be executed first, and then steps 308 and 309 may be executed, or steps 308 and 309 may be executed first. Go to step 307 again.
  • the self-vehicle generates the size of the first vehicle according to the coordinates of the first point and the ground plane assumption principle.
  • the self-vehicle in an implementation manner, generates only the length and/or width of the first vehicle according to the coordinates of the first point and the ground plane assumption principle.
  • the length and/or length of the first vehicle are generated according to the coordinates of the bottom vertex of the 3D envelope box of the first vehicle and the ground plane assumption principle.
  • the third vertex refers to the bottom vertex of the 3D envelope box of the first vehicle.
  • the self-vehicle can obtain the coordinates of at least one first point located on the bottom side of the 3D envelope box of the first vehicle, according to the aforementioned at least one first point Respectively determine whether each of the aforementioned at least one first point is a vertex of the bottom surface of the 3D envelope box, so as to calculate the number of targets corresponding to the first vehicle in the first image.
  • Figure 6 shows the four first points D1, D2, D3 and D4.
  • D1 and D2 are located on the bottom edge of the 3D envelope box of the first vehicle.
  • the first point is that only D2 is the bottom vertex of the 3D envelope box of the first vehicle. It should be understood that the examples given here in conjunction with Figure 6 are only for the convenience of understanding the concept of the vertex of the 3D envelope box, and are not used to limit the solution. .
  • the own vehicle judges whether the first vehicle has leaked the main surface in the first image. If the main surface is leaked and the target number is equal to three, then the 3D envelope of the three own vehicles.
  • the coordinates of the bottom vertices of the box and the ground plane assumption principle respectively generate the three-dimensional coordinates of the aforementioned three bottom vertices in the vehicle coordinate system, and then generate the length and width of the first vehicle according to the three-dimensional coordinates of the aforementioned three bottom vertices.
  • the coordinates of the two bottom vertices in the vehicle are generated respectively According to the three-dimensional coordinates under the system, the length of the first vehicle is generated based on the three-dimensional coordinates of the two bottom vertices.
  • step 311 that is, the own vehicle uses another image including the first vehicle to generate the length and/or width of the first vehicle .
  • the own vehicle terminates the step of generating the length and/or width of the first vehicle.
  • the execution sequence between the generation step of the number of targets and the determination step of whether the first vehicle leaks out of the main surface is not limited.
  • the generation step may be performed first, and then the determination step may be performed. Perform the judgment step, and then perform the generation step.
  • the self-vehicle generates one or more of the following according to the coordinates of the first point and the ground plane assumption principle: the length of the first vehicle, the width of the first vehicle, and the height of the first vehicle.
  • the vehicle after acquiring the coordinates of multiple first points, the vehicle will determine whether the number of first vertices located on the vertices of the 3D envelope box of the first vehicle among the multiple first points is greater than one, and if greater than one , The own car generates the three-dimensional coordinates of the first vertex in the vehicle body coordinate system according to the coordinates of the first vertex and the ground plane assumption principle, and generates one of the following according to the three-dimensional coordinates of the first vertex in the vehicle body coordinate system Item or multiple items: the length of the first vehicle, the width of the first vehicle, and the height of the first vehicle. If it is equal to one, proceed to step 311 to generate the length, width, and/or height of the first vehicle using another image including the first vehicle. If it is equal to zero, the own vehicle terminates the step of generating the size of the first vehicle.
  • the size of the first vehicle can also be generated according to the coordinates of the first point, which further expands the application scenario of the solution; in addition, the accuracy of the generated size of the first vehicle is improved.
  • the vehicle acquires a second image, where the second image includes the first vehicle, and the second image and the first image have different image acquisition angles.
  • the second image is acquired.
  • the way of acquiring the second image by the vehicle is similar to the way of acquiring the first image by the vehicle in step 301, and the description in step 301 can be referred to.
  • the difference between the second image and the first image is that the second image and the first image have different image acquisition angles for the first vehicle.
  • the own vehicle obtains the coordinates of at least two second points through the image processing network according to the second image, and the at least two second points are all located on the side of the 3D outer envelope box of the first vehicle in the second image.
  • the vehicle after acquiring the second image, the vehicle inputs the second image into the image processing network to obtain the fourth result output by the image processing network.
  • the fourth result and the type of information included in the first result The same, the difference is that the first result is obtained by inputting the first image into the image processing network, and the fourth result is obtained by inputting the second image into the image processing network.
  • the own car generates the position information of the 3D envelope box of the first vehicle in the second image, and the position information of the 3D envelope box of the first vehicle in the second image includes at least two second points coordinate of.
  • the second point is the same in nature as the first point.
  • the first point is located on the edge of the 3D envelope of the first vehicle in the first image
  • the second point is located on the 3D envelope of the first vehicle in the second image.
  • Two of the at least two second points locate the edge of the 3D envelope box of the first vehicle in the second image, and the coordinates of the at least two second points are used to locate the first vehicle in the second image 3D outsourcing network box.
  • step 312 please refer to the description of steps 302 to 303, which will not be repeated here.
  • the own vehicle generates the size of the first vehicle according to the coordinates of the first point and the coordinates of the second point.
  • the own vehicle in an implementation manner, generates the length and/or width of the first vehicle according to the coordinates of the first point and the coordinates of the second point. Specifically, after obtaining the coordinates of the second point, the self-vehicle selects a fourth vertex from the second point, and the fourth vertex is another vertex of the bottom surface of the 3D envelope box of the first vehicle. According to the coordinates of the fourth vertex and the ground plane assumption principle, the self-vehicle generates the three-dimensional coordinates of the fourth vertex in the vehicle body coordinate system, and then generates the length of the first vehicle based on the three-dimensional coordinates of the third vertex and the three-dimensional coordinates of the fourth vertex. And/or wide.
  • the self-vehicle generates one or more of the following according to the coordinates of the first point and the coordinates of the second point: the length of the first vehicle, the width of the first vehicle, and the height of the first vehicle . Specifically, after the vehicle obtains the coordinates of the second point, the second vertex is selected from the second point, and the second vertex is another vertex of the 3D envelope box of the first vehicle.
  • the self-car According to the coordinates of the second vertex and the ground plane assumption principle, the self-car generates the three-dimensional coordinates of the second vertex in the vehicle body coordinate system, and then generates one of the following based on the three-dimensional coordinates of the first vertex and the three-dimensional coordinates of the second vertex Or more: the length of the first vehicle, the width of the first vehicle, and the height of the first vehicle.
  • the size of the first vehicle cannot be generated from an image of the first vehicle
  • another image of the first vehicle is used to jointly generate the size of the first vehicle, which ensures that the size of the first vehicle can be generated under various conditions.
  • the size of the first vehicle is generated, which improves the comprehensiveness of the solution.
  • steps 310 to 313 are optional steps. If steps 310 to 313 are not performed, the execution can end after step 309 is executed. If steps 310 to 313 are executed, the embodiment of the present application does not limit the execution order of steps 310 to 313 and steps 308 and 309. Steps 308 and 309 may be executed first, and then steps 310 to 313 may be executed; or steps 310 to 313 may be executed first. , Then perform steps 308 and 309.
  • the own vehicle judges whether the first vehicle has a side surface in the first image according to the indication information of the leakage surface of the first vehicle in the first image, if the side surface is not leaked, go to step 315; if the side surface is leaked, go to step 316.
  • the self-vehicle generates an orientation angle of the first vehicle relative to the self-vehicle according to the coordinates of the center point of the 2D envelope frame and the principle of small hole imaging.
  • the specific manner of executing steps 314 and 315 by the own vehicle can refer to the description in steps 305 and 306, which will not be repeated here.
  • the heading angle of the first vehicle can be generated, which enriches the solution.
  • the self-vehicle uses the second calculation rule to generate the heading angle of the first vehicle relative to the self-vehicle according to the coordinates of the first point.
  • the second calculation rule when it is determined that the distance between the first vehicle and the own vehicle exceeds a preset threshold, and the side of the first vehicle leaks from the first image, the second calculation rule can be passed, According to the coordinates of the first point, the heading angle of the first vehicle relative to the self-vehicle is generated.
  • the heading angle of the first vehicle relative to the self-vehicle can also be generated according to the coordinates of the first point, so as to improve the accuracy of the obtained heading angle.
  • the direction angle further improves the accuracy of the generated direction angle.
  • step 316 may include: generating the position information of the side line of the first vehicle by the own vehicle according to the coordinates of the first point, and generating the position information of the side line of the first vehicle according to the position information of the side line of the first vehicle and the first
  • the position information of the vanishing line of the image is used to generate the coordinates of the vanishing point, where the vanishing point is the intersection between the side line of the first vehicle and the vanishing line of the first image.
  • the self-vehicle According to the vanishing point coordinates and the principle of two-point perspective, the self-vehicle generates a heading angle.
  • the two-point perspective principle can also be called the angled perspective principle or the complementary angle perspective principle.
  • the two-point perspective principle means that the side of the first vehicle in the first image and the main surface of the first vehicle are both obliquely intersecting with the first image. , There are two vanishing points in the first image, and the two vanishing points are on the same horizon line.
  • the position information of the vanishing line of the first image may specifically be expressed as a linear equation of the vanishing line of the first image, and the position of the vanishing line of the first image is only related to the image acquisition device that collects the first image.
  • a specific implementation of generating the heading angle of the first vehicle is provided when the distance between the first vehicle and the self-vehicle exceeds a preset threshold, and the side of the first vehicle leaks from the first image The way, the operation is simple, and the efficiency is high.
  • the self-vehicle can obtain the coordinates of two, four, three, or six first points through step 303.
  • the vehicle After the vehicle has obtained the coordinates of multiple first points, it can select the coordinates of the two first points located on the side line of the first vehicle, that is, obtain the coordinates of the bottom surface of the 3D envelope box of the first vehicle. The coordinates of the two first points on the side.
  • the own car According to the coordinates of the aforementioned two first points, the own car generates the straight line equation of the side line of the first vehicle.
  • the straight line equation of the vanishing line of the first image can be pre-configured in the own car.
  • the straight line equation and the straight line equation of the vanishing line obtain the coordinates of the intersection between the side line of the first vehicle and the vanishing line of the first image (that is, the coordinates of the vanishing point are obtained).
  • the vehicle After obtaining the coordinates of the vanishing point, the vehicle generates the angle ⁇ between the projection of the second ray on the ground plane in the first image and the x-axis of the camera coordinate system according to the coordinates of the vanishing point and the principle of two-point perspective.
  • the angle ⁇ is also the heading angle of the first vehicle in the camera coordinate system.
  • the second ray is a ray whose optical center of the camera that collects the first image passes through the aforementioned vanishing point.
  • the vehicle Based on the angle ⁇ and the second transformation relationship, the vehicle generates the heading angle ⁇ of the first vehicle in the vehicle body coordinate system of the vehicle.
  • the second transformation relationship refers to the transformation relationship between the camera coordinate system and the vehicle body coordinate system.
  • the second transformation relationship can also be referred to as the external parameters of the camera.
  • the concept of the camera coordinate system and the vehicle body coordinate system can refer to the introduction in the previous steps.
  • step 316 may include: generating the mapping relationship between the first angle and the orientation angle according to the coordinates of the first point, the first angle, and the principle of small hole imaging; and according to the mapping relationship and the first angle. Angle, generate heading angle.
  • the meaning of the principle of small hole imaging can refer to the introduction in the foregoing steps.
  • two possible implementation methods for obtaining the heading angle are provided, which improves the implementation flexibility of the solution.
  • the self-vehicle selects a vertex of the 3D envelope box of the first vehicle. If the multiple first points include multiple vertices of the 3D envelope box, it can be selected from Choose one vertex among multiple vertices. For the specific implementation of selecting the vertices of the 3D envelope box of the first vehicle from the first point, reference may be made to the description in step 308, which will not be repeated here.
  • the self-vehicle can generate the mapping relationship between the first angle and the orientation angle, and then can generate the mapping relationship between the first angle and the orientation angle according to the first angle , Solve the heading angle.
  • step 304 is used to determine whether the distance between the first vehicle and the self-vehicle exceeds a preset threshold, and then steps 305 and 314 are used to determine whether the first vehicle is in the first image. Leaks the side.
  • the step of determining whether the preset threshold is exceeded, and the step of determining whether the first vehicle has a side surface in the first image may also be exchanged. That is, through step 304, it is determined whether the first vehicle has a side surface in the first image.
  • the own vehicle will use the coordinates of the center point of the 2D envelope frame and the principle of small hole imaging, The heading angle of the first vehicle relative to the own vehicle is generated. If the side of the first vehicle leaks from the first image, then determine whether the distance between the first vehicle and the self-vehicle exceeds the preset threshold, and if the distance between the first vehicle and the self-vehicle does not exceed the preset threshold, perform the above steps 307 to 313 If the distance between the first vehicle and the self-vehicle exceeds the preset threshold, execute the content described in step 316.
  • the acquired image is input to the image processing network
  • the output of the image processing network is the position information of the two-dimensional envelope frame of the vehicle, the coordinates and the first angle of the wheels, according to the two-dimensional envelope frame
  • the position information, the coordinates of the wheels and the first angle are generated to generate the position information of the 3D 3D envelope box of the first vehicle, and then the 3D envelope box of the vehicle is located.
  • Due to the position information of the 2D envelope box, the coordinates of the wheels and The accuracy of the three parameters of the first angle has nothing to do with whether the vehicle in the image is complete, so regardless of whether the vehicle in the image is complete, the coordinates of the first point obtained are accurate, and the accuracy of the 3D envelope box is located.
  • Higher, that is, the accuracy of the acquired 3D outsourcing box is improved; further, it is possible to more accurately determine the driving intention of surrounding vehicles, thereby improving the driving safety of the autonomous vehicle.
  • FIG. 8 is a schematic flowchart of a network training method provided by an embodiment of the application.
  • the image processing method provided by the embodiment of the application may include:
  • the training device obtains training images and annotation data of the training images.
  • a training data set is pre-configured on the training device, and the training data set includes multiple training images, and one or more sets of annotation data corresponding to each training image.
  • One training image includes one or more second vehicles, and each of the aforementioned multiple sets of annotation data corresponds to one second vehicle in the training image.
  • a set of label data includes the label indication information of the leaking surface of the second vehicle, the label coordinates of the wheels and the label first angle of the second vehicle, and the first angle of the second vehicle.
  • An angle indicates the angle between the sideline and the first axis of the training image.
  • the sideline of the second vehicle is the line between the vanishing point of the leaked side in the training image and the wheels of the second vehicle.
  • the first axis of the training image is parallel to one side of the training image.
  • the labeling data may also include labeling position information of the 2D envelope frame of the second vehicle.
  • a set of labeling data includes labeling indication information of the leaking surface of the second vehicle, labeling coordinates of the wheels of the second vehicle, and labeling the first angle of the second vehicle ,
  • the marked position information of the boundary line of the second vehicle and the marked second angle of the second vehicle, the main surface of the second vehicle is the front or back of the second vehicle, and the boundary line is the boundary line between the side and the main surface.
  • the second angle of the vehicle indicates the angle between the main sideline of the second vehicle and the first axis of the training image.
  • the main sideline of the second vehicle is the vanishing point and the first axis of the main surface leaked from the second vehicle in the training image.
  • connection line between the target points of the second vehicle, and the target point of the second vehicle is the intersection of the side line of the second vehicle and the boundary line of the second vehicle.
  • the labeling data may also include labeling position information of the 2D envelope frame of the second vehicle.
  • the label data includes label indication information of the leaking surface of the second vehicle and label position information of the 2D envelope frame.
  • the training device inputs the training image into the image processing network to obtain a third result output by the image input network.
  • the training device after obtaining the training image, inputs the training image into the image processing network to obtain one or more sets of third results output by the image input network.
  • the number of third results is the same as that of the second vehicle in the training image.
  • the number of is the same, and a set of third results are used to indicate the characteristic information of a second vehicle.
  • the third result includes the generation instruction information of the leakage surface of the second vehicle, the generation coordinates of the wheels of the second vehicle, and the generation of the second vehicle. angle.
  • the third result may also include generated position information of the 2D envelope frame of the second vehicle.
  • a set of third results includes the generation instruction information of the leakage surface of the second vehicle, the generated coordinates of the wheels of the second vehicle, and the first generation of the second vehicle. The angle, the generation position information of the boundary line of the second vehicle, and the second angle of generation of the second vehicle.
  • the third result may also include generated position information of the 2D envelope frame of the second vehicle.
  • the third result includes the generation instruction information of the leakage surface of the second vehicle and the generation position information of the 2D envelope frame of the second vehicle.
  • the image processing network may include a target detection network and a three-dimensional feature extraction network.
  • the aforementioned target detection network may be a one-stage target detection network, a two-stage target detection network, or other types of target detection networks.
  • the two-stage target detection network includes a first feature extraction network, a region proposal network (region proposal network, RPN), and a second feature extraction network.
  • the first feature extraction network is used to perform a convolution operation on the training image to obtain a feature map of the training image, and input the feature map of the training image into the RPN.
  • RPN outputs the position information of one or more 2D envelope boxes according to the feature map of the training image.
  • the first feature extraction network is also used to extract the first feature map from the feature map of the training image according to the position information of the 2D envelope box output by the RPN.
  • the first feature map is the feature map of the training image located in the 2D output of the RPN.
  • the feature map in the envelope frame; the first feature extraction network is also used to generate the category corresponding to each first feature map, that is, to generate the category corresponding to each 2D envelope frame.
  • the aforementioned categories include but are not limited to vehicles, Street lights, roadblocks, road signs, guardrails, pedestrians, etc.
  • the second feature extraction network is used to perform convolution according to the first feature map to obtain a more accurate position information of the 2D envelope frame.
  • the second feature extraction network is also used to extract a second feature map from the feature map of the training image, where the second feature map is the feature map in the feature map of the training image located in the 2D envelope box output by the second feature extraction network;
  • the second feature extraction network is also used to generate a category corresponding to each second feature map, that is, to generate a category corresponding to each more accurate 2D envelope box.
  • the image processing network includes a two-stage target detection network and a three-dimensional feature extraction network.
  • Step 802 may include: the training device inputs the training image into the two-stage target detection network to obtain the position information of the 2D envelope box of the second vehicle output by the RPN in the two-stage target detection network; The first feature map is deducted from the feature map of the training image.
  • the first feature map is the feature map in the feature map of the training image located in the 2D envelope box output by the RPN; the training device inputs the first feature map into the three-dimensional feature extraction network, The third result output by the three-dimensional feature extraction network is obtained.
  • the accuracy of the 2D envelope box directly output by the RPN is low, that is, the accuracy of the first feature map obtained based on the 2D envelope box directly output by the RPN is low, which is beneficial to improve the difficulty of the training phase. , And then improve the robustness of the image processing network after training.
  • FIG. 9 is a schematic flowchart of a network training method provided in an embodiment of this application.
  • the image processing network envelops the first feature extraction network, the RPN, the second feature extraction network, and the three-dimensional feature extraction network.
  • the first feature map is deducted from the feature map of the image, and the category corresponding to the first feature map is generated, that is, according to the first feature map, the category corresponding to the position information of the 2D envelope box output by the RPN is generated, and the first feature map is generated.
  • a feature map and the category corresponding to the 2D include box output by the RPN are input to the three-dimensional feature extraction network.
  • the three-dimensional feature extraction network generates three-dimensional feature information according to the first feature map and category, and no longer uses the second feature extraction network to compare RPN
  • the position information of the output 2D envelope box is secondarily corrected, thereby increasing the difficulty of the training process. It should be understood that the example in Figure 9 is only to facilitate the understanding of this solution. In actual products, the three-dimensional feature extraction network can also output more Less or more types of three-dimensional feature information are not limited here.
  • step 802 may include: the training device inputs the training image into the target detection network to obtain the second feature map output by the entire target detection network, and output the second feature map to the three-dimensional feature extraction network to obtain The third result output by the 3D feature extraction network.
  • FIG. 10 is a schematic flowchart of a network training method provided by an embodiment of this application.
  • the image processing network envelops the first feature extraction network, RPN, the second feature extraction network, and the three-dimensional feature extraction network.
  • the second feature extraction network will be reconvolved according to the first feature map to obtain more Accurate the position information of the 2D envelope frame, and according to the more accurate position information of the 2D envelope frame, the second feature map is deducted from the feature map of the training image, and the category corresponding to the second feature map is generated, that is, According to the second feature map, category 2 corresponding to the position information of the 2D envelope frame output by the second feature extraction network is generated, and the three-dimensional feature extraction network generates a three-dimensional image based on the second feature map and category 2 output by the second feature extraction network.
  • Characteristic information it should be understood that the example in FIG. 10 is only to facilitate the understanding of the solution, and is not used to limit the solution.
  • the training device uses the loss function to train the image processing network according to the labeled data and the third result until the convergence condition of the loss function is met, and then outputs the trained image processing network.
  • the training device after the training device obtains the labeled data and the third result, it can use the loss function to train the image processing network once, and the training setting uses multiple training images in the training data set and the labeled data of each training image.
  • the image processing network is iteratively trained until the convergence condition of the loss function is met.
  • the loss function may specifically be an L1 loss function, a cross-entropy loss function, and/or other types of loss functions.
  • the loss function is used to narrow the similarity between the generation indication information of the leakage surface and the label indication information of the leakage surface, and the generated coordinates of the wheels are close to the wheel coordinates. Mark the similarity between the coordinates, and zoom in to generate the similarity between the first angle and the first angle.
  • the loss function is also used to narrow the similarity between the generated position information of the 2D envelope box and the marked position information of the 2D envelope box.
  • the loss function is also used to narrow the similarity between the generated position information of the dividing line and the marked position information of the dividing line, and to reduce Nearly generate the similarity between the second angle and label the second angle.
  • the loss function is used to narrow the similarity between the generation indication information of the leakage surface and the label indication information of the leakage surface, and the generation of the 2D envelope box The similarity between the position information and the marked position information of the 2D envelope box.
  • the training device can obtain the loss function value of each type of information one by one, and then sum them to obtain the final loss function value. After the training device obtains the loss function value of each type of information one by one, it can also perform a weighted summation to obtain the final loss function value. According to the final loss function value, a gradient value is generated, and the gradient value is used to update the parameters of the image processing network to complete a training of the image processing network.
  • the following shows the specific formulas of the loss functions corresponding to various types of information. Taking the loss function corresponding to the indication information of the leaking surface as an example, the L1 loss function is used.
  • the formula can be as follows:
  • L side_visi represents the loss function corresponding to the indication information of the leakage surface
  • the kth category refers to vehicles; m ⁇ ⁇ front, back, left ,Right ⁇ represent the front, back, left and right sides of the second vehicle, Represents the label indication information for whether the m-plane of the second vehicle in the j-th labelled 2D envelope box leaks out, if The value of is 1, which indicates that the second vehicle has leaked m planes in the training image.
  • the value of is 0, which indicates that the second vehicle does not leak out the m plane in the training image, Represents the generation indication information of whether the m-plane of the second vehicle in the i-th 2D envelope frame output by the image processing network is leaking out of the RPN, if The value of is 1, which means that the image processing network predicts that the second vehicle has missed m planes in the training image. If The value of is 0, which means that the image processing network predicts that the second vehicle does not miss the m plane in the training image.
  • the training device For the i-th 2D envelope box output by the RPN, the training device will calculate the intersection over Union (IOU) between the i-th 2D envelope box and each labeled 2D envelope box in the k-th category.
  • a threshold for the intersection ratio is preset on the training device. If the generated intersection ratio is greater than the aforementioned threshold, x_ij ⁇ k takes 1, and if it is less than or equal to the aforementioned threshold, x_ij ⁇ k takes 0. As an example, for example, the intersection ratio
  • the threshold can be 0.5.
  • the formula can be as follows:
  • L wheel represents the loss function corresponding to the coordinates of the wheel, represent with represent with Represents the x-coordinates (also called u-coordinates) generated by the image processing network for the wheel points of the second vehicle in the i-th 2D envelope output by the RPN, Represents the y coordinate (also called v coordinate) generated by the image processing network for the wheel point of the second vehicle in the i-th 2D envelope output by the RPN, Represents the x coordinate labeled with the wheel point of the second vehicle in the j-th labeled 2D envelope frame, Represents the x coordinate of the center point generated by the image processing network for the i-th 2D envelope output by the RPN, Represents the y coordinate labeled for the wheel point of the second vehicle in the j-th labeled 2D envelope frame, Represents the y coordinate of the center point generated by the image processing network for the i-th 2D envelope output by the RPN.
  • the formula can be as follows:
  • L boundary represents the loss function corresponding to the position information of the dividing line
  • the representative table image processing network generates position information for the boundary line of the second vehicle in the i-th 2D envelope frame output by the RPN, Represents the x-coordinate marked on the dividing line of the second vehicle in the j-th marked 2D envelope frame, Represents the x coordinate of the center point generated by the image processing network for the i-th 2D envelope output by the RPN.
  • the commonly used loss function is the L1 loss function.
  • the training device can calculate the center point coordinates of the i-th 2D envelope box and the coordinates of the j-th labeled 2D envelope box. Mark the deviation between the coordinates of the center point, and calculate the log value of the ratio between the length and width of the generated length and width of the i-th 2D envelope frame and the length and width of the j-th marked 2D envelope frame.
  • the loss function corresponding to the category of the 2D envelope box may be a cross-entropy loss function.
  • the formula can be as follows:
  • L degree represents the loss function corresponding to the first angle and the second angle
  • alpha represents the first angle
  • beta represents the second angle.
  • the training device divides the 360-degree angle into bin intervals, and the angle occupied in each interval The number is delta
  • the value of is 1, which means that the generated m1 is in the interval w, if The value of is 0, which means that the generated m1 is not in the interval w
  • the angular offset of the generated m1 relative to the middle section of the interval w Represents the angular offset of
  • the training device uses multiple training images in the training data set to iteratively train the image processing network until the convergence condition of the loss function is met, and the training device will output the trained image processing network.
  • the aforementioned convergence condition refers to a convergence condition that satisfies the loss function, or it can be that the number of iterations reaches a preset number, or other types of convergence conditions, etc.
  • the output image processing network includes a target detection network and a three-dimensional feature extraction network. In the case that the aforementioned target detection network is a two-stage target detection network, no matter whether the training device adopts the implementation method corresponding to FIG. 9 in step 802, or adopts In the implementation manner corresponding to FIG. 10, the output image processing networks all include the first feature extraction network, the RPN, the second feature extraction network, and the three-dimensional feature extraction network.
  • the acquired image is input into the image processing network.
  • the output of the image processing network is the position information of the two-dimensional envelope of the vehicle, the coordinates of the wheels, and the first angle.
  • the first angle refers to the vehicle The angle between the side line and the first axis.
  • FIG. 11 is a schematic flowchart of the image processing method provided by the embodiment of the application.
  • Image processing methods can include:
  • the vehicle acquires a third image.
  • the third image includes a first rigid body, and the first rigid body is a cube.
  • the specific implementation manner of executing step 1101 by the own vehicle is similar to the specific implementation manner of step 301 in the embodiment corresponding to FIG. 3, and will not be repeated here.
  • the difference is that the object that must be included in the first image is the first vehicle, and the object that must be included in the third image is the first rigid body.
  • the shape of the first rigid body is a cube.
  • the first rigid body can be represented as a whole object. It can also be expressed as a part of an object. As an example, for example, the entire object is a guardrail in the middle of a road, and the first rigid body can refer to the square part at the bottom of the guardrail.
  • the vehicle inputs the third image into the image processing network to obtain the second result output by the image processing network.
  • each set of second results may include the position information and the first angle of the two-dimensional 2D envelope frame of the first rigid body.
  • each set of second results can include the position information of the 2D envelope frame of the first rigid body, and the aforementioned main surface refers to the front or back .
  • the second result may also include the position information of the boundary line of the first rigid body and the second angle of the first rigid body. Further optionally, the second result may also include indication information of the leakage surface of the first rigid body in the first image.
  • step 1102 by the own vehicle is similar to the specific implementation manner of step 302 in the embodiment corresponding to FIG. 3.
  • the meaning of the second result in the embodiment corresponding to FIG. 11 is similar to the meaning of the second result in the embodiment corresponding to FIG. 3.
  • the difference is that, firstly, the various information included in the first result is used to describe the characteristics of the first vehicle, and the various information included in the second result is used to describe the characteristics of the first rigid body.
  • the first rigid body is a cube, there is no wheel bump like the first vehicle, and there is no information such as wheel coordinates in the second result.
  • the second result can also include the 2D envelope frame
  • the coordinates of the apex of the lower left corner and/or the coordinates of the apex of the lower right corner are substituted for the coordinates of the wheels in the first result in the corresponding embodiment in FIG.
  • the second result includes the position information of the boundary line of the first rigid body
  • the second result may also include one or more of the following: the boundary line of the first rigid body and the bottom edge of the 2D envelope frame of the first rigid body
  • the coordinates of the intersection point, the coordinates of the lower left vertex and the lower right vertex of the 2D envelope frame of the first rigid body replace the coordinates of the wheels in the first result of the corresponding embodiment in FIG. 3.
  • the specific meaning of each type of information included in the second result please refer to the description in the embodiment corresponding to FIG. 3, which will not be repeated here.
  • the vehicle generates position information of the three-dimensional 3D outer envelope box of the first rigid body according to the second result.
  • the vehicle when the first rigid body only leaks the main surface or only the side surface in the third image, if the second result does not include the coordinates of the lower left corner vertex of the 2D envelope of the first rigid body and/ Or the coordinates of the vertex of the lower right corner, the vehicle can also generate the coordinates of the lower left vertex and/or the coordinates of the lower right vertex of the 2D envelope of the first rigid body according to the position information of the 2D envelope of the first rigid body, instead Figure 3 corresponds to the coordinates of the wheels in the first result of the embodiment.
  • the vehicle can be based on the position information of the 2D envelope frame of the first rigid body
  • the position information of the boundary line with the first rigid body generates one or more of the following: the coordinates of the intersection of the boundary line of the first rigid body and the bottom edge of the 2D envelope frame of the first rigid body, the 2D envelope of the first rigid body
  • the coordinates of the vertex of the lower left corner and the vertex of the lower right corner of the box replace the coordinates of the wheels in the first result of the corresponding embodiment in FIG. 3.
  • the self-vehicle judges whether the distance between the first rigid body and the self-vehicle exceeds a preset threshold, if it does not exceed the preset threshold, go to step 1105; if it exceeds the preset threshold, go to step 1114.
  • the vehicle judges whether the first rigid body has a side surface in the third image according to the indication information of the leakage surface of the first rigid body in the third image. If the side surface is not leaked, go to step 1106; if the side surface is leaked, go to step 1107.
  • the self-vehicle generates an orientation angle of the first rigid body relative to the self-vehicle according to the coordinates of the center point of the 2D envelope frame of the first rigid body and the principle of small hole imaging.
  • the self-vehicle passes the first calculation rule and generates the orientation angle of the first rigid body relative to the self-vehicle according to the coordinates of the third point.
  • the vehicle obtains the coordinates of the vertices of the 3D outer envelope box of the first rigid body from the coordinates of at least two third points.
  • the self-vehicle According to the coordinates of the vertices of the 3D outer envelope box of the first rigid body and the ground plane assumption principle, the self-vehicle generates the three-dimensional coordinates of the centroid of the first rigid body in the vehicle body coordinate system.
  • the coordinate system origin of the vehicle body coordinate system is located at From the car.
  • the self-vehicle generates the size of the first rigid body according to the coordinates of the third point and the ground plane assumption principle.
  • the vehicle acquires a fourth image.
  • the fourth image includes a first rigid body.
  • the fourth image and the third image have different image acquisition angles.
  • the vehicle obtains the coordinates of at least two fourth points through the image processing network, and the at least two fourth points are all located on the side of the 3D outer envelope box of the first rigid body.
  • the vehicle generates the size of the first rigid body according to the coordinates of the third point and the fourth point.
  • the vehicle judges whether the first rigid body leaks the side surface in the third image. If the side surface is not leaked, go to step 1116; if the side surface is leaked, go to step 1117.
  • the self-vehicle According to the coordinates of the center point of the 2D envelope frame of the first rigid body and the principle of small hole imaging, the self-vehicle generates the orientation angle of the first rigid body relative to the self-vehicle.
  • the self-vehicle passes the second calculation rule and generates the orientation angle of the first rigid body relative to the self-vehicle according to the coordinates of the third point.
  • the specific manner of executing steps 1103 to 1116 by the vehicle is similar to the specific implementation manner of steps 303 to 316 in the embodiment corresponding to FIG. 3, and you can refer to the description of steps 303 and 316 in the embodiment corresponding to FIG. Do not repeat them here.
  • the 3D envelope box of the vehicle can be positioned, but also the 3D envelope box of a general rigid body can be positioned, which greatly expands the application scenarios of this solution.
  • FIG. 12 is a schematic diagram of the 3D envelope box in the image processing method provided by the embodiments of the present application.
  • Figure 12 includes two sub-schematics (a) and (b).
  • the sub-schematic diagram of Figure 12 (a) is the 3D envelope box of the vehicle positioned using the current industry solution
  • the sub-schematic diagram (b) of Figure 12 is the use of this
  • the solution provided in the embodiment of the application is positioned on one side of the 3D envelope box.
  • the 3D envelope box positioned by the solution in the embodiment of the application is more accurate.
  • the solution provided by the embodiments of this application can be used to obtain more accurate 3D feature information.
  • the following combined data will further introduce the beneficial effects brought by the embodiments of this application, aiming at the situation where the complete vehicle is not included in the image.
  • the error of the heading angle generated by the current industry solution is 22.5 degrees, and the error of the heading angle generated by the solution provided by the embodiment of this application is 6.7 degrees, and the performance is improved by about 70%;
  • the centroid point generated by the current industry solution The error rate of the position is 18.4%, and the error rate of the centroid point position generated by the solution provided by the embodiment of the application is 6.2%, and the performance is improved by about 66%.
  • FIG. 13 is a schematic structural diagram of an image processing apparatus provided by an embodiment of the application.
  • the image processing apparatus 1300 may include an acquisition module 1301, an input module 1302, and a generation module 1303.
  • the acquisition module 1301 is used to obtain a first image, which includes the first vehicle;
  • the input module 1302 is used to input the first image into the image processing network to obtain the first result output by the image processing network.
  • the first result includes the position information of the two-dimensional 2D envelope frame of the first vehicle, the coordinates of the wheels of the first vehicle, and the first angle of the first vehicle.
  • the first angle indicates the angle between the side line of the first vehicle and the first axis of the first image.
  • the side line of the first vehicle is the angle between the side of the first vehicle leaking out and the plane where the first vehicle is located Line of intersection, the first axis of the first image is parallel to one side of the first image;
  • the generating module 1303 is used to generate the first vehicle according to the position information of the 2D envelope frame of the first vehicle, the coordinates of the wheels and the first angle
  • the position information of the three-dimensional 3D envelope box of the first vehicle includes the coordinates of at least two first points, and the at least two first points are located in the 3D envelope box of the first vehicle On the edge of the at least two first points, two of the at least two first points locate the edges of the 3D envelope box of the first vehicle, and the coordinates of the at least two first points are used to locate the 3D envelope box of the first vehicle.
  • the at least two first points include two of the side line of the first vehicle and the 2D envelope of the first vehicle. Point of intersection.
  • the generating module 1303 is specifically configured to: generate the position information of the side line of the first vehicle according to the coordinates of the wheels of the first vehicle and the first angle of the first vehicle; The position information of the side line and the position information of the 2D envelope frame of the first vehicle are executed to perform a coordinate generation operation to obtain the coordinates of at least two first points.
  • the first result also includes the position information of the boundary line of the first vehicle and the second angle of the first vehicle,
  • the dividing line is the dividing line between the side and the main surface.
  • the main surface of the first vehicle is the front or back of the first vehicle.
  • the second angle of the first vehicle indicates the main sideline of the first vehicle and the first axis of the first image.
  • the main sideline of the first vehicle is the line of intersection between the main surface where the first vehicle leaks and the ground plane where the first vehicle is located; at least two first points include a first intersection, a second intersection, and The third intersection, the first intersection is the intersection of the side line of the first vehicle and the boundary line of the first vehicle, the first intersection is a vertex of the 3D envelope box of the first vehicle, and the second intersection is the intersection of the first vehicle The intersection of the sideline and the 2D envelope of the first vehicle, and the third intersection is the intersection of the main sideline of the first vehicle and the 2D envelope of the first vehicle.
  • the generating module 1303 is specifically configured to: generate the position information of the side line of the first vehicle according to the coordinates of the wheels of the first vehicle and the first angle of the first vehicle; The position information of the side line and the position information of the boundary line of the first vehicle are used to generate the coordinates of the first intersection; The coordinates of the second intersection point; generate the position information of the main sideline of the first vehicle according to the coordinates of the first intersection point and the second angle of the first vehicle; according to the position information of the main sideline of the first vehicle and the 2D envelope frame of the first vehicle The position information of, generates the coordinates of the third point of intersection.
  • the generating module 1303 is also used to generate the heading angle of the first vehicle relative to the self-vehicle according to the coordinates of the first point when the side of the first vehicle leaks from the first image.
  • the generating module 1303 is also used to generate the distance between the first point and the vehicle based on the coordinates of the first point and the ground plane assumption principle; the generating module 1303 is specifically used to: If the distance between a point and the self-vehicle determines that the distance between the first vehicle and the self-vehicle does not exceed the preset threshold, the first calculation rule is used to generate the heading angle according to the coordinates of the first point; The distance between the point and the self-vehicle determines that when the distance between the first vehicle and the self-vehicle exceeds the preset threshold, the direction angle is generated according to the coordinates of the first point through the second calculation rule, the second calculation rule and the first A calculation rule is a different calculation rule.
  • the generating module 1303 is specifically used to generate the three-dimensional coordinates of the first point in the vehicle body coordinate system and the origin of the vehicle body coordinate system according to the coordinates of the first point and the ground plane assumption principle Located in the vehicle; according to the three-dimensional coordinates of the first point, the heading angle is generated.
  • the generating module 1303 is specifically used to generate the position information of the side line of the first vehicle according to the coordinates of the first point and the first angle of the first vehicle, and according to the side of the first vehicle
  • the position information of the line and the position information of the vanishing line of the first image are generated to generate the coordinates of the vanishing point.
  • the vanishing point is the intersection between the side line of the first vehicle and the vanishing line of the first image; according to the coordinates of the vanishing point and the two
  • the principle of point perspective generates an orientation angle.
  • the generating module 1303 is specifically used to generate the difference between the first angle and the orientation angle of the first vehicle according to the coordinates of the first point, the first angle of the first vehicle, and the principle of small hole imaging.
  • Mapping relationship According to the mapping relationship and the first angle of the first vehicle, the heading angle is generated.
  • the acquiring module 1301 is also used to acquire the coordinates of the vertices of the 3D envelope box of the first vehicle from the coordinates of at least two first points; the generating module 1303 is also used to The coordinates of the vertices of the 3D envelope box of the vehicle and the ground plane assumption principle generate the three-dimensional coordinates of the centroid point of the first vehicle in the vehicle body coordinate system, and the origin of the coordinate system of the vehicle body coordinate system is located in the own vehicle.
  • the obtaining module 1301 is further configured to obtain the coordinates of a first vertex from the coordinates of at least two first points, where the first vertex is a vertex of the 3D envelope box of the first vehicle;
  • the module 1303 is also used to generate the three-dimensional coordinates of the first vertex in the vehicle body coordinate system according to the coordinates of the first vertex and the ground plane assumption principle;
  • the generating module 1303 is also used to generate the three-dimensional coordinates of the first vertex in the vehicle body coordinate system if at least two of the first
  • the three-dimensional coordinates of the first vertex in the vehicle body coordinate system one or more of the following are generated: the length of the first vehicle, the width of the first vehicle, and the height of the first vehicle, the vehicle body coordinates
  • the origin of the coordinate system of the system is inside the vehicle.
  • the obtaining module 1301 is also used to obtain a second image if one first vertex is included in the at least two first points, and the second image includes the first vehicle, the second image, and the first image
  • the image acquisition angles of the image are different
  • the generation module 1303 is also used to obtain the coordinates of at least two second points through the image processing network according to the second image, and the at least two second points are both located in the three-dimensional 3D outer envelope of the first vehicle
  • two of the at least two second points locate the side of the 3D envelope box of the first vehicle, and the coordinates of the at least two second points are used to locate the 3D envelope box of the first vehicle
  • the generation module 1303 is also used to generate the three-dimensional coordinates of the second vertex in the vehicle body coordinate system according to the coordinates of the second point and the ground plane assumption principle, and the second vertex is a vertex of the 3D envelope box of the first vehicle , The second vertex and the first vertex are different vertices; the generating module
  • FIG. 14 is a schematic structural diagram of an image processing device provided by an embodiment of the application.
  • the image processing apparatus 1400 may include an acquisition module 1401, an input module 1402, and a generation module 1403.
  • the acquisition module 1401 is used to acquire a third image, the third image includes a first rigid body, and the first rigid body is a cube; the input module 1402 is used to input the third image into the image processing network to obtain the output of the image processing network
  • the second result in the case where the first rigid body leaks the side surface in the third image, the second result includes the position information of the 2D envelope frame of the first rigid body and the first angle of the first rigid body, and the first angle of the first rigid body Indicates the angle between the side line of the first rigid body and the first axis of the third image.
  • the side line of the first rigid body is the line of intersection between the leaking side surface of the first rigid body and the plane where the first rigid body is located.
  • the first axis of the three images is parallel to one side of the third image; the generating module 1403 is used to generate the three-dimensional 3D envelope box of the first rigid body according to the position information and the first angle of the 2D envelope box of the first rigid body Position information.
  • the position information of the 3D outer envelope box of the first rigid body includes the coordinates of at least two third points.
  • the at least two third points are located on the edge of the 3D outer envelope box of the first rigid body.
  • Two third points of the three points are used to locate the edge of the 3D outer envelope box of the first rigid body, and the coordinates of at least two third points are used to locate the 3D outer envelope box of the first rigid body.
  • the at least two third points include the side line of the first rigid body and two of the 2D envelope frame of the first rigid body. Points of intersection.
  • the first result when the first rigid body leaks the side and main surface in the third image, the first result also includes the position information of the boundary line of the first rigid body and the second angle of the first rigid body,
  • the dividing line is the dividing line between the side and the main surface.
  • the main surface of the first rigid body is the front or back of the first rigid body.
  • the second angle of the first rigid body indicates the main edge of the first rigid body and the first axis of the third image.
  • the main edge of the first rigid body is the line of intersection between the main surface leaked from the first rigid body and the ground plane where the first rigid body is located; at least two third points include the first intersection, the second intersection, and The third point of intersection, the first point of intersection is the point of intersection between the side line of the first rigid body and the boundary line of the first rigid body, the first point of intersection is a vertex of the 3D envelope box of the first rigid body, and the second point of intersection is that of the first rigid body The intersection of the side line and the 2D envelope of the first rigid body, and the third intersection is the intersection of the main edge of the first rigid body and the 2D envelope of the first rigid body.
  • the generating module 1403 is further configured to generate three-dimensional feature information of the first rigid body according to the coordinates of at least two third points, and the three-dimensional feature information of the first rigid body includes one or more of the following :
  • the first rigid body corresponds to the heading angle of the vehicle, the position information of the center of mass of the first rigid body, and the size of the first rigid body.
  • FIG. 15 is a schematic structural diagram of the network training device provided by an embodiment of the application.
  • the network training apparatus 1500 may include an acquisition module 1501, an input module 1502, and a training module 1503.
  • the acquisition module 1501 is used to acquire training images and annotation data of the training images.
  • the training images include the second vehicle.
  • the annotation data includes the annotations of the wheels of the second vehicle.
  • the coordinates and the first angle of the second vehicle are marked.
  • the first angle of the second vehicle indicates the angle between the side line of the second vehicle and the first axis of the training image.
  • the side line of the second vehicle is the second The line of intersection between the side surface of the vehicle leakage and the plane where the second vehicle is located, the first axis of the training image is parallel to one side of the training image;
  • the input module 1502 is used to input the training image into the image processing network to obtain the image input network output
  • the third result of the third result includes the generated coordinates of the wheels of the second vehicle and the first generated angle of the second vehicle;
  • the training module 1503 is used to train the image processing network using the loss function according to the labeled data and the third result , Until the convergence condition of the loss function is met, output the trained image processing network, the loss function is used to narrow the similarity between the generated coordinates and the labeled coordinates, and the similarity between the generated first angle and the labeled first angle Spend.
  • the label data also includes the label position information of the boundary line of the second vehicle and the label second angle of the second vehicle.
  • the third result also includes the generated position information of the boundary line of the second vehicle and the generated second angle of the second vehicle.
  • the loss function is also used to narrow the similarity between the generated position information and the marked position information, and the generated second angle The similarity between the angle and the second angle;
  • the main surface of the second vehicle is the front or back of the second vehicle
  • the dividing line is the dividing line between the side and the main surface
  • the second angle of the second vehicle indicates the main sideline of the second vehicle and the first of the training image.
  • the angle of the included angle between the axes, and the main sideline of the second vehicle is the intersection line between the main surface of the leakage of the second vehicle and the ground plane where the second vehicle is located.
  • the image processing network includes a two-stage target detection network and a three-dimensional feature extraction network
  • the two-stage target detection network includes a region generation network RPN.
  • the input module 1502 is specifically configured to: input the training image into the two-stage target detection network to obtain the position information of the 2D envelope box of the second vehicle output by the RPN in the two-stage target detection network; and input the first feature map into the three-dimensional feature Extract the network and obtain the third result output by the three-dimensional feature extraction network.
  • the first feature map is the feature map of the training image in the 2D envelope of the RPN output; the training module 1503 is specifically used to output a two-stage target Image processing network of detection network and 3D feature extraction network.
  • FIG. 16 is a schematic structural diagram of the execution device provided in an embodiment of the application. Processing chips or other product forms, etc.
  • the image processing device 1300 described in the embodiment corresponding to FIG. 13 may be deployed on the execution device 1600 to implement the function of the self-car in the embodiment corresponding to FIG. 3 to FIG. 7.
  • the image processing device 1400 described in the embodiment corresponding to FIG. 14 may be deployed on the execution device 1600 to implement the function of the self-car in the embodiment corresponding to FIG. 11.
  • the execution device 1600 includes: a receiver 1601, a transmitter 1602, a processor 1603, and a memory 1604 (the number of processors 1603 in the execution device 1600 may be one or more, and one processor is taken as an example in FIG. 16), where The processor 1603 may include an application processor 16031 and a communication processor 16032. In some embodiments of the embodiments of the present application, the receiver 1601, the transmitter 1602, the processor 1603, and the memory 1604 may be connected by a bus or other methods.
  • the memory 1604 may include a read-only memory and a random access memory, and provides instructions and data to the processor 1603. A part of the memory 1604 may also include a non-volatile random access memory (NVRAM).
  • NVRAM non-volatile random access memory
  • the memory 1604 stores a processor and operating instructions, executable modules or data structures, or a subset of them, or an extended set of them.
  • the operating instructions may include various operating instructions for implementing various operations.
  • the processor 1603 controls the operation of the data generating device.
  • the various components of the data generating device are coupled together through a bus system, where the bus system may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus.
  • bus system may include a power bus, a control bus, a status signal bus, etc., in addition to a data bus.
  • various buses are referred to as bus systems in the figure.
  • the method disclosed in the foregoing embodiments of the present application may be applied to the processor 1603 or implemented by the processor 1603.
  • the processor 1603 may be an integrated circuit chip with signal processing capabilities. In the implementation process, the steps of the foregoing method can be completed by an integrated logic circuit of hardware in the processor 1603 or instructions in the form of software.
  • the aforementioned processor 1603 may be a general-purpose processor, a digital signal processing (digital signal processing, DSP), a microprocessor or a microcontroller, and may further include an application specific integrated circuit (ASIC), field programmable Field-programmable gate array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components.
  • DSP digital signal processing
  • FPGA field programmable Field-programmable gate array
  • the processor 1603 can implement or execute the methods, steps, and logical block diagrams disclosed in the embodiments of the present application.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware decoding processor, or executed and completed by a combination of hardware and software modules in the decoding processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory 1604, and the processor 1603 reads the information in the memory 1604, and completes the steps of the foregoing method in combination with its hardware.
  • the receiver 1601 can be used to receive input digital or character information, and to generate signal input related to the related settings and function control of the data generating device.
  • the transmitter 1602 can be used to output digital or character information through the first interface.
  • the transmitter 1602 can also be used to send instructions to the disk group through the first interface to modify the data in the disk group.
  • the transmitter 1602 can also include display devices such as a display screen. .
  • FIG. 17 is a schematic structural diagram of an autonomous driving vehicle provided in an embodiment of the application.
  • the system of the autonomous driving vehicle is further described through FIG. 17.
  • the self-driving vehicle is configured in a fully or partially automatic driving mode.
  • the self-driving vehicle can control itself while in the automatic driving mode, and can determine the current state of the vehicle and its surrounding environment through human operations, and determine the surrounding environment At least one possible behavior of other vehicles is determined, and the confidence level corresponding to the possibility of other vehicles performing the possible behavior is determined, and the autonomous driving vehicle is controlled based on the determined information.
  • the self-driving vehicle can also be set to operate without human interaction.
  • the autonomous vehicle may include various subsystems, such as a travel system 102, a sensor system 104, a control system 106, one or more peripheral devices 108 and a power supply 110, a computer system 112, and a user interface 116.
  • the autonomous vehicle may include more or fewer subsystems, and each subsystem may include multiple components.
  • each subsystem and component of an autonomous vehicle can be interconnected by wire or wirelessly.
  • the travel system 102 may include components that provide power movement for the autonomous vehicle.
  • the travel system 102 may include an engine 118, an energy source 119, a transmission 120, and wheels/tires 121.
  • the engine 118 may be an internal combustion engine, an electric motor, an air compression engine, or other types of engine combinations, for example, a hybrid engine composed of a gasoline engine and an electric motor, or a hybrid engine composed of an internal combustion engine and an air compression engine.
  • the engine 118 converts the energy source 119 into mechanical energy. Examples of energy sources 119 include gasoline, diesel, other petroleum-based fuels, propane, other compressed gas-based fuels, ethanol, solar panels, batteries, and other sources of electricity.
  • the energy source 119 may also provide energy for other systems of the autonomous vehicle.
  • the transmission device 120 can transmit mechanical power from the engine 118 to the wheels 121.
  • the transmission device 120 may include a gearbox, a differential, and a drive shaft. In an embodiment, the transmission device 120 may also include other devices, such as a clutch.
  • the drive shaft may include one or more shafts that can be coupled to one or more wheels 121.
  • the sensor system 104 may include several sensors that sense information about the environment around the autonomous vehicle.
  • the sensor system 104 may include a positioning system 122 (the positioning system may be a global positioning GPS system, a Beidou system or other positioning systems), an inertial measurement unit (IMU) 124, a radar 126, a laser rangefinder 128 and camera 130.
  • the sensor system 104 may also include sensors of the internal system of the self-driving vehicle being monitored (for example, an in-vehicle air quality monitor, a fuel gauge, an oil temperature gauge, etc.). Sensing data from one or more of these sensors can be used to detect objects and their corresponding characteristics (position, shape, direction, speed, etc.). This detection and recognition is a key function for the safe operation of autonomous vehicles.
  • the positioning system 122 can be used to estimate the geographic location of the autonomous vehicle.
  • the IMU 124 is used to sense changes in the position and orientation of the autonomous vehicle based on inertial acceleration.
  • the IMU 124 may be a combination of an accelerometer and a gyroscope.
  • the radar 126 may use radio signals to perceive objects in the surrounding environment of the autonomous vehicle, and may specifically be expressed as millimeter wave radar or lidar. In some embodiments, in addition to sensing an object, the radar 126 may also be used to sense the speed and/or direction of the object.
  • the laser rangefinder 128 can use lasers to perceive objects in the environment where the autonomous vehicle is located.
  • the laser rangefinder 128 may include one or more laser sources, laser scanners, and one or more detectors, as well as other system components.
  • the camera 130 may be used to capture multiple images of the surrounding environment of the autonomous vehicle.
  • the camera 130 may be a still camera or a video camera.
  • the control system 106 controls the operation of the self-driving vehicle and its components.
  • the control system 106 may include various components, including a steering system 132, a throttle 134, a braking unit 136, a computer vision system 140, a line control system 142, and an obstacle avoidance system 144.
  • the steering system 132 is operable to adjust the forward direction of the self-driving vehicle.
  • it may be a steering wheel system.
  • the throttle 134 is used to control the operating speed of the engine 118 and thereby control the speed of the self-driving vehicle.
  • the braking unit 136 is used to control the deceleration of the self-driving vehicle.
  • the braking unit 136 may use friction to slow down the wheels 121.
  • the braking unit 136 may convert the kinetic energy of the wheels 121 into electric current.
  • the braking unit 136 may also take other forms to slow down the rotation speed of the wheels 121 so as to control the speed of the autonomous vehicle.
  • the computer vision system 140 may be operable to process and analyze the images captured by the camera 130 to identify objects and/or features in the surrounding environment of the autonomous vehicle.
  • the objects and/or features may include traffic signals, road boundaries, and obstacles.
  • the computer vision system 140 may use object recognition algorithms, Structure from Motion (SFM) algorithms, video tracking, and other computer vision technologies.
  • SFM Structure from Motion
  • the computer vision system 140 may be used to map the environment, track objects, estimate the speed of objects, and so on.
  • the route control system 142 is used to determine the route and speed of the autonomous vehicle.
  • the route control system 142 may include a horizontal planning module 1421 and a vertical planning module 1422.
  • the horizontal planning module 1421 and the vertical planning module 1422 are respectively used to combine data from the obstacle avoidance system 144, GPS 122, and one or more predetermined maps.
  • the data determines the driving route and driving speed for the autonomous vehicle.
  • the obstacle avoidance system 144 is used for identifying, evaluating and avoiding or otherwise surpassing obstacles in the environment of the autonomous driving vehicle.
  • the aforementioned obstacles may specifically be represented as actual obstacles and virtual moving objects that may collide with the autonomous driving vehicle.
  • the control system 106 may additionally or alternatively include components other than those shown and described. Alternatively, a part of the components shown above may be reduced.
  • the autonomous vehicle interacts with external sensors, other vehicles, other computer systems, or users through the peripheral device 108.
  • the peripheral device 108 may include a wireless communication system 146, an onboard computer 148, a microphone 150, and/or a speaker 152.
  • the peripheral device 108 provides a means for the user of the self-driving vehicle to interact with the user interface 116.
  • the on-board computer 148 may provide information to users of autonomous vehicles.
  • the user interface 116 can also operate the on-board computer 148 to receive user input.
  • the on-board computer 148 can be operated through a touch screen.
  • the peripheral device 108 may provide a means for the autonomous vehicle to communicate with other devices located in the vehicle.
  • the wireless communication system 146 may wirelessly communicate with one or more devices directly or via a communication network.
  • the wireless communication system 146 may use 3G cellular communication, such as CDMA, EVDO, GSM/GPRS, or 4G cellular communication, such as LTE. Or 5G cellular communication.
  • the wireless communication system 146 may use a wireless local area network (WLAN) to communicate.
  • the wireless communication system 146 may directly communicate with the device using an infrared link, Bluetooth, or ZigBee.
  • Other wireless protocols such as various vehicle communication systems.
  • the wireless communication system 146 may include one or more dedicated short-range communications (DSRC) devices, which may include vehicles and/or roadside stations. Public and/or private data communications.
  • DSRC dedicated short-range communications
  • the power supply 110 may provide power to various components of the autonomous vehicle.
  • the power source 110 may be a rechargeable lithium ion or lead-acid battery.
  • One or more battery packs of such batteries may be configured as a power source to provide power to various components of the autonomous vehicle.
  • the power source 110 and the energy source 119 may be implemented together, such as in some all-electric vehicles.
  • the computer system 112 may include at least one processor 1603 that executes instructions 115 stored in a non-transitory computer-readable medium such as the memory 1604.
  • the computer system 112 may also be multiple computing devices that control individual components or subsystems of the autonomous vehicle in a distributed manner. The specific forms of the processor 1603 and the memory 1604 will not be repeated here.
  • the processor 1603 may be located away from the autonomous vehicle and wirelessly communicate with the autonomous vehicle. In other aspects, some of the processes described herein are executed on the processor 1603 arranged in the autonomous vehicle and others are executed by the remote processor 1603, including taking the necessary steps to perform a single manipulation.
  • the memory 1604 may include instructions 115 (eg, program logic), which may be executed by the processor 1603 to perform various functions of the autonomous vehicle, including those described above.
  • the memory 1604 may also contain additional instructions, including those for sending data to, receiving data from, interacting with, and/or controlling one or more of the traveling system 102, the sensor system 104, the control system 106, and the peripheral device 108. instruction.
  • the memory 1604 may also store data, such as road maps, route information, the location, direction, and speed of the vehicle, and other such vehicle data, as well as other information. Such information may be used by the autonomous vehicle and computer system 112 during operation of the autonomous vehicle in autonomous, semi-autonomous, and/or manual modes.
  • the user interface 116 is used to provide information to or receive information from the user of the self-driving vehicle.
  • the user interface 116 may include one or more input/output devices in the set of peripheral devices 108, such as a wireless communication system 146, a car computer 148, a microphone 150, and a speaker 152.
  • the computer system 112 may control the functions of the autonomous vehicle based on inputs received from various subsystems (for example, the travel system 102, the sensor system 104, and the control system 106) and from the user interface 116. For example, the computer system 112 may use input from the control system 106 to control the steering system 132 to avoid obstacles detected by the sensor system 104 and the obstacle avoidance system 144. In some embodiments, the computer system 112 is operable to provide control over many aspects of the autonomous vehicle and its subsystems.
  • one or more of these components may be installed or associated with the autonomous vehicle separately.
  • the memory 1604 may exist partially or completely separately from the autonomous vehicle.
  • the above-mentioned components may be communicatively coupled together in a wired and/or wireless manner.
  • FIG. 1 should not be construed as a limitation to the embodiments of the present application.
  • a self-driving vehicle traveling on a road such as the self-driving vehicle above, can recognize objects in its surrounding environment to determine the current speed adjustment.
  • the object may be other vehicles, traffic control equipment, or other types of objects.
  • each recognized object can be considered independently, and based on the respective characteristics of the object, such as its current speed, acceleration, distance from the vehicle, etc., can be used to determine the speed to be adjusted by the autonomous vehicle.
  • the self-driving vehicle or the computing device associated with the self-driving vehicle may be based on the characteristics of the recognized object and the state of the surrounding environment (for example, traffic, Rain, ice on the road, etc.) to predict the behavior of the identified object.
  • each recognized object depends on each other's behavior, so all recognized objects can also be considered together to predict the behavior of a single recognized object.
  • the autonomous vehicle can adjust its speed based on the predicted behavior of the recognized object.
  • the self-driving vehicle can determine what stable state the vehicle will need to adjust to (for example, accelerate, decelerate, or stop) based on the predicted behavior of the object.
  • the computing device can also provide instructions to modify the steering angle of the autonomous vehicle, so that the autonomous vehicle follows a given trajectory and/or maintains an object near the autonomous vehicle ( For example, the safe horizontal and vertical distances of cars in adjacent lanes on the road.
  • the aforementioned autonomous vehicles can be cars, trucks, motorcycles, buses, boats, airplanes, helicopters, lawn mowers, recreational vehicles, playground vehicles, construction equipment, trams, golf carts, trains, and trolleys, etc.
  • the embodiments of this application do not make any special limitations.
  • the processor 1603 executes the image processing method executed by the own vehicle in the embodiment corresponding to FIG. 3 to FIG. 7 through the application processor 16031, or executes the image processing method executed by the own vehicle in the embodiment corresponding to FIG. 11.
  • the image processing method executed by the application processor 16031 executes the image processing method executed by the application processor 16031 and the beneficial effects brought about.
  • the narratives in, I will not repeat them one by one here.
  • FIG. 18 is a schematic structural diagram of a training device provided by an embodiment of the present application.
  • the training device 1800 may be deployed with the network training device 1500 described in the embodiment corresponding to FIG. 15 to implement the functions of the training device in the embodiment corresponding to FIG. 8 to FIG. 10.
  • the training device 1800 is composed of one or more servers.
  • the training device 1800 may have relatively large differences due to different configurations or performances, and may include one or more central processing units (CPU) 1822 (for example, one or more processors) and a memory 1832. Or more than one storage medium 1830 for storing application programs 1842 or data 1844 (for example, one or one storage device with a large amount of storage).
  • CPU central processing units
  • storage medium 1830 for storing application programs 1842 or data 1844 (for example, one or one storage device with a large amount of storage).
  • the memory 1832 and the storage medium 1830 may be short-term storage or persistent storage.
  • the program stored in the storage medium 1830 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations on the training device.
  • the central processing unit 1822 may be configured to communicate with the storage medium 1830, and execute a series of instruction operations in the storage medium 1830 on the training device 1800.
  • the training device 1800 may also include one or more power supplies 1826, one or more wired or wireless network interfaces 1850, one or more input and output interfaces 1858, and/or one or more operating systems 1841, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
  • operating systems 1841 such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
  • the central processing unit 1822 is configured to execute the network training method executed by the training device in the embodiment corresponding to FIG. 8 to FIG. 10.
  • the network training method executed by the training device in the embodiment corresponding to FIG. 8 to FIG. 10.
  • An embodiment of the present application also provides a computer-readable storage medium, which stores a program, and when it runs on a computer, the computer executes the steps described in the foregoing embodiments shown in FIGS. 3 to 7
  • the steps performed by the vehicle in the method, or the computer is caused to execute the steps performed by the vehicle in the method described in the embodiment shown in FIG. 11, or the computer is caused to execute the steps described in the embodiment shown in FIG. 8 to FIG. 10
  • the embodiment of the present application also provides a product including a computer program, which when it runs on a computer, causes the computer to execute the steps performed by the vehicle in the method described in the embodiments shown in FIGS. 3 to 7, or,
  • the computer executes the steps executed by the vehicle in the method described in the foregoing embodiment shown in FIG. 11, or causes the computer to execute the steps executed by the training device in the method described in the foregoing embodiment shown in FIG. 8 to FIG. 10.
  • An embodiment of the present application also provides a circuit system, the circuit system includes a processing circuit configured to execute the steps performed by the vehicle in the method described in the embodiments shown in FIGS. 3 to 7, or Execute the steps performed by the vehicle in the method described in the embodiment shown in FIG. 11, or perform the steps performed by the training device in the method described in the embodiment shown in FIG. 8 to FIG. 10.
  • the self-vehicle, training equipment, image processing device, or network training device provided in the embodiments of the present application may specifically be a chip.
  • the chip includes a processing unit and a communication unit.
  • the processing unit may be a processor, and the communication unit may be, for example, a processor. Input/output interface, pin or circuit, etc.
  • the processing unit can execute the computer-executable instructions stored in the storage unit, so that the chip executes the steps performed by the vehicle in the method described in the foregoing embodiments shown in FIGS. 3 to 7, or executes the foregoing embodiment shown in FIG. 11
  • the steps performed by the self-vehicle in the described method, or the steps performed by the training device in the method described in the embodiments shown in FIGS. 8 to 10 are performed.
  • the storage unit is a storage unit in the chip, such as a register, a cache, etc.
  • the storage unit may also be a storage unit located outside the chip in the wireless access device, such as Read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), etc.
  • ROM Read-only memory
  • RAM random access memory
  • Figure 19 is a schematic diagram of a structure of a chip provided by an embodiment of the application.
  • the Host CPU assigns tasks.
  • the core part of the NPU is the arithmetic circuit 1903.
  • the arithmetic circuit 1903 is controlled by the controller 1904 to extract matrix data from the memory and perform multiplication operations.
  • the arithmetic circuit 1903 includes multiple processing units (Process Engine, PE). In some implementations, the arithmetic circuit 1903 is a two-dimensional systolic array. The arithmetic circuit 1903 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuit 1903 is a general-purpose matrix processor.
  • the arithmetic circuit fetches the corresponding data of matrix B from the weight memory 1902 and caches it on each PE in the arithmetic circuit.
  • the arithmetic circuit takes the matrix A data and matrix B from the input memory 1901 to perform matrix operations, and the partial or final result of the obtained matrix is stored in an accumulator 1908.
  • the unified memory 1906 is used to store input data and output data.
  • the weight data directly passes through the memory unit access controller (Direct Memory Access Controller, DMAC) 1905, and the DMAC is transferred to the weight memory 1902.
  • the input data is also transferred to the unified memory 1906 through the DMAC.
  • DMAC Direct Memory Access Controller
  • the BIU is the Bus Interface Unit, that is, the bus interface unit 1910, which is used for the interaction between the AXI bus and the DMAC and the instruction fetch buffer (IFB) 1909.
  • IFB instruction fetch buffer
  • the bus interface unit 1910 (Bus Interface Unit, BIU for short) is used for the instruction fetch memory 1909 to obtain instructions from the external memory, and is also used for the storage unit access controller 1905 to obtain the original data of the input matrix A or the weight matrix B from the external memory.
  • the DMAC is mainly used to transfer the input data in the external memory DDR to the unified memory 1906, or to transfer the weight data to the weight memory 1902, or to transfer the input data to the input memory 1901.
  • the vector calculation unit 1907 includes multiple arithmetic processing units, and further processes the output of the arithmetic circuit if necessary, such as vector multiplication, vector addition, exponential operation, logarithmic operation, size comparison, and so on. It is mainly used in the calculation of non-convolutional/fully connected layer networks in neural networks, such as Batch Normalization, pixel-level summation, and upsampling of feature planes.
  • the vector calculation unit 1907 can store the processed output vector to the unified memory 1906.
  • the vector calculation unit 1907 may apply a linear function and/or a non-linear function to the output of the arithmetic circuit 1903, such as performing linear interpolation on the feature plane extracted by the convolutional layer, and for example a vector of accumulated values, to generate the activation value.
  • the vector calculation unit 1907 generates normalized values, pixel-level summed values, or both.
  • the processed output vector can be used as an activation input to the arithmetic circuit 1903, for example for use in subsequent layers in a neural network.
  • the instruction fetch buffer 1909 connected to the controller 1904 is used to store instructions used by the controller 1904;
  • the unified memory 1906, the input memory 1901, the weight memory 1902, and the fetch memory 1909 are all On-Chip memories.
  • the external memory is private to the NPU hardware architecture.
  • the calculation of each layer in the recurrent neural network can be executed by the arithmetic circuit 1903 or the vector calculation unit 1907.
  • processor mentioned in any of the foregoing may be a general-purpose central processing unit, a microprocessor, an ASIC, or one or more integrated circuits used to control the execution of the program of the method in the first aspect.
  • the device embodiments described above are only illustrative, and the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physically separate.
  • the physical unit can be located in one place or distributed across multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the connection relationship between the modules indicates that they have a communication connection between them, which may be specifically implemented as one or more communication buses or signal lines.
  • this application can be implemented by means of software plus necessary general hardware.
  • it can also be implemented by dedicated hardware including dedicated integrated circuits, dedicated CLUs, dedicated memories, Dedicated components and so on to achieve.
  • all functions completed by computer programs can be easily implemented with corresponding hardware.
  • the specific hardware structures used to achieve the same function can also be diverse, such as analog circuits, digital circuits or special-purpose circuits. Circuit etc.
  • software program implementation is a better implementation in more cases.
  • the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a readable storage medium, such as a computer floppy disk. , U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk, etc., including several instructions to make a computer device (which can be a personal computer, server, or network device, etc.) execute the methods described in each embodiment of this application .
  • a computer device which can be a personal computer, server, or network device, etc.
  • the computer program product includes one or more computer instructions.
  • the computer may be a general-purpose computer, a special-purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • the computer instructions may be transmitted from a website, computer, server, or data center. Transmission to another website, computer, server or data center via wired (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.).
  • wired such as coaxial cable, optical fiber, digital subscriber line (DSL)
  • wireless such as infrared, wireless, microwave, etc.
  • the computer-readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a server or a data center integrated with one or more available media.
  • the usable medium may be a magnetic medium (for example, a floppy disk, a hard disk, and a magnetic tape), an optical medium (for example, a DVD), or a semiconductor medium (for example, a solid state disk (SSD)).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

一种图像的处理方法、网络的训练方法以及相关设备,涉及人工智能领域中的图像处理技术,包括:将包括第一车辆的第一图像输入图像处理网络中,得到图像处理网络输出的第一结果,第一结果包括第一车辆的二维2D包络框的位置信息、第一车辆的车轮的坐标和第一车辆的第一角度,第一车辆的第一角度指示第一车辆的侧边线与第一图像的第一轴线之间夹角的角度;根据第一结果生成第一车辆的三维3D外包络盒的位置信息。根据第二车辆的二维包络框的位置信息、车轮的坐标和第一角度这三种参数,生成第一车辆的三维3D外包络盒的位置信息,提高了获取到的3D外包络盒的准确度。

Description

一种图像的处理方法、网络的训练方法以及相关设备
本申请要求于2020年4月30日提交中国专利局、申请号为202010366441.9、发明名称为“一种图像的处理方法、网络的训练方法以及相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能领域,尤其涉及一种图像的处理方法、网络的训练方法以及相关设备。
背景技术
人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用***。换句话说,人工智能是计算机科学的一个分支,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式作出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。人工智能领域的研究包括机器人,自然语言处理,计算机视觉,决策与推理,人机交互,推荐与搜索,AI基础理论等。
自动驾驶是人工智能领域的一种主流应用。目前,在自动驾驶领域中,自车在采集到周围车辆的完整图像之后,可以通过神经网络根据图像中完整的车辆,输出车辆在车体坐标系下的朝向角和尺寸等信息,进而可以定位到车辆的3D外包络盒。
但若自车采集到周围车辆的图像中的车辆是不完整的,则神经网络输出的信息会存在较大误差,导致定位到的3D外包络盒准确率较低,因此,一种提高获取到的3D外包络盒的准确度的方案亟待推出。
发明内容
本申请实施例提供了一种图像的处理方法、网络的训练方法以及相关设备,根据第二车辆的二维包络框的位置信息、车轮的坐标和第一角度这三种参数,生成第一车辆的三维3D外包络盒的位置信息,提高了获取到的3D外包络盒的准确度。
为解决上述技术问题,本申请实施例提供以下技术方案:
第一方面,本申请实施例提供一种图像的处理方法,可用于人工智能领域的图像处理领域中。方法包括:执行设备获取第一图像,第一图像中包括第一车辆,执行设备将第一图像输入图像处理网络中,得到图像处理网络输出的第一结果。其中,在第一车辆在第一图像中漏出侧面的情况下,第一结果包括第一车辆的二维2D包络框的位置信息、第一车辆的车轮的坐标和第一车辆的第一角度。进一步地,第一车辆的2D包络框的位置信息可以包括2D包络框的中心点的坐标和2D包络框的边长。第一车辆的车轮的坐标指的可以为车轮的外侧找地点的坐标,也可以为车轮的内侧找地点的坐标,还可以为车轮厚度中间的找地点的坐标。第一车辆的第一角度指示第一车辆的侧边线与第一图像的第一轴线之间夹 角的角度,第一车辆的侧边线为第一车辆漏出的侧面与第一车辆所在地平面之间的交线,第一图像的第一轴线与第一图像的一个边平行,第一轴线可以为与第一图像的U轴平行,也可以与第一图像的V轴平行,第一角度的取值范围可以为0度到360度,也可以为负180度到正180度。执行设备根据第一车辆的2D包络框的位置信息、车轮的坐标和第一角度,生成第一车辆的三维3D外包络盒的位置信息,第一车辆的3D外包络盒的位置信息包括至少两个第一点的坐标,至少两个第一点均位于第一车辆的3D外包络盒的边上,至少两个第一点中两个第一点定位第一车辆的3D外包络盒的边,至少两个第一点的坐标用于定位第一车辆的3D外包络盒。
本实现方式中,将获取到的图像输入到图像处理网络中,图像处理网络输出的为车辆的二维包络框的位置信息、车轮的坐标和第一角度,根据二维包络框的位置信息、车轮的坐标和第一角度,生成第一车辆的三维3D外包络盒的位置信息,进而定位车辆的3D外包络盒,由于二维包络框的位置信息、车轮的坐标和第一角度这三种参数的准确度与图像中车辆是否完整无关,所以无论图像中的车辆是否完整,得到的第一点的坐标是准确的,从而定位出的3D外包络盒的准确率较高,也即提高了获取到的3D外包络盒的准确度;进一步地,也即能够更为准确的判断周围车辆行驶意图,进而提高自动驾驶车辆的行驶安全度。
在第一方面的一种可能实现方式中,在第一车辆在第一图像中仅漏出侧面的情况下,至少两个第一点包括第一车辆的侧边线与第一车辆的2D包络框的两个交点;进一步地,至少两个第一点包括第一车辆的侧边线与第一车轮的2D包络框的左边界之间的交点,和,第一车辆的侧边线与第一车轮的2D包络框的右边界之间的交点。本实现方式中,在第一车辆在第一图像中仅漏出侧面的情况下,第一点为第一车辆的侧边线与2D包络框之间的交点,细化了在特定场景下,第一点的具体表现形态,提高了与应用场景的结合度。
在第一方面的一种可能实现方式中,执行设备根据2D包络框的位置信息、车轮的坐标和第一角度,生成第一车辆的3D外包络盒的位置信息,可以包括:执行设备根据第一车辆的车轮的坐标和第一车辆的第一角度,生成第一车辆的侧边线的位置信息,第一车辆的侧边线的位置信息可以为第一车辆的侧边线的直线方程。执行设备根据第一车辆的侧边线的位置信息和第一车辆的2D包络框的位置信息,执行坐标生成操作,以得到至少两个第一点的坐标;具体的,执行设备根据第一车辆的2D包络框的位置信息,可以确定第一车辆的2D包络框的左边界和右边界的位置,根据第一车辆的侧边线的直线方程,生成侧边线与前述左边界的交点的坐标,生成侧边线与前述右边界的交点的坐标。本实现方式中,自车根据第一车辆的车轮的坐标和第一角度,就可以生成第一车辆的侧边线的位置信息,操作简单,易于实现,且准确度高。
在第一方面的一种可能实现方式中,在第一车辆在第一图像中漏出侧面和主面的情况下,第一结果中还包括第一车辆的分界线的位置信息和第一车辆的第二角度,分界线为侧面与主面之间的分界线,第一车辆的主面为第一车辆的前面或后面,第一车辆的第二角度指示第一车辆的主边线与第一图像的第一轴线之间夹角的角度,第一车辆的主边线为第一车辆漏出的主面与第一车辆所在的地平面之间的交线,第二角度的取值范围可以为0度到360度,也可以为负180度到正180度。至少两个第一点包括第一交点、第二交点和第三 交点,第一交点为第一车辆的侧边线与第一车辆的分界线的交点,第一交点为第一车辆的3D外包络盒的一个顶点,第二交点为第一车辆的侧边线与第一车辆的2D包络框的交点,第三交点为第一车辆的主边线与第一车辆的2D包络框的交点。
本实现方式中,不仅提供了在第一车辆在第一图像中仅漏出侧面的情况下,第一点的具体表现形式,还提供了在第一车辆在第一图像中漏出侧面和主面的情况下,第一点的具体表现形式,丰富了本方案的应用场景,提高了实现灵活性。
在第一方面的一种可能实现方式中,第一车辆的分界线穿过第一车辆的车灯的轮廓,或者,第一车辆的分界线穿过第一车辆的车灯的中心点,或者,第一车辆的分界线穿过第一车辆的侧边线和第一车辆的主边线的交点。本实现方式中,提供了分界线的位置信息的几种具体实现方式,提高了本方案的选择灵活性。
在第一方面的一种可能实现方式中,执行设备根据第一车辆的2D包络框的位置信息、车轮的坐标和第一角度,生成第一车辆的三维3D外包络盒的位置信息,可以包括:执行设备根据第一车辆的车轮的坐标和第一车辆的第一角度,生成第一车辆的侧边线的位置信息。执行设备根据第一车辆的侧边线的位置信息和第一车辆的分界线的位置信息,生成第一交点的坐标;根据第一车辆的侧边线的位置信息和第一车辆的2D包络框的位置信息,生成第二交点的坐标;根据第一交点的坐标和第一车辆的第二角度,生成第一车辆的主边线的位置信息,第一车辆的主边线的位置信息具体可以为第一车辆的主边线的直线方程,根据第一车辆的主边线的位置信息和第一车辆的2D包络框的位置信息,生成第三交点的坐标。本实现方式中,提供了当第一车辆在第一图像中漏出侧面和主面时,生成多个第一点的坐标的实现方式,操作简单,易于实现,且准确度高。
在第一方面的一种可能实现方式中,在第一车辆在第一图像中仅漏出主面的情况下,第一结果中包括第一车辆的2D包络框的位置信息,主面包括前面或后面,2D包络框的位置信息中包括2D包络框的中心点的坐标。
在第一方面的一种可能实现方式中,第一结果还可以包括第一车辆在第一图像中的漏出面的指示信息,漏出面包括以下中的一项或多项:侧面、前面和后面,前述侧面包括左面和右面。漏出面的指示信息具体可以表现为数字序列或字符串。
在第一方面的一种可能实现方式中,方法还可以包括:执行设备根据至少两个第一点的坐标,生成第一车辆的三维特征信息,第一车辆的三维特征信息包括以下中的一项或多项:第一车辆相当于自车的朝向角、第一车辆的质心点的位置信息和第一车辆的尺寸。
在第一方面的一种可能实现方式中,方法还包括:执行设备在第一车辆在第一图像中漏出侧面的情况下,根据第一点的坐标,生成第一车辆相对于自车的朝向角,其中,第一车辆在第一图像中漏出侧面的情况包括第一车辆在第一图像中仅漏出侧面的情况,和第一车辆在第一图像中同时漏出侧面和主面的情况。本实现方式中,在得到第一点的坐标之后,还可以根据第一点的坐标,生成第一车辆相对于自车的朝向角,以提高得到的朝向角的准确度。
在第一方面的一种可能实现方式中,执行设备根据第一点的坐标,生成朝向角之前,方法还可以包括:执行设备根据第一点的坐标和地平面假设原理,生成第一点与自车之间 的距离。执行设备根据第一点的坐标,生成朝向角,可以包括:在根据第一点与自车之间的距离确定第一车辆与自车之间的距离未超过预设阈值的情况下,通过第一计算规则,根据第一点的坐标,生成朝向角,预设阈值的取值可以为10米、15米、30米或25米;在根据第一点与自车之间的距离确定第一车辆与自车之间的距离超过预设阈值的情况下,通过第二计算规则,根据第一点的坐标,生成朝向角,第二计算规则和第一计算规则为不同的计算规则。本实现方式中,针对第一车辆与自车之间的距离超过预设阈值,和,第一车辆与自车之间的距离未超过预设阈值这两种情况,分别采用不同的计算规则,生成第一车辆的朝向角,进一步提高生成的朝向角的准确度。
在第一方面的一种可能实现方式中,当至少两个第一点中任一个第一点与自车之间的距离未超过预设阈值时,视为第一车辆与自车之间的距离未超过预设阈值;或者,当至少两个第一点中任一个第一点与自车之间的距离超过预设阈值时,视为第一车辆与自车之间的距离超过预设阈值。本实现方式中,提供了判断第一车辆与自车之间距离是否超过预设预置的两种具体实现方式,提高了本方案的实现灵活性。
在第一方面的一种可能实现方式中,执行设备通过第一计算规则,根据第一点的坐标,生成朝向角,可以包括:执行设备根据第一点的坐标和地平面假设原理,生成第一点在车体坐标系下的三维坐标;其中,车体坐标系的坐标系原点位于自车内,车体坐标系的坐标系原点可以为自车的两个后车轮连线的中点,车体坐标系的坐标系原点也可以为自车的质心点。执行设备根据第一点的三维坐标,生成朝向角。本实现方式中,无论第一车辆在第一图像中是否为完整的图像,都可以得到准确的第一点的坐标,由于朝向角是基于第一点的坐标和地平面假设原理生成的,从而保证生成的朝向角的准确性。
在第一方面的一种可能实现方式中,执行设备通过第二计算规则,根据第一点的坐标,生成朝向角,可以包括:执行设备根据第一点的坐标和第一车辆的第一角度,生成第一车辆的侧边线的位置信息,根据第一车辆的侧边线的位置信息和第一图像的消失线的位置信息,生成消失点的坐标,消失点为第一车辆的侧边线与第一图像的消失线之间的交点。执行设备根据消失点的坐标和两点透视原理,生成朝向角。具体的,自车在得到消失点的坐标之后,根据消失点的坐标和两点透视原理,生成第一车辆在相机坐标系下的朝向角,再根据第一车辆在相机坐标系下的朝向角和第二变换关系,生成第一车辆在自车的车体坐标系下的朝向角,第二变换关系指的是相机坐标系与车体坐标系之间的转换关系,第二变换关系也可以称为相机的外参。本实现方式中,提供了当第一车辆与自车之间距离超过预设阈值,且第一车辆在第一图像中漏出侧面的情况下,生成第一车辆的朝向角的一种具体实现方式,操作简单,且效率较高。
在第一方面的一种可能实现方式中,执行设备通过第二计算规则,根据第一点的坐标,生成朝向角,可以包括:执行设备根据第一点的坐标、第一车辆的第一角度和小孔成像原理,生成第一车辆的第一角度和朝向角之间的映射关系;根据映射关系和第一车辆的第一角度,生成朝向角。本实现方式中,针对第一车辆与自车之间的距离超过预设阈值的情况,提供了求得朝向角的两种可实现方式,提高本方案实现灵活性。
在第一方面的一种可能实现方式中,方法还可以包括:在第一车辆在第一图像中仅漏 出主面的情况下,执行设备根据第一车辆的2D包络框的中心点的坐标和小孔成像原理,生成朝向角。
在第一方面的一种可能实现方式中,方法还可以包括:执行设备从至少两个第一点的坐标中获取第一车辆的3D外包络盒的顶点的坐标,根据第一车辆的3D外包络盒的顶点的坐标和地平面假设原理,生成第一车辆的质心点在车体坐标系下的三维坐标,车体坐标系的坐标系原点位于自车内。本实现方式中,根据第一点的坐标,不仅能够生成第一车辆的朝向角,还可以生成第一车辆的质心点在车体坐标系下的三维坐标,扩展了本方案的应用场景;此外,提高了生成的质心点的三维坐标的准确性。
在第一方面的一种可能实现方式中,方法还可以包括:执行设备上预先设置有第一图像的U轴方向的第一取值范围,和,第一图像的V轴方向的第二取值范围。在得到一个第一点的坐标之后,执行设备判断第一点的坐标中U轴方向的取值是否在第一取值范围内,判断第一点的坐标中V轴方向的取值是否在第二取值范围内;若第一点的坐标中U轴方向的取值在第一取值范围内,且,第一点的坐标中V轴方向的取值在第二取值范围内,则确定该第一点为第一车辆的3D外包络盒的顶点。
在第一方面的一种可能实现方式中,方法还可以包括:执行设备从至少两个第一点的坐标中获取第一顶点的坐标,第一顶点为第一车辆的3D外包络盒的一个顶点。执行设备根据第一顶点的坐标和地平面假设原理,生成第一顶点在车体坐标系下的三维坐标,若至少两个第一点中包括至少两个第一顶点,根据第一顶点在车体坐标系下的三维坐标,生成以下中的一项或多项:第一车辆的长、第一车辆的宽和第一车辆的高,车体坐标系的坐标系原点位于自车内。本实现方式中,根据第一点的坐标,还可以生成第一车辆的尺寸,进一步扩展了本方案的应用场景;此外,提高了生成的第一车辆的尺寸的准确性。
在第一方面的一种可能实现方式中,方法还可以包括:若至少两个第一点中包括一个第一顶点,执行设备获取第二图像,第二图像中包括第一车辆,第二图像和第一图像的图像采集角度不同。执行设备根据第二图像,通过图像处理网络,得到至少两个第二点的坐标,至少两个第二点均位于第一车辆的三维3D外包络盒的边上,至少两个第二点中两个第二点定位第一车辆的3D外包络盒的边,至少两个第二点的坐标用于定位第一车辆的3D外包络盒。执行设备根据第二点的坐标和地平面假设原理,生成第二顶点在车体坐标系下的三维坐标,第二顶点为第一车辆的3D外包络盒的一个顶点,第二顶点与第一顶点为不同的顶点,根据第一顶点的三维坐标和第二顶点的三维坐标,生成以下中的一项或多项:第一车辆的长、第一车辆的宽和第一车辆的高。本实现方式中,在通过第一车辆的一个图像无法生成第一车辆的尺寸的情况下,利用第一车辆的另一个图像共同生成第一车辆的尺寸,保证了在各种情况下均能生成第一车辆的尺寸,提高了本方案的全面性。
本申请实施例第二方面提供了一种图像的处理方法,可用于人工智能领域的图像处理领域中。方法包括:执行设备获取第一图像,第一图像中包括第一车辆;通过图像处理网络,根据第一图像,得到第一车辆的三维3D外包络盒的位置信息;根据第一车辆的三维3D外包络盒的位置信息,生成第一车辆的三维特征信息,第一车辆的三维特征信息包括以下中的一项或多项:第一车辆相当于自车的朝向角、第一车辆的质心点的位置信息和第一 车辆的尺寸。
在第二方面的一种可能实现方式中,第一车辆的3D外包络盒的位置信息包括至少两个第一点的坐标,至少两个第一点均位于第一车辆的三维3D外包络盒的边上,至少两个第一点中两个第一点定位第一车辆的3D外包络盒的边,至少两个第一点的坐标用于定位第一车辆的3D外包络盒。
在第二方面的一种可能实现方式中,执行设备通过图像处理网络,根据第一图像,得到第一车辆的三维3D外包络盒的位置信息,包括:执行设备将第一图像输入图像处理网络中,得到图像处理网络输出的第一结果,在第一车辆在第一图像中漏出侧面的情况下,第一结果包括第一车辆的二维2D包络框的位置信息、第一车辆的车轮的坐标和第一车辆的第一角度,第一车辆的第一角度指示第一车辆的侧边线与第一图像的第一轴线之间夹角的角度,第一车辆的侧边线为第一车辆漏出的侧面与第一车辆所在地平面之间的交线,第一图像的第一轴线与第一图像的一个边平行。执行设备根据第一车辆的2D包络框的位置信息、车轮的坐标和第一角度,执行坐标生成操作,以得到至少两个第一点的坐标。
在第二方面的一种可能实现方式中,在第一车辆在第一图像中仅漏出侧面的情况下,至少两个第一点包括第一车辆的侧边线与第一车辆的2D包络框的两个交点。
在第二方面的一种可能实现方式中,执行设备根据2D包络框的位置信息、车轮的坐标和第一角度,生成第一车辆的3D外包络盒的位置信息,包括:执行设备根据第一车辆的车轮的坐标和第一车辆的第一角度,生成第一车辆的侧边线的位置信息;根据第一车辆的侧边线的位置信息和第一车辆的2D包络框的位置信息,执行坐标生成操作,以得到至少两个第一点的坐标。
在第二方面的一种可能实现方式中,在第一车辆在第一图像中漏出侧面和主面的情况下,第一结果中还包括第一车辆的分界线的位置信息和第一车辆的第二角度,分界线为侧面与主面之间的分界线,第一车辆的主面为第一车辆的前面或后面,第一车辆的第二角度指示第一车辆的主边线与第一图像的第一轴线之间夹角的角度,第一车辆的主边线为第一车辆漏出的主面与第一车辆所在的地平面之间的交线。至少两个第一点包括第一交点、第二交点和第三交点,第一交点为第一车辆的侧边线与第一车辆的分界线的交点,第一交点为第一车辆的3D外包络盒的一个顶点,第二交点为第一车辆的侧边线与第一车辆的2D包络框的交点,第三交点为第一车辆的主边线与第一车辆的2D包络框的交点。
在第二方面的一种可能实现方式中,执行设备根据第一车辆的2D包络框的位置信息、车轮的坐标和第一角度,生成第一车辆的三维3D外包络盒的位置信息,包括:执行设备根据第一车辆的车轮的坐标和第一车辆的第一角度,生成第一车辆的侧边线的位置信息;执行设备根据第一车辆的侧边线的位置信息和第一车辆的分界线的位置信息,生成第一交点的坐标。执行设备根据第一车辆的侧边线的位置信息和第一车辆的2D包络框的位置信息,生成第二交点的坐标。执行设备根据第一交点的坐标和第一车辆的第二角度,生成第一车辆的主边线的位置信息;根据第一车辆的主边线的位置信息和第一车辆的2D包络框的位置信息,生成第三交点的坐标。
在第二方面的一种可能实现方式中,方法还包括:执行设备在第一车辆在第一图像中 漏出侧面的情况下,根据第一点的坐标,生成第一车辆相对于自车的朝向角。
在第二方面的一种可能实现方式中,执行设备根据第一点的坐标,生成朝向角之前,方法还包括:执行设备根据第一点的坐标和地平面假设原理,生成第一点与自车之间的距离。执行设备根据第一点的坐标,生成朝向角,包括:执行设备在根据第一点与自车之间的距离确定第一车辆与自车之间的距离未超过预设阈值的情况下,通过第一计算规则,根据第一点的坐标,生成朝向角;在根据第一点与自车之间的距离确定第一车辆与自车之间的距离超过预设阈值的情况下,通过第二计算规则,根据第一点的坐标,生成朝向角,第二计算规则和第一计算规则为不同的计算规则。
在第二方面的一种可能实现方式中,当至少两个第一点中任一个第一点与自车之间的距离未超过预设阈值时,视为第一车辆与自车之间的距离未超过预设阈值;或者,当至少两个第一点中任一个第一点与自车之间的距离超过预设阈值时,视为第一车辆与自车之间的距离超过预设阈值。
在第二方面的一种可能实现方式中,执行设备通过第一计算规则,根据第一点的坐标,生成朝向角,包括:根据第一点的坐标和地平面假设原理,生成第一点在车体坐标系下的三维坐标,车体坐标系的坐标系原点位于自车内;根据第一点的三维坐标,生成朝向角。
在第二方面的一种可能实现方式中,执行设备通过第二计算规则,根据第一点的坐标,生成朝向角,包括:根据第一点的坐标和第一车辆的第一角度,生成第一车辆的侧边线的位置信息,根据第一车辆的侧边线的位置信息和第一图像的消失线的位置信息,生成消失点的坐标,消失点为第一车辆的侧边线与第一图像的消失线之间的交点;根据消失点的坐标和两点透视原理,生成朝向角。
在第二方面的一种可能实现方式中,执行设备通过第二计算规则,根据第一点的坐标,生成朝向角,包括:根据第一点的坐标、第一车辆的第一角度和小孔成像原理,生成第一车辆的第一角度和朝向角之间的映射关系;根据映射关系和第一车辆的第一角度,生成朝向角。
在第二方面的一种可能实现方式中,方法还包括:执行设备从至少两个第一点的坐标中获取第一车辆的3D外包络盒的顶点的坐标;根据第一车辆的3D外包络盒的顶点的坐标和地平面假设原理,生成第一车辆的质心点在车体坐标系下的三维坐标,车体坐标系的坐标系原点位于自车内。
在第二方面的一种可能实现方式中,方法还包括:执行设备从至少两个第一点的坐标中获取第一顶点的坐标,第一顶点为第一车辆的3D外包络盒的一个顶点;根据第一顶点的坐标和地平面假设原理,生成第一顶点在车体坐标系下的三维坐标;若至少两个第一点中包括至少两个第一顶点,根据第一顶点在车体坐标系下的三维坐标,生成以下中的一项或多项:第一车辆的长、第一车辆的宽和第一车辆的高,车体坐标系的坐标系原点位于自车内。
在第二方面的一种可能实现方式中,方法还包括:若至少两个第一点中包括一个第一顶点,执行设备获取第二图像,第二图像中包括第一车辆,第二图像和第一图像的图像采集角度不同;执行设备根据第二图像,通过图像处理网络,得到至少两个第二点的坐标, 至少两个第二点均位于第一车辆的三维3D外包络盒的边上,至少两个第二点中两个第二点定位第一车辆的3D外包络盒的边,至少两个第二点的坐标用于定位第一车辆的3D外包络盒。执行设备根据第二点的坐标和地平面假设原理,生成第二顶点在车体坐标系下的三维坐标,第二顶点为第一车辆的3D外包络盒的一个顶点,第二顶点与第一顶点为不同的顶点。执行设备根据第一顶点的三维坐标和第二顶点的三维坐标,生成以下中的一项或多项:第一车辆的长、第一车辆的宽和第一车辆的高。
对于本申请实施例第二方面以及第二方面的各种可能实现方式的具体实现步骤,以及每种可能实现方式所带来的有益效果,均可以参考第一方面中各种可能的实现方式中的描述,此处不再一一赘述。
第三方面,本申请实施例提供一种图像处理方法,可用于人工智能领域的图像处理领域中。方法包括:执行设备获取第三图像,第三图像中包括第一刚体,第一刚体为立方体;执行设备将第三图像输入图像处理网络中,得到图像处理网络输出的第二结果,在第一刚体在第三图像中漏出侧面的情况下,第二结果包括第一刚体的2D包络框的位置信息和第一刚体的第一角度,第一刚体的第一角度指示第一刚体的侧边线与第三图像的第一轴线之间夹角的角度,第一刚体的侧边线为第一刚体漏出的侧面与第一刚体所在平面之间的交线,第三图像的第一轴线与第三图像的一个边平行。执行设备根据第一刚体的2D包络框的位置信息和第一角度,生成第一刚体的三维3D外包络盒的位置信息,第一刚体的3D外包络盒的位置信息包括至少两个第三点的坐标,至少两个第三点均位于第一刚体的3D外包络盒的边上,至少两个第三点中两个第三点定位第一刚体的3D外包络盒的边,至少两个第三点的坐标用于定位第一刚体的3D外包络盒。
本申请第三方面的一种可能的实现方式中,执行设备根据第一刚体的2D包络框的位置信息和第一角度,生成第一刚体的三维3D外包络盒的位置信息,可以包括:在第一刚体在第三图像中仅漏出侧面的情况下,自车可以根据第一刚体的2D包络框的位置信息,生成第一刚体的2D包络框的左下角顶点的坐标和/或右下角顶点的坐标,以替代第一方面的第一结果中的车轮的坐标。自车根据第一刚体的2D包络框的左下角顶点的坐标和/或右下角顶点的坐标、第一刚体的2D包络框的位置信息和第一角度,生成第一刚体的三维3D外包络盒的位置信息。
本申请第三方面的一种可能的实现方式中,在第一刚体在第三图像中漏出主面和侧面的情况下,第二结果中还可以包括第一刚体的分界线的位置信息,自车可以根据第一刚体的2D包络框的位置信息和第一刚体的分界线的位置信息生成以下中的一项或多项坐标信息:第一刚体的分界线与第一刚体的2D包络框的底边的交点的坐标、第一刚体的2D包络框的左下角顶点的坐标和右下角顶点的坐标,以替代第一方面的第一结果中的车轮的坐标。自车根据前述生成的坐标信息、第一刚体的2D包络框的位置信息和第一角度,生成第一刚体的三维3D外包络盒的位置信息。
本申请第三方面的一种可能的实现方式中,在第一刚体在第三图像中或者只漏出侧面的情况下,第二结果中还可以包括第一刚体的2D包络框的左下角顶点的坐标和/或右下角顶点的坐标,以替代第一方面的第一结果中的车轮的坐标。在第一刚体在第三图像中漏出 主面和侧面的情况下,第二结果中还可以包括以下中的一项或多项:第一刚体的分界线与2D包络框的底边的交点的坐标、第一刚体的2D包络框的左下角顶点的坐标和右下角顶点的坐标,以替代第一方面的第一结果中的车轮的坐标。
本申请第三方面中,执行设备还可以执行第一方面的各个可能实现方式中的步骤,对于本申请实施例第三方面以及第三方面的各种可能实现方式的具体实现步骤,以及每种可能实现方式所带来的有益效果,均可以参考第一方面中各种可能的实现方式中的描述,此处不再一一赘述。
第四方面,本申请实施例提供一种网络的训练方法,可用于人工智能领域的图像处理领域中。方法可以包括:训练设备获取训练图像和训练图像的标注数据,训练图像中包括第二车辆,在第二车辆在训练图像中漏出侧面的情况下,标注数据包括第二车辆的车轮的标注坐标和第二车辆的标注第一角度,第二车辆的第一角度指示第二车辆的侧边线与训练图像的第一轴线之间夹角的角度,第二车辆的侧边线为第二车辆漏出的侧面与第二车辆所在地平面之间的交线,训练图像的第一轴线与训练图像的一个边平行。训练设备将训练图像输入图像处理网络中,得到图像输入网络输出的第三结果,第三结果包括第二车辆的车轮的生成坐标和第二车辆的生成第一角度。训练设备根据标注数据和第三结果,利用损失函数对图像处理网络进行训练,直至满足损失函数的收敛条件,输出训练后的图像处理网络,损失函数用于拉近生成坐标与标注坐标之间的相似度,且拉近生成第一角度和标注第一角度之间的相似度。本实现方式中,由于二维包络框的位置信息、车轮的坐标和第一角度这三种参数的准确度与图像中车辆是否完整无关,所以无论图像中的车辆是否完整,训练后的图像处理网络都能够输出准确的信息,有利于提高图像处理网络的稳定性;此外,二维包络框的位置信息、车轮的坐标和第一角度的标注规则简单,相对于目前利用激光雷达进行训练数据标注的方式,大大降低了训练数据标注过程的难度。
在第四方面的一种可能实现方式中,在第二车辆在训练图像中漏出侧面和主面的情况下,标注数据还包括第二车辆的分界线的标注位置信息和第二车辆的标注第二角度,第三结果还包括第二车辆的分界线的生成位置信息和第二车辆的生成第二角度,损失函数还用于拉近生成位置信息和标注位置信息之间的相似度,且拉近生成第二角度与标注第二角度之间的相似度。其中,第二车辆的主面为第二车辆的前面或后面,分界线为侧面与主面之间的分界线,第二车辆的第二角度指示第二车辆的主边线与训练图像的第一轴线之间夹角的角度,第二车辆的主边线为第二车辆漏出的主面与第二车辆所在的地平面之间的交线。
在第四方面的一种可能实现方式中,图像处理网络包括二阶段目标检测网络和三维特征提取网络,二阶段目标检测网络包括区域生成网络。训练设备将训练图像输入图像处理网络中,得到图像输入网络输出的第三结果,包括:训练设备将训练图像输入二阶段目标检测网络中,得到二阶段目标检测网络中的区域生成网络输出的第二车辆的2D包络框的位置信息;训练设备将第一特征图输入三维特征提取网络,得到三维特征提取网络输出的第三结果,第一特征图为训练图像的特征图中位于区域生成网络输出的2D包络框内的特征图。训练设备输出训练后的图像处理网络,包括:训练设备输出包括二阶段目标检测网络和三维特征提取网络的图像处理网络。
本实现方式中,由于区域生成网络直接输出的2D包络框的准确度较低,也即基于区域生成网络直接输出的2D包络框得到的第一特征图的精度较低,有利于提高训练阶段的难度,进而提高训练后图像处理网络的鲁棒性。
对于本申请实施例第四方面以及第四方面的各种可能实现方式中名词的具体含义,均可以参考第一方面中各种可能的实现方式中的描述,此处不再一一赘述。
第五方面,本申请实施例提供一种图像处理装置,可用于人工智能领域的图像处理领域中。装置包括获取模块、输入模块和生成模块,其中,获取模块,用于获取第一图像,第一图像中包括第一车辆;输入模块,用于将第一图像输入图像处理网络中,得到图像处理网络输出的第一结果,在第一车辆在第一图像中漏出侧面的情况下,第一结果包括第一车辆的二维2D包络框的位置信息、第一车辆的车轮的坐标和第一车辆的第一角度,第一车辆的第一角度指示第一车辆的侧边线与第一图像的第一轴线之间夹角的角度,第一车辆的侧边线为第一车辆漏出的侧面与第一车辆所在地平面之间的交线,第一图像的第一轴线与第一图像的一个边平行;生成模块,用于根据第一车辆的2D包络框的位置信息、车轮的坐标和第一角度,生成第一车辆的三维3D外包络盒的位置信息,第一车辆的3D外包络盒的位置信息包括至少两个第一点的坐标,至少两个第一点均位于第一车辆的3D外包络盒的边上,至少两个第一点中两个第一点定位第一车辆的3D外包络盒的边,至少两个第一点的坐标用于定位第一车辆的3D外包络盒。
本申请实施例第五方面中,图像处理装置包括各个模块还可以用于实现第一方面各种可能实现方式中的步骤,对于本申请实施例第五方面以及第五方面的各种可能实现方式中某些步骤的具体实现方式,以及每种可能实现方式所带来的有益效果,均可以参考第一方面中各种可能的实现方式中的描述,此处不再一一赘述。
第六方面,本申请实施例提供一种图像处理装置,可用于人工智能领域的图像处理领域中。装置包括:获取模块和生成模块,其中,获取模块,用于获取第一图像,第一图像中包括第一车辆;生成模块,用于通过图像处理网络,根据第一图像,得到第一车辆的三维3D外包络盒的位置信息;生成模块,还用于根据第一车辆的三维3D外包络盒的位置信息,生成第一车辆的三维特征信息,第一车辆的三维特征信息包括以下中的一项或多项:第一车辆相当于自车的朝向角、第一车辆的质心点的位置信息和第一车辆的尺寸。
本申请实施例第六方面中,图像处理装置包括各个模块还可以用于实现第二方面各种可能实现方式中的步骤,对于本申请实施例第六方面以及第六方面的各种可能实现方式中某些步骤的具体实现方式,以及每种可能实现方式所带来的有益效果,均可以参考第二方面中各种可能的实现方式中的描述,此处不再一一赘述。
第七方面,本申请实施例提供一种图像处理装置,可用于人工智能领域的图像处理领域中。装置包括获取模块、输入模块和生成模块,其中,获取模块,用于获取第三图像,第三图像中包括第一刚体,第一刚体为立方体;输入模块,用于将第三图像输入图像处理网络中,得到图像处理网络输出的第二结果,在第一刚体在第三图像中漏出侧面的情况下,第二结果包括第一刚体的2D包络框的位置信息和第一刚体的第一角度,第一刚体的第一角度指示第一刚体的侧边线与第三图像的第一轴线之间夹角的角度,第一刚体的侧边线为 第一刚体漏出的侧面与第一刚体所在平面之间的交线,第三图像的第一轴线与第三图像的一个边平行;生成模块,用于根据第一刚体的2D包络框的位置信息和第一角度,生成第一刚体的三维3D外包络盒的位置信息,第一刚体的3D外包络盒的位置信息包括至少两个第三点的坐标,至少两个第三点均位于第一刚体的3D外包络盒的边上,至少两个第三点中两个第三点定位第一刚体的3D外包络盒的边,至少两个第三点的坐标用于定位第一刚体的3D外包络盒。
本申请实施例第七方面中,图像处理装置包括各个模块还可以用于实现第三方面各种可能实现方式中的步骤,对于本申请实施例第七方面以及第七方面的各种可能实现方式中某些步骤的具体实现方式,以及每种可能实现方式所带来的有益效果,均可以参考第三方面中各种可能的实现方式中的描述,此处不再一一赘述。
第八方面,本申请实施例提供一种图像处理装置,可用于人工智能领域的图像处理领域中。装置包括获取模块、输入模块和训练模块,其中,获取模块,用于获取训练图像和训练图像的标注数据,训练图像中包括第二车辆,在第二车辆在训练图像中漏出侧面的情况下,标注数据包括第二车辆的车轮的标注坐标和第二车辆的标注第一角度,第二车辆的第一角度指示第二车辆的侧边线与训练图像的第一轴线之间夹角的角度,第二车辆的侧边线为第二车辆漏出的侧面与第二车辆所在地平面之间的交线,训练图像的第一轴线与训练图像的一个边平行;输入模块,用于将训练图像输入图像处理网络中,得到图像输入网络输出的第三结果,第三结果包括第二车辆的车轮的生成坐标和第二车辆的生成第一角度;训练模块,用于根据标注数据和第三结果,利用损失函数对图像处理网络进行训练,直至满足损失函数的收敛条件,输出训练后的图像处理网络,损失函数用于拉近生成坐标与标注坐标之间的相似度,且拉近生成第一角度和标注第一角度之间的相似度。
本申请实施例第八方面中,图像处理装置包括各个模块还可以用于实现第四方面各种可能实现方式中的步骤,对于本申请实施例第八方面以及第八方面的各种可能实现方式中某些步骤的具体实现方式,以及每种可能实现方式所带来的有益效果,均可以参考第四方面中各种可能的实现方式中的描述,此处不再一一赘述。
第九方面,本申请实施例提供了一种执行设备,可以包括处理器,处理器和存储器耦合,存储器存储有程序指令,当存储器存储的程序指令被处理器执行时实现上述第一方面所述的图像的处理方法,或者,当存储器存储的程序指令被处理器执行时实现上述第二方面所述的图像的处理方法,或者,当存储器存储的程序指令被处理器执行时实现上述第三方面所述的图像的处理方法。对于处理器执行第一方面、第二方面或第三方面的各个可能实现方式中执行设备执行的步骤,具体均可以参阅上述第一方面、第二方面或第三方面,此处不再赘述。
第十方面,本申请实施例提供了一种自动驾驶车辆,可以包括处理器,处理器和存储器耦合,存储器存储有程序指令,当存储器存储的程序指令被处理器执行时实现上述第一方面所述的图像的处理方法,或者,当存储器存储的程序指令被处理器执行时实现上述第二方面所述的图像的处理方法,或者,当存储器存储的程序指令被处理器执行时实现上述第三方面所述的图像的处理方法。对于处理器执行第一方面、第二方面或第三方面的各个 可能实现方式中执行设备执行的步骤,具体均可以参阅上述第一方面、第二方面或第三方面,此处不再赘述。
第十一方面,本申请实施例提供了一种训练设备,可以包括处理器,处理器和存储器耦合,存储器存储有程序指令,当存储器存储的程序指令被处理器执行时实现上述第四方面所述的网络的训练方法。对于处理器执行第四方面的各个可能实现方式中训练设备执行的步骤,具体均可以参阅第四方面,此处不再赘述。
第十二方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质中存储有计算机程序,当其在计算机上行驶时,使得计算机执行上述第一方面、第二方面或第三方面所述的图像的处理方法,或者,使得计算机执行上述第四方面所述的网络的训练方法。
第十三方面,本申请实施例提供了一种电路***,所述电路***包括处理电路,所述处理电路配置为执行上述第一方面、第二方面或第三方面所述的图像的处理方法,或者,所述处理电路配置为执行上述第四方面所述的网络的训练方法。
第十四方面,本申请实施例提供了一种计算机程序,当其在计算机上运行时,使得计算机执行上述第一方面、第二方面或第三方面所述的图像的处理方法,或者,使得计算机执行上述第四方面所述的网络的训练方法。
第十五方面,本申请实施例提供了一种芯片***,该芯片***包括处理器,用于支持服务器或图像处理装置实现上述方面中所涉及的功能,例如,发送或处理上述方法中所涉及的数据和/或信息。在一种可能的设计中,所述芯片***还包括存储器,所述存储器,用于保存服务器或通信设备必要的程序指令和数据。该芯片***,可以由芯片构成,也可以包括芯片和其他分立器件。
附图说明
图1为本申请实施例提供的为人工智能主体框架的一种结构示意图;
图2为本申请实施例提供的图像处理***的一种***架构图;
图3为本申请实施例提供的图像的处理方法的一种流程示意图;
图4为本申请实施例提供的图像的处理方法中第一结果的一种示意图;
图5为本申请实施例提供的图像的处理方法中第一结果的另一种示意图;
图6为本申请实施例提供的图像的处理方法中第一点的一种示意图;
图7为本申请实施例提供的图像的处理方法中第一点的另一种示意图;
图8为本申请实施例提供的网络的训练方法的一种流程示意图;
图9为本申请实施例提供的网络的训练方法的另一种流程示意图;
图10为本申请实施例提供的网络的训练方法的又一种流程示意图;
图11为本申请实施例提供的图像的处理方法的另一种流程示意图;
图12为本申请实施例提供的图像处理方法中3D包络盒的一种示意图;
图13为本申请实施例提供的图像处理装置的一种结构示意图;
图14为本申请实施例提供的图像处理装置的另一种结构示意图;
图15为本申请实施例提供的网络训练装置的一种结构示意图;
图16为本申请实施例提供的执行设备的一种结构示意图;
图17为本申请实施例提供的自动驾驶车辆的一种结构示意图;
图18为本申请实施例提供的训练设备的一种结构示意图;
图19为本申请实施例提供的芯片的一种结构示意图。
具体实施方式
本申请实施例提供了一种图像的处理方法、网络的训练方法以及相关设备,根据第二车辆的二维包络框的位置信息、车轮的坐标和第一角度这三种参数,生成第一车辆的三维3D外包络盒的位置信息,提高了获取到的3D外包络盒的准确度。
下面结合附图,对本申请的实施例进行描述。本领域普通技术人员可知,随着技术的发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,以便包含一系列单元的过程、方法、***、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。
首先对人工智能***总体工作流程进行描述,请参见图1,图1示出的为人工智能主体框架的一种结构示意图,下面从“智能信息链”(水平轴)和“IT价值链”(垂直轴)两个维度对上述人工智能主题框架进行阐述。其中,“智能信息链”反映从数据的获取到处理的一列过程。举例来说,可以是智能信息感知、智能信息表示与形成、智能推理、智能决策、智能执行与输出的一般过程。在这个过程中,数据经历了“数据—信息—知识—智慧”的凝练过程。“IT价值链”从人智能的底层基础设施、信息(提供和处理技术实现)到***的产业生态过程,反映人工智能为信息技术产业带来的价值。
(1)基础设施
基础设施为人工智能***提供计算能力支持,实现与外部世界的沟通,并通过基础平台实现支撑。通过传感器与外部沟通;计算能力由智能芯片提供,前述智能芯片包括但不限于中央处理器(central processing unit,CPU)、嵌入式神经网络处理器(neural-network processing unit,NPU)、图形处理器(graphics processing unit,GPU)、专用集成电路(application specific integrated circuit,ASIC)和现场可编程逻辑门阵列(field programmable gate array,FPGA)等硬件加速芯片;基础平台包括分布式计算框架及网络等相关的平台保障和支持,可以包括云存储和计算、互联互通网络等。举例来说,传感器和外部沟通获取数据,这些数据提供给基础平台提供的分布式计算***中的智能芯片进行计算。
(2)数据
基础设施的上一层的数据用于表示人工智能领域的数据来源。数据涉及到图形、图像、语音、文本,还涉及到传统设备的物联网数据,包括已有***的业务数据以及力、位移、 液位、温度、湿度等感知数据。
(3)数据处理
数据处理通常包括数据训练,机器学习,深度学习,搜索,推理,决策等方式。
其中,机器学习和深度学习可以对数据进行符号化和形式化的智能信息建模、抽取、预处理、训练等。
推理是指在计算机或智能***中,模拟人类的智能推理方式,依据推理控制策略,利用形式化的信息进行机器思维和求解问题的过程,典型的功能是搜索与匹配。
决策是指智能信息经过推理后进行决策的过程,通常提供分类、排序、预测等功能。
(4)通用能力
对数据经过上面提到的数据处理后,进一步基于数据处理的结果可以形成一些通用的能力,比如可以是算法或者一个通用***,例如,翻译,文本的分析,计算机视觉的处理,语音识别等等。
(5)智能产品及行业应用
智能产品及行业应用指人工智能***在各领域的产品和应用,是对人工智能整体解决方案的封装,将智能信息决策产品化、实现落地应用,其应用领域主要包括:智能终端、智能制造、智能交通、智能家居、智能医疗、智能安防、自动驾驶、平安城市等。
本申请可以应用于人工智能领域的各种领域中,具体可以应用于各种需要对周围环境中的刚体进行3D外包络盒定位的场景中。其中,刚体是指在运动中和受力作用后,形状和大小不变,且内部各点的相对位置不变的物体;前述刚体具体可以表现为道路中的车辆、路障或其他类型的刚体等。作为示例,例如本申请实施例可以应用于对自车(也即用户所在的自动驾驶车辆)周围车辆的朝向角进行估计的场景中,可以先对自车周围车辆的3D外包络盒进行定位,并利用3D外包络盒的边上的点,生成自车周围车辆的朝向角。作为另一示例,例如本申请实施例可以应用于对自车周围的路障的位置进行估计的场景中,可以先对自车周围的路障的3D包络盒进行定位,进而利用3D外包络盒的边上的点,生成自车周围的路障的位置信息等。应当理解,此处介绍仅为方便理解本申请实施例的应用场景,不对本申请实施例的应用场景进行穷举。以下均以本申请实施例应用于自动驾驶领域为例进行说明。
为了便于理解本方案,先对本申请实施例提供的图像处理***进行介绍,请参阅图2,图2为本申请实施例提供的图像处理***的一种***架构图,在图2中,图像处理***200包括执行设备210、训练设备220、数据库230和数据存储***240,执行设备210中包括计算模块211。
其中,数据库230中存储有训练数据集合,训练数据集合中包括多个训练图像以及每个训练图像的标注数据,训练设备220生成用于图像的目标模型/规则201,并利用数据库中的训练数据集合对目标模型/规则201进行迭代训练,得到成熟的目标模型/规则201。
训练设备220得到的图像处理网络可以应用不同的***或设备中,例如自动驾驶车辆、手机、平板、智能家电、监控***等等。其中,执行设备210可以调用数据存储***240中的数据、代码等,也可以将数据、指令等存入数据存储***240中。数据存储***240 可以置于执行设备210中,也可以为数据存储***240相对执行设备210是外部存储器。
计算模块211可以通过图像处理网络对执行设备210采集到的图像进行处理,得到图像中第一刚体的2D包络框的位置信息和第一角度,第一角度指示侧边线与第一轴线之间夹角的角度,第一刚体的侧边线为第一刚体漏出的侧面与第一刚体所在地平面之间的交线,第一图像的第一轴线与第一图像的一个边平行,进而执行设备210可以根据第一刚体的二维(2 dimension,2D)包络框的位置信息和第一角度,生成第一刚体的3D包络盒的边上的点的坐标,以对第一刚体的3D包络盒进行定位,由于2D包络框的位置信息和第一角度这两种参数的准确度与图像中刚体是否完整无关,所以无论图像中的刚体是否完整,得到的第一点的坐标是准确的,从而定位出的3D外包络盒都是准确的。
本申请的一些实施例中,例如图2中,“用户”可以直接与执行设备210进行交互,也即执行设备210与客户设备集成于同一设备中。但图2仅是本发明实施例提供的两种图像处理***的架构示意图,图中所示设备、器件、模块等之间的位置关系不构成任何限制。在本申请的另一些实施例中,执行设备210和客户设备可以为分别独立的设备,执行设备210配置有输入/输出接口,与客户设备进行数据交互,“用户”可以通过客户设备向输入/输出接口输入采集到的图像,执行设备210通过输入/输出接口将第一点的坐标返回给客户设备。
结合上述描述,本申请实施例提供了一种图像的处理方法,可应用于图2中示出的执行设备210中。自车上可以预先配置有训练后的图像处理网络,在获取到包括第一车辆的第一图像之后,将第一图像输入到图像处理网络中,得到图像处理网络输出的第一结果,第一结果中包括第车辆的2D包络框、第一车轮的坐标和第一车辆的侧边线与第一图像的一个轴线之间夹角的第一角度。根据图像处理网络输出的第一结果,生成第一点的坐标,第一点指的是第一车辆的3D外包络盒的边上的点,并利用第一点的坐标对第一车辆的3D外包络盒进行定位,由于位置信息、车轮的坐标和第一角度这三种参数的准确度与图像中车辆是否完整无关,所以无论图像中的车辆是否完整,得到的第一点的坐标是准确的,从而定位出的3D外包络盒都是准确的,也即提高了获取到的3D外包络盒的准确度。由图2中的描述可知,本申请实施例包括推理阶段和训练阶段,而推理阶段和训练阶段的流程有所不同,以下分别对推理阶段和训练阶段进行描述。
一、推理阶段
本申请实施例中,推理阶段描述的是执行设备210如何利用成熟的图像处理网络对第一图像中的第一车辆的3D外包络盒进行定位的过程。自车在对第一车辆的3D外包络盒进行定位后,可以对第一车辆的朝向角、质心点位置和/或尺寸等3D特征信息进行预估。本申请实施例中,请参阅图3,图3为本申请实施例提供的图像的处理方法的一种流程示意图,本申请实施例提供的图像的处理方法可以包括:
301、自车获取第一图像,第一图像中包括第一车辆。
本申请实施例中,自车上可以配置有进行图像采集的摄像设备,从而自车可以通过前述摄像设备进行图像采集,以获取到第一图像。其中,前述摄像设备包括但不限于相机、采集卡、雷达或其他类型的摄像设备等,一个第一图像中可以包括一个或多个第一车辆以 及第一车辆所在的环境。第一图像可以为一张独立的图像,也可以为视频中的一帧视频帧。
进一步地,若自车上配置的为单目摄像***,则第一图像可以为通过单目摄像***采集到的;若自车上配置的为双目摄像***,则第一图像可以为通过双目摄像***采集到的两个图像中的任一个图像;若自车上配置的为多目摄像***,则第一图像可以为通过多目摄像***采集到的多个图像中的任一个图像。
302、自车将第一图像输入图像处理网络中,得到图像处理网络输出的第一结果。
本申请实施例中,自车上预先配置有成熟的图像处理网络,在获取到第一图像之后,将第一图像输入到图像处理网络中,得到图像处理网络输出的一组或多组第一结果,第一结果的数量与第一图像中第一车辆的数量一致,一组第一结果用于指示一个第一车辆的特征信息。其中,在第一车辆在第一图像中漏出侧面的情况下,每组第一结果中可以包括第一车辆的2D包络框(bounding frame)的位置信息、第一车辆的车轮的坐标和第一角度。在第一车辆在第一图像中仅漏出主面且未漏出侧面的情况下,每组第一结果中可以包括第一车辆的2D包络框的位置信息,前述主面指的是前面或后面。
进一步地,第一车辆的2D包络框的位置信息可以包括2D包络框的中心点的坐标和2D包络框的边长。由于车辆的车轮存在一定的厚度,第一车辆的车轮的坐标指的可以为车轮的外侧找地点的坐标,也可以为车轮的内侧找地点的坐标,还可以为车轮厚度中间的找地点的坐标等等。第一车辆的车轮的坐标可以包括一个车轮或两个车轮的坐标,具体情况可由实际拍摄到的图像决定。更进一步地,前述车轮的坐标和中心点的坐标所对应的可以为同一坐标系,该坐标系的原点可以为第一图像的任一个顶点,也可以为第一图像的中心点,还可以为第一图像中的其他位置点等,此处不做限定。该坐标系的两条坐标轴分别为第一图像的U轴和V轴。第一车辆的第一角度指示第一车辆的侧边线与第一图像的第一轴线之间夹角的角度,第一车辆的侧边线为第一车辆漏出的侧面与第一车辆所在地平面之间的交线,第一图像的第一轴线与第一图像的一个边平行,第一轴线可以为与第一图像的U轴平行,也可以与第一图像的V轴平行;更进一步地,第一角度的取值范围可以为0度到360度,也可以为负180度到正180度,此处不做限定。
为了更为直观的理解本方案,请参阅图4,图4为本申请实施例提供的图像的处理方法中第一结果的一种示意图,图4中以第一车辆在第一图像中仅漏出侧面,车轮的坐标采用的为车轮的外侧找地点的坐标,第一轴线采用的为第一图像的U轴为例。其中,A1代表第一车辆的2D包络框,A2代表第一车辆的车轮的坐标,A3代表第一图像的U轴,A4代表第一图像的V轴,A5代表第一车辆的侧边线,A6代表第一车辆的第一角度,应理解,图5中的示例仅为方便理解本方案,不用于限定本方案。
可选地,在第一车辆在第一图像中漏出主面和侧面的情况下,第一结果中还可以包括第一车辆的分界线的位置信息和第一车辆的第二角度。
进一步地,分界线为侧面与主面之间的分界线,若第一车辆在第一图像中漏出前面和侧面,则第一车辆的分界线为漏出的前面和侧面之间的分界线;若第一车辆在第一图像中漏出后面和侧面,则第一车辆的分界线为漏出的后面和侧面之间的分界线。其中,第一车辆的前面指的是第一车辆的车头所在面,第一车辆的后面指的是第一车辆的车尾所在面。 第一车辆的分界线可以为穿过第一车辆的车灯的轮廓,或者,第一车辆的分界线也可以为穿过第一车辆的车灯的中心点,或者,第一车辆的分界线还可以为穿过第一车辆的侧边线和第一车辆的主边线的交点,第一车辆的主面和侧面之间的分界线还可以依据其他信息确定,此处不做限定。分界线的位置信息具体可以表现为一个数值,该数值可以为第一车辆的分界线与第一车辆的2D包络框的一个边之间的距离值,也可以为分界线与坐标系下的U轴之间交点的U轴坐标值等,此处不做限定。本申请实施例中,提供了2D包络框的位置信息的具体实现形式,以及,分界线的位置信息的几种具体实现方式,提高了本方案的选择灵活性。
第一车辆的第二角度指示第一车辆的主边线与第一图像的第一轴线之间夹角的角度,第一车辆的主边线为第一车辆漏出的主面与第一车辆所在的地平面之间的交线,若第一车辆在第一图像中漏出前面和地平面,则第一车辆的主边线为漏出的前面和地平面之间的交线;若第一车辆在第一图像中漏出后面和地平面,则第一车辆的主边线为漏出的后面和地平面之间的交线。第二角度的取值范围与第一角度的取值范围一致,此处不再赘述。
为了更为直观的理解本方案,请参阅图5,图5为本申请实施例提供的图像的处理方法中第一结果的一种示意图,图5中以第一车辆在第一图像中漏出侧面和前面,车轮的坐标采用的为车轮的外侧着地点的坐标,第一轴线采用的为第一图像的U轴,分界线穿过车灯的外轮廓为例。其中,B1代表第一车辆的2D包络框,B2代表第一车辆的车轮的坐标,B3代表第一图像的U轴,B4代表第一图像的V轴,B5代表第一车辆的侧边线,B6代表第一车辆的第一角度,B7代表第一车辆的侧面和前面之间的分界线,B8代表第一车辆的主边线(也即图5中的前边线),B9代表第一车辆的第二角度,应理解,图5中的示例仅为方便理解本方案,不用于限定本方案。
进一步可选地,第一结果还可以包括第一车辆在第一图像中的漏出面的指示信息,漏出面包括以下中的一项或多项:侧面、前面和后面,前述侧面包括左面和右面。具体的,漏出面的指示信息具体可以表现为数字序列,作为示例,例如该数字序列包括四组数字,分别与第一车辆的前面、后面、左面和右面对应,一组数字指示第一图像中是否漏出与该一组数字对应的面,一组数字包括一个或多个数值。作为示例,漏出面的指示信息具体表现为1010,分别对应第一车辆的前面、后面、左面和右面,1指示第一图像中存在对应的面,0代表第一图像中不存在对应的面,则“1010”指示第一车辆在第一图像中漏出了前面和左面。漏出面的指示信息具体也可以表现为一串字符,作为示例,例如漏出面的指示信息具体表现为“前面和右面”,指示第一车辆在第一图像中漏出了前面和右面等,对于前述指示信息的具体表现形式,可以结合实际的产品形态确定,此处不做限定。本申请实施例中,第一结果中还包括第一车辆在第一图像中的漏出面的指示信息,从而后续在利用第一点的坐标,生成第一车辆的三维特征信息的过程中,可以根据第一图像中的漏出面的指示信息,确定第一车辆在第一图像中是仅漏出主面,还是仅漏出侧面,还是同时漏出侧面和主面,有利于提高后续三维特征信息生成过程的精度,以提高生成的三维特征信息的准确度。
303、自车根据第一结果,生成第一车辆的三维3D外包络盒的位置信息。
本申请的一些实施例中,自车在得到第一结果之后,会先根据第一结果,生成第一车辆的三维3D外包络盒的位置信息,再根据第一车辆的三维3D外包络盒的位置信息,生成第一车辆的3D特征信息。其中,第一车辆的3D外包络盒的位置信息包括至少两个第一点的坐标,至少两个第一点均位于第一车辆的3D外包络盒的边上,至少两个第一点中两个第一点定位第一车辆的3D外包络盒的边,至少两个第一点的坐标用于定位第一车辆的3D外包络盒。进一步地,第一车辆的3D外包络盒包括12条边和8个顶点,本申请实施例中定位的概念指的是能够确定前述12条边和8个顶点中的部分边和/或顶点的位置。可选地,由于在生成第一车辆的3D特征信息的过程中,主要利用的是3D外包络盒的底面信息,则根据第一点的坐标和第一角度主要定位的为第一车辆的3D外包络盒的底面上的边和/或顶点的位置。
具体的,第一车辆在第一图像中同时漏出侧面和主面,和,第一车辆在第一图像中仅漏出侧面这两种情况下,生成的具体的第一点的位置不同。自车在得到第一结果之后,可以根据第一车辆的漏出面的指示信息,判断第一车辆在第一图像中是仅漏出侧面,还是同时漏出了侧面和主面,进而对前述两种情况分别进行处理,以下分别介绍这两种情况。
A、第一车辆在第一图像中仅漏出侧面
本实施例中,自车在根据第一车辆的漏出面的指示信息,确定第一车辆在第一图像中仅漏出侧面的情况下,步骤303可以包括:自车根据第一车辆的2D包络框的位置信息、第一车辆的车轮的坐标和第一车辆的第一角度,生成至少两个第一点的坐标,至少两个第一点包括第一车辆的侧边线与2D包络框的两个交点。其中,第一点是一个泛指的概念,第一点指的是在第一车辆在第一图像中仅漏出侧面这一具体场景下,生成的位于第一车辆的3D外包络盒的边上的点的坐标。本申请实施例中,在第一车辆在第一图像中仅漏出侧面的情况下,第一点为第一车辆的侧边线与2D包络框之间的交点,细化了在特定场景下,第一点的具体表现形态,提高了与应用场景的结合度。
具体的,自车根据第一车辆的车轮的坐标和第一角度,生成第一车辆的侧边线的位置信息,根据第一车辆的侧边线的位置信息和2D包络框的位置信息,执行坐标生成操作,以得到至少两个第一点的坐标。本申请实施例中,自车根据第一车辆的车轮的坐标和第一角度,就可以生成第一车辆的侧边线的位置信息,操作简单,易于实现,且准确度高。
更具体的,自车根据第一车辆的车轮的坐标和第一角度,可以生成第一车辆的侧边线的直线方程,也即得到了第一车辆的侧边线的位置信息。自车根据第一车辆的2D包络框的位置信息,可以确定第一车辆的2D包络框的左边界和右边界的位置,根据第一车辆的侧边线的直线方程,生成侧边线与前述左边界的交点M(也即一个第一点)的坐标,生成侧边线与前述右边界的交点N(也即另一个第一点)的坐标。其中,前述交点M和交点N之间的连线也即第一车辆的侧边线,第一车辆的侧边线即为第一车辆的3D外包络盒的底面的一条边。第一车辆的侧面存在前后两个边界,若第一图像中漏出了第一车辆的侧面的任一个边界,则两个第一点中可以存在第一车辆的3D外包络盒的顶点;若第一图像中未漏出第一车辆的侧面的任一个边界,则两个第一点中不存在第一车辆的3D外包络盒的顶点。2D包络框的左边界和右边界分别与第一车辆的3D包络盒的侧面的两条边平行。因此, 利用第一点的坐标能够实现对第一车辆的3D包络盒的定位。
可选地,步骤303还可以包括:自车根据第一车辆的2D包络框的位置信息,确定第一车辆的2D包络框的左上角的顶点的坐标和右上角的顶点的坐标,并将前述左上角的顶点O和右上角的顶点P分别确定为两个第一点,将顶点O的坐标和顶点P的坐标分别确定为个第一点的坐标,顶点O和顶点P均位于第一车辆的3D外包络盒的顶面的边上。
为进一步理解本方案,结合图4进行举例,请参阅图6,图6为本申请实施例提供的图像的处理方法中第一点的一种示意图,图6中以第一车辆在第一图像中仅漏出侧面为例。其中,A1代表第一车辆的2D包络框,A1是基于第一车辆的2D包络框的位置信息生成的;A5代表第一车辆的侧边线,A5是基于第一车辆的车轮的坐标和第一角度生成的。D1代表第一车辆的侧边线与第一车辆的2D包络框的左边界的一个交点(也即上述交点M),D2代表第一车辆的侧边线与第一车辆的2D包络框的右边界的一个交点(也即上述交点N),D3代表第一车辆的2D包络框的左上角顶点(也即上述顶点O),D4代表第一车辆的2D包络框的右上角顶点(也即上述顶点P),D1、D2、D3和D4分别为位于第一车辆的3D外包络盒的边上的四个第一点,D1和D2位于第一车辆的3D外包络盒的底面的边上,D3和D4位于第一车辆的3D外包络盒的顶面的边上,D1、D2、D3和D4的坐标即为生成的四个第一点的坐标,应理解,在其他实施例中,也可以不生成D3和D4的坐标,图6中的示例仅为方便理解本方案,不用于限定本方案。
B、第一车辆在第一图像中漏出侧面和主面
本实施例中,在第一车辆在第一图像中漏出侧面和主面的情况下,第一结果中还包括第一车辆的分界线的位置信息和第一车辆的第二角度,分界线的位置信息和第二角度的具体含义均已在步骤302中进行了介绍,此处不再赘述。至少两个第一点包括第一交点、第二交点和第三交点,第一交点的坐标、第二交点的坐标和第三交点的坐标指的是在第一车辆在第一图像中漏出侧面和主面这一具体场景下,生成的位于第一车辆的3D外包络盒的边上的点的坐标。第一交点为第一车辆的侧边线与分界线的交点,也是一车辆的3D外包络盒的一个顶点。第二交点为第一车辆的侧边线与2D包络框的交点,第一交点与第二交点的连线为第一车辆的侧边线,该第一车辆的侧边线也即第一车辆的3D包络盒的底面的一条边,进一步地,若第一图像包括第一车辆的完整的侧面,则第二交点为第一车辆的3D包络盒的底面的一个顶点。第三交点为第一车辆的主边线与2D包络框的交点,第一交点与第三交点的连线为第一车辆的主边线,该第一车辆的主边线也即第一车辆的3D包络盒的底面的另一条边;进一步地,若第一图像包络第一车辆的完整的主面,则第三交点为第一车辆的3D包络盒的底面的一个顶点。第一车辆的2D包络框的左边界和右边界,分别与第一车辆的3D包络盒的侧面的两条边平行。因此,利用第一交点的坐标、第二交点的坐标和第三交点的坐标能够实现对第一车辆的3D外包络盒的定位。本申请实施例中,不仅提供了在第一车辆在第一图像中仅漏出侧面的情况下,第一点的具体表现形式,还提供了在第一车辆在第一图像中漏出侧面和主面的情况下,第一点的具体表现形式,丰富了本方案的应用场景,提高了实现灵活性。
具体的,在自车根据第一车辆的漏出面的指示信息,确定第一车辆在第一图像中漏出 侧面和主面的情况下,步骤303可以包括:自车根据分界线的位置信息、车轮的坐标和第一角度,生成第一交点的坐标。自车根据2D包络框的位置信息、车轮的坐标和第一角度,生成第二交点的坐标。根据2D包络框的位置信息、第一交点的坐标和第二角度,生成第三交点的坐标。其中,第一交点的坐标、第二交点的坐标、第三交点的坐标、第一角度和第二角度用于定位第一车辆的3D外包络盒。本申请实施例中,提供了当第一车辆在第一图像中漏出侧面和主面时,生成多个第一点的坐标的实现方式,操作简单,易于实现,且准确度高。
更具体的,在一种情况下,若自车根据第一车辆的漏出面的指示信息,确定第一车辆在第一图像中漏出侧面和前面,则主面具体为前面,主边线具体为前边线。自车根据车轮的坐标和第一角度生成侧边线的直线方程,也即生成第一车辆的侧边线的位置信息;进而根据分界线的位置信息和侧边线的直线方程,生成侧边线与分界线之间交点的坐标,也即生成了第一交点的坐标。自车根据第一车辆的2D包络框的位置信息,可以确定第一车辆的2D包络框的左边界和右边界的位置,根据侧边线的直线方程和2D包络框的右边界的位置,生成侧边线与2D包络框的右边界之间交点的坐标,也即生成了第二交点的坐标。自车根据第一交点的坐标和第二角度生成前边线的直线方程,也即生成第一车辆的前边线的位置信息,根据前边线的直线方程和2D包络框的左边界的位置,生成前边线与2D包络框的左边界之间交点的坐标,也即生成了第三交点的坐标。需要说明的是,在其他实施例中,也可以为先生成第二交点的坐标,再生成第一交点的坐标,或者,还可以为先生成第三交点的坐标,再生成第二交点的坐标等,此处不限定第一交点的坐标、第二交点的坐标和第三交点的坐标的生成顺序。
进一步地,在第一图像中包括第一车辆的侧面和前面的情况下,第一车辆的侧面和前面的分界线为第一车辆的3D包络盒的侧面的一个边。若第一图像包括第一车辆的完整的侧面,则第一车辆的2D包络框的右边界为第一车辆的3D包络盒的侧面的一个边;若第一图像包括第一车辆的完整的前面,则第一车辆的2D包络框的左边界为第一车辆的3D包络盒的侧面的一个边,从而实现了对第一车辆的3D包络盒的侧面的定位。
在另一种情况下,若自车根据第一车辆的漏出面的指示信息,确定第一车辆在第一图像中漏出侧面和后面,则主面具体为后面,主边线具体为后边线。自车根据车轮的坐标和第一角度生成侧边线的直线方程,也即生成第一车辆的侧边线的位置信息;进而根据分界线的位置信息和侧边线的直线方程,生成侧边线与分界线之间交点的坐标,也即生成了第一交点的坐标。自车根据第一车辆的2D包络框的位置信息,可以确定第一车辆的2D包络框的左边界和右边界的位置,根据侧边线的直线方程和2D包络框的左边界的位置,生成侧边线与2D包络框的左边界之间交点的坐标,也即生成了第二交点的坐标。自车根据第一交点的坐标和第二角度生成后边线的直线方程,也即生成第一车辆的后边线的位置信息,根据后边线的直线方程和2D包络框的右边界的位置,生成后边线与2D包络框的右边界之间交点的坐标,也即生成了第三交点的坐标。
进一步地,在第一图像中包括第一车辆的侧面和后面的情况下,第一车辆的侧面和后面的分界线为第一车辆的3D包络盒的侧面的一个边。若第一图像包括第一车辆的完整的 侧面,则第一车辆的2D包络框的左边界为第一车辆的3D包络盒的侧面的一个边;若第一图像包括第一车辆的完整的后面,则第一车辆的2D包络框的右边界为第一车辆的3D包络盒的侧面的一个边,从而实现了对第一车辆的3D包络盒的侧面的定位。
可选地,步骤303还可以包括:自车根据第一车辆的2D包络框的位置信息,确定第一车辆的2D包络框的左上角的顶点的坐标和右上角的顶点的坐标,并将前述两个顶点的坐标确定为两个第一点的坐标。自车根据第一车辆的2D包络框的位置信息和第一车辆的分界线的位置信息,生成分界线与2D包括框之间交点的坐标,也即生成一个第一点的坐标,前述交点为第一车辆的3D包络盒的顶面的一个顶点。
为进一步理解本方案,结合图5进行举例,请参阅图7,图7为本申请实施例提供的图像的处理方法中第一点的一种示意图,图7中以第一车辆在第一图像中漏出侧面和前面为例进行说明。其中,B1代表第一车辆的2D包络框,第一车辆的2D包络框为根据第一车辆的2D包络框的位置信息确定的;B5代表第一车辆的侧边线,第一车辆的侧边线是根据第一车辆的车轮的坐标和第一角度生成的;B7代表第一车辆的侧面和前面之间的分界线,该分界线为根据第一车辆的分界线的位置信息确定的;B8代表第一车辆的主边线,第一车辆的主边线是根据第一交点的坐标和第二角度生成的。E1代表侧边线与分界线之间的交点(也即第一交点),E1的坐标即为第一交点的坐标;E2代表侧边线与2D包络框的右边界之间的交点(也即第二交点),E2的坐标即为第二交点的坐标;E3代表主边线与2D包络框的左边界之间的交点(也即第三交点),E3的坐标即为第三交点的坐标;E4代表2D包络框的左上角顶点(也即一个第一点),E4的坐标即为第一点的坐标;E5代表2D包络框的右上角顶点(也即一个第一点),E5的坐标即为第一点的坐标;E6代表分界线与2D包络框之间的交点(也即一个第一点),E6的坐标即为第一点的坐标。其中,E1至E6均为一种具体化的第一点,E1、E2和E3均位于第一车辆的3D包络框的底面,E1为第一车辆的3D包络框的底面的一个顶点,E4、E5和E6均位于第一车辆的3D包络框的顶面,E6为第一车辆的3D包络框的顶面的一个顶点,应理解,在其他实施例中,也可以不生成E4、E5和E6的坐标,图7中的示例仅为方便理解本方案,不用于限定本方案。
304、自车判断第一车辆与自车的距离是否超过预设阈值,若未超过预设阈值,则进入步骤305;若超过预设阈值,则进入步骤314。
本申请的一些实施例中,自车生成第一车辆与自车之间的距离,进而判断第一车辆与自车之间的距离是否超过预设阈值,若未超过预设阈值,则进入步骤305;若超过预设阈值,则进入步骤314。其中,预设阈值的取值可以为10米、15米、30米、25米或其他数值等,具体可以结合实际产品形态确定。
具体的,在一种实现方式中,自车根据第一点的坐标和地平面假设原理,生成至少一个第一点中每个第一点与自车之间的距离,进而根据每个第一点与自车之间的距离,生成第一车辆与自车之间的距离。
更具体的,针对至少两个第一点中的任一个第一点,自车可以根据第一点的坐标和地平面假设原理,生成第一点在车体坐标系下的三维坐标,根据第一点在车体坐标系下的三维坐标生成第一点与自车之间的第一距离。自车重复执行前述操作,以生成每个第一点与 自车之间的第一距离。自车可以根据与至少两个第一点对应的至少两个第一距离,从前述至少两个第一距离中选取最小的第一距离,作为第一车辆与自车之间的距离,进而判断选取出的距离值最小的第一距离是否超过预设阈值,若未超过阈值,则视为第一车辆与自车的距离未超过预设阈值,也即当至少两个第一点中任一个第一点与自车之间的距离未超过预设阈值时,视为第一车辆与自车之间的距离未超过预设阈值。自车也可以从前述至少两个第一距离中选取最大的第一距离,作为第一车辆与自车之间的距离,进而判断选取出的距离值最大的第一距离是否超过预设阈值,若超过预设阈值,则视为第一车辆与自车的距离超过预设阈值,也即当至少两个第一点中任一个第一点与自车之间的距离超过预设阈值时,视为第一车辆与自车之间的距离超过预设阈值。还可以为自车将前述至少两个第一距离的平均值作为第一车辆与自车之间的距离,以执行第一车辆与自车之间的距离是否超过预设阈值的判断操作。本申请实施例中,提供了判断第一车辆与自车之间距离是否超过预设预置的两种具体实现方式,提高了本方案的实现灵活性。
在另一种实现方式中,自车也可以根据第一车辆的车轮点的坐标和地平面假设原理,生成第一车辆的车轮点在车体坐标系下的三维坐标,进而根据第一车辆的车轮点的三维坐标,生成第一车辆的车轮点与自车之间的距离,并将前述距离确定为第一车辆与自车之间的距离,以执行第一车辆与自车之间的距离是否超过预设阈值的判断操作。
应理解,在其他实施例中,自车也可以利用第一车辆的3D包络盒上的其他点的坐标来确定第一车辆与自车之间的距离,此处不做限定。
305、自车根据第一车辆在第一图像中的漏出面的指示信息,判断第一车辆是否在第一图像中漏出侧面,若未漏出侧面,则进入步骤306;若漏出侧面,则进入步骤307。
本申请实施例中,由于第一车辆在第一图像中可以漏出侧面,也可以未漏出侧面且仅漏出主面,而前述两种情况的处理方式有所不同。则自车在得到第一结果之后,可以根据第一车辆在第一图像中的漏出面的指示信息,来判断第一车辆在第一图像中是否漏出侧面。具体的,自车在根据第一车辆在第一图像中的漏出面的指示信息,确定第一车辆在第一图像中的漏出面中存在左面或者存在右面的情况下,均视为第一车俩在第一图像中漏出了侧面。对于第一图像中的漏出面的指示信息如何指示漏出了哪些面,可参阅步骤302中的描述,此处不做赘述。
306、自车根据2D包络框的中心点的坐标和小孔成像原理,生成第一车辆相对于自车的朝向角。
本申请的一些实施例中,自车在确定第一车辆在第一图像中未漏出侧面且仅漏出主面的情况下,自车可以认为第一车辆的3D质心点在图像上的投影点为第一车辆的2D包括框的中心点,进而根据第一车辆的2D包络框的中心点的坐标和小孔成像原理,生成第一车辆相对于自车的朝向角;进一步地,自车可以根据第一车辆相对于自车的朝向角,确定第一车辆的行驶意图,例如第一车辆是否会并道。
具体的,在一种实现方式中,第一车辆的2D包络框的位置信息中包括第一车辆的2D包络框的中心点的坐标,自车根据第一车辆的2D包络框的中心点的坐标和第一变换关系,生成第一射线在第一图像中的地平面上的投影与相机坐标系的x轴之间夹角的角度γ,角 度γ也即第一车辆在相机坐标系下的朝向角。其中,第一变换关系为相机坐标系与坐标系之间的变换关系,该第一变换关系也可以称为相机的内参,是基于小孔成像原理预先生成并配置于自车上的。第一射线为采集第一图像的相机的光心穿过第一车辆的3D质心点的射线。相机坐标系的原点为配置于自车上用于采集第一图像的相机。自车再根据角度γ和第二变换关系,生成第一车辆在自车的车体坐标系下的朝向角θ,第二变换关系指的是前述相机坐标系与前述车体坐标系之间的转换关系,该第二变换关系也可以称为相机的外参。
进一步地,相机坐标系和车体坐标系均为3D坐标系,相机坐标系的x轴可以为向右,相机坐标系的y轴可以为向下,相机坐标系的z轴可以为向前,相机坐标系的x轴和z轴可以构成与地平面平行的平面。车体坐标系的坐标系原点可以为自车的两个后车轮连线的中点,车体坐标系的坐标系原点也可以为自车的质心点,车体坐标系的x轴可以是向左,车体坐标系的y轴可以是向前,车体坐标系的z轴可以是向下,车体坐标系的x轴和y轴可以构成与地平面平行的平面。应理解,前述对于相机坐标系和车体坐标系的描述仅为方便理解本方案,在其他实施例中,也可以调整相机坐标系和/或车体坐标系的坐标系原点,或者调整相机坐标系的x轴、y轴和/或z轴的朝向等,此处均不做限定。
307、自车通过第一计算规则,根据第一点的坐标,生成第一车辆相对于自车的朝向角。
本申请的一些实施例中,自车在确定第一车辆与自车之间的距离未超过预设阈值,且自车在第一车辆中漏出侧面的情况下,可以通过第一计算规则,根据第一点的坐标,生成第一车辆相对于自车的朝向角。
具体的,在一种实现方式中,自车根据第一点的坐标和地平面假设原理,生成第一点在车体坐标系下的三维坐标,车体坐标系的坐标系原点位于自车内。其中,车体坐标系和地平面假设原理的概念都已经在步骤306中进行了描述,此处不做赘述。自车根据第一点的三维坐标,生成朝向角。本申请实施例中,无论第一车辆在第一图像中是否为完整的图像,都可以得到准确的第一点的坐标,由于朝向角是基于第一点的坐标和地平面假设原理生成的,从而保证生成的朝向角的准确性。
更具体的,针对第一点的坐标的获取过程。在一种情况下,若第一车辆的漏出面的指示信息指示第一车辆在第一图像中仅漏出侧面,则可以生成两个第一点的坐标,或者,可以生成四个第一点的坐标,具体第一点的坐标的生成过程可以参阅步骤303中的描述。自车可以从前述两个第一点或者前述四个第一点中获取位于第一车辆的侧边线上的两个第一点,也即获取位于第一车辆的3D包络盒的底面的侧边上的两个第一点。在另一种情况下,若第一车辆在第一图像中漏出了侧面和主面,则可以生成三个第一点的坐标,或者,可以生成六个第一点的坐标,具体第一点的坐标的生成过程可以参阅步骤303中的描述。自车可以从前述三个第一点或者前述六个第一点的坐标中,获取位于第一车辆的侧边线上的两个第一点的坐标。
针对生成朝向角的过程。自车根据位于第一车辆的外包络盒的底面两个第一点的坐标和地平面假设原理,分别生成该两个第一点在车体坐标系下的三维坐标,进而根据该两个第一点在车体坐标系下的三维坐标,生成第一车辆相对于自车的朝向角。进一步地,自车在得到一个第一点在车体坐标系下的三维坐标和另一个第一点在车体坐标系下的三维坐标 之后,若车体坐标系的x轴和y轴构成的为与地平面平行的平面,则自车可以根据第一车辆的车轮的三维坐标中x轴和y轴方向的值,以及,目标点的三维坐标中x轴和y轴方向的值,生成第一车辆相对于自车的朝向角θ。作为示例,例如一个第一点的三维坐标为(x 1,y 1,,z 1),另一个第一点的三维坐标为(x 2,y 2,z 2),则第一车辆相对于自车的朝向角
Figure PCTCN2021088263-appb-000001
在另一种实现方式中,自车可以根据第一点的坐标、第一车辆的车轮的坐标和地平面假设原理,生成第一点在车体坐标系下的三维坐标和第一车辆的车轮在车体坐标系下的三维坐标,根据第一点在车体坐标系下的三维坐标和第一车辆的车轮在车体坐标系下的三维坐标,生成第一车辆相对于自车的朝向角。
更具体的,针对第一点的坐标的获取过程,在通过步骤303得到多个第一点的坐标之后,可以从中获取到位于第一车辆的侧边线上的两个第一点的坐标。进而自车可以从前述两个第一点的坐标中任选一个目标点的坐标,并根据目标点的坐标、车轮的坐标和地平面假设原理,生成目标点在车体坐标系下的三维坐标和车轮在车体坐标系下的三维坐标,进而执行朝向角的生成操作。自车也可以根据两个第一点的坐标、车轮的坐标和地平面假设原理,分别生成两个第一点在车体坐标系下的三维坐标和车轮在车体坐标系下的三维坐标,进而执行朝向角的生成操作。
需要说明的是,步骤307为可选步骤,若不执行步骤307,则在执行完步骤305之后,可以直接执行步骤308。
308、自车从至少两个第一点的坐标中获取第一车辆的3D外包络盒的顶点的坐标。
本申请的一些实施例中,在确定第一车辆与自车之间的距离未超过预设阈值,且自车在第一车辆中漏出侧面的情况下,自车在通过步骤303得到多个第一点的坐标之后,从多个第一点的坐标中选取第一车辆的3D外包络盒的顶点的坐标,以利用第一车辆的3D外包络盒的顶点的坐标来生成第一车辆的质心点的三维坐标。若多个第一点中不存在位于第一车辆的3D外包络盒的顶点,则可以不再执行步骤309,也即不再生成第一车辆的质心点的三维坐标。
具体的,针对一个第一点是否为3D外包络盒的顶点的判断过程。自车上可以预先设置有第一图像的U轴方向的第一取值范围,和,第一图像的V轴方向的第二取值范围。在得到一个第一点的坐标之后,判断第一点的坐标中U轴方向的取值是否在第一取值范围内,判断第一点的坐标中V轴方向的取值是否在第二取值范围内;若第一点的坐标中U轴方向的取值在第一取值范围内,且,第一点的坐标中V轴方向的取值在第二取值范围内,则确定第一点为3D外包络盒的顶点;若第一点的坐标中U轴方向的取值不在第一取值范围内,或者,第一点的坐标中V轴方向的取值在第二取值范围内,则确定第一点不是3D外包络盒的顶点。
309、自车根据第一车辆的3D外包络盒的顶点的坐标和地平面假设原理,生成第一车辆的质心点在车体坐标系下的三维坐标,车体坐标系的坐标系原点位于自车内。
本申请的一些实施例中,自车在从多个第一点的坐标中获取到第一车辆的3D外包络 盒的顶点的坐标之后,根据地平面假设原理,生成第一车辆的3D外包络盒的顶点在车体坐标系下的三维坐标。进而根据第一车辆的3D外包络盒的顶点的三维坐标和第一车辆的预设尺寸,生成第一车辆的质心点在在车体坐标系下的三维坐标。本申请实施例中,根据第一点的坐标,不仅能够生成第一车辆的朝向角,还可以生成第一车辆的质心点在车体坐标系下的三维坐标,扩展了本方案的应用场景;此外,提高了生成的质心点的三维坐标的准确性。
具体的,在一种实现方式中,自车在从多个第一点的坐标中获取到一个第一顶点的坐标之后,可以根据第一顶点的坐标和地平面假设原理,生成第一顶点在车体坐标系下的三维坐标。其中,多个第一点中可以存在至少一个第一点位于第一车辆的3D外包络盒的顶点上,第一顶点为前述至少一个第一点中的任一个顶点。自车根据第一顶点的三维坐标,可以确定第一车辆的3D外包络盒在车体坐标系下的位置,进而生成第一车辆的质心点在车体坐标系下的三维坐标。
进一步地,自车可以先根据第一图像,生成第一车辆的初始3D外包络盒的顶点的三维坐标,在得到第一顶点的三维坐标之后,通过第一顶点的三维坐标对初始3D外包络盒的顶点的三维坐标进行校正,以得到第一车辆的最终3D外包络盒的顶点的三维坐标,进而根据第一车辆的最终3D外包络盒的顶点的三维坐标生成第一车辆的质心点在车体坐标系下的三维坐标。
在另一种实现方式中,若第一车辆在第一图像中漏出侧面和主面,多个第一点中包括位于第一车辆的3D外包络盒的底面的三个第一顶点,则自车也可以直接根据前述三个第一顶点的坐标和地平面假设原理,分别生成前述三个第一顶点在车体坐标系下的三维坐标,进而得到第一车辆的3D外包络盒的底面在车体坐标系下的位置,并生成第一车辆的质心点在车体坐标系下的三维坐标,本实现方式中质心点的三维坐标中的高可以不纳入考虑。
在另一种实现方式中,由于多个第一点中可以包括多个第一顶点,若第一车辆在第一图像中漏出侧面和主面,多个第一点中包括六个第一顶点,则自车也可以直接根据六个第一顶点的坐标和地平面假设原理,分别生成六个第一顶点在车体坐标系下的三维坐标,进而得到第一车辆的3D外包络盒在车体坐标系下的位置,并生成第一车辆的质心点在车体坐标系下的三维坐标。
需要说明的是,步骤308和309为可选步骤,若不执行步骤308和309,则在执行完步骤307之后,可以直接执行步骤310。若执行步骤308和309,则本申请实施例不限定步骤308和309与步骤307之间的执行顺序,可以为先执行步骤307,再执行步骤308和309,也可以先执行步骤308和309,再执行步骤307。
310、自车根据第一点的坐标和地平面假设原理,生成第一车辆的尺寸。
本申请的一些实施例中,在一种实现方式中,自车根据第一点的坐标和地平面假设原理,只生成第一车辆的长和/或宽。
具体的,自车在第三顶点的目标个数大于或等于二的情况下,根据第一车辆的3D包络盒的底面顶点的坐标和地平面假设原理,生成第一车辆的长和/或宽,第三顶点指的是第一车辆的3D外包络盒的底面顶点。
更具体的,自车在获取到多个第一点的坐标之后,可以从中获取位于第一车辆的3D包络盒的底面边上的至少一个第一点的坐标,根据前述至少一个第一点的坐标,分别判断前述至少一个第一点中每个第一点是否为3D外包络盒的底面顶点,以计算得到与第一图像中的第一车辆对应的该目标个数。为进一步理解本方案,结合图6进行举例,图6中示出了D1、D2、D3和D4这四个第一点,D1和D2为位于第一车辆的3D包络盒的底面边上的第一点,只有D2为第一车辆的3D外包络盒的底面顶点,应当理解,此处结合图6进行举例仅为方便理解3D外包络盒的顶点这一概念,不用于限定本方案。
自车根据第一车辆的漏出面的指示信息,判断第一车辆是否在第一图像中漏出主面,若漏出主面且该目标个数等于三,则根据三个自车的3D外包络盒的底面顶点的坐标和地平面假设原理,分别生成前述三个底面顶点在自车坐标系下的三维坐标,进而根据前述三个底面顶点的三维坐标,生成第一车辆的长和宽。
若漏出主面且该目标个数等于二,则根据二个自车的3D外包络盒的底面顶点的坐标和地平面假设原理,分别生成前述二个底面顶点在自车坐标系下的三维坐标,进而根据前述二个底面顶点的三维坐标,生成第一车辆的长或宽。
若未漏出主面仅漏出侧面且该目标个数等于二,则根据二个自车的3D外包络盒的底面顶点的坐标和地平面假设原理,分别生成前述二个底面顶点在自车坐标系下的三维坐标,进而根据前述二个底面顶点的三维坐标,生成第一车辆的长。
若目标个数等于一,无论第一车辆是否在第一图像中漏出主面,均进入步骤311,也即自车利用包括第一车辆的另一张图像生成第一车辆的长和/或宽。
若目标个数等于零,自车终止执行第一车辆的长和/或宽的生成步骤。
需要说明的是,本申请实施例中不限定上述目标个数的生成步骤和第一车辆是否漏出主面的判断步骤之间的执行顺序,可以先执行生成步骤,再执行判断步骤,也可以先执行判断步骤,再执行生成步骤。
在另一种实现方式中,自车根据第一点的坐标和地平面假设原理,生成以下中的一项或多项:第一车辆的长、第一车辆的宽和第一车辆的高。
具体的,自车在获取到多个第一点的坐标之后,会判断多个第一点中位于第一车辆的3D包络盒的顶点上的第一顶点的数量是否大于一,若大于一,则自车根据第一顶点的坐标和地平面假设原理,生成第一顶点在车体坐标系下的三维坐标,并根据第一顶点在车体坐标系下的三维坐标,生成以下中的一项或多项:第一车辆的长、第一车辆的宽和第一车辆的高。若等于一,则进入步骤311,以利用包括第一车辆的另一张图像生成第一车辆的长、宽和/或高。若等于零,则自车终止执行第一车辆的尺寸的生成步骤。
本申请实施例中,根据第一点的坐标,还可以生成第一车辆的尺寸,进一步扩展了本方案的应用场景;此外,提高了生成的第一车辆的尺寸的准确性。
311、自车获取第二图像,第二图像中包括第一车辆,第二图像和第一图像的图像采集角度不同。
本申请的一些实施例中,自车在目标个数等于一的情况下,或者,在多个第一点中第一顶点的数量等于一的情况下,获取第二图像。自车获取第二图像的方式与步骤301中自 车获取第一图像的方式类似,可以参阅步骤301中的描述。第二图像与第一图像的区别在于,第二图像与第一图像对第一车辆的图像采集角度不同。
312、自车根据第二图像,通过图像处理网络,得到至少两个第二点的坐标,至少两个第二点均位于第一车辆在第二图像中的3D外包络盒的边上。
本申请的一些实施例中,自车在获取到第二图像之后,将第二图像输入到图像处理网络中,得到图像处理网络输出的第四结果,第四结果与第一结果包括的信息类型相同,区别在于第一结果是将第一图像输入到图像处理网络中得到的,第四结果是将第二图像输入到图像处理网络中得到的。自车根据第四结果,生成第一车辆在第二图像中的3D外包络盒的位置信息,第一车辆在第二图像中的3D外包络盒的位置信息包括至少两个第二点的坐标。第二点与第一点的性质相同,区别在于第一点位于第一车辆在第一图像中的3D外包络盒的边上,第二点位于第一车辆在第二图像中的3D包络盒的边上。至少两个第二点中两个第二点定位第一车辆在第二图像中的3D外包络盒的边,至少两个第二点的坐标用于定位第一车辆在第二图像中的3D外包络盒。步骤312的具体实现方式可以参阅步骤302至303的描述,此处不做赘述。
313、自车根据第一点的坐标和第二点的坐标,生成第一车辆的尺寸。
本申请的一些实施例中,在一种实现方式中,自车根据第一点的坐标和第二点的坐标,生成第一车辆的长和/或宽。具体的,自车在得到第二点的坐标之后,从第二点中选取第四顶点,第四顶点为所述第一车辆的3D外包络盒的底面的另一个顶点。自车根据第四顶点的坐标和地平面假设原理,生成第四顶点在车体坐标系下的三维坐标,进而根据第三顶点的三维坐标和第四顶点的三维坐标,生成第一车辆的长和/或宽。
在另一种实现方式中,自车根据第一点的坐标和第二点的坐标,生成以下中的一项或多项:第一车辆的长、第一车辆的宽和第一车辆的高。具体的,自车在得到第二点的坐标之后,从第二点中选取第二顶点,第二顶点为所述第一车辆的3D外包络盒的另一个顶点。自车根据第二顶点的坐标和地平面假设原理,生成第二顶点在车体坐标系下的三维坐标,进而根据第一顶点的三维坐标和第二顶点的三维坐标,生成以下中的一项或多项:第一车辆的长、第一车辆的宽和第一车辆的高。本申请实施例中,在通过第一车辆的一个图像无法生成第一车辆的尺寸的情况下,利用第一车辆的另一个图像共同生成第一车辆的尺寸,保证了在各种情况下均能生成第一车辆的尺寸,提高了本方案的全面性。
需要说明的是,步骤310至313为可选步骤,若不执行步骤310至313,则在执行完步骤309之后,可以执行结束。若执行步骤310至313,则本申请实施例不限定步骤310至313与步骤308和309的执行顺序,可以先执行步骤308和309,再执行步骤310至313;也可以先执行步骤310至313,再执行步骤308和309。
314、自车根据第一车辆在第一图像中的漏出面的指示信息,判断第一车辆是否在第一图像中漏出侧面,若未漏出侧面,则进入步骤315;若漏出侧面,则进入步骤316。
315、自车根据2D包络框的中心点的坐标和小孔成像原理,生成第一车辆相对于自车的朝向角。
本申请实施例中,自车执行步骤314和315的具体方式可以参阅步骤305和306中的 描述,此处不做赘述。本申请实施例中,在第一车辆仅漏出主面的情况下,无论第一车辆与自车之间的距离是否超过预设阈值,都能够生成第一车辆的朝向角,丰富了本方案的应用场景。
316、自车通过第二计算规则,根据第一点的坐标,生成第一车辆相对于自车的朝向角。
本申请的一些实施例中,自车在确定第一车辆与自车之间的距离超过预设阈值,且,第一车辆在第一图像中漏出侧面的情况下,可以通过第二计算规则,根据第一点的坐标,生成第一车辆相对于自车的朝向角。本申请实施例中,在得到第一点的坐标之后,还可以根据第一点的坐标,生成第一车辆相对于自车的朝向角,以提高得到的朝向角的准确度。针对第一车辆与自车之间的距离超过预设阈值,和,第一车辆与自车之间的距离未超过预设阈值这两种情况,分别采用不同的计算规则,生成第一车辆的朝向角,进一步提高生成的朝向角的准确度。
具体的,在一种实现方式中,步骤316可以包括:自车根据第一点的坐标,生成第一车辆的侧边线的位置信息,根据第一车辆的侧边线的位置信息和第一图像的消失线的位置信息,生成消失点的坐标,其中,前述消失点为第一车辆的侧边线与第一图像的消失线之间的交点。自车根据消失点的坐标和两点透视原理,生成朝向角。其中,两点透视原理也可以称为成角透视原理或余角透视原理,两点透视原理指的是第一图像中第一车辆的侧面和第一车辆的主面均与第一图像斜交,在第一图像中存在两个消失点,两个消失点在同一视平线上。第一图像的消失线的位置信息具体可以表现为第一图像的消失线的直线方程,第一图像的消失线的位置仅与采集第一图像的图像采集装置有关。
本申请实施例中,提供了当第一车辆与自车之间距离超过预设阈值,且第一车辆在第一图像中漏出侧面的情况下,生成第一车辆的朝向角的一种具体实现方式,操作简单,且效率较高。
更具体的,针对消失点的坐标的生成过程。自车通过步骤303可以得到两个、四个、三个或六个第一点的坐标,具体实现方式参阅步骤303中的描述。自车在获取到多个第一点的坐标之后,可以从中选取位于第一车辆的侧边线上的两个第一点的坐标,也即获取位于第一车辆的3D包络盒的底面的侧边上的两个第一点的坐标。自车根据前述两个第一点的坐标,生成第一车辆的侧边线的直线方程,自车中可以预先配置有第一图像的消失线的直线方程,根据第一车辆的侧边线的直线方程和消失线的直线方程,得到第一车辆的侧边线与第一图像的消失线之间的交点的坐标(也即得到消失点的坐标)。
针对朝向角的生成过程。自车在得到消失点的坐标之后,根据消失点的坐标和两点透视原理,生成第二射线在第一图像中的地平面上的投影与相机坐标系x轴之间夹角的角度δ,角度δ也即第一车辆在相机坐标系下的朝向角。其中,第二射线为采集第一图像的相机的光心穿过前述消失点的射线。自车再根据角度δ和第二变换关系,生成第一车辆在自车的车体坐标系下的朝向角θ,第二变换关系指的是相机坐标系与车体坐标系之间的转换关系,该第二变换关系也可以称为相机的外参。相机坐标系和车体坐标系的概念可以参阅前述步骤中的介绍。
在另一种实现方式中,步骤316可以包括:自车根据第一点的坐标、第一角度和小孔 成像原理,生成第一角度和朝向角之间的映射关系;根据映射关系和第一角度,生成朝向角。其中,小孔成像原理的含义可以参阅前述步骤中的介绍。本申请实施例中,针对第一车辆与自车之间的距离超过预设阈值的情况,提供了求得朝向角的两种可实现方式,提高本方案实现灵活性。
具体的,自车在得到多个第一点的坐标之后,从中选取第一车辆的3D包络盒的一个顶点,若多个第一点中包括3D包络盒的多个顶点,则可以从多个顶点中任选一个顶点。对于从第一点中选取第一车辆的3D包络盒的顶点的具体实现方式可以参阅步骤308中的描述,此处不做赘述。自车根据第一车辆的3D包络盒的一个顶点的坐标、第一车辆的预设尺寸和小孔成像原理,可以生成第一角度和朝向角之间的映射关系,进而可以根据第一角度,求解出朝向角。
需要说明的是,图3示出的实施例中为先通过步骤304判断第一车辆与自车的距离是否超过预设阈值,再分别通过步骤305和步骤314判断第一车辆是否在第一图像中漏出侧面。在其他实施例中,也可以将是否超过预设阈值的判断步骤,和,第一车辆是否在第一图像中漏出侧面的判断步骤交换。也即通过步骤304判断第一车辆是否在第一图像中漏出侧面,若第一车辆在第一图像中未漏出侧面,则自车根据2D包络框的中心点的坐标和小孔成像原理,生成第一车辆相对于自车的朝向角。若第一车辆在第一图像中漏出侧面,再判断第一车辆与自车的距离是否超过预设阈值,若第一车辆与自车的距离未超过预设阈值,则执行上述步骤307至313中描述的内容;若第一车辆与自车的距离超过预设阈值,则执行步骤316中描述的内容。
本申请实施例中,将获取到的图像输入到图像处理网络中,图像处理网络输出的为车辆的二维包络框的位置信息、车轮的坐标和第一角度,根据二维包络框的位置信息、车轮的坐标和第一角度,生成第一车辆的三维3D外包络盒的位置信息,进而定位车辆的3D外包络盒,由于二维包络框的位置信息、车轮的坐标和第一角度这三种参数的准确度与图像中车辆是否完整无关,所以无论图像中的车辆是否完整,得到的第一点的坐标是准确的,从而定位出的3D外包络盒的准确率较高,也即提高了获取到的3D外包络盒的准确度;进一步地,也即能够更为准确的判断周围车辆行驶意图,进而提高自动驾驶车辆的行驶安全度。
二、训练阶段
本申请实施例中,训练阶段描述的是训练设备220如何训练得到成熟的图像处理***的过程。请参阅图8,图8为本申请实施例提供的网络的训练方法的一种流程示意图,本申请实施例提供的图像的处理方法可以包括:
801、训练设备获取训练图像和训练图像的标注数据。
本申请实施例中,训练设备上预先配置有训练数据集合,训练数据集合中包括多个训练图像,以及,与每个训练图像对应的一组或多组标注数据。一个训练图像中包括一个或多个第二车辆,前述多组标注数据中每组标注数据对应该训练图像中的一个第二车辆。
在第二车辆在训练图像中仅漏出侧面的情况下,一组标注数据包括第二车辆的漏出面的标注指示信息、车轮的标注坐标和第二车辆的标注第一角度,第二车辆的第一角度指示 侧边线与训练图像的第一轴线之间夹角的角度,第二车辆的侧边线为漏出的侧面在训练图像中的消失点与第二车辆的车轮之间的连线,训练图像的第一轴线与训练图像的一个边平行。可选地,标注数据中还可以包括第二车辆的2D包络框的标注位置信息。
在第二车辆在训练图像中漏出侧面和主面的情况下,一组标注数据包括第二车辆的漏出面的标注指示信息、第二车辆的车轮的标注坐标、第二车辆的标注第一角度、第二车辆的分界线的标注位置信息和第二车辆的标注第二角度,第二车辆的主面为第二车辆的前面或后面,分界线为侧面与主面之间的分界线,第二车辆的第二角度指示第二车辆的主边线与训练图像的第一轴线之间夹角的角度,第二车辆的主边线为第二车辆漏出的主面在训练图像中的消失点与第二车辆的目标点之间的连线,第二车辆的目标点为第二车辆的侧边线与第二车辆的分界线的交点。可选地,标注数据中还可以包括第二车辆的2D包络框的标注位置信息。
在第二车辆在训练图像中仅漏出主面的情况下,标注数据包括第二车辆的漏出面的标注指示信息和2D包络框的标注位置信息。
对于上述各种概念的含义和具体表现形式均已在图3对应的实施例中进行了介绍,此处不做赘述。
802、训练设备将训练图像输入图像处理网络中,得到图像输入网络输出的第三结果。
本申请实施例中,训练设备在得到训练图像之后,将训练图像输入图像处理网络中,得到图像输入网络输出的一组或多组第三结果,第三结果的数量与训练图像中第二车辆的数量一致,一组第三结果用于指示一个第二车辆的特征信息。
其中,在第二车辆在训练图像中仅漏出侧面的情况下,第三结果包括所述第二车辆的漏出面的生成指示信息、第二车辆的车轮的生成坐标和第二车辆的生成第一角度。可选地,第三结果中还可以包括第二车辆的2D包络框的生成位置信息。在第二车辆在训练图像中漏出侧面和主面的情况下,一组第三结果包括第二车辆的漏出面的生成指示信息、第二车辆的车轮的生成坐标、第二车辆的生成第一角度、第二车辆的分界线的生成位置信息和第二车辆的生成第二角度。可选地,第三结果中还可以包括第二车辆的2D包络框的生成位置信息。在第二车辆在训练图像中仅漏出主面的情况下,第三结果包括第二车辆的漏出面的生成指示信息和第二车辆的2D包络框的生成位置信息。
具体的,图像处理网络可以包括目标检测网络和三维特征提取网络,前述目标检测网络可以为一阶段目标检测网络、二阶段目标检测网络或其他类型的目标检测网络等。二阶段目标检测网络包括第一特征提取网络、区域生成网络(region proposal network,RPN)和第二特征提取网络。其中,第一特征提取网络用于对训练图像执行卷积操作,以得到训练图像的特征图,将训练图像的特征图输入到RPN中。RPN根据训练图像的特征图,输出一个或多个2D包络框的位置信息。第一特征提取网络还用于根据RPN输出的2D包络框的位置信息,从训练图像的特征图中扣出第一特征图,第一特征图为训练图像的特征图中位于RPN输出的2D包络框内的特征图;第一特征提取网络还用于生成与每个第一特征图对应的类别,也即生成与每个2D包络框对应的类别,前述类别包括但不限于车辆、路灯、路障、路标、护栏和行人等。第二特征提取网络用于根据第一特征图再进行卷积,以得到 一个更为精准的2D包络框的位置信息。第二特征提取网络还用于从训练图像的特征图中扣出第二特征图,第二特征图为训练图像的特征图中位于第二特征提取网络输出的2D包络框内的特征图;第二特征提取网络还用于生成与每个第二特征图对应的类别,也即生成与每个更精准的2D包络框对应的类别。
在一种实现方式中,图像处理网络包括二阶段目标检测网络和三维特征提取网络。步骤802可以包括:训练设备将训练图像输入二阶段目标检测网络中,得到二阶段目标检测网络中的RPN输出的第二车辆的2D包络框的位置信息;训练设备通过第一特征提取网络从训练图像的特征图中扣出第一特征图,第一特征图为训练图像的特征图中位于RPN输出的2D包络框内的特征图;训练设备将第一特征图输入三维特征提取网络,得到三维特征提取网络输出的第三结果。本申请实施例中,由于RPN直接输出的2D包络框的准确度较低,也即基于RPN直接输出的2D包络框得到的第一特征图的精度较低,有利于提高训练阶段的难度,进而提高训练后图像处理网络的鲁棒性。
为进一步理解本方案,请参阅图9,图9为本申请实施例提供的网络的训练方法的一种流程示意图。如图9所示,图像处理网络中包络第一特征提取网络、RPN、第二特征提取网络和三维特征提取网络,在训练过程中,根据RPN输出的2D包络框的位置信息,从训练图像的特征图中扣出的第一特征图,并生成与第一特征图对应的类别,也即根据第一特征图,生成与RPN输出的2D包络框的位置信息对应的类别,将第一特征图和与RPN输出的2D包括框对应的类别输入到三维特征提取网络中,由三维特征提取网络根据第一特征图和类别,生成三维特征信息,不再利用第二特征提取网络对RPN输出的2D包络框的位置信息进行二次校正,从而提高了训练过程的难度,应理解,图9中的示例仅为方便理解本方案,在实际产品中,三维特征提取网络还可以输出更少或更多种类的三维特征信息,此处不做限定。
在另一种实现方式中,步骤802可以包括:训练设备将训练图像输入目标检测网络中,得到整个目标检测网络输出的第二特征图,将第二特征图输出到三维特征提取网络中,得到三维特征提取网络输出的第三结果。
为进一步理解本方案,请参阅图10,图10为本申请实施例提供的网络的训练方法的一种流程示意图。图像处理网络中包络第一特征提取网络、RPN、第二特征提取网络和三维特征提取网络,在训练过程中,第二特征提取网络会根据第一特征图进行再卷积,以得到更为精确的2D包络框的位置信息,并根据更为精确的2D包络框的位置信息,从训练图像的特征图中扣出第二特征图,生成与第二特征图对应的类别,也即根据第二特征图,生成与第二特征提取网络输出的2D包络框的位置信息对应的类别2,由三维特征提取网络根据第二特征图和第二特征提取网络输出的类别2,生成三维特征信息,应理解,图10中的示例仅为方便理解本方案,不用于限定本方案。
803、训练设备根据标注数据和第三结果,利用损失函数对图像处理网络进行训练,直至满足损失函数的收敛条件,输出训练后的图像处理网络。
本申请实施例中,训练设备获取到标注数据和第三结果之后,可以利用损失函数对图像处理网络进行一次训练,训练设置利用训练数据集合中的多个训练图像和每个训练图像 的标注数据对图像处理网络进行迭代训练,直至满足损失函数的收敛条件。其中,损失函数具体可以为L1损失函数、交叉熵损失函数和/或其他类型的损失函数等。
在第二车辆在训练图像中仅漏出侧面的情况下,损失函数用于拉近漏出面的生成指示信息和漏出面的标注指示信息之间的相似度,且拉近车轮的生成坐标与车轮的标注坐标之间的相似度,且拉近生成第一角度和标注第一角度之间的相似度。可选地,损失函数还用于拉近2D包络框的生成位置信息和2D包络框的标注位置信息之间的相似度。
在第二车辆在训练图像中漏出侧面和主面的情况下,除了上述作用外,损失函数还用于拉近分界线的生成位置信息和分界线的标注位置信息之间的相似度,且拉近生成第二角度与标注第二角度之间的相似度。
在第二车辆在训练图像中仅漏出主面的情况下,损失函数用于拉近漏出面的生成指示信息和漏出面的标注指示信息之间的相似度,且拉近2D包络框的生成位置信息和2D包络框的标注位置信息之间的相似度。
具体的,由于标注数据和第三结果中可以包括多种类型的信息,不同类型的信息可以采用相同类型的损失函数,也可以采用不同类型的损失函数。在一次训练的过程中,训练设备可以逐个获取到每个类型的信息的损失函数值,进而求和,以得到最终的损失函数值。训练设备在逐个获取到每个类型的信息的损失函数值之后,也可以进行加权求和,以得到最终的损失函数值。根据最终的损失函数值,生成梯度值,并利用该梯度值,梯度更新图像处理网络的参数,以完成了对图像处理网络的一次训练。
为进一步理解本方案,以下对与各种类型的信息对应的损失函数的具体公式进行展示,以与漏出面的指示信息对应的损失函数采用L1损失函数为例,公式可以如下:
Figure PCTCN2021088263-appb-000002
其中,L side_visi代表与漏出面的指示信息对应的损失函数,
Figure PCTCN2021088263-appb-000003
代表RPN输出的第i个2D包络框是否与第k类的第j个标注的2D包络框匹配,若
Figure PCTCN2021088263-appb-000004
取1,则证明匹配成功,若
Figure PCTCN2021088263-appb-000005
取0,则证明匹配失败,由于训练图像中可以不仅包括车辆,还可以包括路灯、路障、路标、护栏、行人或其他种类等,第k类指的是车辆;m∈{front,back,left,right}分别代表第二车辆的前面、后面、左面和右面,
Figure PCTCN2021088263-appb-000006
代表针对第j个标注的2D包络框中的第二车辆的m面是否漏出的标注指示信息,如果
Figure PCTCN2021088263-appb-000007
的值取1,则指示第二车辆在训练图像中漏出了m面,如果
Figure PCTCN2021088263-appb-000008
的值取0,则指示第二车辆在训练图像中没有漏出了m面,
Figure PCTCN2021088263-appb-000009
代表图像处理网络针对RPN输出的第i个2D包络框中的第二车辆的m面是否漏出的生成指示信息,如果
Figure PCTCN2021088263-appb-000010
的值取1,则代表图像处理网络预测第二车辆在训练图像中漏出了m面,如果
Figure PCTCN2021088263-appb-000011
的值取0,则代表图像处理网络预测第二车辆在训练图像中没有漏出了m面。
对于RPN输出的第i个2D包络框,训练设备会计算第i个2D包络框与第k类中每个标注的2D包络框之间的交并比(Intersection over Union,IOU),训练设备上预设一个交并比的阈值,若生成的交并比大于前述阈值,则x_ij^k取1,若小于或等于前述阈值,则x_ij^k取0,作为示例,例如交并比的阈值可以取0.5。
以与车轮的坐标对应的损失函数采用L1损失函数为例,公式可以如下:
Figure PCTCN2021088263-appb-000012
其中,L wheel代表与车轮的坐标对应的损失函数,
Figure PCTCN2021088263-appb-000013
代表
Figure PCTCN2021088263-appb-000014
Figure PCTCN2021088263-appb-000015
代表
Figure PCTCN2021088263-appb-000016
Figure PCTCN2021088263-appb-000017
代表图像处理网络针对RPN输出的第i个2D包络框中的第二车辆的车轮点生成的x坐标(也可以称为u坐标),
Figure PCTCN2021088263-appb-000018
代表图像处理网络针对RPN输出的第i个2D包络框中的第二车辆的车轮点生成的y坐标(也可以称为v坐标),
Figure PCTCN2021088263-appb-000019
代表针对第j个标注的2D包络框第二车辆的车轮点标注的x坐标,
Figure PCTCN2021088263-appb-000020
代表图像处理网络针对RPN输出的第i个2D包络框生成的中心点的x坐标,
Figure PCTCN2021088263-appb-000021
代表针对第j个标注的2D包络框中第二车辆的车轮点标注的y坐标,
Figure PCTCN2021088263-appb-000022
代表图像处理网络针对RPN输出的第i个2D包络框生成的中心点的y坐标。
以与分界线的位置信息对应的损失函数采用L1损失函数为例,公式可以如下:
Figure PCTCN2021088263-appb-000023
其中,L boundary代表与分界线的位置信息对应的损失函数,
Figure PCTCN2021088263-appb-000024
代表表图像处理网络针对RPN输出的第i个2D包络框中的第二车辆的分界线的生成位置信息,
Figure PCTCN2021088263-appb-000025
代表针对第j个标注的2D包络框第二车辆的分界线标注的x坐标,
Figure PCTCN2021088263-appb-000026
代表图像处理网络针对RPN输出的第i个2D包络框生成的中心点的x坐标。
对于与2D包络框的位置信息对应的损失函数,常用的损失函数为L1损失函数,训练设备可以计算第i个2D包络框的生成中心点坐标与第j个标注的2D包络框的标注中心点坐标之间的偏差,计算第i个2D包络框的生成的长宽与第j个标注的2D包络框的标注的长宽之间比值的log值。可选地,若图像处理网络还输出每个2D包络框的类别,与2D包络框的类别对应的损失函数可以为交叉熵损失函数。
以与第一角度和第二角度对应的损失函数采用L1损失函数为例,公式可以如下:
Figure PCTCN2021088263-appb-000027
其中,L degree代表与第一角度和第二角度对应的损失函数,alpha代表第一角度,beta代表第二角度,训练设备将360度角分为bin个区间,每个区间中所占的角度数为delta,
Figure PCTCN2021088263-appb-000028
代表图像处理网络针对RPN输出的第i个2D包络框中的第二车辆生成的m1是否在区间w内,
Figure PCTCN2021088263-appb-000029
的值取1,则代表生成的m1在区间w内,如果
Figure PCTCN2021088263-appb-000030
的值取0,则代表生成的m1不在区间w内,
Figure PCTCN2021088263-appb-000031
代表第j个标注的2D包络框中的第二车辆的标注的m1是否在区间w内,
Figure PCTCN2021088263-appb-000032
的值取1,则代表标注的m1在区间w内,如果
Figure PCTCN2021088263-appb-000033
的值取0,则代表标注的m1不在区间w内,
Figure PCTCN2021088263-appb-000034
生成的m1相对于区间w的区间中段的角度偏移量,
Figure PCTCN2021088263-appb-000035
代表标注的m1相对于区间w的区间中段的角度偏移量,
Figure PCTCN2021088263-appb-000036
代表标注的m1。
需要说明的是,式(1)至式(4)中的举例仅为方便理解本方案,不用于限定本方案。
训练设备在利用训练数据集合中的多个训练图像,对图像处理网络进行迭代训练,直至满足损失函数的收敛条件,训练设备会输出训练后的图像处理网络。其中,前述收敛条件指的可以为满足损失函数的收敛条件,也可以为迭代次数达到预设次数,或其他类型的收敛条件等。输出的图像处理网络包括目标检测网络和三维特征提取网络,在前述目标检测网络为二阶段目标检测网络的情况下,无论训练设备在步骤802中采用的是与图9对应的实现方式,还是采用与图10对应的实现方式,输出的图像处理网络均为包括第一特征提取网络、RPN、第二特征提取网络和三维特征提取网络。
本申请实施例中,将获取到的图像输入到图像处理网络中,图像处理网络输出的为车辆的二维包络框的位置信息、车轮的坐标和第一角度,第一角度指的是车辆的侧边线与第一轴线之间的夹角,由于二维包络框的位置信息、车轮的坐标和第一角度这三种参数的准确度与图像中车辆是否完整无关,所以无论图像中的车辆是否完整,训练后的图像处理网络都能够输出准确的信息,有利于提高图像处理网络的稳定性;此外,二维包络框的位置信息、车轮的坐标和第一角度的标注规则简单,相对于目前利用激光雷达进行训练数据标注的方式,大大降低了训练数据标注过程的难度。
本申请实施例还提供一种图像处理的方法,前述方法应用于推理阶段,请参阅图11,图11为本申请实施例提供的图像的处理方法的一种流程示意图,本申请实施例提供的图像的处理方法可以包括:
1101、自车获取第三图像,第三图像中包括第一刚体,第一刚体为立方体。
本申请实施例中,自车执行步骤1101的具体实现方式与图3对应实施例中步骤301的具体实现方式类似,此处不做赘述。区别在于第一图像中一定会包括的对象为第一车辆,第三图像中一定会包括的对象为第一刚体,第一刚体的形状为立方体,该第一刚体具体可以表现为一个物体整体,也可以表现为一个物体的一部分,作为示例,例如物体整体为道路中间的护栏,则第一刚体指的可以为护栏底部的方体部分。
1102、自车将第三图像输入图像处理网络中,得到图像处理网络输出的第二结果。
本申请实施例中,在第一刚体在第三图像中漏出侧面的情况下,每组第二结果中可以包括第一刚体的二维2D包络框的位置信息和第一角度。在第一刚体在第一图像中仅漏出主面且未漏出侧面的情况下,每组第二结果中可以包括第一刚体的2D包络框的位置信息,前述主面指的是前面或后面。可选地,在第一刚体在第一图像中漏出主面和侧面的情况下,第二结果中还可以包括第一刚体的分界线的位置信息和第一刚体的第二角度。进一步可选地,第二结果还可以包括第一刚体在第一图像中的漏出面的指示信息。
自车执行步骤1102的具体实现方式与图3对应实施例中步骤302的具体实现方式类似,图11对应实施例中第二结果的含义与图3对应实施例中第二结果的含义类似。区别在于,第一,第一结果包括的各种信息均为用于描述第一车辆的特征,第二结果包括的各种信息均为描述第一刚体的特征。第二,由于第一刚体是立方体,不像第一车辆一样存在车轮凸起,第二结果中不存在车轮的坐标这类信息。对应的,在第一刚体在第三图像中只漏出主面或者只漏出侧面的情况下,也即第二结果中不包括分界线的位置信息,第二结果中还可以包括2D包络框的左下角顶点的坐标和/或右下角顶点的坐标,以替代图3对应实施例中第一结果中的车轮的坐标;在第一刚体在第三图像中漏出主面和侧面的情况下,也即第二结果中包括第一刚体的分界线的位置信息,第二结果中还可以包括一下中的一项或多项:第一刚体的分界线与第一刚体的2D包络框的底边的交点的坐标、第一刚体的2D包络框的左下角顶点的坐标和右下角顶点的坐标,以替换图3对应实施例中第一结果中车轮的坐标。对于第二结果中包括的每种信息的具体含义,均可以参阅图3对应实施例中的描述,此处不做赘述。
1103、自车根据第二结果,生成第一刚体的三维3D外包络盒的位置信息。
本申请实施例中,在第一刚体在第三图像中只漏出主面或者只漏出侧面的情况下,若第二结果中不包括第一刚体的2D包络框的左下角顶点的坐标和/或右下角顶点的坐标,自车也可以根据第一刚体的2D包络框的位置信息,生成第一刚体的2D包络框的左下角顶点的坐标和/或右下角顶点的坐标,以替代图3对应实施例中第一结果中的车轮的坐标。在第一刚体在第三图像中漏出主面和侧面的情况下,也即第二结果中包括第一刚体的分界线的位置信息,自车可以根据第一刚体的2D包络框的位置信息和第一刚体的分界线的位置信息生成以下中的一项或多项:第一刚体的分界线与第一刚体的2D包络框的底边的交点的坐标、第一刚体的2D包络框的左下角顶点的坐标和右下角顶点的坐标,以替代图3对应实施例中第一结果中的车轮的坐标。
1104、自车判断第一刚体与自车的距离是否超过预设阈值,若未超过预设阈值,则进入步骤1105;若超过预设阈值,则进入步骤1114。
1105、自车根据第一刚体在第三图像中的漏出面的指示信息,判断第一刚体是否在第三图像中漏出侧面,若未漏出侧面,则进入步骤1106;若漏出侧面,则进入步骤1107。
1106、自车根据第一刚体的2D包络框的中心点的坐标和小孔成像原理,生成第一刚体相对于自车的朝向角。
1107、自车通过第一计算规则,根据第三点的坐标,生成第一刚体相对于自车的朝向角。
1108、自车从至少两个第三点的坐标中获取第一刚体的3D外包络盒的顶点的坐标。
1109、自车根据第一刚体的3D外包络盒的顶点的坐标和地平面假设原理,生成第一刚体的质心点在车体坐标系下的三维坐标,车体坐标系的坐标系原点位于自车内。
1110、自车根据第三点的坐标和地平面假设原理,生成第一刚体的尺寸。
1111、自车获取第四图像,第四图像中包括第一刚体,第四图像和第三图像的图像采集角度不同。
1112、自车根据第四图像,通过图像处理网络,得到至少两个第四点的坐标,至少两个第四点均位于第一刚体的3D外包络盒的边上。
1113、自车根据第三点的坐标和第四点的坐标,生成第一刚体的尺寸。
1114、自车根据第一刚体在第三图像中的漏出面的指示信息,判断第一刚体是否在第三图像中漏出侧面,若未漏出侧面,则进入步骤1116;若漏出侧面,则进入步骤1117。
1115、自车根据第一刚体的2D包络框的中心点的坐标和小孔成像原理,生成第一刚体相对于自车的朝向角。
1116、自车通过第二计算规则,根据第三点的坐标,生成第一刚体相对于自车的朝向角。
本申请实施例中,自车执行步骤1103至1116的具体方式与图3对应实施例中步骤303至316的具体实现方式类似,均可以参阅图3对应实施例中步骤303和316中的描述,此处不做赘述。
本申请实施例中,不仅可以对车辆的3D包络盒进行定位,还可以对一般刚体的3D包络盒进行定位,大大扩展了本方案的应用场景。
为更为直观的体现本申请实施例带来的有益效果,请参阅图12,图12为本申请实施例提供的图像处理方法中3D包络盒的一种示意图。图12包括(a)和(b)两个子示意图,图12的(a)子示意图中是采用目前业界的方案定位到的车辆的3D包络盒,图12的(b)子示意图是采用本申请实施例提供的方案定位到的3D包络盒的一个侧面,很明显,采用本申请实施例中的方案定位到的3D包络盒更为准确。
此外,采用本申请实施例提供的方案可以求得更为准确的3D特征信息,以下结合数据对本申请实施例所带来的有益效果作进一步的介绍,针对图像中不包括完整的车辆这种情况,采用目前业界的方案生成的朝向角的误差在22.5度,采用本申请实施例提供的方案生成的朝向角的误差在6.7度,性能提升了约70%;采用目前业界的方案生成的质心点位置的出错率在18.4%,采用本申请实施例提供的方案生成的质心点位置的出错率在6.2%,性能提升了约66%。
在图1至图12所对应的实施例的基础上,为了更好的实施本申请实施例的上述方案,下面还提供用于实施上述方案的相关设备。具体参阅图13,图13为本申请实施例提供的图像处理装置的一种结构示意图。图像处理装置1300包括可以包括获取模块1301、输入模块1302和生成模块1303。其中,获取模块1301,用于获取第一图像,第一图像中包括第一车辆;输入模块1302,用于将第一图像输入图像处理网络中,得到图像处理网络输出的第一结果,在第一车辆在第一图像中漏出侧面的情况下,第一结果包括第一车辆的二维2D包络框的位置信息、第一车辆的车轮的坐标和第一车辆的第一角度,第一车辆的第一角度指示第一车辆的侧边线与第一图像的第一轴线之间夹角的角度,第一车辆的侧边线为第一车辆漏出的侧面与第一车辆所在地平面之间的交线,第一图像的第一轴线与第一图像的一个边平行;生成模块1303,用于根据第一车辆的2D包络框的位置信息、车轮的坐标和第一角度,生成第一车辆的三维3D外包络盒的位置信息,第一车辆的3D外包络盒的位置信息包括至少两个第一点的坐标,至少两个第一点均位于第一车辆的3D外包络盒的边上,至少两个第一点中两个第一点定位第一车辆的3D外包络盒的边,至少两个第一点的坐标用于定位第一车辆的3D外包络盒。
在一种可能的设计中,在第一车辆在第一图像中仅漏出侧面的情况下,至少两个第一点包括第一车辆的侧边线与第一车辆的2D包络框的两个交点。
在一种可能的设计中,生成模块1303,具体用于:根据第一车辆的车轮的坐标和第一车辆的第一角度,生成第一车辆的侧边线的位置信息;根据第一车辆的侧边线的位置信息和第一车辆的2D包络框的位置信息,执行坐标生成操作,以得到至少两个第一点的坐标。
在一种可能的设计中,在第一车辆在第一图像中漏出侧面和主面的情况下,第一结果中还包括第一车辆的分界线的位置信息和第一车辆的第二角度,分界线为侧面与主面之间的分界线,第一车辆的主面为第一车辆的前面或后面,第一车辆的第二角度指示第一车辆的主边线与第一图像的第一轴线之间夹角的角度,第一车辆的主边线为第一车辆漏出的主面与第一车辆所在的地平面之间的交线;至少两个第一点包括第一交点、第二交点和第三交点,第一交点为第一车辆的侧边线与第一车辆的分界线的交点,第一交点为第一车辆的3D外包络盒的一个顶点,第二交点为第一车辆的侧边线与第一车辆的2D包络框的交点,第三交点为第一车辆的主边线与第一车辆的2D包络框的交点。
在一种可能的设计中,生成模块1303,具体用于:根据第一车辆的车轮的坐标和第一车辆的第一角度,生成第一车辆的侧边线的位置信息;根据第一车辆的侧边线的位置信息和第一车辆的分界线的位置信息,生成第一交点的坐标;根据第一车辆的侧边线的位置信息和第一车辆的2D包络框的位置信息,生成第二交点的坐标;根据第一交点的坐标和第一车辆的第二角度,生成第一车辆的主边线的位置信息;根据第一车辆的主边线的位置信息和第一车辆的2D包络框的位置信息,生成第三交点的坐标。
在一种可能的设计中,生成模块1303,还用于在第一车辆在第一图像中漏出侧面的情况下,根据第一点的坐标,生成第一车辆相对于自车的朝向角。
在一种可能的设计中,生成模块1303,还用于根据第一点的坐标和地平面假设原理,生成第一点与自车之间的距离;生成模块1303,具体用于:在根据第一点与自车之间的距 离确定第一车辆与自车之间的距离未超过预设阈值的情况下,通过第一计算规则,根据第一点的坐标,生成朝向角;在根据第一点与自车之间的距离确定第一车辆与自车之间的距离超过预设阈值的情况下,通过第二计算规则,根据第一点的坐标,生成朝向角,第二计算规则和第一计算规则为不同的计算规则。
在一种可能的设计中,当至少两个第一点中任一个第一点与自车之间的距离未超过预设阈值时,视为第一车辆与自车之间的距离未超过预设阈值;或者,当至少两个第一点中任一个第一点与自车之间的距离超过预设阈值时,视为第一车辆与自车之间的距离超过预设阈值。
在一种可能的设计中,生成模块1303,具体用于:根据第一点的坐标和地平面假设原理,生成第一点在车体坐标系下的三维坐标,车体坐标系的坐标系原点位于自车内;根据第一点的三维坐标,生成朝向角。
在一种可能的设计中,生成模块1303,具体用于:根据第一点的坐标和第一车辆的第一角度,生成第一车辆的侧边线的位置信息,根据第一车辆的侧边线的位置信息和第一图像的消失线的位置信息,生成消失点的坐标,消失点为第一车辆的侧边线与第一图像的消失线之间的交点;根据消失点的坐标和两点透视原理,生成朝向角。
在一种可能的设计中,生成模块1303,具体用于:根据第一点的坐标、第一车辆的第一角度和小孔成像原理,生成第一车辆的第一角度和朝向角之间的映射关系;根据映射关系和第一车辆的第一角度,生成朝向角。
在一种可能的设计中,获取模块1301,还用于从至少两个第一点的坐标中获取第一车辆的3D外包络盒的顶点的坐标;生成模块1303,还用于根据第一车辆的3D外包络盒的顶点的坐标和地平面假设原理,生成第一车辆的质心点在车体坐标系下的三维坐标,车体坐标系的坐标系原点位于自车内。
在一种可能的设计中,获取模块1301,还用于从至少两个第一点的坐标中获取第一顶点的坐标,第一顶点为第一车辆的3D外包络盒的一个顶点;生成模块1303,还用于根据第一顶点的坐标和地平面假设原理,生成第一顶点在车体坐标系下的三维坐标;生成模块1303,还用于若至少两个第一点中包括至少两个第一顶点,根据第一顶点在车体坐标系下的三维坐标,生成以下中的一项或多项:第一车辆的长、第一车辆的宽和第一车辆的高,车体坐标系的坐标系原点位于自车内。
在一种可能的设计中,获取模块1301,还用于若至少两个第一点中包括一个第一顶点,获取第二图像,第二图像中包括第一车辆,第二图像和第一图像的图像采集角度不同;生成模块1303,还用于根据第二图像,通过图像处理网络,得到至少两个第二点的坐标,至少两个第二点均位于第一车辆的三维3D外包络盒的边上,至少两个第二点中两个第二点定位第一车辆的3D外包络盒的边,至少两个第二点的坐标用于定位第一车辆的3D外包络盒;生成模块1303,还用于根据第二点的坐标和地平面假设原理,生成第二顶点在车体坐标系下的三维坐标,第二顶点为第一车辆的3D外包络盒的一个顶点,第二顶点与第一顶点为不同的顶点;生成模块1303,还用于根据第一顶点的三维坐标和第二顶点的三维坐标,生成以下中的一项或多项:第一车辆的长、第一车辆的宽和第一车辆的高。
需要说明的是,图像处理装置1300中各模块/单元之间的信息交互、执行过程以及所带来的有益效果等内容,与本申请中图3至图7对应的各个方法实施例基于同一构思,具体内容可参见本申请前述所示的方法实施例中的叙述,此处不再赘述。
本申请实施例还提供一种图像处理装置,具体参阅图14,图14为本申请实施例提供的图像处理装置的一种结构示意图。图像处理装置1400包括可以包括获取模块1401、输入模块1402和生成模块1403。其中,获取模块1401,用于获取第三图像,第三图像中包括第一刚体,第一刚体为立方体;输入模块1402,用于将第三图像输入图像处理网络中,得到图像处理网络输出的第二结果,在第一刚体在第三图像中漏出侧面的情况下,第二结果包括第一刚体的2D包络框的位置信息和第一刚体的第一角度,第一刚体的第一角度指示第一刚体的侧边线与第三图像的第一轴线之间夹角的角度,第一刚体的侧边线为第一刚体漏出的侧面与第一刚体所在平面之间的交线,第三图像的第一轴线与第三图像的一个边平行;生成模块1403,用于根据第一刚体的2D包络框的位置信息和第一角度,生成第一刚体的三维3D外包络盒的位置信息,第一刚体的3D外包络盒的位置信息包括至少两个第三点的坐标,至少两个第三点均位于第一刚体的3D外包络盒的边上,至少两个第三点中两个第三点定位第一刚体的3D外包络盒的边,至少两个第三点的坐标用于定位第一刚体的3D外包络盒。
在一种可能的设计中,在第一刚体在第三图像中仅漏出侧面的情况下,至少两个第三点中包括第一刚体的侧边线与第一刚体的2D包络框的两个交点。
在一种可能的设计中,在第一刚体在第三图像中漏出侧面和主面的情况下,第一结果中还包括第一刚体的分界线的位置信息和第一刚体的第二角度,分界线为侧面与主面之间的分界线,第一刚体的主面为第一刚体的前面或后面,第一刚体的第二角度指示第一刚体的主边线与第三图像的第一轴线之间夹角的角度,第一刚体的主边线为第一刚体漏出的主面与第一刚体所在的地平面之间的交线;至少两个第三点包括第一交点、第二交点和第三交点,第一交点为第一刚体的侧边线与第一刚体的分界线的交点,第一交点为第一刚体的3D外包络盒的一个顶点,第二交点为第一刚体的侧边线与第一刚体的2D包络框的交点,第三交点为第一刚体的主边线与第一刚体的2D包络框的交点。
在一种可能的设计中,生成模块1403,还用于根据至少两个第三点的坐标,生成第一刚体的三维特征信息,第一刚体的三维特征信息包括以下中的一项或多项:第一刚体相当于自车的朝向角、第一刚体的质心点的位置信息和第一刚体的尺寸。
需要说明的是,图像处理装置1400中各模块/单元之间的信息交互、执行过程以及所带来的有益效果等内容,与本申请中图11对应的各个方法实施例基于同一构思,具体内容可参见本申请前述所示的方法实施例中的叙述,此处不再赘述。
本申请实施例还提供一种网络训练装置,具体参阅图15,图15为本申请实施例提供的网络训练装置的一种结构示意图。网络训练装置1500包括可以包括获取模块1501、输入模块1502和训练模块1503。其中,获取模块1501,用于获取训练图像和训练图像的标注数据,训练图像中包括第二车辆,在第二车辆在训练图像中漏出侧面的情况下,标注数据包括第二车辆的车轮的标注坐标和第二车辆的标注第一角度,第二车辆的第一角度指示 第二车辆的侧边线与训练图像的第一轴线之间夹角的角度,第二车辆的侧边线为第二车辆漏出的侧面与第二车辆所在地平面之间的交线,训练图像的第一轴线与训练图像的一个边平行;输入模块1502,用于将训练图像输入图像处理网络中,得到图像输入网络输出的第三结果,第三结果包括第二车辆的车轮的生成坐标和第二车辆的生成第一角度;训练模块1503,用于根据标注数据和第三结果,利用损失函数对图像处理网络进行训练,直至满足损失函数的收敛条件,输出训练后的图像处理网络,损失函数用于拉近生成坐标与标注坐标之间的相似度,且拉近生成第一角度和标注第一角度之间的相似度。
在一种可能的设计中,在第二车辆在训练图像中漏出侧面和主面的情况下,标注数据还包括第二车辆的分界线的标注位置信息和第二车辆的标注第二角度,第三结果还包括第二车辆的分界线的生成位置信息和第二车辆的生成第二角度,损失函数还用于拉近生成位置信息和标注位置信息之间的相似度,且拉近生成第二角度与标注第二角度之间的相似度;
其中,第二车辆的主面为第二车辆的前面或后面,分界线为侧面与主面之间的分界线,第二车辆的第二角度指示第二车辆的主边线与训练图像的第一轴线之间夹角的角度,第二车辆的主边线为第二车辆漏出的主面与第二车辆所在的地平面之间的交线。
在一种可能的设计中,图像处理网络包括二阶段目标检测网络和三维特征提取网络,二阶段目标检测网络包括区域生成网络RPN。输入模块1502,具体用于:将训练图像输入二阶段目标检测网络中,得到二阶段目标检测网络中的RPN输出的第二车辆的2D包络框的位置信息;将第一特征图输入三维特征提取网络,得到三维特征提取网络输出的第三结果,第一特征图为训练图像的特征图中位于RPN输出的2D包络框内的特征图;训练模块1503,具体用于输出包括二阶段目标检测网络和三维特征提取网络的图像处理网络。
需要说明的是,网络训练装置1500中各模块/单元之间的信息交互、执行过程以及所带来的有益效果等内容,与本申请中图8至图10对应的各个方法实施例基于同一构思,具体内容可参见本申请前述所示的方法实施例中的叙述,此处不再赘述。
本申请实施例还提供了一种执行设备,请参阅图16,图16为本申请实施例提供的执行设备的一种结构示意图,执行设备1600具体可以表现为自动驾驶车辆、自动驾驶车辆中的处理芯片或其他产品形态等。其中,执行设备1600上可以部署有图13对应实施例中所描述的图像处理装置1300,用于实现图3至图7对应实施例中自车的功能。或者,执行设备1600上可以部署有图14对应实施例中所描述的图像处理装置1400,用于实现图11对应实施例中自车的功能。执行设备1600包括:接收器1601、发射器1602、处理器1603和存储器1604(其中执行设备1600中的处理器1603的数量可以一个或多个,图16中以一个处理器为例),其中,处理器1603可以包括应用处理器16031和通信处理器16032。在本申请实施例的一些实施例中,接收器1601、发射器1602、处理器1603和存储器1604可通过总线或其它方式连接。
存储器1604可以包括只读存储器和随机存取存储器,并向处理器1603提供指令和数据。存储器1604的一部分还可以包括非易失性随机存取存储器(non-volatile random access memory,NVRAM)。存储器1604存储有处理器和操作指令、可执行模块或者数据结构,或者它们的子集,或者它们的扩展集,其中,操作指令可包括各种操作指令,用于实现各 种操作。
处理器1603控制数据生成装置的操作。具体的应用中,数据生成装置的各个组件通过总线***耦合在一起,其中总线***除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都称为总线***。
上述本申请实施例揭示的方法可以应用于处理器1603中,或者由处理器1603实现。处理器1603可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器1603中的硬件的集成逻辑电路或者软件形式的指令完成。上述的处理器1603可以是通用处理器、数字信号处理器(digital signal processing,DSP)、微处理器或微控制器,还可进一步包括专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。该处理器1603可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器1604,处理器1603读取存储器1604中的信息,结合其硬件完成上述方法的步骤。
接收器1601可用于接收输入的数字或字符信息,以及产生与数据生成装置的相关设置以及功能控制有关的信号输入。发射器1602可用于通过第一接口输出数字或字符信息,发射器1602还可用于通过第一接口向磁盘组发送指令,以修改磁盘组中的数据,发射器1602还可以包括显示屏等显示设备。
若执行设备具体表现为自动驾驶车辆,则请参阅图17,图17为本申请实施例提供的自动驾驶车辆的一种结构示意图,通过图17对自动驾驶车辆的***做进一步描述。自动驾驶车辆配置为完全或部分地自动驾驶模式,例如,自动驾驶车辆可以在处于自动驾驶模式中的同时控制自身,并且可通过人为操作来确定车辆及其周边环境的当前状态,确定周边环境中的至少一个其他车辆的可能行为,并确定其他车辆执行可能行为的可能性相对应的置信水平,基于所确定的信息来控制自动驾驶车辆。在自动驾驶车辆处于自动驾驶模式中时,也可以将自动驾驶车辆置为在没有和人交互的情况下操作。
自动驾驶车辆可包括各种子***,例如行进***102、传感器***104、控制***106、一个或多个***设备108以及电源110、计算机***112和用户接口116。可选地,自动驾驶车辆可包括更多或更少的子***,并且每个子***可包括多个部件。另外,自动驾驶车辆的每个子***和部件可以通过有线或者无线互连。
行进***102可包括为自动驾驶车辆提供动力运动的组件。在一个实施例中,行进***102可包括引擎118、能量源119、传动装置120和车轮/轮胎121。
其中,引擎118可以是内燃引擎、电动机、空气压缩引擎或其他类型的引擎组合,例如,汽油发动机和电动机组成的混动引擎,内燃引擎和空气压缩引擎组成的混动引擎。引擎118将能量源119转换成机械能量。能量源119的示例包括汽油、柴油、其他基于石油 的燃料、丙烷、其他基于压缩气体的燃料、乙醇、太阳能电池板、电池和其他电力来源。能量源119也可以为自动驾驶车辆的其他***提供能量。传动装置120可以将来自引擎118的机械动力传送到车轮121。传动装置120可包括变速箱、差速器和驱动轴。在一个实施例中,传动装置120还可以包括其他器件,比如离合器。其中,驱动轴可包括可耦合到一个或多个车轮121的一个或多个轴。
传感器***104可包括感测关于自动驾驶车辆周边的环境的信息的若干个传感器。例如,传感器***104可包括定位***122(定位***可以是全球定位GPS***,也可以是北斗***或者其他定位***)、惯性测量单元(inertial measurement unit,IMU)124、雷达126、激光测距仪128以及相机130。传感器***104还可包括被监视自动驾驶车辆的内部***的传感器(例如,车内空气质量监测器、燃油量表、机油温度表等)。来自这些传感器中的一个或多个的传感数据可用于检测对象及其相应特性(位置、形状、方向、速度等)。这种检测和识别是自主自动驾驶车辆的安全操作的关键功能。
其中,定位***122可用于估计自动驾驶车辆的地理位置。IMU 124用于基于惯性加速度来感知自动驾驶车辆的位置和朝向变化。在一个实施例中,IMU 124可以是加速度计和陀螺仪的组合。雷达126可利用无线电信号来感知自动驾驶车辆的周边环境内的物体,具体可以表现为毫米波雷达或激光雷达。在一些实施例中,除了感知物体以外,雷达126还可用于感知物体的速度和/或前进方向。激光测距仪128可利用激光来感知自动驾驶车辆所位于的环境中的物体。在一些实施例中,激光测距仪128可包括一个或多个激光源、激光扫描器以及一个或多个检测器,以及其他***组件。相机130可用于捕捉自动驾驶车辆的周边环境的多个图像。相机130可以是静态相机或视频相机。
控制***106为控制自动驾驶车辆及其组件的操作。控制***106可包括各种部件,其中包括转向***132、油门134、制动单元136、计算机视觉***140、线路控制***142以及障碍避免***144。
其中,转向***132可操作来调整自动驾驶车辆的前进方向。例如在一个实施例中可以为方向盘***。油门134用于控制引擎118的操作速度并进而控制自动驾驶车辆的速度。制动单元136用于控制自动驾驶车辆减速。制动单元136可使用摩擦力来减慢车轮121。在其他实施例中,制动单元136可将车轮121的动能转换为电流。制动单元136也可采取其他形式来减慢车轮121转速从而控制自动驾驶车辆的速度。计算机视觉***140可以操作来处理和分析由相机130捕捉的图像以便识别自动驾驶车辆周边环境中的物体和/或特征。所述物体和/或特征可包括交通信号、道路边界和障碍体。计算机视觉***140可使用物体识别算法、运动中恢复结构(Structure from Motion,SFM)算法、视频跟踪和其他计算机视觉技术。在一些实施例中,计算机视觉***140可以用于为环境绘制地图、跟踪物体、估计物体的速度等等。线路控制***142用于确定自动驾驶车辆的行驶路线以及行驶速度。在一些实施例中,线路控制***142可以包括横向规划模块1421和纵向规划模块1422,横向规划模块1421和纵向规划模块1422分别用于结合来自障碍避免***144、GPS 122和一个或多个预定地图的数据为自动驾驶车辆确定行驶路线和行驶速度。障碍避免***144用于识别、评估和避免或者以其他方式越过自动驾驶车辆的环境中的障碍体,前述障碍体 具体可以表现为实际障碍体和可能与自动驾驶车辆发生碰撞的虚拟移动体。在一个实例中,控制***106可以增加或替换地包括除了所示出和描述的那些以外的组件。或者也可以减少一部分上述示出的组件。
自动驾驶车辆通过***设备108与外部传感器、其他车辆、其他计算机***或用户之间进行交互。***设备108可包括无线通信***146、车载电脑148、麦克风150和/或扬声器152。在一些实施例中,***设备108为自动驾驶车辆的用户提供与用户接口116交互的手段。例如,车载电脑148可向自动驾驶车辆的用户提供信息。用户接口116还可操作车载电脑148来接收用户的输入。车载电脑148可以通过触摸屏进行操作。在其他情况中,***设备108可提供用于自动驾驶车辆与位于车内的其它设备通信的手段。例如,麦克风150可从自动驾驶车辆的用户接收音频(例如,语音命令或其他音频输入)。类似地,扬声器152可向自动驾驶车辆的用户输出音频。无线通信***146可以直接地或者经由通信网络来与一个或多个设备无线通信。例如,无线通信***146可使用3G蜂窝通信,例如CDMA、EVD0、GSM/GPRS,或者4G蜂窝通信,例如LTE。或者5G蜂窝通信。无线通信***146可利用无线局域网(wireless localarea network,WLAN)通信。在一些实施例中,无线通信***146可利用红外链路、蓝牙或ZigBee与设备直接通信。其他无线协议,例如各种车辆通信***,例如,无线通信***146可包括一个或多个专用短程通信(dedicated short range communications,DSRC)设备,这些设备可包括车辆和/或路边台站之间的公共和/或私有数据通信。
电源110可向自动驾驶车辆的各种组件提供电力。在一个实施例中,电源110可以为可再充电锂离子或铅酸电池。这种电池的一个或多个电池组可被配置为电源为自动驾驶车辆的各种组件提供电力。在一些实施例中,电源110和能量源119可一起实现,例如一些全电动车中那样。
自动驾驶车辆的部分或所有功能受计算机***112控制。计算机***112可包括至少一个处理器1603,处理器1603执行存储在例如存储器1604这样的非暂态计算机可读介质中的指令115。计算机***112还可以是采用分布式方式控制自动驾驶车辆的个体组件或子***的多个计算设备。处理器1603和存储器1604的具体形态,此处不再赘述。在此处所描述的各个方面中,处理器1603可以位于远离自动驾驶车辆并且与自动驾驶车辆进行无线通信。在其它方面中,此处所描述的过程中的一些在布置于自动驾驶车辆内的处理器1603上执行而其它则由远程处理器1603执行,包括采取执行单一操纵的必要步骤。
在一些实施例中,存储器1604可包含指令115(例如,程序逻辑),指令115可被处理器1603执行来执行自动驾驶车辆的各种功能,包括以上描述的那些功能。存储器1604也可包含额外的指令,包括向行进***102、传感器***104、控制***106和***设备108中的一个或多个发送数据、从其接收数据、与其交互和/或对其进行控制的指令。除了指令115以外,存储器1604还可存储数据,例如道路地图、路线信息,车辆的位置、方向、速度以及其它这样的车辆数据,以及其他信息。这种信息可在自动驾驶车辆在自主、半自主和/或手动模式中操作期间被自动驾驶车辆和计算机***112使用。用户接口116,用于向自动驾驶车辆的用户提供信息或从其接收信息。可选地,用户接口116可包括在***设备 108的集合内的一个或多个输入/输出设备,例如无线通信***146、车载电脑148、麦克风150和扬声器152。
计算机***112可基于从各种子***(例如,行进***102、传感器***104和控制***106)以及从用户接口116接收的输入来控制自动驾驶车辆的功能。例如,计算机***112可利用来自控制***106的输入以便控制转向***132来避免由传感器***104和障碍避免***144检测到的障碍体。在一些实施例中,计算机***112可操作来对自动驾驶车辆及其子***的许多方面提供控制。
可选地,上述这些组件中的一个或多个可与自动驾驶车辆分开安装或关联。例如,存储器1604可以部分或完全地与自动驾驶车辆分开存在。上述组件可以按有线和/或无线方式来通信地耦合在一起。
可选地,上述组件只是一个示例,实际应用中,上述各个模块中的组件有可能根据实际需要增添或者删除,图1不应理解为对本申请实施例的限制。在道路行进的自动驾驶车辆,如上面的自动驾驶车辆,可以识别其周围环境内的物体以确定对当前速度的调整。所述物体可以是其它车辆、交通控制设备、或者其它类型的物体。在一些示例中,可以独立地考虑每个识别的物体,并且基于物体的各自的特性,诸如它的当前速度、加速度、与车辆的间距等,可以用来确定自动驾驶车辆所要调整的速度。
可选地,自动驾驶车辆或者与自动驾驶车辆相关联的计算设备如图1的计算机***112、计算机视觉***140、存储器1604可以基于所识别的物体的特性和周围环境的状态(例如,交通、雨、道路上的冰、等等)来预测所识别的物体的行为。可选地,每一个所识别的物体都依赖于彼此的行为,因此还可以将所识别的所有物体全部一起考虑来预测单个识别的物体的行为。自动驾驶车辆能够基于预测的所识别的物体的行为来调整它的速度。换句话说,自动驾驶车辆能够基于所预测的物体的行为来确定车辆将需要调整到(例如,加速、减速、或者停止)什么稳定状态。在这个过程中,也可以考虑其它因素来确定自动驾驶车辆的速度,诸如,自动驾驶车辆在行驶的道路中的横向位置、道路的曲率、静态和动态物体的接近度等等。除了提供调整自动驾驶车辆的速度的指令之外,计算设备还可以提供修改自动驾驶车辆的转向角的指令,以使得自动驾驶车辆遵循给定的轨迹和/或维持与自动驾驶车辆附近的物体(例如,道路上的相邻车道中的轿车)的安全横向和纵向距离。
上述自动驾驶车辆可以为轿车、卡车、摩托车、公共汽车、船、飞机、直升飞机、割草机、娱乐车、游乐场车辆、施工设备、电车、高尔夫球车、火车、和手推车等,本申请实施例不做特别的限定。
本申请实施例中,处理器1603通过应用处理器16031执行图3至图7对应实施例中自车执行的图像处理方法,或者,执行图11对应实施例中自车执行的图像处理方法。对于应用处理器16031执行图像的处理方法的具体实现方式以及带来的有益效果,可以参考图3至图7对应的各个方法实施例中的叙述,或者,可以参考图11对应的各个方法实施例中的叙述,此处不再一一赘述。
本申请实施例还提供一种训练设备,请参阅图18,图18为本申请实施例提供的训练设备的一种结构示意图。训练设备1800上可以部署有图15对应实施例中所描述的网络训 练装置1500,用于实现图8至图10对应实施例中训练设备的功能,具体的,训练设备1800由一个或多个服务器实现,训练设备1800可因配置或性能不同而产生比较大的差异,可以包括一个或一个以***处理器(central processing units,CPU)1822(例如,一个或一个以上处理器)和存储器1832,一个或一个以上存储应用程序1842或数据1844的存储介质1830(例如一个或一个以上海量存储设备)。其中,存储器1832和存储介质1830可以是短暂存储或持久存储。存储在存储介质1830的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对训练设备中的一系列指令操作。更进一步地,中央处理器1822可以设置为与存储介质1830通信,在训练设备1800上执行存储介质1830中的一系列指令操作。
训练设备1800还可以包括一个或一个以上电源1826,一个或一个以上有线或无线网络接口1850,一个或一个以上输入输出接口1858,和/或,一个或一个以上操作***1841,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。
本申请实施例中,中央处理器1822,用于执行图8至图10对应实施例中的训练设备执行的网络的训练方法。对于中央处理器1822执行网络的训练方法的具体实现方式以及带来的有益效果,均可以参考图8至图10对应的各个方法实施例中的叙述,此处不再一一赘述。
本申请实施例中还提供一种计算机可读存储介质,该计算机可读存储介质中存储有程序,当其在计算机上运行时,使得计算机执行如前述图3至图7所示实施例描述的方法中自车所执行的步骤,或者,使得计算机执行如前述图11所示实施例描述的方法中自车所执行的步骤,或者,使得计算机执行如前述图8至图10所示实施例描述的方法中训练设备所执行的步骤。
本申请实施例中还提供一种包括计算机程序产品,当其在计算机上运行时,使得计算机执行如前述图3至图7所示实施例描述的方法中自车所执行的步骤,或者,使得计算机执行如前述图11所示实施例描述的方法中自车所执行的步骤,或者,使得计算机执行如前述图8至图10所示实施例描述的方法中训练设备所执行的步骤。
本申请实施例中还提供一种电路***,所述电路***包括处理电路,所述处理电路配置为执行如前述图3至图7所示实施例描述的方法中自车所执行的步骤,或者,执行如前述图11所示实施例描述的方法中自车所执行的步骤,或者,执行如前述图8至图10所示实施例描述的方法中训练设备所执行的步骤。
本申请实施例提供的自车、训练设备、图像处理装置或网络训练装置具体可以为芯片,芯片包括:处理单元和通信单元,所述处理单元例如可以是处理器,所述通信单元例如可以是输入/输出接口、管脚或电路等。该处理单元可执行存储单元存储的计算机执行指令,以使芯片执行如前述图3至图7所示实施例描述的方法中自车所执行的步骤,或者,执行如前述图11所示实施例描述的方法中自车所执行的步骤,或者,执行如前述图8至图10所示实施例描述的方法中训练设备所执行的步骤。可选地,所述存储单元为所述芯片内的存储单元,如寄存器、缓存等,所述存储单元还可以是所述无线接入设备端内的位于所述芯片外部的存储单元,如只读存储器(read-only memory,ROM)或可存储静态信息和指令 的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)等。
具体的,请参阅图19,图19为本申请实施例提供的芯片的一种结构示意图,所述芯片可以表现为神经网络处理器NPU 190,NPU 190作为协处理器挂载到主CPU(Host CPU)上,由Host CPU分配任务。NPU的核心部分为运算电路1903,通过控制器1904控制运算电路1903提取存储器中的矩阵数据并进行乘法运算。
在一些实现中,运算电路1903内部包括多个处理单元(Process Engine,PE)。在一些实现中,运算电路1903是二维脉动阵列。运算电路1903还可以是一维脉动阵列或者能够执行例如乘法和加法这样的数学运算的其它电子线路。在一些实现中,运算电路1903是通用的矩阵处理器。
举例来说,假设有输入矩阵A,权重矩阵B,输出矩阵C。运算电路从权重存储器1902中取矩阵B相应的数据,并缓存在运算电路中每一个PE上。运算电路从输入存储器1901中取矩阵A数据与矩阵B进行矩阵运算,得到的矩阵的部分结果或最终结果,保存在累加器(accumulator)1908中。
统一存储器1906用于存放输入数据以及输出数据。权重数据直接通过存储单元访问控制器(Direct Memory Access Controller,DMAC)1905,DMAC被搬运到权重存储器1902中。输入数据也通过DMAC被搬运到统一存储器1906中。
BIU为Bus Interface Unit即,总线接口单元1910,用于AXI总线与DMAC和取指存储器(Instruction Fetch Buffer,IFB)1909的交互。
总线接口单元1910(Bus Interface Unit,简称BIU),用于取指存储器1909从外部存储器获取指令,还用于存储单元访问控制器1905从外部存储器获取输入矩阵A或者权重矩阵B的原数据。
DMAC主要用于将外部存储器DDR中的输入数据搬运到统一存储器1906或将权重数据搬运到权重存储器1902中或将输入数据数据搬运到输入存储器1901中。
向量计算单元1907包括多个运算处理单元,在需要的情况下,对运算电路的输出做进一步处理,如向量乘,向量加,指数运算,对数运算,大小比较等等。主要用于神经网络中非卷积/全连接层网络计算,如Batch Normalization(批归一化),像素级求和,对特征平面进行上采样等。
在一些实现中,向量计算单元1907能将经处理的输出的向量存储到统一存储器1906。例如,向量计算单元1907可以将线性函数和/或非线性函数应用到运算电路1903的输出,例如对卷积层提取的特征平面进行线性插值,再例如累加值的向量,用以生成激活值。在一些实现中,向量计算单元1907生成归一化的值、像素级求和的值,或二者均有。在一些实现中,处理过的输出的向量能够用作到运算电路1903的激活输入,例如用于在神经网络中的后续层中的使用。
控制器1904连接的取指存储器(instruction fetch buffer)1909,用于存储控制器1904使用的指令;
统一存储器1906,输入存储器1901,权重存储器1902以及取指存储器1909均为On-Chip存储器。外部存储器私有于该NPU硬件架构。
其中,循环神经网络中各层的运算可以由运算电路1903或向量计算单元1907执行。
其中,上述任一处提到的处理器,可以是一个通用中央处理器,微处理器,ASIC,或一个或多个用于控制上述第一方面方法的程序执行的集成电路。
另外需说明的是,以上所描述的装置实施例仅仅是示意性的,其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。另外,本申请提供的装置实施例附图中,模块之间的连接关系表示它们之间具有通信连接,具体可以实现为一条或多条通信总线或信号线。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到本申请可借助软件加必需的通用硬件的方式来实现,当然也可以通过专用硬件包括专用集成电路、专用CLU、专用存储器、专用元器件等来实现。一般情况下,凡由计算机程序完成的功能都可以很容易地用相应的硬件来实现,而且,用来实现同一功能的具体硬件结构也可以是多种多样的,例如模拟电路、数字电路或专用电路等。但是,对本申请而言更多情况下软件程序实现是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在可读取的存储介质中,如计算机的软盘、U盘、移动硬盘、ROM、RAM、磁碟或者光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。
所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存储的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如,DVD)、或者半导体介质(例如固态硬盘(Solid State Disk,SSD))等。

Claims (47)

  1. 一种图像的处理方法,其特征在于,所述方法包括:
    获取第一图像,所述第一图像中包括第一车辆;
    将所述第一图像输入图像处理网络中,得到所述图像处理网络输出的第一结果,在所述第一车辆在所述第一图像中漏出侧面的情况下,所述第一结果包括所述第一车辆的二维2D包络框的位置信息、所述第一车辆的车轮的坐标和所述第一车辆的第一角度,所述第一车辆的第一角度指示所述第一车辆的侧边线与所述第一图像的第一轴线之间夹角的角度,所述第一车辆的侧边线为所述第一车辆漏出的侧面与所述第一车辆所在地平面之间的交线,所述第一图像的第一轴线与所述第一图像的一个边平行;
    根据所述第一车辆的2D包络框的位置信息、所述车轮的坐标和所述第一角度,生成所述第一车辆的三维3D外包络盒的位置信息,所述第一车辆的3D外包络盒的位置信息包括至少两个第一点的坐标,所述至少两个第一点均位于所述第一车辆的3D外包络盒的边上,所述至少两个第一点中两个第一点定位所述第一车辆的3D外包络盒的边,所述至少两个第一点的坐标用于定位所述第一车辆的3D外包络盒。
  2. 根据权利要求1所述的方法,其特征在于,在所述第一车辆在所述第一图像中仅漏出侧面的情况下,所述至少两个第一点包括所述第一车辆的侧边线与所述第一车辆的2D包络框的两个交点。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述2D包络框的位置信息、所述车轮的坐标和所述第一角度,生成所述第一车辆的3D外包络盒的位置信息,包括:
    根据所述第一车辆的车轮的坐标和所述第一车辆的第一角度,生成所述第一车辆的侧边线的位置信息;
    根据所述第一车辆的侧边线的位置信息和所述第一车辆的2D包络框的位置信息,执行坐标生成操作,以得到所述至少两个第一点的坐标。
  4. 根据权利要求1所述的方法,其特征在于,在所述第一车辆在所述第一图像中漏出侧面和主面的情况下,所述第一结果中还包括所述第一车辆的分界线的位置信息和所述第一车辆的第二角度,所述分界线为侧面与主面之间的分界线,所述第一车辆的主面为所述第一车辆的前面或后面,所述第一车辆的第二角度指示所述第一车辆的主边线与所述第一图像的第一轴线之间夹角的角度,所述第一车辆的主边线为所述第一车辆漏出的主面与所述第一车辆所在的地平面之间的交线;
    所述至少两个第一点包括第一交点、第二交点和第三交点,所述第一交点为所述第一车辆的侧边线与所述第一车辆的分界线的交点,所述第一交点为所述第一车辆的3D外包络盒的一个顶点,所述第二交点为所述第一车辆的侧边线与所述第一车辆的2D包络框的交点,所述第三交点为所述第一车辆的主边线与所述第一车辆的2D包络框的交点。
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述第一车辆的2D包络框的位置信息、所述车轮的坐标和所述第一角度,生成所述第一车辆的三维3D外包络盒的位置信息,包括:
    根据所述第一车辆的车轮的坐标和所述第一车辆的第一角度,生成所述第一车辆的侧 边线的位置信息;
    根据所述第一车辆的侧边线的位置信息和所述第一车辆的分界线的位置信息,生成所述第一交点的坐标;
    根据所述第一车辆的侧边线的位置信息和所述第一车辆的2D包络框的位置信息,生成所述第二交点的坐标;
    根据所述第一交点的坐标和所述第一车辆的第二角度,生成所述第一车辆的主边线的位置信息;
    根据所述第一车辆的主边线的位置信息和所述第一车辆的2D包络框的位置信息,生成所述第三交点的坐标。
  6. 根据权利要求1至5任一项所述的方法,其特征在于,所述方法还包括:
    在所述第一车辆在所述第一图像中漏出侧面的情况下,根据所述第一点的坐标,生成所述第一车辆相对于自车的朝向角。
  7. 根据权利要求6所述的方法,其特征在于,所述根据所述第一点的坐标,生成所述朝向角之前,所述方法还包括:
    根据所述第一点的坐标和地平面假设原理,生成所述第一点与自车之间的距离;
    所述根据所述第一点的坐标,生成所述朝向角,包括:
    在根据所述第一点与自车之间的距离确定所述第一车辆与自车之间的距离未超过预设阈值的情况下,通过第一计算规则,根据所述第一点的坐标,生成所述朝向角;
    在根据所述第一点与自车之间的距离确定所述第一车辆与自车之间的距离超过预设阈值的情况下,通过第二计算规则,根据所述第一点的坐标,生成所述朝向角,所述第二计算规则和所述第一计算规则为不同的计算规则。
  8. 根据权利要求7述的方法,其特征在于,
    当至少两个第一点中任一个第一点与自车之间的距离未超过所述预设阈值时,视为所述第一车辆与自车之间的距离未超过所述预设阈值;或者,
    当至少两个第一点中任一个第一点与自车之间的距离超过所述预设阈值时,视为所述第一车辆与自车之间的距离超过所述预设阈值。
  9. 根据权利要求7所述的方法,其特征在于,所述通过第一计算规则,根据所述第一点的坐标,生成所述朝向角,包括:
    根据所述第一点的坐标和地平面假设原理,生成所述第一点在所述车体坐标系下的三维坐标,所述车体坐标系的坐标系原点位于自车内;
    根据所述第一点的三维坐标,生成所述朝向角。
  10. 根据权利要求7所述的方法,其特征在于,所述通过第二计算规则,根据所述第一点的坐标,生成所述朝向角,包括:
    根据所述第一点的坐标和所述第一车辆的第一角度,生成所述第一车辆的侧边线的位置信息,根据所述第一车辆的侧边线的位置信息和所述第一图像的消失线的位置信息,生成消失点的坐标,所述消失点为所述第一车辆的侧边线与所述第一图像的消失线之间的交点;
    根据所述消失点的坐标和两点透视原理,生成所述朝向角。
  11. 根据权利要求7所述的方法,其特征在于,所述通过第二计算规则,根据所述第一点的坐标,生成所述朝向角,包括:
    根据所述第一点的坐标、所述第一车辆的第一角度和小孔成像原理,生成所述第一车辆的第一角度和所述朝向角之间的映射关系;
    根据所述映射关系和所述第一车辆的第一角度,生成所述朝向角。
  12. 根据权利要求1至5任一项所述的方法,其特征在于,所述方法还包括:
    从所述至少两个第一点的坐标中获取所述第一车辆的3D外包络盒的顶点的坐标;
    根据所述第一车辆的3D外包络盒的顶点的坐标和地平面假设原理,生成所述第一车辆的质心点在车体坐标系下的三维坐标,所述车体坐标系的坐标系原点位于自车内。
  13. 根据权利要求1至5任一项所述的方法,其特征在于,所述方法还包括:
    从所述至少两个第一点的坐标中获取第一顶点的坐标,所述第一顶点为所述第一车辆的3D外包络盒的一个顶点;
    根据所述第一顶点的坐标和地平面假设原理,生成所述第一顶点在车体坐标系下的三维坐标;
    若所述至少两个第一点中包括至少两个第一顶点,根据第一顶点在车体坐标系下的三维坐标,生成以下中的一项或多项:第一车辆的长、第一车辆的宽和第一车辆的高,所述车体坐标系的坐标系原点位于自车内。
  14. 根据权利要求13所述的方法,其特征在于,所述方法还包括:
    若所述至少两个第一点中包括一个第一顶点,获取第二图像,所述第二图像中包括所述第一车辆,所述第二图像和所述第一图像的图像采集角度不同;
    根据所述第二图像,通过所述图像处理网络,得到至少两个第二点的坐标,所述至少两个第二点均位于所述第一车辆的三维3D外包络盒的边上,所述至少两个第二点中两个第二点定位所述第一车辆的3D外包络盒的边,所述至少两个第二点的坐标用于定位所述第一车辆的3D外包络盒;
    根据所述第二点的坐标和地平面假设原理,生成第二顶点在所述车体坐标系下的三维坐标,所述第二顶点为所述第一车辆的3D外包络盒的一个顶点,所述第二顶点与所述第一顶点为不同的顶点;
    根据所述第一顶点的三维坐标和所述第二顶点的三维坐标,生成以下中的一项或多项:第一车辆的长、第一车辆的宽和第一车辆的高。
  15. 一种图像的处理方法,其特征在于,所述方法包括:
    获取第三图像,所述第三图像中包括第一刚体,所述第一刚体为立方体;
    将所述第三图像输入图像处理网络中,得到所述图像处理网络输出的第二结果,在所述第一刚体在所述第三图像中漏出侧面的情况下,所述第二结果包括所述第一刚体的2D包络框的位置信息和所述第一刚体的第一角度,所述第一刚体的第一角度指示所述第一刚体的侧边线与所述第三图像的第一轴线之间夹角的角度,所述第一刚体的侧边线为所述第一刚体漏出的侧面与所述第一刚体所在平面之间的交线,所述第三图像的第一轴线与所述 第三图像的一个边平行;
    根据所述第一刚体的2D包络框的位置信息和所述第一角度,生成所述第一刚体的三维3D外包络盒的位置信息,所述第一刚体的3D外包络盒的位置信息包括至少两个第三点的坐标,所述至少两个第三点均位于所述第一刚体的3D外包络盒的边上,所述至少两个第三点中两个第三点定位所述第一刚体的3D外包络盒的边,所述至少两个第三点的坐标用于定位所述第一刚体的3D外包络盒。
  16. 根据权利要求15中所述的方法,其特征在于,在所述第一刚体在所述第三图像中仅漏出侧面的情况下,所述至少两个第三点中包括所述第一刚体的侧边线与所述第一刚体的2D包络框的两个交点。
  17. 根据权利要求15中所述的方法,其特征在于,在所述第一刚体在所述第三图像中漏出侧面和主面的情况下,所述第一结果中还包括所述第一刚体的分界线的位置信息和所述第一刚体的第二角度,所述分界线为侧面与主面之间的分界线,所述第一刚体的主面为所述第一刚体的前面或后面,所述第一刚体的第二角度指示所述第一刚体的主边线与所述第三图像的第一轴线之间夹角的角度,所述第一刚体的主边线为所述第一刚体漏出的主面与所述第一刚体所在的地平面之间的交线;
    所述至少两个第三点包括第一交点、第二交点和第三交点,所述第一交点为所述第一刚体的侧边线与所述第一刚体的分界线的交点,所述第一交点为所述第一刚体的3D外包络盒的一个顶点,所述第二交点为所述第一刚体的侧边线与所述第一刚体的2D包络框的交点,所述第三交点为所述第一刚体的主边线与所述第一刚体的2D包络框的交点。
  18. 根据权利要求15至17任一项所述的方法,其特征在于,所述方法还包括:
    根据所述至少两个第三点的坐标,生成所述第一刚体的三维特征信息,所述第一刚体的三维特征信息包括以下中的一项或多项:所述第一刚体相当于自车的朝向角、所述第一刚体的质心点的位置信息和所述第一刚体的尺寸。
  19. 一种网络的训练方法,其特征在于,所述方法包括:
    获取训练图像和所述训练图像的标注数据,所述训练图像中包括第二车辆,在所述第二车辆在所述训练图像中漏出侧面的情况下,所述标注数据包括所述第二车辆的车轮的标注坐标和所述第二车辆的标注第一角度,所述第二车辆的第一角度指示所述第二车辆的侧边线与所述训练图像的第一轴线之间夹角的角度,所述第二车辆的侧边线为所述第二车辆漏出的侧面与所述第二车辆所在地平面之间的交线,所述训练图像的第一轴线与所述训练图像的一个边平行;
    将所述训练图像输入图像处理网络中,得到所述图像输入网络输出的第三结果,所述第三结果包括所述第二车辆的车轮的生成坐标和所述第二车辆的生成第一角度;
    根据所述标注数据和所述第三结果,利用损失函数对所述图像处理网络进行训练,直至满足所述损失函数的收敛条件,输出训练后的所述图像处理网络,所述损失函数用于拉近所述生成坐标与所述标注坐标之间的相似度,且拉近所述生成第一角度和所述标注第一角度之间的相似度。
  20. 根据权利要求19所述的方法,其特征在于,在所述第二车辆在所述训练图像中漏 出侧面和主面的情况下,所述标注数据还包括所述第二车辆的分界线的标注位置信息和所述第二车辆的标注第二角度,所述第三结果还包括所述第二车辆的分界线的生成位置信息和所述第二车辆的生成第二角度,所述损失函数还用于拉近所述生成位置信息和所述标注位置信息之间的相似度,且拉近所述生成第二角度与所述标注第二角度之间的相似度;
    其中,所述第二车辆的主面为所述第二车辆的前面或后面,所述分界线为侧面与主面之间的分界线,所述第二车辆的第二角度指示所述第二车辆的主边线与所述训练图像的第一轴线之间夹角的角度,所述第二车辆的主边线为所述第二车辆漏出的主面与所述第二车辆所在的地平面之间的交线。
  21. 根据权利要求19或20所述的方法,其特征在于,所述图像处理网络包括二阶段目标检测网络和三维特征提取网络,所述二阶段目标检测网络包括区域生成网络RPN;
    所述将所述训练图像输入图像处理网络中,得到所述图像输入网络输出的第三结果,包括:
    将所述训练图像输入所述二阶段目标检测网络中,得到所述二阶段目标检测网络中的所述RPN输出的所述第二车辆的2D包络框的位置信息;
    将第一特征图输入所述三维特征提取网络,得到所述三维特征提取网络输出的所述第三结果,所述第一特征图为所述训练图像的特征图中位于所述RPN输出的所述2D包络框内的特征图;
    所述输出训练后的所述图像处理网络,包括:
    输出包括所述二阶段目标检测网络和三维特征提取网络的图像处理网络。
  22. 一种图像处理装置,其特征在于,所述装置包括:
    获取模块,用于获取第一图像,所述第一图像中包括第一车辆;
    输入模块,用于将所述第一图像输入图像处理网络中,得到所述图像处理网络输出的第一结果,在所述第一车辆在所述第一图像中漏出侧面的情况下,所述第一结果包括所述第一车辆的二维2D包络框的位置信息、所述第一车辆的车轮的坐标和所述第一车辆的第一角度,所述第一车辆的第一角度指示所述第一车辆的侧边线与所述第一图像的第一轴线之间夹角的角度,所述第一车辆的侧边线为所述第一车辆漏出的侧面与所述第一车辆所在地平面之间的交线,所述第一图像的第一轴线与所述第一图像的一个边平行;
    生成模块,用于根据所述第一车辆的2D包络框的位置信息、所述车轮的坐标和所述第一角度,生成所述第一车辆的三维3D外包络盒的位置信息,所述第一车辆的3D外包络盒的位置信息包括至少两个第一点的坐标,所述至少两个第一点均位于所述第一车辆的3D外包络盒的边上,所述至少两个第一点中两个第一点定位所述第一车辆的3D外包络盒的边,所述至少两个第一点的坐标用于定位所述第一车辆的3D外包络盒。
  23. 根据权利要求22所述的装置,其特征在于,在所述第一车辆在所述第一图像中仅漏出侧面的情况下,所述至少两个第一点包括所述第一车辆的侧边线与所述第一车辆的2D包络框的两个交点。
  24. 根据权利要求23所述的装置,其特征在于,所述生成模块,具体用于:
    根据所述第一车辆的车轮的坐标和所述第一车辆的第一角度,生成所述第一车辆的侧 边线的位置信息;
    根据所述第一车辆的侧边线的位置信息和所述第一车辆的2D包络框的位置信息,执行坐标生成操作,以得到所述至少两个第一点的坐标。
  25. 根据权利要求22所述的装置,其特征在于,在所述第一车辆在所述第一图像中漏出侧面和主面的情况下,所述第一结果中还包括所述第一车辆的分界线的位置信息和所述第一车辆的第二角度,所述分界线为侧面与主面之间的分界线,所述第一车辆的主面为所述第一车辆的前面或后面,所述第一车辆的第二角度指示所述第一车辆的主边线与所述第一图像的第一轴线之间夹角的角度,所述第一车辆的主边线为所述第一车辆漏出的主面与所述第一车辆所在的地平面之间的交线;
    所述至少两个第一点包括第一交点、第二交点和第三交点,所述第一交点为所述第一车辆的侧边线与所述第一车辆的分界线的交点,所述第一交点为所述第一车辆的3D外包络盒的一个顶点,所述第二交点为所述第一车辆的侧边线与所述第一车辆的2D包络框的交点,所述第三交点为所述第一车辆的主边线与所述第一车辆的2D包络框的交点。
  26. 根据权利要求25所述的装置,其特征在于,所述生成模块,具体用于:
    根据所述第一车辆的车轮的坐标和所述第一车辆的第一角度,生成所述第一车辆的侧边线的位置信息;
    根据所述第一车辆的侧边线的位置信息和所述第一车辆的分界线的位置信息,生成所述第一交点的坐标;
    根据所述第一车辆的侧边线的位置信息和所述第一车辆的2D包络框的位置信息,生成所述第二交点的坐标;
    根据所述第一交点的坐标和所述第一车辆的第二角度,生成所述第一车辆的主边线的位置信息;
    根据所述第一车辆的主边线的位置信息和所述第一车辆的2D包络框的位置信息,生成所述第三交点的坐标。
  27. 根据权利要求22至26任一项所述的装置,其特征在于,
    所述生成模块,还用于在所述第一车辆在所述第一图像中漏出侧面的情况下,根据所述第一点的坐标,生成所述第一车辆相对于自车的朝向角。
  28. 根据权利要求27所述的装置,其特征在于,所述生成模块,还用于根据所述第一点的坐标和地平面假设原理,生成所述第一点与自车之间的距离;
    所述生成模块,具体用于:
    在根据所述第一点与自车之间的距离确定所述第一车辆与自车之间的距离未超过预设阈值的情况下,通过第一计算规则,根据所述第一点的坐标,生成所述朝向角;
    在根据所述第一点与自车之间的距离确定所述第一车辆与自车之间的距离超过预设阈值的情况下,通过第二计算规则,根据所述第一点的坐标,生成所述朝向角,所述第二计算规则和所述第一计算规则为不同的计算规则。
  29. 根据权利要求28述的装置,其特征在于,
    当至少两个第一点中任一个第一点与自车之间的距离未超过所述预设阈值时,视为所 述第一车辆与自车之间的距离未超过所述预设阈值;或者,
    当至少两个第一点中任一个第一点与自车之间的距离超过所述预设阈值时,视为所述第一车辆与自车之间的距离超过所述预设阈值。
  30. 根据权利要求28所述的装置,其特征在于,所述生成模块,具体用于:
    根据所述第一点的坐标和地平面假设原理,生成所述第一点在所述车体坐标系下的三维坐标,所述车体坐标系的坐标系原点位于自车内;
    根据所述第一点的三维坐标,生成所述朝向角。
  31. 根据权利要求28所述的装置,其特征在于,所述生成模块,具体用于:
    根据所述第一点的坐标和所述第一车辆的第一角度,生成所述第一车辆的侧边线的位置信息,根据所述第一车辆的侧边线的位置信息和所述第一图像的消失线的位置信息,生成消失点的坐标,所述消失点为所述第一车辆的侧边线与所述第一图像的消失线之间的交点;
    根据所述消失点的坐标和两点透视原理,生成所述朝向角。
  32. 根据权利要求28所述的装置,其特征在于,所述生成模块,具体用于:
    根据所述第一点的坐标、所述第一车辆的第一角度和小孔成像原理,生成所述第一车辆的第一角度和所述朝向角之间的映射关系;
    根据所述映射关系和所述第一车辆的第一角度,生成所述朝向角。
  33. 根据权利要求22至26任一项所述的装置,其特征在于,
    所述获取模块,还用于从所述至少两个第一点的坐标中获取所述第一车辆的3D外包络盒的顶点的坐标;
    所述生成模块,还用于根据所述第一车辆的3D外包络盒的顶点的坐标和地平面假设原理,生成所述第一车辆的质心点在车体坐标系下的三维坐标,所述车体坐标系的坐标系原点位于自车内。
  34. 根据权利要求22至26任一项所述的装置,其特征在于,
    所述获取模块,还用于从所述至少两个第一点的坐标中获取第一顶点的坐标,所述第一顶点为所述第一车辆的3D外包络盒的一个顶点;
    所述生成模块,还用于根据所述第一顶点的坐标和地平面假设原理,生成所述第一顶点在车体坐标系下的三维坐标;
    所述生成模块,还用于若所述至少两个第一点中包括至少两个第一顶点,根据第一顶点在车体坐标系下的三维坐标,生成以下中的一项或多项:第一车辆的长、第一车辆的宽和第一车辆的高,所述车体坐标系的坐标系原点位于自车内。
  35. 根据权利要求34所述的装置,其特征在于,
    所述获取模块,还用于若所述至少两个第一点中包括一个第一顶点,获取第二图像,所述第二图像中包括所述第一车辆,所述第二图像和所述第一图像的图像采集角度不同;
    所述生成模块,还用于根据所述第二图像,通过所述图像处理网络,得到至少两个第二点的坐标,所述至少两个第二点均位于所述第一车辆的三维3D外包络盒的边上,所述至少两个第二点中两个第二点定位所述第一车辆的3D外包络盒的边,所述至少两个第二 点的坐标用于定位所述第一车辆的3D外包络盒;
    所述生成模块,还用于根据所述第二点的坐标和地平面假设原理,生成第二顶点在所述车体坐标系下的三维坐标,所述第二顶点为所述第一车辆的3D外包络盒的一个顶点,所述第二顶点与所述第一顶点为不同的顶点;
    所述生成模块,还用于根据所述第一顶点的三维坐标和所述第二顶点的三维坐标,生成以下中的一项或多项:第一车辆的长、第一车辆的宽和第一车辆的高。
  36. 一种图像处理装置,其特征在于,所述装置包括:
    获取模块,用于获取第三图像,所述第三图像中包括第一刚体,所述第一刚体为立方体;
    输入模块,用于将所述第三图像输入图像处理网络中,得到所述图像处理网络输出的第二结果,在所述第一刚体在所述第三图像中漏出侧面的情况下,所述第二结果包括所述第一刚体的2D包络框的位置信息和所述第一刚体的第一角度,所述第一刚体的第一角度指示所述第一刚体的侧边线与所述第三图像的第一轴线之间夹角的角度,所述第一刚体的侧边线为所述第一刚体漏出的侧面与所述第一刚体所在平面之间的交线,所述第三图像的第一轴线与所述第三图像的一个边平行;
    生成模块,用于根据所述第一刚体的2D包络框的位置信息和所述第一角度,生成所述第一刚体的三维3D外包络盒的位置信息,所述第一刚体的3D外包络盒的位置信息包括至少两个第三点的坐标,所述至少两个第三点均位于所述第一刚体的3D外包络盒的边上,所述至少两个第三点中两个第三点定位所述第一刚体的3D外包络盒的边,所述至少两个第三点的坐标用于定位所述第一刚体的3D外包络盒。
  37. 根据权利要求36所述的装置,其特征在于,在所述第一刚体在所述第三图像中仅漏出侧面的情况下,所述至少两个第三点中包括所述第一刚体的侧边线与所述第一刚体的2D包络框的两个交点。
  38. 根据权利要求36所述的装置,其特征在于,在所述第一刚体在所述第三图像中漏出侧面和主面的情况下,所述第一结果中还包括所述第一刚体的分界线的位置信息和所述第一刚体的第二角度,所述分界线为侧面与主面之间的分界线,所述第一刚体的主面为所述第一刚体的前面或后面,所述第一刚体的第二角度指示所述第一刚体的主边线与所述第三图像的第一轴线之间夹角的角度,所述第一刚体的主边线为所述第一刚体漏出的主面与所述第一刚体所在的地平面之间的交线;
    所述至少两个第三点包括第一交点、第二交点和第三交点,所述第一交点为所述第一刚体的侧边线与所述第一刚体的分界线的交点,所述第一交点为所述第一刚体的3D外包络盒的一个顶点,所述第二交点为所述第一刚体的侧边线与所述第一刚体的2D包络框的交点,所述第三交点为所述第一刚体的主边线与所述第一刚体的2D包络框的交点。
  39. 根据权利要求36至38任一项所述的装置,其特征在于,
    所述生成模块,还用于根据所述至少两个第三点的坐标,生成所述第一刚体的三维特征信息,所述第一刚体的三维特征信息包括以下中的一项或多项:所述第一刚体相当于自车的朝向角、所述第一刚体的质心点的位置信息和所述第一刚体的尺寸。
  40. 一种网络训练装置,其特征在于,所述装置包括:
    获取模块,用于获取训练图像和所述训练图像的标注数据,所述训练图像中包括第二车辆,在所述第二车辆在所述训练图像中漏出侧面的情况下,所述标注数据包括所述第二车辆的车轮的标注坐标和所述第二车辆的标注第一角度,所述第二车辆的第一角度指示所述第二车辆的侧边线与所述训练图像的第一轴线之间夹角的角度,所述第二车辆的侧边线为所述第二车辆漏出的侧面与所述第二车辆所在地平面之间的交线,所述训练图像的第一轴线与所述训练图像的一个边平行;
    输入模块,用于将所述训练图像输入图像处理网络中,得到所述图像输入网络输出的第三结果,所述第三结果包括所述第二车辆的车轮的生成坐标和所述第二车辆的生成第一角度;
    训练模块,用于根据所述标注数据和所述第三结果,利用损失函数对所述图像处理网络进行训练,直至满足所述损失函数的收敛条件,输出训练后的所述图像处理网络,所述损失函数用于拉近所述生成坐标与所述标注坐标之间的相似度,且拉近所述生成第一角度和所述标注第一角度之间的相似度。
  41. 根据权利要求40所述的装置,其特征在于,在所述第二车辆在所述训练图像中漏出侧面和主面的情况下,所述标注数据还包括所述第二车辆的分界线的标注位置信息和所述第二车辆的标注第二角度,所述第三结果还包括所述第二车辆的分界线的生成位置信息和所述第二车辆的生成第二角度,所述损失函数还用于拉近所述生成位置信息和所述标注位置信息之间的相似度,且拉近所述生成第二角度与所述标注第二角度之间的相似度;
    其中,所述第二车辆的主面为所述第二车辆的前面或后面,所述分界线为侧面与主面之间的分界线,所述第二车辆的第二角度指示所述第二车辆的主边线与所述训练图像的第一轴线之间夹角的角度,所述第二车辆的主边线为所述第二车辆漏出的主面与所述第二车辆所在的地平面之间的交线。
  42. 根据权利要求40或41所述的装置,其特征在于,所述图像处理网络包括二阶段目标检测网络和三维特征提取网络,所述二阶段目标检测网络包括区域生成网络RPN;
    所述输入模块,具体用于:
    将所述训练图像输入所述二阶段目标检测网络中,得到所述二阶段目标检测网络中的所述RPN输出的所述第二车辆的2D包络框的位置信息;
    将第一特征图输入所述三维特征提取网络,得到所述三维特征提取网络输出的所述第三结果,所述第一特征图为所述训练图像的特征图中位于所述RPN输出的所述2D包络框内的特征图;
    所述训练模块,具体用于输出包括所述二阶段目标检测网络和三维特征提取网络的图像处理网络。
  43. 一种执行设备,其特征在于,包括处理器,所述处理器和存储器耦合,所述存储器存储有程序指令,当所述存储器存储的程序指令被所述处理器执行时实现权利要求1至14中任一项所述的方法,或者,实现权利要求15至18中任一项所述的方法。
  44. 一种自动驾驶车辆,其特征在于,包括处理器,所述处理器和存储器耦合,所述 存储器存储有程序指令,当所述存储器存储的程序指令被所述处理器执行时实现权利要求1至14中任一项所述的方法,或者,实现权利要求15至18中任一项所述的方法。
  45. 一种训练设备,其特征在于,包括处理器,所述处理器和存储器耦合,所述存储器存储有程序指令,当所述存储器存储的程序指令被所述处理器执行时实现权利要求19至21中任一项所述的方法。
  46. 一种计算机可读存储介质,包括程序,当其在计算机上运行时,使得计算机执行如权利要求1至14中任一项所述的方法,或者,使得计算机执行如权利要求15至18中任一项所述的方法,或者,使得计算机执行如权利要求19至21中任一项所述的方法。
  47. 一种电路***,其特征在于,所述电路***包括处理电路,所述处理电路配置为执行如权利要求1至14中任一项所述的方法,或者,执行如权利要求15至18中任一项所述的方法,或者,执行如权利要求19至21中任一项所述的方法。
PCT/CN2021/088263 2020-04-30 2021-04-20 一种图像的处理方法、网络的训练方法以及相关设备 WO2021218693A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP21797363.5A EP4137990A4 (en) 2020-04-30 2021-04-20 IMAGE PROCESSING METHOD, NETWORK TRAINING METHOD AND ASSOCIATED APPARATUS
US17/975,922 US20230047094A1 (en) 2020-04-30 2022-10-28 Image processing method, network training method, and related device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010366441.9 2020-04-30
CN202010366441.9A CN113591518B (zh) 2020-04-30 2020-04-30 一种图像的处理方法、网络的训练方法以及相关设备

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/975,922 Continuation US20230047094A1 (en) 2020-04-30 2022-10-28 Image processing method, network training method, and related device

Publications (1)

Publication Number Publication Date
WO2021218693A1 true WO2021218693A1 (zh) 2021-11-04

Family

ID=78237489

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/088263 WO2021218693A1 (zh) 2020-04-30 2021-04-20 一种图像的处理方法、网络的训练方法以及相关设备

Country Status (4)

Country Link
US (1) US20230047094A1 (zh)
EP (1) EP4137990A4 (zh)
CN (1) CN113591518B (zh)
WO (1) WO2021218693A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114091521A (zh) * 2021-12-09 2022-02-25 深圳佑驾创新科技有限公司 车辆航向角的检测方法、装置、设备及存储介质
CN115470742A (zh) * 2022-10-31 2022-12-13 中南大学 一种锂离子电池建模方法、***、设备及存储介质
CN117315035A (zh) * 2023-11-30 2023-12-29 武汉未来幻影科技有限公司 一种车辆朝向的处理方法、装置以及处理设备

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112614359B (zh) * 2020-12-21 2022-06-28 阿波罗智联(北京)科技有限公司 控制交通的方法、装置、路侧设备和云控平台

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109242903A (zh) * 2018-09-07 2019-01-18 百度在线网络技术(北京)有限公司 三维数据的生成方法、装置、设备及存储介质
CN109829421A (zh) * 2019-01-29 2019-05-31 西安邮电大学 车辆检测的方法、装置及计算机可读存储介质
US20190340432A1 (en) * 2016-10-11 2019-11-07 Zoox, Inc. Three dimensional bounding box estimation from two dimensional images
CN111008557A (zh) * 2019-10-30 2020-04-14 长安大学 一种基于几何约束的车辆细粒度识别方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10312993B2 (en) * 2015-10-30 2019-06-04 The Florida International University Board Of Trustees Cooperative clustering for enhancing MU-massive-MISO-based UAV communication
US10816992B2 (en) * 2018-04-17 2020-10-27 Baidu Usa Llc Method for transforming 2D bounding boxes of objects into 3D positions for autonomous driving vehicles (ADVs)
US10586456B2 (en) * 2018-04-27 2020-03-10 TuSimple System and method for determining car to lane distance
CN110148169B (zh) * 2019-03-19 2022-09-27 长安大学 一种基于ptz云台相机的车辆目标三维信息获取方法
CN110555407B (zh) * 2019-09-02 2022-03-08 东风汽车有限公司 路面车辆空间识别方法及电子设备

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190340432A1 (en) * 2016-10-11 2019-11-07 Zoox, Inc. Three dimensional bounding box estimation from two dimensional images
CN109242903A (zh) * 2018-09-07 2019-01-18 百度在线网络技术(北京)有限公司 三维数据的生成方法、装置、设备及存储介质
CN109829421A (zh) * 2019-01-29 2019-05-31 西安邮电大学 车辆检测的方法、装置及计算机可读存储介质
CN111008557A (zh) * 2019-10-30 2020-04-14 长安大学 一种基于几何约束的车辆细粒度识别方法

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114091521A (zh) * 2021-12-09 2022-02-25 深圳佑驾创新科技有限公司 车辆航向角的检测方法、装置、设备及存储介质
CN114091521B (zh) * 2021-12-09 2022-04-26 深圳佑驾创新科技有限公司 车辆航向角的检测方法、装置、设备及存储介质
CN115470742A (zh) * 2022-10-31 2022-12-13 中南大学 一种锂离子电池建模方法、***、设备及存储介质
CN115470742B (zh) * 2022-10-31 2023-03-14 中南大学 一种锂离子电池建模方法、***、设备及存储介质
CN117315035A (zh) * 2023-11-30 2023-12-29 武汉未来幻影科技有限公司 一种车辆朝向的处理方法、装置以及处理设备
CN117315035B (zh) * 2023-11-30 2024-03-22 武汉未来幻影科技有限公司 一种车辆朝向的处理方法、装置以及处理设备

Also Published As

Publication number Publication date
EP4137990A4 (en) 2023-09-27
US20230047094A1 (en) 2023-02-16
CN113591518B (zh) 2023-11-03
CN113591518A (zh) 2021-11-02
EP4137990A1 (en) 2023-02-22

Similar Documents

Publication Publication Date Title
CN110543814B (zh) 一种交通灯的识别方法及装置
WO2021160184A1 (en) Target detection method, training method, electronic device, and computer-readable medium
WO2021218693A1 (zh) 一种图像的处理方法、网络的训练方法以及相关设备
WO2022001773A1 (zh) 轨迹预测方法及装置
WO2021000800A1 (zh) 道路可行驶区域推理方法及装置
WO2021238306A1 (zh) 一种激光点云的处理方法及相关设备
CN112639882B (zh) 定位方法、装置及***
CN110930323B (zh) 图像去反光的方法、装置
WO2022104774A1 (zh) 目标检测方法和装置
WO2021244207A1 (zh) 训练驾驶行为决策模型的方法及装置
US20220080972A1 (en) Autonomous lane change method and apparatus, and storage medium
WO2022142839A1 (zh) 一种图像处理方法、装置以及智能汽车
CN112512887B (zh) 一种行驶决策选择方法以及装置
CN114494158A (zh) 一种图像处理方法、一种车道线检测方法及相关设备
EP4206731A1 (en) Target tracking method and device
WO2022051951A1 (zh) 车道线检测方法、相关设备及计算机可读存储介质
US20230048680A1 (en) Method and apparatus for passing through barrier gate crossbar by vehicle
CN115546781A (zh) 一种点云数据的聚类方法以及装置
WO2022052881A1 (zh) 一种构建地图的方法及计算设备
CN114332845A (zh) 一种3d目标检测的方法及设备
CN116261649A (zh) 一种车辆行驶意图预测方法、装置、终端及存储介质
WO2022033089A1 (zh) 确定检测对象的三维信息的方法及装置
CN113066124A (zh) 一种神经网络的训练方法以及相关设备
WO2021159397A1 (zh) 车辆可行驶区域的检测方法以及检测装置
WO2022022284A1 (zh) 目标物的感知方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21797363

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021797363

Country of ref document: EP

Effective date: 20221116

NENP Non-entry into the national phase

Ref country code: DE