CN112364793A - Target detection and fusion method based on long-focus and short-focus multi-camera vehicle environment - Google Patents

Target detection and fusion method based on long-focus and short-focus multi-camera vehicle environment Download PDF

Info

Publication number
CN112364793A
CN112364793A CN202011288888.5A CN202011288888A CN112364793A CN 112364793 A CN112364793 A CN 112364793A CN 202011288888 A CN202011288888 A CN 202011288888A CN 112364793 A CN112364793 A CN 112364793A
Authority
CN
China
Prior art keywords
focus
camera
short
long
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011288888.5A
Other languages
Chinese (zh)
Inventor
冯明驰
王鑫
孙博望
刘景林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202011288888.5A priority Critical patent/CN112364793A/en
Publication of CN112364793A publication Critical patent/CN112364793A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Studio Devices (AREA)

Abstract

The invention requests to protect a target detection and fusion method based on a long-focus and short-focus multi-camera vehicle environment. The method comprises the following steps: 1. and performing target detection on the images acquired by the long-focus and short-focus binocular cameras by adopting a convolutional neural network to obtain the positions of target frames in the images acquired by the cameras with different focal lengths at the same time. 2. According to the camera imaging principle and the internal and external parameters K, R, T obtained by camera calibration, the mapping relationship f of the space target point P in the long and short focus camera pixel coordinate system can be obtained. 3. And obtaining the position of the corresponding target frame in the short-focus camera image according to the mapping relation f of the target frame position in the long-focus camera image, and fusing the target frame position with the target in the original short-focus camera image, thereby realizing the target detection task under different distance conditions. The invention overcomes the limitation that a single focal length camera cannot adapt to target detection tasks at different distances, and improves the target detection accuracy in the vehicle environment. Meanwhile, the method is simple and easy to use, low in cost and high in instantaneity.

Description

Target detection and fusion method based on long-focus and short-focus multi-camera vehicle environment
Technical Field
The invention belongs to the technical field of intelligent automobile environment sensing, and particularly relates to a target detection and fusion method in a long-and-short-focal-length multi-camera vehicle environment.
Background
In recent years, with rapid development of fields such as artificial intelligence and machine vision, autopilot has become an important field for academic and industrial research. The environment perception technology is one of the key technologies in the automatic driving system and is the most basic module, and the environment around the vehicle is informed by the eyes of the vehicle. Target detection, positioning and motion state estimation are the most basic functions in the environment sensing module.
With the wide application of deep learning and the great improvement of computing capability of computing devices, the environmental awareness technology based on deep learning becomes an important support for the environmental awareness module. The environment perception based on vision mainly realizes the functions of pedestrian detection, obstacle detection, lane line detection, drivable area detection, traffic sign identification and the like, and can realize the positioning of a target by combining a stereoscopic vision technology. At present, researchers at home and abroad are always focusing on improving the target detection performance of a single focus camera. However, in a complex working environment, information acquired by a single focal length camera is limited, and only the single focal length camera cannot correctly detect targets at different distances, so that detection omission often occurs. And the cameras with different focal lengths can just make up the defects between the cameras and the vehicle, integrate the advantages of the cameras and the vehicle, and accurately detect the target in the vehicle environment. For example, short-focus cameras have a wide field of view, and distant targets are imaged less and difficult to detect through depth learning; the near target is large and easy to detect. The long-focus camera has narrow visual field and large distant target, and is easy to detect; but near objects may not be captured due to the camera view. Therefore, the respective advantages of the short-focus camera image and the long-focus camera image are integrated, the target detection tasks under different distances can be realized, the target under the vehicle environment can be more accurately detected, and the condition that the target is missed to be detected is effectively avoided.
Disclosure of Invention
The present invention is directed to solving the above problems of the prior art. A target detection and fusion method based on a long-focus and short-focus multi-camera vehicle environment is provided. The technical scheme of the invention is as follows:
a target detection and fusion method based on a long-focus and short-focus multi-camera vehicle environment comprises the following steps:
step 1, respectively installing a long-focus camera and a short-focus binocular camera, and calibrating a binocular system. Inputting image sequences acquired by the long-focus camera and the short-focus camera into a deep learning convolutional neural network, and obtaining the positions of target frames of the long-focus binocular camera and the short-focus binocular camera in the wide visual field and the narrow visual field at the same moment through target detection;
step 2, obtaining a mapping relation f of the space target point P under the long-focus camera pixel coordinate system according to the target position in the long-focus camera narrow-view field and the internal and external parameters calibrated by the two eyes by utilizing the camera imaging principle, and obtaining a target position P in the long-focus camera narrow-view field1Corresponding target position p in the wide field of view of a short-focus camera2
And 3, fusing target frames in the long-focus image and the short-focus image by analyzing the target position detected in the wide field of view of the short-focus camera and the target position corresponding to the target position in the narrow field of view of the long-focus camera in the wide field of view of the short-focus camera obtained in the step 2.
Further, the step 1) specifically comprises the following steps:
2-1, setting different focal lengths of the long and short-focus cameras, installing a binocular camera system to the same height above the vehicle, and reserving a certain baseline distance between the binocular cameras;
step 2-2, calibrating by using a Zhangyingyou calibration method to obtain an internal parameter K and an external parameter R, T of the binocular system; where K is an internal reference matrix containing information of the focal length, optical center, etc. of the camera, and R and T are the rotation matrix and translation matrix of the long focus camera relative to the short focus camera, respectively.
And 2-3, performing deep learning target detection, and performing target detection on images collected by the long-focus binocular camera and the short-focus binocular camera at the same moment by adopting a lightweight convolutional neural network YOLOv3-Tiny, wherein the method specifically comprises the following steps: and (4) data set making, transfer learning, network reasoning and target detection are carried out to obtain the positions of target frames under cameras with different focal lengths.
Further, the step 2-1 is to set the focal length of the camera and install the camera system, two cameras with different focal lengths are adopted, the short focal length camera is placed on the left side, the long focal length camera is placed on the right side, the length of a base line between the two cameras is b, a long-short focal binocular vision system is formed, and the binocular vision system is placed in front of the top of the vehicle.
Further, calibrating the long-focus and short-focus binocular camera in the step 2-2), placing a checkerboard calibration plate in front of the binocular camera, and necessarily requiring the checkerboard to be simultaneously present in the visual field of the long-focus and short-focus camera; capturing the corner points of the checkerboard calibration board by using a binocular camera, and calculating the internal parameters K of the cameras by using a Zhang-Zhengyou calibration method1,K2And external references R and T between the binocular cameras.
Further, the specific process of the step 2-3 of making the data set is that a Chongqing city traffic data set which is automatically collected and finished with label making is merged with an open-source Pascal VOC 2012 data set, and then the merged data set is subjected to data enhancement to obtain more training samples;
the specific process of the transfer learning is that a merged data set is loaded and trained by using a YOLOv3_ Tiny network on the basis of an existing pre-training model;
the network reasoning and target detection means that a trained network model weight is loaded by a YOLOv3_ Tiny network in the normal operation process of the intelligent vehicle to perform forward reasoning calculation so as to complete a target detection task.
Further, in step 2, a corresponding relationship between the long focus camera pixel coordinate system and the short focus camera pixel coordinate system is established by a camera imaging principle, and can be calculated by the following formula:
s1p1=K1P,s2p2=K2(RP+T)
p represents a point in real space, P1,p2Expressing the pixel points, K, corresponding to the points P in the space in the long and short-focus camera pixel coordinate system1,K2Respectively representing the internal parameters of the long-focus camera and the short-focus camera, and R, T representing the external parameters between the long-focus binocular camera and the short-focus binocular camera; s1,s2Representing depth information of the point P in the long and short focus camera coordinate systems, respectively.
When using homogeneous coordinates, the above equation is written as follows:
p1=K1P,p2=K2(RP+T)
by the above formula, p can be obtained1,p2The mapping relationship f between the following components:
p2=K2RK1-1p1+K2T
further, the step 3 specifically includes the following steps:
step 3-1, detecting the ith target frame B in the long-focus cameralPosition (x) ofl,yl,wl,hl) According to the mapping relation f, the corresponding target frame B in the short-focus camera image can be obtaineds'position (x's,y′s,w′s,h′s) (ii) a Wherein x isl,yl,wl,hlRespectively representing the horizontal and vertical coordinates of the central position of the target and the width and height of the target frame; x's,y′s,w′s,h′sRespectively representing the horizontal and vertical coordinates of the center position of the mapped target and the width and height of the target frame.
Step 3-2, calculating the mapped target frame B's(x′s,y′s,w′s,h′s) Target frame B detected from short-focus cameras(xs,ys,ws,hs) The cross-over ratio between IOU when IOU>When the threshold value t is reached, the long and short focal cameras detect the target frame; otherwise, at least one camera does not detect the target, and the calculation formula of the IOU is as follows:
Figure BDA0002783267790000041
step 3-3, when IOU>When the threshold value t is reached, the target is detected in the long-focus camera and the short-focus camera, and the target frame B 'needs to be calculated in consideration of the deviation of the actual mapping result's、BsThe scaling ratios Δ w, Δ h and the offset ratios Δ x, Δ y of (a) are calculated as follows:
Figure BDA0002783267790000042
Figure BDA0002783267790000043
Δx=xs-x′s
Δy=ys-y′s
step 3-4, when IOU<At threshold t, it means that the short-focus camera has not detected the target, and B 'is required at this time'sReduction of BsI.e. calculate BsAt the position in the short-focus camera, the reduction calculation formula is as follows:
ws=w′s*Δw
hs=h′s*Δh
xs=x′s+Δx
ys=y′s+Δy
and 3-5, repeating all the steps from 3-1 to 3-4, and completing target fusion according to the target positions and types in the long-focus camera and the short-focus camera.
The invention has the following advantages and beneficial effects:
the invention provides a target detection and fusion method based on a long-focus and short-focus multi-camera vehicle environment. In the field of unmanned driving, the target detection technology based on monocular vision is widely applied. The methods often have the problems of good detection effect of near targets and poor detection effect of far targets. This is because a single camera with a fixed focal length does not adapt well to the detection of objects at different positions, for example, short focus cameras have a wide field of view and distant objects have a small image, and are therefore difficult to detect by depth learning. The long-focus camera has narrow visual field, and a far target is imaged clearly, so that the detection is facilitated for deep learning, but a near target may not be in the long-focus visual field and therefore cannot be detected.
Therefore, the invention adopts a method of long-focus and short-focus multi-camera to detect and fuse the targets in the vehicle environment, and becomes an effective method for solving the problems. The advantages are shown in the following aspects:
(1) the invention adopts the short-focus camera and the long-focus camera as the target detection and fusion technology of the sensor, and compared with the target detection method based on the single-focus camera, the target detection method has higher accuracy and better practical application effect. The method combines the advantages of the short-focus camera and the long-focus camera, makes up for the defect of a single-focus camera, and improves the accuracy of target detection in a vehicle environment.
(2) Compared with a monocular camera, the binocular long-focus and short-focus camera can obtain richer visual information in a vehicle environment and can better realize detection tasks of targets at different distances.
(3) On the basis of a self-made Chongqing traffic data set, the method adopts the lightweight convolutional neural network YOLOv3-Tiny to detect the target in the image, and compared with a common YOLOv3 algorithm, the method has the advantages that the detection speed is higher, the real-time operation can be realized on embedded edge equipment and the like, and the good detection precision can be achieved.
(4) The IOU is often used for deep learning target detection to measure the confidence of a target frame, the method innovatively adopts the IOU as the judgment standard of target matching, the accuracy of target matching is greatly improved, and meanwhile, the method is low in computation time complexity and is faster than the traditional method.
Drawings
FIG. 1 is a simplified flowchart of a target detection and fusion method based on a long and short-focus multi-camera vehicle environment according to a preferred embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described in detail and clearly with reference to the accompanying drawings. The described embodiments are only some of the embodiments of the present invention.
The technical scheme for solving the technical problems is as follows:
the invention aims to provide a target detection and fusion method based on a long-focus and short-focus multi-camera vehicle environment. Through the independent camera of two different focus of installation at intelligent roof portion (keep certain baseline distance between), utilize target detection and fusion technique based on degree of depth learning, overcome the limitation of target detection task under the different distances, the effectual emergence of avoiding the target condition of louing examining, propose this technical scheme, as shown in fig. 1, include following step:
step 1, installing a long-focus and short-focus binocular camera and calibrating a binocular system. And inputting the image sequence acquired by the long-focus and short-focus cameras into the deep learning convolutional neural network, and obtaining the positions of target frames of the long-focus and short-focus binocular cameras in the wide view field and the narrow view field at the same moment through target detection. The method comprises the following specific steps:
the short-focus camera and the long-focus camera are arranged on the top of a vehicle, the focal length of the cameras is set, a camera system is arranged, two cameras with different focal lengths are adopted, the short-focus camera is arranged on the left side, the long-focus camera is arranged on the right side, the length of a base line between the two cameras is b, a long-short-focus binocular vision system is formed, and the binocular vision system is arranged in front of the top of the vehicle.
(2) Calibrating the long-focus and short-focus binocular cameras, placing a checkerboard calibration plate in front of the binocular cameras, and necessarily requiring the checkerboard to be simultaneously present in the visual fields of the long-focus and short-focus cameras; and capturing the angular points of the checkerboard calibration board by using binocular cameras, and calculating internal parameters K of the cameras and external parameters R and T between the binocular cameras by using a Zhang-friend calibration method. Where K is an internal reference matrix containing information of the focal length, optical center, etc. of the camera, and R and T are the rotation matrix and translation matrix of the long focus camera relative to the short focus camera, respectively.
(3) And (3) deep learning target detection, wherein a light-weight convolutional neural network YOLOv3-Tiny is adopted to carry out target detection on images collected by the long-focus and short-focus binocular cameras at the same moment, and the positions of target frames under different focus cameras are obtained.
Step 2, obtaining a mapping relation f of the space target point P under the long-focus camera pixel coordinate system according to the target position in the long-focus camera narrow-view field and the internal and external parameters calibrated by the two eyes by using the camera imaging principle, and further obtaining a target position P in the long-focus camera narrow-view field1Corresponding target position p in the wide field of view of a short-focus camera2. The method comprises the following specific steps:
(1) according to the camera imaging principle, establishing a pixel coordinate P of a point P in space in a long-focus and short-focus camera pixel coordinate system1,p2The relationship between the target positions can be reduced by the target positions on the long-focus camera pixel coordinate system, and the target positions which are not detected in the short-focus camera pixel coordinate system can be reduced.
(2) The corresponding relation between the long-focus camera pixel coordinate system and the short-focus camera pixel coordinate system is established by the camera imaging principle, and can be calculated by the following formula:
s1p1=K1P,s2p2=K2(RP+T)
p represents a point in real space, P1,p2And expressing pixel points corresponding to the points P in the space in the long and short focus camera pixel coordinate system respectively. K1,K2Representing the internal parameters of a long focus camera and a short focus camera, respectively. R, T denotes extrinsic parameters between the long and short-focus binocular cameras. s1,s2Representing depth information of the point P in the long and short focus camera coordinate systems, respectively.
If homogeneous coordinates are used, the above equation can be written as follows:
p1=K1P,p2=K2(RP+T)
(3) by the above formula, p can be obtained1,p2The mapping relationship f between the following components:
p2=K2RK1-1p1+K2T
and 3, analyzing the target position detected in the wide field of view of the short-focus camera and the target position corresponding to the target position in the narrow field of view of the long-focus camera in the wide field of view of the short-focus camera obtained in the step 2, and further performing fusion processing on the target frames in the long-focus and short-focus images. The method comprises the following specific steps:
(1) for the ith target frame B detected in the long-focus cameralPosition (x) ofl,yl,wl,hl) According to the mapping relation f, the corresponding target frame B 'in the short-focus camera image can be obtained'sPosition (x's,y′s,w′s,h′s). Wherein x isl,yl,wl,hlRespectively representing the horizontal and vertical coordinates of the central position of the target and the width and height of the target frame; x's,y′s,w′s,h′sRespectively representing the horizontal and vertical coordinates of the center position of the mapped target and the width and height of the target frame.
(2) Calculating mapped target frame B's(x′s,y′s,w′s,h′s) Target frame B detected from short-focus cameras(xs,ys,ws,hs) The cross-over ratio between IOU. When IOU is used>When the threshold value t is reached, the long and short focal cameras detect the target frame; otherwise, at least one camera does not detect the target. The calculation formula of the IOU is as follows:
Figure BDA0002783267790000071
(3) when IOU is used>When the threshold value t is reached, the target is detected in the long-focus camera and the short-focus camera, and the target frame B 'needs to be calculated in consideration of the deviation of the actual mapping result's、BsAnd the scaling Δ w, Δ h and the offset scaling Δ x, Δ y. The calculation formula is as follows:
Figure BDA0002783267790000081
Figure BDA0002783267790000082
Δx=xs-x′s
Δy=ys-y′s
(4) when IOU is used<At threshold t, it means that the short-focus camera has not detected the target, and B 'is required at this time'sReduction of BsI.e. calculate BsAt the position in the short-focus camera, the reduction calculation formula is as follows:
ws=w′s*Δw
hs=h′s*Δh
xs=x′s+Δx
ys=y′s+Δy
(5) and repeating all the steps from 3-1 to 3-4 to complete target fusion according to the target positions and the types in the long-focus camera and the short-focus camera.
The method illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above examples are to be construed as merely illustrative and not limitative of the remainder of the disclosure. After reading the description of the invention, the skilled person can make various changes or modifications to the invention, and these equivalent changes and modifications also fall into the scope of the invention defined by the claims.

Claims (7)

1. A target detection and fusion method based on a long-focus and short-focus multi-camera vehicle environment is characterized by comprising the following steps:
step 1, respectively installing a long-focus camera and a short-focus binocular camera, and calibrating a binocular system. Inputting image sequences acquired by the long-focus camera and the short-focus camera into a deep learning convolutional neural network, and obtaining the positions of target frames of the long-focus binocular camera and the short-focus binocular camera in the wide visual field and the narrow visual field at the same moment through target detection;
step 2, obtaining a mapping relation f of the space target point P under the long-focus camera pixel coordinate system according to the target position in the long-focus camera narrow-view field and the binocular calibrated internal and external parameters by utilizing the camera imaging principle, and obtaining a target position P in the long-focus camera narrow-view field1Corresponding target position p in the wide field of view of a short-focus camera2
And 3, fusing target frames in the long-focus image and the short-focus image by analyzing the target position detected in the wide field of view of the short-focus camera and the target position corresponding to the target position in the narrow field of view of the long-focus camera in the wide field of view of the short-focus camera obtained in the step 2.
2. The method for detecting and fusing the targets in the long-and-short-focus-based multi-camera vehicle environment according to claim 1, wherein the step 1) specifically comprises the following steps:
2-1, setting different focal lengths of the long and short-focus cameras, installing a binocular camera system to the same height above the vehicle, and reserving a certain baseline distance between the binocular cameras;
step 2-2, calibrating by using a Zhangyingyou calibration method to obtain an internal parameter K and an external parameter R, T of the binocular system; wherein K is an internal reference matrix containing information such as focal length and optical center of the camera, and R and T are a rotation matrix and a translation matrix of the long-focus camera relative to the short-focus camera respectively;
and 2-3, performing deep learning target detection, and performing target detection on images collected by the long-focus binocular camera and the short-focus binocular camera at the same moment by adopting a lightweight convolutional neural network YOLOv3-Tiny, wherein the method specifically comprises the following steps: and (4) data set making, transfer learning, network reasoning and target detection are carried out to obtain the positions of target frames under cameras with different focal lengths.
3. The method for detecting and fusing targets under the vehicle environment based on the long-focus and short-focus multiple cameras as claimed in claim 2, wherein the step 2-1 is to set the focal lengths of the cameras and install the camera system, two cameras with different focal lengths are adopted, the short-focus camera is placed on the left, the long-focus camera is placed on the right, the length of a base line between the two cameras is b, a long-focus and short-focus binocular vision system is formed, and the binocular vision system is placed in front of the top of the vehicle.
4. The method for target detection and fusion based on long-and-short-focus multi-camera vehicle environment according to claim 2, wherein the step 2-2) is used for calibrating long-and-short-focus doubleA checkerboard calibration plate is placed in front of the binocular camera, and the checkerboard must be required to be simultaneously present in the visual field of the long-focus camera and the short-focus camera; capturing the corner points of the checkerboard calibration board by using a binocular camera, and calculating the internal parameters K of the cameras by using a Zhang-Zhengyou calibration method1,K2And external references R and T between the binocular cameras.
5. The method for detecting and fusing targets under the environment of the long-and-short-focus multi-camera vehicle as claimed in claim 2, wherein the specific process of the step 2-3 data set production is to combine a Chongqing city traffic data set which is automatically collected and labeled with an open-source Pascal VOC 2012 data set, and then perform data enhancement on the combined data set to obtain more training samples;
the specific process of the transfer learning is that a merged data set is loaded and trained by using a YOLOv3_ Tiny network on the basis of an existing pre-training model;
the network reasoning and target detection means that a trained network model weight is loaded by a YOLOv3_ Tiny network in the normal operation process of the intelligent vehicle to perform forward reasoning calculation so as to complete a target detection task.
6. The method for detecting and fusing targets in the long-focus and short-focus multi-camera vehicle environment according to one of claims 1 to 5, wherein the step 2 establishes the correspondence relationship between the long-focus camera pixel coordinate system and the short-focus camera pixel coordinate system according to the camera imaging principle, and can be calculated according to the following formula:
s1p1=K1P,s2p2=K2(RP+T)
p represents a point in real space, P1,p2Expressing the pixel points, K, corresponding to the points P in the space in the long and short-focus camera pixel coordinate system1,K2Denotes intrinsic parameters of the telephoto camera and the short-focus camera, respectively, R, T denotes extrinsic parameters between the telephoto and the short-focus binocular cameras, s1,s2Representing point P in long and short-focus camera coordinate systems, respectivelyDepth information;
when using homogeneous coordinates, the above equation is written as follows:
p1=K1P,p2=K2(RP+T)
by the above formula, p can be obtained1,p2The mapping relationship f between the following components:
Figure FDA0002783267780000021
7. the method for detecting and fusing targets in the long-and-short-focus-based multi-camera vehicle environment according to claim 6, wherein the step 3 specifically comprises the following steps:
step 3-1, detecting the ith target frame B in the long-focus cameralPosition (x) ofl,yl,wl,hl) According to the mapping relation f, the corresponding target frame B 'in the short-focus camera image can be obtained'sPosition (x's,y′s,w′s,h′s) (ii) a Wherein x isl,yl,wl,hlRespectively representing the horizontal and vertical coordinates of the central position of the target and the width and height of the target frame; x's,y′s,w′s,h′sRespectively representing the horizontal and vertical coordinates of the center position of the mapped target and the width and height of the target frame;
step 3-2, calculating the mapped target frame B's(x′s,y′s,w′s,h′s) Target frame B detected from short-focus cameras(xs,ys,ws,hs) The cross-over ratio between IOU when IOU>When the threshold value t is reached, the long and short focal cameras detect the target frame; otherwise, at least one camera does not detect the target, and the calculation formula of the IOU is as follows:
Figure FDA0002783267780000031
step 3-3, when IOU>When the threshold value t is reached, the target is detected in the long-focus camera and the short-focus camera, and the target frame B 'needs to be calculated in consideration of the deviation of the actual mapping result's、BsThe scaling ratios Δ w, Δ h and the offset ratios Δ x, Δ y of (a) are calculated as follows:
Figure FDA0002783267780000032
Figure FDA0002783267780000033
Δx=xs-x′s
Δy=ys-y′s
step 3-4, when IOU<At threshold t, it means that the short-focus camera has not detected the target, and B 'is required at this time'sReduction of BsI.e. calculate BsAt the position in the short-focus camera, the reduction calculation formula is as follows:
ws=w′s*Δw
hs=h′s*Δh
xs=x′s+Δx
ys=y′s+Δy
and 3-5, repeating all the steps from 3-1 to 3-4, and completing target fusion according to the target positions and types in the long-focus camera and the short-focus camera.
CN202011288888.5A 2020-11-17 2020-11-17 Target detection and fusion method based on long-focus and short-focus multi-camera vehicle environment Pending CN112364793A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011288888.5A CN112364793A (en) 2020-11-17 2020-11-17 Target detection and fusion method based on long-focus and short-focus multi-camera vehicle environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011288888.5A CN112364793A (en) 2020-11-17 2020-11-17 Target detection and fusion method based on long-focus and short-focus multi-camera vehicle environment

Publications (1)

Publication Number Publication Date
CN112364793A true CN112364793A (en) 2021-02-12

Family

ID=74532438

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011288888.5A Pending CN112364793A (en) 2020-11-17 2020-11-17 Target detection and fusion method based on long-focus and short-focus multi-camera vehicle environment

Country Status (1)

Country Link
CN (1) CN112364793A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113223094A (en) * 2021-05-24 2021-08-06 深圳市智像科技有限公司 Binocular imaging system, control method and device thereof, and storage medium
CN116778360A (en) * 2023-06-09 2023-09-19 北京科技大学 Ground target positioning method and device for flapping-wing flying robot

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171758A (en) * 2018-01-16 2018-06-15 重庆邮电大学 Polyphaser scaling method based on minimum time principle and transparent glass scaling board
CN108257161A (en) * 2018-01-16 2018-07-06 重庆邮电大学 Vehicle environmental three-dimensionalreconstruction and movement estimation system and method based on polyphaser
CN109163657A (en) * 2018-06-26 2019-01-08 浙江大学 A kind of circular target position and posture detection method rebuild based on binocular vision 3 D
CN109165629A (en) * 2018-09-13 2019-01-08 百度在线网络技术(北京)有限公司 It is multifocal away from visual barrier cognitive method, device, equipment and storage medium
CN109448054A (en) * 2018-09-17 2019-03-08 深圳大学 The target Locate step by step method of view-based access control model fusion, application, apparatus and system
CN109815886A (en) * 2019-01-21 2019-05-28 南京邮电大学 A kind of pedestrian and vehicle checking method and system based on improvement YOLOv3
WO2019198381A1 (en) * 2018-04-13 2019-10-17 ソニー株式会社 Information processing device, information processing method, and program
CN110378210A (en) * 2019-06-11 2019-10-25 江苏大学 A kind of vehicle and car plate detection based on lightweight YOLOv3 and long short focus merge distance measuring method
CN110532937A (en) * 2019-08-26 2019-12-03 北京航空航天大学 Method for distinguishing is known to targeting accuracy with before disaggregated model progress train based on identification model
CN111210478A (en) * 2019-12-31 2020-05-29 重庆邮电大学 Method, medium and system for calibrating external parameters of common-view-free multi-camera system

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171758A (en) * 2018-01-16 2018-06-15 重庆邮电大学 Polyphaser scaling method based on minimum time principle and transparent glass scaling board
CN108257161A (en) * 2018-01-16 2018-07-06 重庆邮电大学 Vehicle environmental three-dimensionalreconstruction and movement estimation system and method based on polyphaser
WO2019198381A1 (en) * 2018-04-13 2019-10-17 ソニー株式会社 Information processing device, information processing method, and program
CN109163657A (en) * 2018-06-26 2019-01-08 浙江大学 A kind of circular target position and posture detection method rebuild based on binocular vision 3 D
CN109165629A (en) * 2018-09-13 2019-01-08 百度在线网络技术(北京)有限公司 It is multifocal away from visual barrier cognitive method, device, equipment and storage medium
US20200089976A1 (en) * 2018-09-13 2020-03-19 Baidu Online Network Technology (Beijing) Co., Ltd. Method and device of multi-focal sensing of an obstacle and non-volatile computer-readable storage medium
CN109448054A (en) * 2018-09-17 2019-03-08 深圳大学 The target Locate step by step method of view-based access control model fusion, application, apparatus and system
CN109815886A (en) * 2019-01-21 2019-05-28 南京邮电大学 A kind of pedestrian and vehicle checking method and system based on improvement YOLOv3
CN110378210A (en) * 2019-06-11 2019-10-25 江苏大学 A kind of vehicle and car plate detection based on lightweight YOLOv3 and long short focus merge distance measuring method
CN110532937A (en) * 2019-08-26 2019-12-03 北京航空航天大学 Method for distinguishing is known to targeting accuracy with before disaggregated model progress train based on identification model
CN111210478A (en) * 2019-12-31 2020-05-29 重庆邮电大学 Method, medium and system for calibrating external parameters of common-view-free multi-camera system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
SAFWAN WSHAH.: ""Deep Learning for Model Parameter Calibration in Power Systems"", 《2020 IEEE INTERNATIONAL CONFERENCE ON POWER SYSTEMS TECHNOLOGY (POWERCON)》 *
李星辰: ""融合YOLO检测的多目标跟踪算法"", 《计算机工程与科学》 *
贾祥: ""基于双目视觉的车辆三维环境重建方法研究"", 《中国优秀硕士论文全文数据库》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113223094A (en) * 2021-05-24 2021-08-06 深圳市智像科技有限公司 Binocular imaging system, control method and device thereof, and storage medium
CN116778360A (en) * 2023-06-09 2023-09-19 北京科技大学 Ground target positioning method and device for flapping-wing flying robot
CN116778360B (en) * 2023-06-09 2024-03-19 北京科技大学 Ground target positioning method and device for flapping-wing flying robot

Similar Documents

Publication Publication Date Title
CN109308693B (en) Single-binocular vision system for target detection and pose measurement constructed by one PTZ camera
CN110889829B (en) Monocular distance measurement method based on fish eye lens
CN106529587B (en) Vision course recognition methods based on object detection
CN105758426A (en) Combined calibration method for multiple sensors of mobile robot
CN111738071B (en) Inverse perspective transformation method based on motion change of monocular camera
CN114089329A (en) Target detection method based on fusion of long and short focus cameras and millimeter wave radar
CN106920247A (en) A kind of method for tracking target and device based on comparison network
CN111260539A (en) Fisheye pattern target identification method and system
CN110779491A (en) Method, device and equipment for measuring distance of target on horizontal plane and storage medium
CN111932627B (en) Marker drawing method and system
CN112364793A (en) Target detection and fusion method based on long-focus and short-focus multi-camera vehicle environment
Cvišić et al. Recalibrating the KITTI dataset camera setup for improved odometry accuracy
CN111967396A (en) Processing method, device and equipment for obstacle detection and storage medium
CN111950370B (en) Dynamic environment offline visual milemeter expansion method
CN112115913B (en) Image processing method, device and equipment and storage medium
CN111239684A (en) Binocular fast distance measurement method based on YoloV3 deep learning
CN110992424A (en) Positioning method and system based on binocular vision
CN111161305A (en) Intelligent unmanned aerial vehicle identification tracking method and system
CN104463240A (en) Method and device for controlling list interface
CN110197104B (en) Distance measurement method and device based on vehicle
CN111724432B (en) Object three-dimensional detection method and device
CN111598956A (en) Calibration method, device and system
CN110864670A (en) Method and system for acquiring position of target obstacle
CN114140659B (en) Social distance monitoring method based on human body detection under unmanned aerial vehicle visual angle
CN115790568A (en) Map generation method based on semantic information and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210212