CN117635683A - Trolley indoor positioning method based on multiple cameras - Google Patents

Trolley indoor positioning method based on multiple cameras Download PDF

Info

Publication number
CN117635683A
CN117635683A CN202311706393.3A CN202311706393A CN117635683A CN 117635683 A CN117635683 A CN 117635683A CN 202311706393 A CN202311706393 A CN 202311706393A CN 117635683 A CN117635683 A CN 117635683A
Authority
CN
China
Prior art keywords
camera
trolley
image
coordinates
points
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311706393.3A
Other languages
Chinese (zh)
Inventor
李倩迪
姚焙继
高桓
吴冶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Yingqi Intelligent Technology Co ltd
Original Assignee
Nanjing Yingqi Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Yingqi Intelligent Technology Co ltd filed Critical Nanjing Yingqi Intelligent Technology Co ltd
Priority to CN202311706393.3A priority Critical patent/CN117635683A/en
Publication of CN117635683A publication Critical patent/CN117635683A/en
Pending legal-status Critical Current

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a multi-camera-based positioning method in a trolley room, which comprises the following steps: step 1, calibrating internal and external parameters of a multi-camera, acquiring three-dimensional space information from a two-dimensional image, acquiring a mapping relation between pixels in the image and a space object through camera calibration, and calculating space coordinates by using pixel coordinates; step 2, detecting characteristic points of the trolley based on an improved YOLOv5 model; and step 3, triangulating the coordinates of the pixel points of the feature points detected in the step 2 to obtain depth information of the trolley, and finishing the indoor positioning target. In the invention, the multi-camera system obtains information of a plurality of angles and visual angles at the same time, thereby reducing errors and providing higher positioning precision; while in a complex indoor environment, the multi-camera system can provide stable position estimation by handling obstructions and multipath effects through multiple perspectives.

Description

Trolley indoor positioning method based on multiple cameras
Technical Field
The invention relates to the technical field of indoor positioning, in particular to a trolley indoor positioning method based on multiple cameras.
Background
Under the large background of automobile intelligence, autopilot is regarded as an important technology in the future traffic field, and can improve traffic safety, reduce traffic jams and provide more travel options. The hundred-degree Apollo trolley is a test vehicle carrying an automatic driving technology and is used for testing and verifying an automatic driving system on an actual road. The Apollo trolley positioning module depends on an IMU, a GPS, a laser radar, a radar and a high-precision map, and the sensors support GNSS positioning and LiDAR positioning simultaneously, GNSS positioning output position and speed information and LiDAR positioning output position and travelling direction information. While the lack of GPS information for indoor positioning presents challenges for the study of indoor autopilot of Apollo carts.
The existing indoor positioning technology covers various methods and application fields, such as indoor positioning technology based on wifi, bluetooth, ultrasonic wave, inertial navigation and other methods, is often applied to the fields of indoor navigation, indoor positioning, indoor remote control and the like, and provides some solutions for positioning and navigation problems in indoor environments. However, the current indoor positioning technology has various problems and disadvantages in practical application, for example, accuracy in a complex environment is limited, and signal strength may be affected by obstacles, interference or multipath propagation, thereby causing an increase in positioning error. Inertial navigation techniques are susceptible to accumulated errors, resulting in drift problems. While some high precision indoor positioning techniques, such as ultrasound or laser based systems, are costly. Deployment of these systems requires expensive hardware and infrastructure investments, limiting their feasibility in large-scale applications.
Disclosure of Invention
The technical problem to be solved by the invention is to provide a multi-camera-based positioning method in a trolley room, wherein a multi-camera system obtains information of a plurality of angles and visual angles at the same time, so that errors are reduced, and higher positioning precision is provided; while in a complex indoor environment, the multi-camera system can provide stable position estimation by handling obstructions and multipath effects through multiple perspectives.
In order to solve the technical problems, the invention provides a multi-camera-based positioning method in a trolley room, which comprises the following steps:
step 1, calibrating internal and external parameters of a multi-camera, acquiring three-dimensional space information from a two-dimensional image, acquiring a mapping relation between pixels in the image and a space object through camera calibration, and calculating space coordinates by using pixel coordinates;
step 2, detecting characteristic points of the trolley based on an improved YOLOv5 model;
and step 3, triangulating the coordinates of the pixel points of the feature points detected in the step 2 to obtain depth information of the trolley, and finishing the indoor positioning target.
Preferably, in step 1, calibrating the internal parameter and the external parameter of the multi-camera, and acquiring three-dimensional space information from the two-dimensional image specifically includes the following steps:
step 11, acquiring camera internal parameters, and converting a camera coordinate system into an image coordinate system;
step 12, obtaining camera external parameters, and converting a world coordinate system into a camera coordinate system;
step 13, constructing a perspective projection matrix through the internal reference matrix and the external reference matrix to finish the calibration of the camera; the perspective projection matrix associates pixel coordinates on the image with actual coordinates in the three-dimensional space, so that points in the image are mapped into the three-dimensional space, and the functions of measuring and calculating of the camera are realized.
Preferably, in step 11, obtaining a camera internal reference, and implementing conversion from a camera coordinate system to an image coordinate system specifically includes the following steps:
step 111, fixedly mounting four Kinect cameras at four vertex angle positions of an indoor positioning site to obtain position information of a plurality of angles of the trolley;
and 112, calling a pyk a interface to obtain a Kinect camera internal reference matrix K.
Preferably, in step 12, obtaining the camera external parameters, and implementing the conversion from the world coordinate system to the camera coordinate system specifically includes the following steps:
step 121, recording a section of indoor positioning site video through a multi-camera, and splitting the video file into a Zhang De checkerboard image;
step 122, loading images, selecting four vertex positions for each image, and calling a cv2.findHomoprography () function to calculate a homography matrix of the camera, wherein the homography matrix is used for mapping points on the images to points in the actual world;
step 123, decomposing a camera homography matrix, and obtaining external parameters of the camera, including a rotation matrix R and a translation matrix T, wherein:
T=(t x ,t y ,t z ) T (2)。
preferably, in step 13, the calibration of the camera is completed by constructing a perspective projection matrix through an internal reference matrix and an external reference matrix, which specifically includes the following steps:
step 131, setting the coordinate of a point in the world coordinate system as P w =(x w ,y w ,z w ) T The coordinates in the camera coordinate system are P c =(x c ,y c ,z c ) Then:
step 132, if the pixel coordinate in the image coordinate system is { u, v }, then:
wherein K is a camera reference matrix, c x And c y Representing camera optical axis sitting in pixelOffset in the frame; f (f) x And f y Normalized focal lengths on the u-axis and v-axis.
Preferably, in step 2, based on the improved YOLOv5 model, the feature point detection for the trolley specifically includes the following steps:
step 21, setting a trolley characteristic point: binding four balls with different colors at front and rear vertex angles of the trolley to serve as characteristic points of the trolley, determining the advancing and retreating directions of the trolley through the colors of the characteristic points, determining the position of the trolley through the position of the characteristic points, and meanwhile adding the trolley as one characteristic point to realize the position constraint effect on other characteristic points;
step 22, constructing a small target detection deep learning model based on improved YOLOv 5: the YOLOv5 model is improved from three aspects of a feature extraction model, a loss function module and a non-maximum suppression module NMS, so that the detection precision of the YOLOv5 model on small target objects is effectively enhanced;
step 23, constructing a trolley detection data set;
and step 24, training the YOLOv 5-based target detection deep learning model constructed in the step 22.
Preferably, in step 22, the feature extraction model is modified: the ratio of the trolley feature points to the whole image is small under the view of the camera, a 4-time downsampling process is added to an original input picture on the basis of a YOLOv5 backbone network, and the original picture is fed into a feature fusion network after 4-time downsampling to obtain a feature picture with a new size; the feature map has smaller receptive field and relatively rich position information, and can promote the detection effect of a small target;
improvement loss function module: and (3) taking the EIoU loss function, taking the aspect ratio apart on the basis of the CIoU, adding the Focal focusing high-quality anchor frame, taking the aspect ratio influence factor apart on the basis of the CIoU penalty term, and calculating the length and width of the target frame and the anchor frame respectively, wherein the EIoU loss function comprises three parts of overlapping loss, center distance loss and width and height loss, the first two parts continue the method in the CIoU, the width and height loss makes the difference between the width and the height of the target frame and the anchor frame minimum, the convergence speed is faster, and the EIoU loss function formula is as follows:
where IOU (A, B) represents the ratio of the intersection to union of two rectangular boxes A and B, a commonly used IoU value, which is a standard method of evaluating the similarity of a prediction bounding box to a real bounding box; ρ 2 : center point b representing predicted bounding box and center point b of real bounding box gt Square distance between them; ρ 2 (h,h gt ) And ρ 2 (w,w gt ): representing the height h and width w of the predicted bounding box and the height h of the real bounding box, respectively gt And width w gt Square distance between them; h is a c And w c : center point coordinates representing the height and width of the predicted bounding box, respectively.
Improved non-maximum suppression NMS module: the non-maximum suppression module is used for a prediction stage of target detection, NMS is used for merging similar boundary boxes of the same target, and DIoU substitution IoU considering the distance between the center points of the two boxes is used as a judgment standard of the NMS;
wherein s is i The classification confidence coefficient epsilon is an NMS threshold value, M is a frame with the highest confidence coefficient, and different characteristic points exist in a boundary frame with a far distance between a DIoU-NMS prediction center point, so that the condition of missed detection is reduced.
Preferably, in step 23, constructing the trolley detection data set specifically includes the steps of:
step 231, acquiring trolley videos of a plurality of angles by adopting the four Kinect cameras installed in the step 1, and dividing the videos into images;
step 232, performing feature point semiautomatic labeling on the acquired trolley image by using a labeling tool LabelImg;
step 233, converting the labeling data set into a yolo labeling format suitable for the YOLOv5 model;
step 234, constructing a trolley detection training set, a verification set and a test set, and randomly selecting 80% of trolley detection data sets as the training set, 10% of trolley detection data sets as the verification set and 10% of trolley detection data sets as the test set.
Preferably, in step 24, the YOLOv 5-based object detection deep learning model constructed in the training step 22 specifically includes the following steps:
step 241, setting training parameters, training by using a random optimization algorithm Adam, wherein the size of a training Batch is set to be batch=64, the Momentum momentum=0.9, the learning rate is initially set to be ir=0.001, and the training iteration times epoch=300;
step 242, the trolley detection data set constructed in the step 22 is sent to a target detection deep learning model based on YOLOv5 constructed in the step 22;
step 243, training a target detection deep learning model based on YOLOv5, adjusting the learning rate and the iteration times according to the average precision change and the loss change trend of the cross verification of the training set and the verification set until the precision change and the loss change gradually tend to a stable state, and determining the final learning rate and the iteration times;
and step 244, completing training of the target detection deep learning model based on the YOLOv5 according to the determined learning rate and iteration times, and obtaining the target detection deep learning model based on the YOLOv5 with good convergence.
Preferably, in step 3, triangulating the coordinates of the feature points detected in step 2 to obtain depth information of the trolley, and completing the indoor positioning target specifically includes the following steps:
step 31, for each frame of image, detecting the characteristic points of the trolley by using a trained target detection model, and acquiring the position and category information of the characteristic points in the image;
step 32, triangulating the positions of the pixel points of the same feature point in different cameras, calculating the coordinates of the feature point in a three-dimensional space, and obtaining depth information of the feature point; acquiring a group of images under multiple cameras at the same moment, and designating one point on one image as a point to be reconstructed; determining the coordinates of the characteristic points of each image according to the category consistency, and calculating corresponding three-dimensional coordinates of each characteristic point by utilizing a triangulation principle;
and 33, continuously detecting the characteristic points of the trolley in the real-time image, and outputting the three-dimensional coordinates of the trolley to the trolley in real time through the GRPC so as to realize indoor automatic driving of the trolley.
The beneficial effects of the invention are as follows: the depth image obtained based on the four Kinect cameras and the multi-camera system can provide high-quality indoor positioning, the multi-camera system can reduce errors, no extra base station or beacon is needed, and positioning accuracy is improved; meanwhile, the balls with different colors are bound to the trolley to serve as target feature points, so that the reliability of target identification and positioning is improved, and the diversity is conducive to realizing reliable positioning under various illumination and scene conditions; feature point detection is carried out based on an improved YOLOv5 model, and the object detection function can be directly utilized to label the feature points of the pellets, so that the preparation work of training data is simplified; since YOLOv5 has multi-target detection capability, it can detect a plurality of balls with different colors on the trolley at the same time, and further accurately position the trolley according to the relative positions and depth information of the feature points; the improved YOLOv5 model not only can be used for detecting specific small ball feature points, but also can be easily expanded to feature points of other shapes, sizes and colors, so that greater flexibility is provided, and the deployment and maintenance cost of the system can be greatly reduced as no additional base stations or beacons are relied on, and meanwhile, the portability and expansibility of the system are also increased; therefore, the detection of the trolley characteristic points combined with the improved YOLOv5 model brings real-time, accuracy and robustness improvement to the indoor positioning method, and simultaneously simplifies the deployment and maintenance work of the system; the indoor positioning technology based on multiple cameras combines the advantages of high precision, stability, low cost, autonomous navigation and the like, provides a more feasible and reliable solution to the indoor positioning problem, and provides a new indoor positioning solution to the fields of automatic driving and robot navigation.
Drawings
FIG. 1 is a schematic flow chart of a positioning method in a trolley room according to the present invention.
Fig. 2 is a schematic diagram of a feature extraction module modified in the present invention.
FIG. 3 is a schematic diagram of a labelImg labeling interface of the present invention.
FIG. 4 is a schematic diagram of the model evaluation result of the present invention.
Fig. 5 is a schematic diagram of the analysis of the positioning error in the car room according to the present invention.
Fig. 6 is a diagram showing an example of the detection of a car object according to the present invention.
Fig. 7 is a schematic diagram of a three-dimensional reconstruction method of multiple images based on feature point constraints according to the present invention.
Detailed Description
As shown in fig. 1, a multi-camera-based positioning method in a car room includes the steps of:
step 1, calibrating internal and external parameters of a multi-camera, acquiring three-dimensional space information from a two-dimensional image, acquiring a mapping relation between pixels in the image and a space object through camera calibration, and calculating space coordinates by using pixel coordinates;
calibrating the internal and external parameters of the multi-camera, and acquiring three-dimensional space information from the two-dimensional image specifically comprises the following steps:
step 11, acquiring camera internal parameters, and converting a camera coordinate system into an image coordinate system; the method specifically comprises the following steps:
step 111, fixedly mounting four Kinect cameras at four vertex angle positions of an indoor positioning site to obtain position information of a plurality of angles of the trolley;
and 112, calling a pyk a interface to obtain a Kinect camera internal reference matrix K.
Step 12, obtaining camera external parameters, and converting a world coordinate system into a camera coordinate system; the method specifically comprises the following steps:
step 121, recording a section of indoor positioning site video through a multi-camera, and splitting the video file into a Zhang De checkerboard image;
step 122, loading images, selecting four vertex positions for each image, and calling a cv2.findHomoprography () function to calculate a homography matrix of the camera, wherein the homography matrix is used for mapping points on the images to points in the actual world;
step 123, decomposing a camera homography matrix, and obtaining external parameters of the camera, including a rotation matrix R and a translation matrix T, wherein:
T=(t x ,t y ,t z ) T (2)。
step 13, constructing a perspective projection matrix through the internal reference matrix and the external reference matrix to finish the calibration of the camera; the perspective projection matrix associates pixel coordinates on the image with actual coordinates in the three-dimensional space, so that points in the image are mapped into the three-dimensional space, and the functions of measuring and calculating of the camera are realized. The method specifically comprises the following steps:
step 131, setting the coordinate of a point in the world coordinate system as P w =(x w ,y w ,z w ) T The coordinates in the camera coordinate system are P c =(x c ,y c ,z c ) Then:
step 132, if the pixel coordinate in the image coordinate system is { u, v }, then:
wherein K is a camera reference matrix, c x And c y Representing an offset of the camera optical axis in the pixel coordinate system; f (f) x And f y Normalized focal lengths on the u-axis and v-axis.
Step 2, detecting characteristic points of the trolley based on an improved YOLOv5 model; the method specifically comprises the following steps:
step 21, setting a trolley characteristic point: binding four balls with different colors at front and rear vertex angles of the trolley to serve as characteristic points of the trolley, determining the advancing and retreating directions of the trolley through the colors of the characteristic points, determining the position of the trolley through the position of the characteristic points, and meanwhile adding the trolley as one characteristic point to realize the position constraint effect on other characteristic points;
step 22, constructing a small target detection deep learning model based on improved YOLOv 5: the YOLOv5 model is improved from three aspects of a feature extraction model, a loss function module and a non-maximum suppression module NMS, so that the detection precision of the YOLOv5 model on small target objects is effectively enhanced;
improving a feature extraction model: the ratio of the trolley feature points to the whole image is small under the view of the camera, a 4-time downsampling process is added to an original input picture on the basis of a YOLOv5 backbone network, and the original picture is fed into a feature fusion network after 4-time downsampling to obtain a feature picture with a new size; the feature map has smaller receptive field and relatively rich position information, and can improve the detection effect of small targets, as shown in fig. 2;
improvement loss function module: and (3) taking the EIoU loss function, taking the aspect ratio apart on the basis of the CIoU, adding the Focal focusing high-quality anchor frame, taking the aspect ratio influence factor apart on the basis of the CIoU penalty term, and calculating the length and width of the target frame and the anchor frame respectively, wherein the EIoU loss function comprises three parts of overlapping loss, center distance loss and width and height loss, the first two parts continue the method in the CIoU, the width and height loss makes the difference between the width and the height of the target frame and the anchor frame minimum, the convergence speed is faster, and the EIoU loss function formula is as follows:
improved non-maximum suppression NMS module: the non-maximum suppression module is used for a prediction stage of target detection, NMS is used for merging similar boundary boxes of the same target, and DIoU substitution IoU considering the distance between the center points of the two boxes is used as a judgment standard of the NMS;
wherein s is i The classification confidence coefficient epsilon is an NMS threshold value, M is a frame with the highest confidence coefficient, and different characteristic points exist in a boundary frame with a far distance between a DIoU-NMS prediction center point, so that the condition of missed detection is reduced.
Step 23, constructing a trolley detection data set; the method specifically comprises the following steps:
step 231, acquiring trolley videos of a plurality of angles by adopting the four Kinect cameras installed in the step 1, and dividing the videos into images;
step 232, performing feature point semiautomatic labeling on the acquired trolley image by using a labeling tool LabelImg; the labelmg labeling interface is shown in figure 3.
Step 233, converting the labeling data set into a yolo labeling format suitable for the YOLOv5 model;
step 234, constructing a trolley detection training set, a verification set and a test set, and randomly selecting 80% of trolley detection data sets as the training set, 10% of trolley detection data sets as the verification set and 10% of trolley detection data sets as the test set.
And step 24, training the YOLOv 5-based target detection deep learning model constructed in the step 22. The method specifically comprises the following steps:
step 241, setting training parameters, training by using a random optimization algorithm Adam, wherein the size of a training Batch is set to be batch=64, the Momentum momentum=0.9, the learning rate is initially set to be ir=0.001, and the training iteration times epoch=300;
step 242, the trolley detection data set constructed in the step 22 is sent to a target detection deep learning model based on YOLOv5 constructed in the step 22;
step 243, training a target detection deep learning model based on YOLOv5, adjusting the learning rate and the iteration times according to the average precision change and the loss change trend of the cross verification of the training set and the verification set until the precision change and the loss change gradually tend to a stable state, and determining the final learning rate and the iteration times;
and step 244, completing training of the target detection deep learning model based on the YOLOv5 according to the determined learning rate and iteration times, and obtaining the target detection deep learning model based on the YOLOv5 with good convergence. Evaluation model: the effect of the model is evaluated by using five indexes such as box_loss and obj_loss, the model evaluation result is shown in fig. 4, and the index mAP@50 can reach more than 99%, so that the accuracy of the target detection model obtained by training of the invention is fully proved to be reliable; FIG. 5 is a graph of the error value between the three-dimensional coordinates and the real space coordinates of the positioning of the trolley using the model, the error range being between 0.01m and 0.10 m; an example of the target detection of the cart is shown in fig. 6.
And step 3, triangulating the coordinates of the pixel points of the feature points detected in the step 2 to obtain depth information of the trolley, and finishing the indoor positioning target. As shown in fig. 7, the method specifically comprises the following steps:
step 31, for each frame of image, detecting the characteristic points of the trolley by using a trained target detection model, and acquiring the position and category information of the characteristic points in the image;
step 32, triangulating the positions of the pixel points of the same feature point in different cameras, calculating the coordinates of the feature point in a three-dimensional space, and obtaining depth information of the feature point; acquiring a group of images under multiple cameras at the same moment, and designating one point on one image as a point to be reconstructed; determining the coordinates of the characteristic points of each image according to the category consistency, and calculating corresponding three-dimensional coordinates of each characteristic point by utilizing a triangulation principle;
and 33, continuously detecting the characteristic points of the trolley in the real-time image, and outputting the three-dimensional coordinates of the trolley to the trolley in real time through the GRPC so as to realize indoor automatic driving of the trolley.

Claims (10)

1. The positioning method in the car room based on the multiple cameras is characterized by comprising the following steps:
step 1, calibrating internal and external parameters of a multi-camera, acquiring three-dimensional space information from a two-dimensional image, acquiring a mapping relation between pixels in the image and a space object through camera calibration, and calculating space coordinates by using pixel coordinates;
step 2, detecting characteristic points of the trolley based on an improved YOLOv5 model;
and step 3, triangulating the coordinates of the pixel points of the feature points detected in the step 2 to obtain depth information of the trolley, and finishing the indoor positioning target.
2. The multi-camera-based positioning method in a car room according to claim 1, wherein in step 1, the calibration of the internal and external parameters of the multi-camera is performed, and the acquisition of three-dimensional spatial information from the two-dimensional image specifically comprises the following steps:
step 11, acquiring camera internal parameters, and converting a camera coordinate system into an image coordinate system;
step 12, obtaining camera external parameters, and converting a world coordinate system into a camera coordinate system;
step 13, constructing a perspective projection matrix through the internal reference matrix and the external reference matrix to finish the calibration of the camera; the perspective projection matrix associates pixel coordinates on the image with actual coordinates in the three-dimensional space, so that points in the image are mapped into the three-dimensional space, and the functions of measuring and calculating of the camera are realized.
3. The multi-camera based intra-car positioning method according to claim 2, wherein in step 11, camera parameters are acquired, and the conversion from the camera coordinate system to the image coordinate system is implemented specifically by the following steps:
step 111, fixedly mounting four Kinect cameras at four vertex angle positions of an indoor positioning site to obtain position information of a plurality of angles of the trolley;
and 112, calling a pyk a interface to obtain a Kinect camera internal reference matrix K.
4. The multi-camera based cart indoor positioning method of claim 2, wherein in step 12, camera parameters are obtained, and the conversion from the world coordinate system to the camera coordinate system is performed specifically by the following steps:
step 121, recording a section of indoor positioning site video through a multi-camera, and splitting the video file into a Zhang De checkerboard image;
step 122, loading images, selecting four vertex positions for each image, and calling a cv2.findHomoprography () function to calculate a homography matrix of the camera, wherein the homography matrix is used for mapping points on the images to points in the actual world;
step 123, decomposing a camera homography matrix, and obtaining external parameters of the camera, including a rotation matrix R and a translation matrix T, wherein:
T=(t x ,t y ,t z ) T (2)。
5. the multi-camera-based positioning method in a car room according to claim 2, wherein in step 13, a perspective projection matrix is constructed by an internal reference matrix and an external reference matrix, and the calibration of the camera is completed specifically comprises the following steps:
step 131, setting the coordinate of a point in the world coordinate system as P w =(x w ,y w ,z w ) T The coordinates in the camera coordinate system are P c =(x c ,y c ,z c ) Then:
step 132, if the pixel coordinate in the image coordinate system is { u, v }, then:
wherein K is a camera reference matrix, c x And c y Representing an offset of the camera optical axis in the pixel coordinate system; f (f) x And f y Normalized focal lengths on the u-axis and v-axis.
6. The multi-camera based intra-vehicle positioning method according to claim 1, wherein in step 2, feature point detection for the vehicle based on the modified YOLOv5 model specifically includes the steps of:
step 21, setting a trolley characteristic point: binding four balls with different colors at front and rear vertex angles of the trolley to serve as characteristic points of the trolley, determining the advancing and retreating directions of the trolley through the colors of the characteristic points, determining the position of the trolley through the position of the characteristic points, and meanwhile adding the trolley as one characteristic point to realize the position constraint effect on other characteristic points;
step 22, constructing a small target detection deep learning model based on improved YOLOv 5: the YOLOv5 model is improved from three aspects of a feature extraction model, a loss function module and a non-maximum suppression module NMS, so that the detection precision of the YOLOv5 model on small target objects is effectively enhanced;
step 23, constructing a trolley detection data set;
and step 24, training the YOLOv 5-based target detection deep learning model constructed in the step 22.
7. The multi-camera based cart indoor positioning method of claim 6, wherein in step 22, the feature extraction model is modified: the ratio of the trolley feature points to the whole image is small under the view of the camera, a 4-time downsampling process is added to an original input picture on the basis of a YOLOv5 backbone network, and the original picture is fed into a feature fusion network after 4-time downsampling to obtain a feature picture with a new size; the feature map has smaller receptive field and relatively rich position information, and can promote the detection effect of a small target;
improvement loss function module: and (3) taking the EIoU loss function, taking the aspect ratio apart on the basis of the CIoU, adding the Focal focusing high-quality anchor frame, taking the aspect ratio influence factor apart on the basis of the CIoU penalty term, and calculating the length and width of the target frame and the anchor frame respectively, wherein the EIoU loss function comprises three parts of overlapping loss, center distance loss and width and height loss, the first two parts continue the method in the CIoU, the width and height loss makes the difference between the width and the height of the target frame and the anchor frame minimum, the convergence speed is faster, and the EIoU loss function formula is as follows:
where IOU (A, B) represents the ratio of the intersection to union of two rectangular boxes A and B, a commonly used IoU value, which is a standard method of evaluating the similarity of a prediction bounding box to a real bounding box; ρ 2 : center point b representing predicted bounding box and center point b of real bounding box gt Square distance between them; ρ 2 (h,h gt ) And ρ 2 (w,w gt ): representing the height h and width w of the predicted bounding box and the height h of the real bounding box, respectively gt And width w gt Square distance between them; h is a c And w c : center point coordinates representing the height and width of the predicted bounding box, respectively;
improved non-maximum suppression NMS module: the non-maximum suppression module is used for a prediction stage of target detection, NMS is used for merging similar boundary boxes of the same target, and DIoU substitution IoU considering the distance between the center points of the two boxes is used as a judgment standard of the NMS;
wherein s is i The classification confidence coefficient epsilon is an NMS threshold value, M is a frame with the highest confidence coefficient, and different characteristic points exist in a boundary frame with a far distance between a DIoU-NMS prediction center point, so that the condition of missed detection is reduced.
8. The multi-camera based cart indoor positioning method of claim 6, wherein in step 23, constructing the cart detection data set specifically comprises the steps of:
step 231, acquiring trolley videos of a plurality of angles by adopting the four Kinect cameras installed in the step 1, and dividing the videos into images;
step 232, performing feature point semiautomatic labeling on the acquired trolley image by using a labeling tool LabelImg;
step 233, converting the labeling data set into a yolo labeling format suitable for the YOLOv5 model;
step 234, constructing a trolley detection training set, a verification set and a test set, and randomly selecting 80% of trolley detection data sets as the training set, 10% of trolley detection data sets as the verification set and 10% of trolley detection data sets as the test set.
9. The multi-camera based intra-car positioning method according to claim 6, wherein the YOLOv 5-based object detection deep learning model constructed in the training step 22 in step 24 specifically includes the steps of:
step 241, setting training parameters, training by using a random optimization algorithm Adam, wherein the size of a training Batch is set to be batch=64, the Momentum momentum=0.9, the learning rate is initially set to be ir=0.001, and the training iteration times epoch=300;
step 242, the trolley detection data set constructed in the step 22 is sent to a target detection deep learning model based on YOLOv5 constructed in the step 22;
step 243, training a target detection deep learning model based on YOLOv5, adjusting the learning rate and the iteration times according to the average precision change and the loss change trend of the cross verification of the training set and the verification set until the precision change and the loss change gradually tend to a stable state, and determining the final learning rate and the iteration times;
and step 244, completing training of the target detection deep learning model based on the YOLOv5 according to the determined learning rate and iteration times, and obtaining the target detection deep learning model based on the YOLOv5 with good convergence.
10. The multi-camera-based positioning method in a car room according to claim 1, wherein in step 3, triangulating coordinates of the feature points detected in step 2 to obtain depth information of the car, and completing the positioning of the target in the car room specifically comprises the following steps:
step 31, for each frame of image, detecting the characteristic points of the trolley by using a trained target detection model, and acquiring the position and category information of the characteristic points in the image;
step 32, triangulating the positions of the pixel points of the same feature point in different cameras, calculating the coordinates of the feature point in a three-dimensional space, and obtaining depth information of the feature point; acquiring a group of images under multiple cameras at the same moment, and designating one point on one image as a point to be reconstructed; determining the coordinates of the characteristic points of each image according to the category consistency, and calculating corresponding three-dimensional coordinates of each characteristic point by utilizing a triangulation principle;
and 33, continuously detecting the characteristic points of the trolley in the real-time image, and outputting the three-dimensional coordinates of the trolley to the trolley in real time through the GRPC so as to realize indoor automatic driving of the trolley.
CN202311706393.3A 2023-12-13 2023-12-13 Trolley indoor positioning method based on multiple cameras Pending CN117635683A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311706393.3A CN117635683A (en) 2023-12-13 2023-12-13 Trolley indoor positioning method based on multiple cameras

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311706393.3A CN117635683A (en) 2023-12-13 2023-12-13 Trolley indoor positioning method based on multiple cameras

Publications (1)

Publication Number Publication Date
CN117635683A true CN117635683A (en) 2024-03-01

Family

ID=90018198

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311706393.3A Pending CN117635683A (en) 2023-12-13 2023-12-13 Trolley indoor positioning method based on multiple cameras

Country Status (1)

Country Link
CN (1) CN117635683A (en)

Similar Documents

Publication Publication Date Title
US11915099B2 (en) Information processing method, information processing apparatus, and recording medium for selecting sensing data serving as learning data
US10192113B1 (en) Quadocular sensor design in autonomous platforms
JP7351293B2 (en) Signal processing device, signal processing method, program, and mobile object
US10496104B1 (en) Positional awareness with quadocular sensor in autonomous platforms
US10318822B2 (en) Object tracking
De Silva et al. Fusion of LiDAR and camera sensor data for environment sensing in driverless vehicles
EP3137850B1 (en) Method and system for determining a position relative to a digital map
US20180260613A1 (en) Object tracking
CN113870343B (en) Relative pose calibration method, device, computer equipment and storage medium
US20210004566A1 (en) Method and apparatus for 3d object bounding for 2d image data
CN112346463B (en) Unmanned vehicle path planning method based on speed sampling
CN110163963B (en) Mapping device and mapping method based on SLAM
CN103890606A (en) Methods and systems for creating maps with radar-optical imaging fusion
CN103424112A (en) Vision navigating method for movement carrier based on laser plane assistance
CN103065323A (en) Subsection space aligning method based on homography transformational matrix
CN114413958A (en) Monocular vision distance and speed measurement method of unmanned logistics vehicle
JP2017181476A (en) Vehicle location detection device, vehicle location detection method and vehicle location detection-purpose computer program
KR101030317B1 (en) Apparatus for tracking obstacle using stereo vision and method thereof
CN112068152A (en) Method and system for simultaneous 2D localization and 2D map creation using a 3D scanner
Mallik et al. Real-time Detection and Avoidance of Obstacles in the Path of Autonomous Vehicles Using Monocular RGB Camera
US20230401748A1 (en) Apparatus and methods to calibrate a stereo camera pair
WO2022133986A1 (en) Accuracy estimation method and system
WO2023040137A1 (en) Data processing
CN117635683A (en) Trolley indoor positioning method based on multiple cameras
US11348278B2 (en) Object detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination