WO2022077863A1

WO2022077863A1 - Visual positioning method, and method for training related model, related apparatus, and device

Info

Publication number: WO2022077863A1
Application number: PCT/CN2021/082198
Authority: WO
Inventors: 鲍虎军; 章国锋; 余海林; 冯友计
Original assignee: 浙江商汤科技开发有限公司
Priority date: 2020-10-16
Filing date: 2021-03-22
Publication date: 2022-04-21
Also published as: TW202217662A; JP7280393B2; KR20220051162A; JP2023502819A; CN112328715A; CN112328715B

Abstract

A visual positioning method, and a method for training a related model, a related apparatus, and a device. A method for training a matching prediction model comprises: constructing sample matching data by using a sample image and map data (S11), wherein the sample matching data comprises several point pair groups and an actual matching value of each point pair group, and the two points of each point pair group are respectively from the sample image and the map data; performing prediction processing on the several point pair groups by using the matching prediction model to obtain prediction matching values of the point pairs (S12); determining a loss value of the matching prediction model by using the actual matching values and the prediction matching values (S13); and adjusting parameters of the matching prediction model by using the loss value (S14). The present solution can improve visual positioning accuracy and instantaneity.

Description

视觉定位方法及相关模型的训练方法及相关装置、设备Visual positioning method and related model training method and related devices and equipment

相关申请的交叉引用CROSS-REFERENCE TO RELATED APPLICATIONS

本公开基于申请号为202011110569.5、申请日为2020年10月16日的中国专利申请提出，并要求该中国专利申请的优先权，该中国专利申请的全部内容在此引入本公开作为参考。The present disclosure is based on a Chinese patent application with application number 202011110569.5 and an application date of October 16, 2020, and claims the priority of the Chinese patent application, the entire contents of which are incorporated herein by reference.

技术领域technical field

本公开涉及计算机视觉技术领域，特别是涉及一种视觉定位方法及相关模型的训练方法及相关装置、设备。The present disclosure relates to the technical field of computer vision, and in particular, to a visual positioning method, a training method for a related model, and related devices and equipment.

背景技术Background technique

视觉定位根据地图数据的表达方式不同，可分为多种方式。其中，尤以基于结构的方式，又称为基于特征(feature-based)的方式以其精度高、泛化性能优而受到广泛关注。Visual positioning can be divided into various ways according to the expression of map data. Among them, the structure-based method, also known as the feature-based method, has received extensive attention due to its high accuracy and excellent generalization performance.

目前，在利用基于特征的方式进行视觉定位时，需要匹配得到图像数据与地图数据之间的多个点对。然而，采用局部相似度建立匹配关系，其可靠性较弱，特别是在大规模场景或具有重复结构/重复纹理的场景中，极易产生错误匹配，从而影响视觉定位的准确性。使用随机采样一致性(Random Sample Consensus，RANSAC)虽然可以剔除错误匹配，但是由于RANSAC对每个样本点进行等概率采样，当初始匹配中的外点过多时，RANSAC存在耗时久且精度低的问题，从而影响视觉定位的即时性和准确性。有鉴于此，如何提高视觉定位的准确性和即时性成为亟待解决的问题。At present, when using a feature-based approach for visual positioning, it is necessary to obtain multiple point pairs between image data and map data by matching. However, the reliability of using local similarity to establish a matching relationship is weak, especially in large-scale scenes or scenes with repeated structures/textures, which are prone to false matching, thus affecting the accuracy of visual localization. Although the use of random sampling consistency (Random Sample Consensus, RANSAC) can eliminate false matching, but because RANSAC performs equal probability sampling on each sample point, when there are too many outliers in the initial matching, RANSAC has a long time and low precision. problems, thereby affecting the immediacy and accuracy of visual positioning. In view of this, how to improve the accuracy and immediacy of visual positioning has become an urgent problem to be solved.

发明内容SUMMARY OF THE INVENTION

本公开提供一种视觉定位方法及相关模型的训练方法及相关装置、设备。The present disclosure provides a visual positioning method, a training method for a related model, and related devices and equipment.

本公开第一方面提供了一种匹配预测模型的训练方法，包括：利用样本图像和地图数据，构建样本匹配数据，其中，样本匹配数据包括若干组点对以及每组点对的实际匹配值，每组点对的两个点分别来自样本图像和地图数据；利用匹配预测模型对若干组点对进行预测处理，得到点对的预测匹配值；利用实际匹配值和预测匹配值，确定匹配预测模型的损失值；利用损失值，调整匹配预测模型的参数。A first aspect of the present disclosure provides a training method for a matching prediction model, including: using sample images and map data to construct sample matching data, wherein the sample matching data includes several groups of point pairs and an actual matching value of each group of point pairs, The two points of each group of point pairs come from the sample image and map data respectively; use the matching prediction model to perform prediction processing on several groups of point pairs to obtain the predicted matching value of the point pair; use the actual matching value and the predicted matching value to determine the matching prediction model loss value; use the loss value to adjust the parameters of the matching prediction model.

因此，通过利用样本图像和地图数据构建得到样本匹配数据，且样本匹配数据包括若干组点对以及每组点对的实际匹配值，每组点对的两个点分别来自样本图像和地图数据，从而利用匹配预测模型对若干组点对进行预测处理，得到点对的预测匹配值，进而利用实际匹配值和预测匹配值，确定匹配预测模型的损失值，利用损失值来对匹配预测模型的参数进行调整，故能够利用匹配预测模型建立匹配关系，从而能够在视觉定位中利用匹配预测模型预测点对之间的匹配值，因而能够基于预测得到的匹配值优先采样高匹配值的点对，确定待定位图像的摄像的位姿参数，进而能够有利于提高视觉定位的准确性和即时性。Therefore, the sample matching data is obtained by constructing the sample image and map data, and the sample matching data includes several groups of point pairs and the actual matching value of each group of point pairs. The two points of each group of point pairs come from the sample image and map data respectively, Therefore, the matching prediction model is used to perform prediction processing on several groups of point pairs, and the predicted matching value of the point pair is obtained, and then the actual matching value and the predicted matching value are used to determine the loss value of the matching prediction model, and the loss value is used to match the parameters of the prediction model. Therefore, the matching prediction model can be used to establish a matching relationship, so that the matching prediction model can be used to predict the matching value between point pairs in visual positioning, so the point pair with high matching value can be preferentially sampled based on the predicted matching value. The camera pose parameters of the image to be positioned can further improve the accuracy and immediacy of visual positioning.

其中，利用样本图像和地图数据，构建样本匹配数据包括：从样本图像中获取若干图像点，以及从地图数据中获取若干地图点，以组成若干组点对；其中，若干组点对包括至少一组所包含的图像点和地图点之间匹配的匹配点对；对于每组匹配点对：利用样本图像的位姿参数将地图点投影至样本图像所属的维度中，得到地图点的投影点；并基于图像点和投影点之间的差异，确定匹配点对的实际匹配值。Wherein, using the sample image and the map data to construct the sample matching data includes: obtaining several image points from the sample image, and obtaining several map points from the map data to form several groups of point pairs; wherein, the several groups of point pairs include at least one Matching point pairs that match between the image points and map points contained in the group; for each group of matching point pairs: use the pose parameters of the sample image to project the map points into the dimension to which the sample image belongs to obtain the projected points of the map points; And based on the difference between the image point and the projected point, the actual matching value of the matching point pair is determined.

因此，通过从样本图像中获取若干图像点，以及从地图数据中获取若干地图点，以组成若干组点对，且若干组点对中包括至少一组所包含的图像点和地图点之间匹配的匹配点对，故能够生成用于训练匹配预测模型的样本，并对于每组匹配点对，利用样本图像的位姿参数将地图点投影至样本图像所属的维度中，得到地图点的投影点，从而基于图像点和投影点之间的差异，确定匹配点对的实际匹配值，故能够使匹配预测模型在训练过程中学习到匹配点对几何特征，有利于提高匹配预测模型的准确性。Therefore, by acquiring several image points from the sample image and acquiring several map points from the map data, several sets of point pairs are formed, and the several sets of point pairs include at least one set of matching between the included image points and the map points. Therefore, it is possible to generate samples for training the matching prediction model, and for each set of matching point pairs, use the pose parameters of the sample image to project the map points into the dimension to which the sample image belongs, and obtain the projected points of the map points. , so as to determine the actual matching value of the matching point pair based on the difference between the image point and the projection point, so that the matching prediction model can learn the geometric features of the matching point pair during the training process, which is beneficial to improve the accuracy of the matching prediction model.

其中，若干组点对包括至少一组所包含的图像点和地图点之间不匹配的非匹配点，利用样本图像和地图数据，构建样本匹配数据还包括：将非匹配点对的实际匹配值设置为预设数值The several sets of point pairs include at least one set of non-matching points that do not match between the included image points and map points, and using sample images and map data to construct sample matching data further includes: converting the actual matching values of the non-matching point pairs into Set to default value

因此，若干组点对包括至少一组所包含的图像点和地图点之间不匹配的非匹配点对，并区别于匹配点对，将非匹配点对的实际匹配值设置预设数值，从而能够有利于提高匹配预测模型的鲁棒性。Therefore, several groups of point pairs include at least one group of non-matching point pairs that do not match between the included image points and map points, and different from the matching point pairs, the actual matching value of the non-matching point pairs is set to a preset value, thereby It can help to improve the robustness of the matching prediction model.

其中，从样本图像中获取若干图像点，以及从地图数据中获取若干地图点，以组成若干组点对，包括：将样本图像中的图像点划分为第一图像点和第二图像点，其中，第一图像点在地图数据中存在与其匹配的地图点，第二图像点在地图数据中不存在与其匹配的地图点；对于每一第一图像点，从地图数据中分配若干第一地图点，并分别将第一图像点与每一第一地图点作为一第一点对，其中，第一地图点中包含与第一图像点匹配的地图点；以及，对于每一第二图像点，从地图数据中分配若干第二地图点，并分别将第二图像点与每一第二地图点作为一第二点对；从第一点对和第二点对中抽取得到若干组点对。Wherein, acquiring several image points from the sample image and acquiring several map points from the map data to form several groups of point pairs, including: dividing the image points in the sample image into a first image point and a second image point, wherein , the first image point has a matching map point in the map data, and the second image point does not have a matching map point in the map data; for each first image point, assign a number of first map points from the map data , and take the first image point and each first map point as a first point pair, wherein the first map point includes a map point matching the first image point; and, for each second image point, A number of second map points are allocated from the map data, and the second image point and each second map point are respectively used as a second point pair; several groups of point pairs are extracted from the first point pair and the second point pair.

因此，通过将样本图像中的图像点划分为第一图像点和第二图像点，且第一图像点在地图中存在与其匹配的地图点，第二图像点在图像数据中不存在与其匹配的图像点，并对第一图像点，从地图数据中分配若干第一地图点，分别将第一图像点与每一第一地图点作为一第一点对，且第一地图点中包含与第一图像点匹配的地图点，而对于每一第二图像点，从地图数据中分配若干第二地图点，分别将第二图像点与每一第二地图点作为一第二点对，并从第一点对和第二点对中抽取得到若干组点对，从而能够构建得到数量丰富且包含非匹配点对和匹配点对的若干组点对，以用于训练匹配预测模型，故能够有利于提高匹配预测模型的准确性。Therefore, by dividing the image points in the sample image into a first image point and a second image point, and the first image point has a matching map point in the map, the second image point does not have a matching map point in the image data. image point, and for the first image point, assign a number of first map points from the map data, respectively take the first image point and each first map point as a first point pair, and the first map point includes and the first map point. A map point matched with an image point, and for each second image point, a number of second map points are allocated from the map data, and the second image point and each second map point are respectively regarded as a second point pair, and the Several groups of point pairs are extracted from the first point pair and the second point pair, so that a number of groups of point pairs including non-matching point pairs and matching point pairs can be constructed to be used for training the matching prediction model. It is beneficial to improve the accuracy of the matching prediction model.

其中，利用样本图像的位姿参数将地图点投影至样本图像所属的维度中，得到地图点的投影点包括：基于匹配点对，计算样本图像的位姿参数；利用位姿参数将地图点投影至样本图像所属的维度中，得到地图点的投影点。Wherein, using the pose parameter of the sample image to project the map point into the dimension to which the sample image belongs, and obtaining the projected point of the map point includes: calculating the pose parameter of the sample image based on the matching point pair; using the pose parameter to project the map point To the dimension to which the sample image belongs, the projected point of the map point is obtained.

因此，通过利用匹配点对，计算样本图像的位姿参数，并利用位姿参数将地图点投影至样本图像所属的维度中，得到地图点的投影点，从而能够有利于提高投影点与图像点之间差异的准确性，进而能够有利于提高匹配预测模型的准确性。Therefore, by using the matching point pairs to calculate the pose parameters of the sample image, and using the pose parameters to project the map points into the dimension to which the sample image belongs, the projected points of the map points can be obtained, which can help to improve the relationship between the projected points and the image points. The accuracy of the difference between them can be beneficial to improve the accuracy of the matching prediction model.

其中，基于图像点和投影点之间的差异，确定匹配点对的实际匹配值包括：利用预设概率分布函数将差异转换为概率密度值，作为匹配点对的实际匹配值。Wherein, determining the actual matching value of the matching point pair based on the difference between the image point and the projection point includes: using a preset probability distribution function to convert the difference into a probability density value as the actual matching value of the matching point pair.

因此，通过利用预设概率分布函数将差异转换为概率密度值，作为匹配点对的实际匹配值，故能够有利于准确地描述投影点与图像点之间的差异，从而能够有利于提高匹配预测模型的准确性。Therefore, by using the preset probability distribution function to convert the difference into a probability density value as the actual matching value of the matching point pair, it can help to accurately describe the difference between the projection point and the image point, which can help improve the matching prediction. accuracy of the model.

其中，样本匹配数据为二分图，二分图包括若干组点对和连接每组点对的连接边，且连接边标注有对应点对的实际匹配值；匹配预测模型包括与样本图像所属的维度对应的第一点特征提取子模型、与地图数据所属的维度对应的第二点特征提取子模型以及边特征提取子模型；利用匹配预测模型对若干组点对进行预测处理，得到点对的预测匹配值包括：分别利用第一点特征提取子模型和第二点特征提取子模型对二分图进行特征提取，得到第一特征和第二特征；利用边特征提取子模型对第一特征和第二特征进行特征提取，得到第三特征；利用第三特征，得到连接边对应的点对的预测匹配值。Among them, the sample matching data is a bipartite graph, and the bipartite graph includes several groups of point pairs and connecting edges connecting each group of point pairs, and the connecting edges are marked with the actual matching values of the corresponding point pairs; the matching prediction model includes the dimension corresponding to the sample image. The first point feature extraction sub-model, the second point feature extraction sub-model corresponding to the dimension to which the map data belongs, and the edge feature extraction sub-model; the matching prediction model is used to perform prediction processing on several groups of point pairs, and the prediction matching of point pairs is obtained. The value includes: using the first point feature extraction sub-model and the second point feature extraction sub-model to perform feature extraction on the bipartite graph to obtain the first feature and the second feature; using the edge feature extraction sub-model to extract the first feature and the second feature. Perform feature extraction to obtain a third feature; use the third feature to obtain the predicted matching value of the point pair corresponding to the connecting edge.

因此，通过对二分图分别进行点特征抽取以及边特征抽取，从而能够使匹配预测模型更加有效地感知匹配的空间几何结构，进而能够有利于提高匹配预测模型的准确性。Therefore, by performing point feature extraction and edge feature extraction respectively on the bipartite graph, the matching prediction model can more effectively perceive the spatial geometric structure of the matching, which can help to improve the accuracy of the matching prediction model.

其中，第一点特征提取子模型和第二点特征提取子模型的结构为以下任一种：包括至少一个残差块，包括至少一个残差块和至少一个空间变换网络；和/或，边特征提取子模型包括至少一个残差块。Wherein, the structure of the first point feature extraction sub-model and the second point feature extraction sub-model is any of the following: including at least one residual block, including at least one residual block and at least one spatial transformation network; and/or, edge The feature extraction submodel includes at least one residual block.

因此，通过将第一点特征提取子模型和第二点特征提取子模型的结构设置为以下任一者：包括至少一个残差块，包括至少一个残差块和至少一个空间变换网络，且将边特征提取子模型设置为包括至少一个残差块，故能够有利于匹配预测模型的优化，并提高匹配预测模型的准确性。Therefore, by setting the structure of the first point feature extraction sub-model and the second point feature extraction sub-model to any one of the following: including at least one residual block, including at least one residual block and at least one spatial transformation network, and The edge feature extraction sub-model is set to include at least one residual block, so it can facilitate the optimization of the matching prediction model and improve the accuracy of the matching prediction model.

其中，若干组点对包括至少一组所包含的图像点和地图点之间匹配的匹配点对和至少一组所包含的图像点和地图点之间不匹配的非匹配点对；利用实际匹配值和预测匹配值，确定匹配预测模型的损失值包括：利用匹配点对的预测匹配值和实际匹配值，确定匹配预测模型的第一损失值；并利用非匹配点对的预测匹配值和实际匹配值，确定匹配预测模型的第二损失值；对第一损失值和第二损失值进行加权处理，得到匹配预测模型的损失值。Wherein, several sets of point pairs include at least one set of matching point pairs that match between the included image points and map points and at least one set of non-matching point pairs that do not match between the included image points and map points; using actual matching value and predicted matching value, and determining the loss value of the matching prediction model includes: using the predicted matching value and the actual matching value of the matching point pair to determine the first loss value of the matching prediction model; and using the predicted matching value and actual matching value of the non-matching point pair. The matching value is used to determine the second loss value of the matching prediction model; the first loss value and the second loss value are weighted to obtain the loss value of the matching prediction model.

因此，通过利用匹配点对的预测匹配值和实际匹配值，确定匹配预测模型的第一损失值，并利用非匹配点对的预测匹配值和实际损失值，确定匹配预测模型的第二损失值，从而对第一损失值和第二损失值进行加权处理，得到匹配预测模型的损失值，故能够有利于使匹配预测模型有效感知匹配的空间几何结构，从而提高匹配预测模型的准确性。Therefore, the first loss value of the matching prediction model is determined by using the predicted matching value and the actual matching value of the matching point pair, and the second loss value of the matching prediction model is determined using the predicted matching value and the actual loss value of the non-matching point pair. , so that the first loss value and the second loss value are weighted to obtain the loss value of the matching prediction model, which can help the matching prediction model to effectively perceive the spatial geometric structure of the matching, thereby improving the accuracy of the matching prediction model.

其中，利用匹配点对的预测匹配值和实际匹配值，确定匹配预测模型的第一损失值之前，方法还包括：分别统计匹配点对的第一数量，以及非匹配点对的第二数量；利用匹配点对的预测匹配值和实际匹配值，确定匹配预测模型的第一损失值包括：利用匹配点对的预测匹配值和实际匹配值之间的差值，以及第一数量，确定第一损失值；利用非匹配点对的预测匹配值和实际匹配值，确定匹配预测模型的第二损失值包括：利用非匹配点对的预测匹配值和实际匹配值之间的差值，以及第二数量，确定第二损失值。Wherein, before determining the first loss value of the matching prediction model by using the predicted matching value and the actual matching value of the matching point pair, the method further includes: separately counting the first number of matching point pairs and the second number of non-matching point pairs; Using the predicted matching value and the actual matching value of the matching point pair to determine the first loss value of the matching prediction model includes: using the difference between the predicted matching value and the actual matching value of the matching point pair, and the first number, determining the first loss value of the matching prediction model. Loss value; using the predicted matching value and the actual matching value of the unmatched point pair to determine the second loss value of the matching prediction model includes: using the difference between the predicted matching value and the actual matching value of the unmatched point pair, and the second loss value of the matching prediction model. quantity to determine the second loss value.

因此，通过统计匹配点对的第一数量，以及非匹配点对的第二数量，从而利用匹配点对的预测匹配值和实际匹配值之间的差值，以及第一数量，确定第一损失值，并利用非匹配点对的预测匹配值和实际匹配值之间的差异，以及第二数量，确定第二损失值，能够有利于提高匹配预测模型的损失值的准确性，从而能够有利于提高匹配预测模型的准确性。Therefore, by counting the first number of matching point pairs and the second number of non-matching point pairs, the first loss is determined by using the difference between the predicted matching value and the actual matching value of the matching point pair, and the first number value, and use the difference between the predicted matching value and the actual matching value of the non-matching point pair, and the second quantity, to determine the second loss value, which can help to improve the accuracy of the loss value of the matching prediction model, which can be beneficial to Improve the accuracy of matching prediction models.

其中，样本图像所属的维度为2维或3维，地图数据所属的维度为2维或3维。The dimension to which the sample image belongs is 2-dimensional or 3-dimensional, and the dimension to which the map data belongs is 2-dimensional or 3-dimensional.

因此，通过设置样本图像和地图数据所属的维度，能够训练得到用于2维-2维的匹配预测模型，或者能够训练得到用于2维-3维的匹配预测模型，或者能够训练得到用于3维-3维的匹配预测模型，从而能够提高匹配预测模型的适用范围。Therefore, by setting the dimensions to which the sample images and map data belong, it is possible to train a matching prediction model for 2D-2D, or train a matching prediction model for 2D-3D, or train a matching prediction model for 2D-3D. The 3-3-dimensional matching prediction model can improve the applicable scope of the matching prediction model.

本公开第二方面提供了一种视觉定位方法，包括：利用待定位图像和地图数据，构建待识别匹配数据，其中，待识别匹配数据包括若干组点对，每组点对的两个点分别来自待定位图像和地图数据；利用匹配预测模型对若干组点对进行预测处理，得到点对的预测匹配值；基于点对的预测匹配值，确定待定位图像的摄像器件的位姿参数。A second aspect of the present disclosure provides a visual positioning method, comprising: constructing matching data to be identified by using an image to be positioned and map data, wherein the matching data to be identified includes several sets of point pairs, and two points of each set of point pairs are respectively From the image and map data to be positioned; use the matching prediction model to perform prediction processing on several groups of point pairs to obtain the predicted matching value of the point pair; based on the predicted matching value of the point pair, determine the pose parameters of the camera device of the image to be positioned.

因此，通过利用待定位图像和地图数据，构建待识别匹配数据，且待识别匹配数据包括若干组点对，每组点对的两个点分别来自待定位图像和地图数据，从而利用匹配预测模型对若干组点对进行预测处理，从而得到点对的预测匹配值，进而基于点对的预测匹配值，确定待定位图像的摄像器件的位姿参数，提高了视觉定位的准确性和即时性。Therefore, by using the to-be-located image and map data, the to-be-identified matching data is constructed, and the to-be-identified matching data includes several sets of point pairs, and the two points of each set of point pairs are respectively from the to-be-located image and the map data, thereby using the matching prediction model. Predictive processing is performed on several groups of point pairs to obtain the predicted matching values of the point pairs, and then based on the predicted matching values of the point pairs, the pose parameters of the camera device of the image to be positioned are determined, which improves the accuracy and immediacy of visual positioning.

其中，基于点对的预测匹配值，确定待定位图像的摄像器件的位姿参数，包括：将若干组点对按照预测匹配值从高到低的顺序进行排序；利用前预设数量组点对，确定待定位图像的摄像器件的位姿参数。Wherein, determining the pose parameters of the imaging device of the image to be positioned based on the predicted matching values of the point pairs includes: sorting several groups of point pairs in descending order of the predicted matching values; using the previously preset number of groups of point pairs , and determine the pose parameters of the imaging device of the image to be positioned.

因此，通过将若干组点对按照预测匹配值从高到低的顺序进行排序，并利用前预设数量组点对，确定待定位图像的摄像器件的位姿参数，从而能够有利于利用排序后的点对进行增量式采样，优先采样匹配值高的点对，故能够通过几何先验引导位姿参数的求解，从而能够提高视觉定位的准确性和即时性。Therefore, by sorting several groups of point pairs in descending order of the predicted matching values, and using the first preset number of point pairs to determine the pose parameters of the camera device of the image to be positioned, it is beneficial to use the sorted The point pairs are incrementally sampled, and the point pairs with high matching values are preferentially sampled, so the geometric prior can guide the solution of the pose parameters, thereby improving the accuracy and immediacy of visual positioning.

其中，匹配预测模型是利用上述第一方面中的匹配预测模型的训练方法得到的。Wherein, the matching prediction model is obtained by using the training method of the matching prediction model in the first aspect.

因此，通过上述匹配预测模型的训练方法得到的匹配预测模型进行视觉定位，能够提高视觉定位的准确性和即时性。Therefore, performing visual positioning with the matching prediction model obtained by the training method for the matching prediction model can improve the accuracy and immediacy of the visual positioning.

本公开第三方面提供了一种匹配预测模型的训练装置，包括：样本构建模块、预测处理模块、损失确定模块和参数调整模块，样本构建模块用于利用样本图像和地图数据，构建样本匹配数据，其中，样本匹配数据包括若干组点对以及每组点对的实际匹配值，每组点对的两个点分别来自样本图像和地图数据；预测处理模块用于利用匹配预测模型对若干组点对进行预测处理，得到点对的预测匹配值；损失确定模块用于利用实际匹配值和预测匹配值，确定匹配预测模型的损失值；参数调整模块用于利用损失值，调整匹配预测模型的参数。A third aspect of the present disclosure provides a training device for matching prediction models, including: a sample construction module, a prediction processing module, a loss determination module, and a parameter adjustment module. The sample construction module is used for using sample images and map data to construct sample matching data. , wherein the sample matching data includes several groups of point pairs and the actual matching values of each group of point pairs, and the two points of each group of point pairs come from the sample image and map data respectively; the prediction processing module is used to use the matching prediction model to match several groups of points. Perform prediction processing on the pair to obtain the predicted matching value of the point pair; the loss determination module is used to use the actual matching value and the predicted matching value to determine the loss value of the matching prediction model; the parameter adjustment module is used to use the loss value to adjust the parameters of the matching prediction model .

本公开第四方面提供了一种视觉定位装置，包括：数据构建模块、预测处理模块和参数确定模块，数据构建模块用于利用待定位图像和地图数据，构建待识别匹配数据，其中，待识别匹配数据包括若干组点对，每组点对的两个点分别来自待定位图像和地图数据；预测处理模块用于利用匹配预测模型对若干组点对进行预测处理，得到点对的预测匹配值；参数确定模块用于基于点对的预测匹配值，确定待定位图像的摄像器件的位姿参数。A fourth aspect of the present disclosure provides a visual positioning device, comprising: a data construction module, a prediction processing module, and a parameter determination module, where the data construction module is used to construct matching data to be identified by using images to be positioned and map data, wherein the to-be-identified matching data is The matching data includes several groups of point pairs, and the two points of each group of point pairs are respectively from the image to be located and the map data; the prediction processing module is used to perform prediction processing on several groups of point pairs by using the matching prediction model to obtain the predicted matching value of the point pair. ; The parameter determination module is used to determine the pose parameters of the camera device of the image to be positioned based on the predicted matching value of the point pair.

本公开第五方面提供了一种电子设备，包括相互耦接的存储器和处理器，处理器用于执行存储器中存储的程序指令，以实现上述第一方面中的匹配预测模型的训练方法，或者实现上述第二方面中的视觉定位方法。A fifth aspect of the present disclosure provides an electronic device, including a memory and a processor coupled to each other, where the processor is configured to execute program instructions stored in the memory, so as to implement the training method for a matching prediction model in the first aspect, or to implement The visual positioning method in the above second aspect.

本公开第六方面提供了一种计算机可读存储介质，其上存储有程序指令，程序指令被处理器执行时实现上述第一方面中的匹配预测模型的训练方法，或者实现上述第二方面中的视觉定位方法。A sixth aspect of the present disclosure provides a computer-readable storage medium on which program instructions are stored, and when the program instructions are executed by a processor, implement the training method for a matching prediction model in the first aspect above, or implement the second aspect above. visual positioning method.

本公开第六方面提供了一种计算机程序，包括计算机可读代码，在所述计算机可读代码在电子设备中运行，被所述电子设备中的处理器执行的情况下，实现上述第一方面中的匹配预测模型的训练方法，或者实现上述第二方面中的视觉定位方法。A sixth aspect of the present disclosure provides a computer program, comprising computer-readable codes, which, when the computer-readable codes are executed in an electronic device and executed by a processor in the electronic device, realize the above-mentioned first aspect The training method of the matching prediction model in the above, or the visual positioning method in the second aspect above.

上述方案，能够利用匹配预测模型建立匹配关系，从而能够在视觉定位中利用匹配预测模型预测点对之间的匹配值，因而能够基于预测得到的匹配值优先采样高匹配值的点对，而建立匹配关系，进而能够有利于提高视觉定位的准确性和即时性。The above scheme can use the matching prediction model to establish a matching relationship, so that the matching prediction model can be used to predict the matching value between point pairs in visual positioning, so the point pair with high matching value can be preferentially sampled based on the predicted matching value, and establish The matching relationship can be beneficial to improve the accuracy and immediacy of visual positioning.

附图说明Description of drawings

图1是本公开匹配预测模型的训练方法一实施例的流程示意图；1 is a schematic flowchart of an embodiment of a training method for a matching prediction model of the present disclosure;

图2是本公开匹配预测模型的训练方法一实施例的状态示意图；2 is a state schematic diagram of an embodiment of a training method for a matching prediction model of the present disclosure;

图3是图1中步骤S11一实施例的流程示意图；3 is a schematic flowchart of an embodiment of step S11 in FIG. 1;

图4是图3中步骤S111一实施例的流程示意图；FIG. 4 is a schematic flowchart of an embodiment of step S111 in FIG. 3;

图5是本公开视觉定位方法一实施例的流程示意图；5 is a schematic flowchart of an embodiment of the visual positioning method of the present disclosure;

图6是本公开匹配预测模型的训练装置一实施例的框架示意图；6 is a schematic diagram of a framework of an embodiment of a training device for matching prediction models of the present disclosure;

图7是本公开视觉定位装置一实施例的框架示意图；FIG. 7 is a schematic frame diagram of an embodiment of the visual positioning device of the present disclosure;

图8是本公开电子设备一实施例的框架示意图；8 is a schematic diagram of a framework of an embodiment of an electronic device of the present disclosure;

图9是本公开计算机可读存储介质一实施例的框架示意图。FIG. 9 is a schematic diagram of a framework of an embodiment of a computer-readable storage medium of the present disclosure.

具体实施方式Detailed ways

下面结合说明书附图，对本公开实施例的方案进行详细说明。The solutions of the embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

以下描述中，为了说明而不是为了限定，提出了诸如特定***结构、接口、技术之类的具体细节，以便透彻理解本公开。In the following description, for purposes of explanation and not limitation, specific details are set forth, such as specific system structures, interfaces, techniques, etc., in order to provide a thorough understanding of the present disclosure.

本文中术语“***”和“网络”在本文中常被可互换使用。本文中术语“和/或”，仅仅是一种描述关联对象的关联关系，表示可以存在三种关系，例如，A和/或B，可以表示：单独存在A，同时存在A和B，单独存在B这三种情况。另外，本文中字符“/”，一般表示前后关联对象是一种“或”的关系。此外，本文中的“多”表示两个或者多于两个。The terms "system" and "network" are often used interchangeably herein. The term "and/or" in this article is only an association relationship to describe the associated objects, indicating that there can be three kinds of relationships, for example, A and/or B, it can mean that A exists alone, A and B exist at the same time, and A and B exist independently B these three cases. In addition, the character "/" in this document generally indicates that the related objects are an "or" relationship. Also, "multiple" herein means two or more than two.

请参阅图1，图1是本公开匹配预测模型的训练方法一实施例的流程示意图。匹配预测模型的训练方法可以包括如下步骤：Please refer to FIG. 1 , which is a schematic flowchart of an embodiment of a training method for a matching prediction model of the present disclosure. The training method of the matching prediction model may include the following steps:

步骤S11：利用样本图像和地图数据，构建样本匹配数据。Step S11: Using the sample image and map data to construct sample matching data.

在本公开实施例中，样本匹配数据包括若干组点对以及每组点对的实际匹配值，每组点对的两个点分别来自样本图像和地图数据。In the embodiment of the present disclosure, the sample matching data includes several groups of point pairs and actual matching values of each group of point pairs, and the two points of each group of point pairs are respectively from the sample image and the map data.

在一个实施场景中，地图数据可以是通过样本图像而构建得到。其中，样本图像所属的维度可以为2维或3维，地图数据所属的维度可以为2维或3维，在此不做限定。例如，样本图像为二维图像，则可以通过诸如SFM(Structure From Motion)等三维重建方式对二维图像进行处理，得到诸如稀疏点云模型的地图数据，此外，样本图像还可以包括三维信息，例如，样本图像还可以为RGB-D图像(即色彩图像与深度图像)，在此不做限定。地图数据可以是由单纯的二维图像组成，也可以是由三维点云地图组成，也可以是二维图像和三维点云的结合，在此不做限定。In one implementation scenario, the map data may be constructed from sample images. The dimension to which the sample image belongs may be 2-dimensional or 3-dimensional, and the dimension to which the map data belongs may be 2-dimensional or 3-dimensional, which is not limited herein. For example, if the sample image is a two-dimensional image, the two-dimensional image can be processed by three-dimensional reconstruction methods such as SFM (Structure From Motion) to obtain map data such as a sparse point cloud model. In addition, the sample image can also include three-dimensional information. For example, the sample image may also be an RGB-D image (ie, a color image and a depth image), which is not limited herein. The map data may be composed of a simple two-dimensional image, a three-dimensional point cloud map, or a combination of a two-dimensional image and a three-dimensional point cloud, which is not limited here.

在本公开实施例中，匹配预测模型的训练方法的执行主体可以是匹配预测模型的训练装置，下文中描述为训练装置；例如，匹配预测模型的训练方法可以由终端设备或服务器或其它处理设备执行，其中，终端设备可以为用户设备(User Equipment，UE)、移动设备、用户终端、终端、蜂窝电话、无绳电话、个人数字处理(Personal Digital Assistant，PDA)、手持设备、计算设备、车载设备、可穿戴设备等。在一些可能的实现方式中，该匹配预测模型的训练方法可以通过处理器调用存储器中存储的计算机可读指令的方式来实现。In the embodiment of the present disclosure, the execution body of the training method for matching prediction models may be a training device for matching prediction models, which is described as a training device hereinafter; for example, the training method for matching prediction models may be performed by a terminal device or a server or other processing device. Execute, wherein, the terminal device may be a user equipment (User Equipment, UE), a mobile device, a user terminal, a terminal, a cellular phone, a cordless phone, a personal digital assistant (Personal Digital Assistant, PDA), a handheld device, a computing device, a vehicle-mounted device , wearable devices, etc. In some possible implementations, the training method of the matching prediction model may be implemented by the processor calling computer-readable instructions stored in the memory.

在一个实施场景中，样本匹配数据可以为二分图，二分图又称为二部图，是由点集和边集所构成的无向图，且点集可以分为两个互不相交的子集，边集中的每条边所关联的两个点分别属于这两个互不相交的子集。其中，样本匹配数据为二分图时，其包括若干组点对和连接每组点对的连接边，且连接边标注有对应点对的实际匹配值，用于描述对应点对的匹配程度，例如，实际匹配值可以为0～1之间的数值；这里，实际匹配值为0.1时，可以表明对应点对之间匹配程度较低，点对中来自样本图像的点与来自地图数据中的点对应于空间中同一点的概率较低；实际匹配值为0.98时，可以表明对应点对之间的匹配程度较高，点对中来自样本图像的点与来自地图数据中的点对应于空间中同一点的概率较高。请结合参阅图2，图2是本公开匹配预测模型的训练方法一实施例的状态示意图，如图2所示，左侧为由二分图表示的样本匹配数据，二分图的上侧和下侧两个为互不相交的点集，连接两个点集中的点为连接边，连接边标注有实际匹配值(未图示)。In an implementation scenario, the sample matching data may be a bipartite graph. A bipartite graph, also known as a bipartite graph, is an undirected graph composed of a point set and an edge set, and the point set can be divided into two mutually disjoint sub-graphs. The two points associated with each edge in the edge set belong to the two disjoint subsets. Among them, when the sample matching data is a bipartite graph, it includes several groups of point pairs and connecting edges connecting each group of point pairs, and the connecting edges are marked with the actual matching value of the corresponding point pair, which is used to describe the matching degree of the corresponding point pair, for example , the actual matching value can be a value between 0 and 1; here, when the actual matching value is 0.1, it can indicate that the matching degree between the corresponding point pairs is low, and the points from the sample image in the point pairs are the same as the points from the map data. The probability corresponding to the same point in the space is low; when the actual matching value is 0.98, it can indicate that the matching degree between the corresponding point pairs is high, and the points from the sample image in the point pairs and the points from the map data correspond to the points in the space. The probability of the same point is high. Please refer to FIG. 2. FIG. 2 is a schematic state diagram of an embodiment of the training method of the matching prediction model of the present disclosure. As shown in FIG. 2, the left side is the sample matching data represented by the bipartite graph, and the upper side and the lower side of the bipartite graph The two are mutually disjoint point sets, and the points connecting the two point sets are connected edges, and the connected edges are marked with actual matching values (not shown).

在一个实施场景中，为了提高样本匹配数据多样化，训练装置还可以对样本匹配数据进行数据增强。例如，训练装置可以将样本匹配数据中的三维点的坐标分别对三个轴进行随机旋转；或者，还可以对样本匹配数据中的三维点进行归一化处理，在此不做限定。In an implementation scenario, in order to improve the diversity of sample matching data, the training device may further perform data enhancement on the sample matching data. For example, the training device may randomly rotate the coordinates of the three-dimensional points in the sample matching data to three axes; or, it may also perform normalization processing on the three-dimensional points in the sample matching data, which is not limited here.

步骤S12：利用匹配预测模型对若干组点对进行预测处理，得到点对的预测匹配值。Step S12: Use the matching prediction model to perform prediction processing on several groups of point pairs to obtain the predicted matching values of the point pairs.

请继续结合参阅图2，仍以样本匹配数据以二分图表示为例，匹配预测模型可以包括与样本图像所属的维度对应的第一点特征提取子模型、与地图数据所属的维度对应的第二点特征提取子模型，以及边特征提取子模型。例如，样本图像为二维图像、地图数据包括二维图像时，第一点特征提取子模型和第二点特征提取子模型为二维点特征提取子模型，则训练得到的匹配预测模型可以用于二维-三维的匹配预测；或者，样本图像为三维图像、地图数据包括三维点云时，第一点特征提取子模型和第二点特征提取子模型为三维点特征提取子模型，则训练得到的匹配预测模型可以用于三维-三维的匹配预测；或者，样本图像为二维图像、地图数据包括三维点云时，第一点特征提取子模型为二维点特征提取子模型、第二点特征提取子模型为三维点特征提取子模型，则训练得到的匹配预测模型可以用于二维-三维的匹配预测；这里，对于匹配预测模型可以根据实际应用进行设置，在此不做限定。Please continue to refer to FIG. 2, still taking the sample matching data represented by a bipartite graph as an example, the matching prediction model may include a first point feature extraction sub-model corresponding to the dimension to which the sample image belongs, and a second sub-model corresponding to the dimension to which the map data belongs. Point feature extraction sub-model, and edge feature extraction sub-model. For example, when the sample image is a two-dimensional image and the map data includes a two-dimensional image, the first point feature extraction sub-model and the second point feature extraction sub-model are two-dimensional point feature extraction sub-models, then the matching prediction model obtained by training can be used For two-dimensional-three-dimensional matching prediction; or, when the sample image is a three-dimensional image and the map data includes a three-dimensional point cloud, the first point feature extraction sub-model and the second point feature extraction sub-model are three-dimensional point feature extraction sub-models, then training The obtained matching prediction model can be used for 3D-3D matching prediction; or, when the sample image is a 2D image and the map data includes a 3D point cloud, the first point feature extraction sub-model is a 2D point feature extraction sub-model, the second The point feature extraction sub-model is a three-dimensional point feature extraction sub-model, and the matching prediction model obtained by training can be used for 2D-3D matching prediction; here, the matching prediction model can be set according to the actual application, which is not limited here.

在一个实施场景中，训练装置可以利用第一点特征提取子模型和第二点特征提取子模型对二分图进行特征提取，得到第一特征和第二特征；再利用边特征提取子模型对第一特征和第二特征进行特征提取，得到第三特征；利用第三特征，得到连接边对应点的预测匹配值；如图2中的表示二分图中各连接边对应的预测匹配值。In an implementation scenario, the training device may use the first point feature extraction sub-model and the second point feature extraction sub-model to perform feature extraction on the bipartite graph to obtain the first feature and the second feature; and then use the edge feature extraction sub-model to perform feature extraction on the second feature The first feature and the second feature are extracted to obtain the third feature; the third feature is used to obtain the predicted matching value of the corresponding point of the connecting edge; as shown in Figure 2, the predicted matching value corresponding to each connecting edge in the bipartite graph is shown.

在一个的实施场景中，当第一点特征提取子模型和第二点特征提取子模型为二维点特征提取子模型时，可以包括至少一个残差块(resblock)，例如，1个残差块、2个残差块、3个残差块等等，每个残差块(resblock)由多个基本块(base block)组成，而每个基本块(base block)由一层1*1的卷积层、批标准化层(batch normalization)、上下文标准化层(context normalization)组成。当第一点特征提取子模型和第二点特征提取子模型为三维点特征提取子模型时，可以包括至少一个残差块(resblock)和至少一个空间变换网络(如，t-net)，例如，1个残差块、2个残差块、3个残差块等等，在此不做限定。空间变换网络可以为1个、2个，这里，空间变换网络可以位于模型的首尾位置，在此不做限定。残差块(resblock)的结构可以参考前述实施场景中的结构，在此不再赘述。边特征提取子模型可以包括至少一个残差块，例如，1个残差块、2个残差块、3个残差块等等，在此不做限定，残差块(resblock)的结构可以参考前述实施场景中的结构，在此不再赘述。In one implementation scenario, when the first point feature extraction sub-model and the second point feature extraction sub-model are two-dimensional point feature extraction sub-models, at least one residual block (resblock) may be included, for example, one residual block, 2 residual blocks, 3 residual blocks, etc., each residual block (resblock) consists of multiple basic blocks (base blocks), and each basic block (base block) consists of a layer of 1*1 It consists of a convolutional layer, a batch normalization layer, and a context normalization layer. When the first point feature extraction sub-model and the second point feature extraction sub-model are three-dimensional point feature extraction sub-models, at least one residual block (resblock) and at least one spatial transformation network (eg, t-net) may be included, for example , 1 residual block, 2 residual blocks, 3 residual blocks, etc., which are not limited here. The number of spatial transformation networks can be one or two. Here, the spatial transformation networks can be located at the beginning and end of the model, which is not limited here. For the structure of the residual block (resblock), reference may be made to the structure in the foregoing implementation scenario, and details are not described herein again. The edge feature extraction sub-model may include at least one residual block, for example, one residual block, two residual blocks, three residual blocks, etc., which are not limited here, and the structure of the residual block (resblock) can be Referring to the structure in the foregoing implementation scenario, details are not repeated here.

步骤S13：利用实际匹配值和预测匹配值，确定匹配预测模型的损失值。Step S13: Determine the loss value of the matching prediction model by using the actual matching value and the predicted matching value.

在一个实施场景中，训练装置可以统计实际匹配值和预测匹配值之间的差异，从而确定匹配预测模型的损失值。这里，训练装置可以统计所有点对的预测匹配值和其实际匹配值之间差值的总和，再利用该总和和所有点对的数量，求取所有点对的预测匹配值的均值，作为匹配预测模型的损失值。In an implementation scenario, the training device may count the difference between the actual matching value and the predicted matching value, so as to determine the loss value of the matching prediction model. Here, the training device can count the sum of the differences between the predicted matching values of all point pairs and their actual matching values, and then use the sum and the number of all point pairs to obtain the average of the predicted matching values of all point pairs, as matching The loss value of the prediction model.

在另一个实施场景中，若干组匹配点对可以包括至少一组所包含的图像点和地图点之间匹配的匹配点对，即匹配点对所包含的图像点与地图点为空间中的同一点，若干组匹配点对还可以包括至少一组所包含的图像点和地图点之间不匹配的非匹配点对，即非匹配点对所包含的图像点与地图点为空间中的不同点，则训练装置可以利用匹配点对的预测匹配值w ^*和实际匹配值w，确定匹配预测模型的第一损失值L _pos(w,w ^*)，并利用非匹配点对的预测匹配值w ^*和实际匹配值w，确定匹配预测模型的第二损失值L _neg(w,w ^*)，从而通过对第一损失值L _pos(w,w ^*)和第二损失值L _neg(w,w ^*)进行加权处理，得到匹配预测模型的损失值L(w,w ^*)，参见公式(1)： In another implementation scenario, several sets of matching point pairs may include at least one set of matching point pairs that match between the included image points and map points, that is, the image points and map points included in the matching point pairs are the same in space. One point, several sets of matching point pairs may also include at least one set of non-matching point pairs that do not match between the image points and map points contained in the non-matching point pairs, that is, the image points and map points contained in the non-matching point pairs are different points in space , then the training device can use the predicted matching value w ^* and the actual matching value w of the matching point pair to determine the first loss value L _pos (w, w ^* ) of the matching prediction model, and use the predicted matching value w of the non-matching point pair ^* and the actual matching value w, determine the second loss value L _neg (w, w ^* ) of the matching prediction model, so that by comparing the first loss value L _pos (w, w ^* ) and the second loss value L _neg (w, w ^* ) is weighted to obtain the loss value L(w, w ^* ) that matches the prediction model, see formula (1):

L(w,w ^*)＝αL _pos(w,w ^*)+βL _neg(w,w ^*)……(1) L(w,w ^* )= _αLpos (w,w ^* )+ _βLneg (w,w ^* )......(1)

上述公式(1)中，L(w,w ^*)表示匹配预测模型的损失值，L _pos(w,w ^*)表示匹配点对所对应的第一损失值，L _neg(w,w ^*)表示非匹配点对所对应的第二损失值，α和β分别表示第一损失值L _pos(w,w ^*)的权重、第二损失值L _neg(w,w ^*)的权重。 In the above formula (1), L(w, w ^* ) represents the loss value of the matching prediction model, L _pos (w, w ^* ) represents the first loss value corresponding to the matching point pair, and L _neg (w, w ^* ) represents the second loss value corresponding to the non-matching point pair, and α and β represent the weight of the first loss value L _pos (w,w ^* ) and the weight of the second loss value L _neg (w,w ^* ), respectively.

在一个实施场景中，训练装置还可以分别统计匹配点对的第一数量|ε _pos|和非匹配点对的第二数量|ε _neg|，从而可以利用匹配点对的预测匹配值和实际匹配值之间的差值，以及第一数量，确定第一损失值，参见公式(2)： In an implementation scenario, the training device may also count the first number |ε _pos | of matching point pairs and the second number |ε _neg | of non-matching point pairs, so that the predicted matching value and actual matching of matching point pairs can be used. The difference between the values, and the first quantity, determines the first loss value, see formula (2):

上述公式(2)中，L _pos(w,w ^*)表示第一损失值，|ε _pos|表示第一数量，w,w ^*分别表示匹配点对的实际匹配值和预测匹配值。 In the above formula (2), L _pos (w, w ^* ) represents the first loss value, |ε _pos | represents the first quantity, and w, w ^* represent the actual matching value and the predicted matching value of the matching point pair, respectively.

训练装置还可以利用非匹配点对的预测匹配值和实际匹配值之间的差值，以及第二数量，确定第二损失值，参见公式(3)：The training device can also use the difference between the predicted matching value and the actual matching value of the non-matching point pair, as well as the second quantity, to determine the second loss value, see formula (3):

上述公式(3)中，L _neg(w,w ^*)表示第二损失值，|ε _neg|表示第二数量，w,w ^*分别表示非匹配点对的实际匹配值和预测匹配值；此外，非匹配点对的实际匹配值w还可以统一设置为一预设数值(例如，0)。 In the above formula (3), L _neg (w, w ^* ) represents the second loss value, |ε _neg | represents the second quantity, and w, w ^* represent the actual matching value and the predicted matching value of the non-matching point pair, respectively; in addition, , the actual matching value w of the non-matching point pair can also be uniformly set to a preset value (for example, 0).

步骤S14：利用损失值，调整匹配预测模型的参数。Step S14: Using the loss value, adjust the parameters of the matching prediction model.

在本公开实施例中，训练装置可以采用随机梯度下降(Stochastic Gradient Descent，SGD)、批量梯度下降(Batch Gradient Descent，BGD)、小批量梯度下降(Mini-Batch Gradient Descent，MBGD)等方式，利用损失值对匹配预测模型的参数进行调整；其中，批量梯度下降是指在每一次迭代时，使用所有样本来进行参数更新；随机梯度下降是指在每一次迭代时，使用一个样本来进行参数更新；小批量梯度下降是指在每一次迭代时，使用一批样本来进行参数更新，在此不再赘述。In the embodiment of the present disclosure, the training device may adopt methods such as Stochastic Gradient Descent (SGD), Batch Gradient Descent (BGD), Mini-Batch Gradient Descent (MBGD), etc. The loss value adjusts the parameters of the matching prediction model; among them, batch gradient descent refers to using all samples for parameter update in each iteration; stochastic gradient descent refers to using one sample for parameter update in each iteration ; Mini-batch gradient descent refers to using a batch of samples to update parameters in each iteration, which will not be repeated here.

在一个实施场景中，还可以设置一训练结束条件，当满足训练结束条件时，训练装置可以结束对匹配预测模型的训练。其中，训练结束条件可以包括：损失值小于一预设损失阈值，且损失值不再减小；当前训练次数达到预设次数阈值(例如，500次、1000次等)，在此不做限定。In an implementation scenario, a training end condition may also be set, and when the training end condition is satisfied, the training device may end the training of the matching prediction model. The training end condition may include: the loss value is less than a preset loss threshold, and the loss value is no longer reduced; the current training times reaches a preset times threshold (eg, 500 times, 1000 times, etc.), which is not limited here.

上述方案，通过利用样本图像和地图数据构建得到样本匹配数据，且样本匹配数据包括若干组点对以及每组点对的实际匹配值，每组点对的两个点分别来自样本图像和地图数据，从而利用匹配预测模型对若干组点对进行预测处理，得到点对的预测匹配值，进而利用实际匹配值和预测匹配值，确定匹配预测模型的损失值，利用损失值来对匹配预测模型的参数进行调整，故能够利用匹配预测模型建立匹配关系，从而能够在视觉定位中利用匹配预测模型预测点对之间的匹配值，因而能够基于预测得到的匹配值优先采样高匹配值的点对，进而能够有利于提高视觉定位的准确性和即时性。In the above scheme, the sample matching data is obtained by constructing the sample image and the map data, and the sample matching data includes several groups of point pairs and the actual matching value of each group of point pairs, and the two points of each group of point pairs come from the sample image and the map data respectively. , so that the matching prediction model is used to perform prediction processing on several groups of point pairs, and the predicted matching value of the point pair is obtained, and then the actual matching value and the predicted matching value are used to determine the loss value of the matching prediction model, and the loss value is used to determine the matching prediction model. By adjusting the parameters, the matching prediction model can be used to establish a matching relationship, so that the matching prediction model can be used to predict the matching value between point pairs in visual positioning, so the point pair with high matching value can be preferentially sampled based on the predicted matching value. This can help to improve the accuracy and immediacy of visual positioning.

请参阅图3，图3是图1中步骤S11一实施例的流程示意图。其中，训练装置可以通过如下步骤构建得到样本匹配数据：Please refer to FIG. 3 , which is a schematic flowchart of an embodiment of step S11 in FIG. 1 . Among them, the training device can construct the sample matching data through the following steps:

步骤S111：从样本图像中获取若干图像点，以及从地图数据中获取若干地图点，以组成若干组点对。Step S111: Acquire several image points from the sample image, and acquire several map points from the map data to form several groups of point pairs.

若干组点对包括至少一组所包含的图像点和地图点之间匹配的匹配点对；也就是说，若干组点对中至少包含一组所包含的图像点和地图点对应于空间中同一点的匹配点对。以样本图像为二维图像，地图数据为通SFM重建方式得到的稀疏点云模型为例，若干组点对中至少包含1个三角化的点以及该三角化的点对应于稀疏点云模型中的三维点。Several sets of point pairs include at least one set of matched point pairs that match between the included image points and map points; that is, at least one set of the included image points and map points in the several sets of point pairs corresponds to the same in space. A matching pair of dots. Taking the sample image as a two-dimensional image and the map data as a sparse point cloud model obtained by SFM reconstruction as an example, several groups of point pairs contain at least one triangulated point and the triangulated point corresponds to the sparse point cloud model. three-dimensional point.

在一个实施场景中，若干组点对中还可以包括至少一组所包含的图像点与地图点之间不匹配的非匹配点对，也就是说，若干组点对中还可以包括至少一组所包含的图像点和地图点对应于空间中不同点的非匹配点对。仍以样本图像为二维图像，地图数据为通SFM重建方式得到的稀疏点云模型为例，若干组点对中还可以包括未三角化的点以及稀疏点云模型中的任一点，以构成一组非匹配点对，从而能够在样本匹配数据中加入噪声，进而能够提高匹配预测模型的鲁棒性。In an implementation scenario, the several groups of point pairs may further include at least one group of non-matching point pairs that do not match between the included image points and map points, that is, the several groups of point pairs may further include at least one group of non-matching point pairs. The included image points and map points correspond to unmatched pairs of points at different points in space. Still taking the sample image as a two-dimensional image and the map data as a sparse point cloud model obtained by SFM reconstruction as an example, several groups of point pairs can also include untriangulated points and any point in the sparse point cloud model to form a sparse point cloud model. A set of non-matching point pairs can add noise to the sample matching data, thereby improving the robustness of the matching prediction model.

在一个实施场景中，请结合参阅图4，图4是图3中步骤S111一实施例的流程示意图。其中，训练装置可以通过如下步骤，得到若干组点对：In an implementation scenario, please refer to FIG. 4 , which is a schematic flowchart of an embodiment of step S111 in FIG. 3 . Wherein, the training device can obtain several sets of point pairs through the following steps:

步骤S41：将样本图像中的图像点划分为第一图像点和第二图像点。Step S41: Divide the image points in the sample image into a first image point and a second image point.

在本公开实施例中，第一图像点在地图数据中存在与其匹配的地图点，第二图像点在地图数据中不存在与其匹配的地图点。仍以样本图像为二维图像，地图数据为通SFM重建方式得到的稀疏点云模型为例，第一图像点可以为样本图像中三角化的特征点，第二图像点可以为样本图像中未三角化的特征点；在其他应用场景中，可以以此类推，在此不做限定。In the embodiment of the present disclosure, the first image point has a matching map point in the map data, and the second image point does not have a matching map point in the map data. Still taking the sample image as a two-dimensional image and the map data as a sparse point cloud model obtained by SFM reconstruction as an example, the first image point can be a triangulated feature point in the sample image, and the second image point can be a non-triangulated point in the sample image. Triangulated feature points; in other application scenarios, it can be deduced by analogy, which is not limited here.

在一个实施场景中，样本图像中的图像点为样本图像的特征点。在另一个实施场景中，还可以将特征点的坐标转换到归一化平面上。In one implementation scenario, the image points in the sample image are feature points of the sample image. In another implementation scenario, the coordinates of the feature points can also be converted to a normalized plane.

步骤S42：对于每一第一图像点，从地图数据中分配若干第一地图点，并分别将第一图像点与每一第一地图点作为一第一点对，其中，第一地图点中包含与第一图像点匹配的地图点。Step S42: For each first image point, assign a number of first map points from the map data, and use the first image point and each first map point as a first point pair, wherein, among the first map points Contains map points that match the first image point.

对于每一第一图像点，从地图数据中分配若干第一地图点，并分别将第一图像点与每一第一地图点作为一第一点对，且第一地图点中包含与第一图像点匹配的地图点。在一个实施场景中，对于每一第一图像点分配的第一地图点的数量可以相同，也可以不同。在另一个实施场景中，在分配第一地图点之前，还可以从划分得到的第一图像点中随机抽取若干第一图像点，并对抽取得到的第一图像点，执行从地图数据中分配若干第一地图点，并分别将第一图像点与每一第一地图点作为一第一点对的步骤，在此不做限定。在一个实施场景中，可以从划分得到的第一图像点中随机抽取N个点，并对抽取得到的N个第一图像点中的每一个第一图像点，从地图数据中随机分配K个第一地图点，且随机分配的K个第一地图点中包含与第一图像点匹配的地图点。For each first image point, a number of first map points are allocated from the map data, and the first image point and each first map point are respectively regarded as a first point pair, and the first map point includes a The map point to which the image point matches. In an implementation scenario, the number of first map points allocated to each first image point may be the same or different. In another implementation scenario, before assigning the first map points, a number of first image points may be randomly selected from the first image points obtained by division, and the first image points obtained by extraction may be assigned from map data. The steps of using a plurality of first map points and respectively using the first image point and each first map point as a first point pair are not limited herein. In an implementation scenario, N points may be randomly selected from the first image points obtained by division, and for each of the N first image points obtained by the extraction, K points are randomly allocated from the map data The first map point, and the randomly assigned K first map points include map points that match the first image point.

步骤S43：对于每一第二图像点，从地图数据中分配若干第二地图点，并分别将第二图像点与每一第二地图点作为一第二点对。Step S43: For each second image point, assign a number of second map points from the map data, and use the second image point and each second map point as a second point pair.

对于每一第二图像点，从地图数据中分配若干第二地图点，并分别将第二图像点与每一第二地图点作为一第二点对。在一个实施场景中，对于每一第二图像点分配的第二地图点的数量可以相同，也可以不同。在另一个实施场景中，在分配第二地图点之前，还可以从划分得到的第二图像点中随机抽取若干第二图像点，并对抽取得到的第二图像点，执行从地图数据中分配若干第二地图点，并分别将第二图像点与每一第二地图点作为一第二点对的步骤，在此不做限定。在一个实施场景中，可以从划分得到的第二图像点中随机抽取M个点，并对抽取得到的M个第二图像点中的每一个第二图像点，从地图数据中随机分配K个第二地图点。For each second image point, a number of second map points are allocated from the map data, and the second image point and each second map point are respectively regarded as a second point pair. In an implementation scenario, the number of second map points allocated to each second image point may be the same or different. In another implementation scenario, before assigning the second map points, a number of second image points may be randomly selected from the second image points obtained by division, and the second image points obtained by extraction may be assigned from map data. The steps of using a plurality of second map points and using the second image point and each second map point as a second point pair are not limited herein. In an implementation scenario, M points may be randomly selected from the divided second image points, and K points may be randomly allocated from the map data for each of the M second image points obtained by the extraction Second map point.

在一个实施场景中，为了便于明确每一第一点对和每一第二点对是否为匹配点对，还可以遍历每一第一点对和每一第二点对，并利用第一标识符(例如，1)对匹配点对进行标记，利用第二标识符(例如，0)对非匹配点对进行标记。In an implementation scenario, in order to clarify whether each first point pair and each second point pair are matching point pairs, it is also possible to traverse each first point pair and each second point pair, and use the first identifier An identifier (eg, 1) marks matching point pairs, and a second identifier (eg, 0) marks non-matching point pairs.

上述步骤S42和步骤S43可以按照先后顺序执行，例如，先执行步骤S42，后执行步骤S43；或者，先执行步骤S43，后执行步骤S42；此外，上述步骤S42和步骤S43也可以同步执行，在此不做限定。The above-mentioned steps S42 and S43 can be performed in sequence, for example, step S42 is performed first, and then step S43 is performed; or, step S43 is performed first, and then step S42 is performed; This is not limited.

步骤S44：从第一点对和第二点对中抽取得到若干组点对。Step S44: Extracting several groups of point pairs from the first point pair and the second point pair.

在本公开实施例中，可以从第一点对和第二点对中随机抽取，得到若干组点对，作为一样本匹配数据。在一个实施场景中，还可以对第一点对和第二点对随机抽取若干次，从而得到若干个样本匹配数据。在另一个实施场景中，还可以获取多个样本图像和地图数据，并对每一样本图像和地图数据，重复执行上述步骤，得到多个样本匹配数据，从而能够提高样本数量，有利于提高匹配预测模型的准确性。In the embodiment of the present disclosure, several groups of point pairs may be obtained by randomly extracting from the first point pair and the second point pair, as a sample matching data. In an implementation scenario, the first point pair and the second point pair may also be randomly selected several times to obtain several sample matching data. In another implementation scenario, multiple sample images and map data can also be obtained, and the above steps can be repeatedly performed for each sample image and map data to obtain multiple sample matching data, which can increase the number of samples and help improve matching. Predictive model accuracy.

步骤S112：对于每组匹配点对：利用样本图像的位姿参数将地图点投影至样本图像所属的维度中，得到地图点的投影点；并基于图像点和投影点之间的差异，确定匹配点对的实际匹配值。Step S112: for each set of matching point pairs: using the pose parameters of the sample image to project the map point into the dimension to which the sample image belongs to obtain the projected point of the map point; and determine the matching based on the difference between the image point and the projected point The actual match value of the point pair.

对于每组匹配点对，可以利用与其对应的样本图像的位姿参数将地图点投影至样本图像所属的维度中，得到地图点的投影点。仍以样本图像为二维图像，地图数据为通过SFM 重建方式得到的稀疏点云模型为例，训练装置可以利用位姿参数将三维点重投影，从而得到其投影点。For each set of matched point pairs, the map point can be projected into the dimension to which the sample image belongs by using the pose parameters of the corresponding sample image to obtain the projected point of the map point. Taking the sample image as a two-dimensional image and the map data as a sparse point cloud model obtained by SFM reconstruction as an example, the training device can use the pose parameters to reproject the three-dimensional points to obtain their projected points.

在一个实施场景中，可以利用预设概率分布函数将图像点和其投影点之间的差异转换为概率密度值，作为匹配点对的实际匹配值。在一个实施场景中，预设概率分布函数可以是标准高斯分布函数，从而能够将取值范围在负无穷至正无穷之间的差异转换为对应的概率密度值，且差异的绝对值越大，对应的概率密度值越小，相应表示点对的匹配程度越低，差异的绝对值越小，对应的概率密度值越小，相应表示点对的匹配程度越高，当差异的绝对值为0时，其对应的概率密度值最大。In one implementation scenario, a preset probability distribution function can be used to convert the difference between the image point and its projected point into a probability density value, which is used as the actual matching value of the matching point pair. In an implementation scenario, the preset probability distribution function may be a standard Gaussian distribution function, so that the difference in the value range from negative infinity to positive infinity can be converted into a corresponding probability density value, and the greater the absolute value of the difference, The smaller the corresponding probability density value, the lower the matching degree of the corresponding point pair, the smaller the absolute value of the difference, the smaller the corresponding probability density value, the higher the matching degree of the corresponding point pair, when the absolute value of the difference is 0 , the corresponding probability density value is the largest.

在一个实施场景中，在利用位姿参数将地图点投影至样本图像所属的维度之前，训练装置还可以基于匹配点对，计算样本图像的位姿参数；这里，可以采用BA(Bundle Adjustment)计算位姿参数，从而利用位姿参数将地图点投影至样本图像所属的维度中，得到地图点的投影点。In an implementation scenario, before using the pose parameters to project the map points to the dimension to which the sample images belong, the training device can also calculate the pose parameters of the sample images based on the matched point pairs; here, BA (Bundle Adjustment) can be used to calculate The pose parameter is used to project the map point into the dimension to which the sample image belongs to obtain the projected point of the map point.

在一个实施场景中，还可以将非匹配点对的实际匹配值设置为预设数值，例如，将非匹配点对的实际匹配值设置为0。In an implementation scenario, the actual matching value of the non-matching point pair may also be set to a preset value, for example, the actual matching value of the non-matching point pair is set to 0.

区别于前述实施例，通过从样本图像中获取若干图像点，以及从地图数据中获取若干地图点，以组成若干组点对，且若干组点对中包括至少一组所包含的图像点和地图点之间匹配的匹配点对，故能够生成用于训练匹配预测模型的样本，并对于每组匹配点对，利用样本图像的位姿参数将地图点投影至样本图像所属的维度中，得到地图点的投影点，从而基于图像点和投影点之间的差异，确定匹配点对的实际匹配值，故能够使匹配预测模型在训练过程中学习到匹配点对几何特征，有利于提高匹配预测模型的准确性。Different from the foregoing embodiments, several sets of point pairs are formed by acquiring several image points from the sample image and several map points from the map data, and the several sets of point pairs include at least one set of the image points and the map. The matching point pairs matched between the points can generate samples for training the matching prediction model, and for each set of matching point pairs, the map points are projected to the dimension of the sample image by using the pose parameters of the sample image to obtain a map. The actual matching value of the matching point pair is determined based on the difference between the image point and the projected point, so the matching prediction model can learn the geometric features of the matching point pair during the training process, which is conducive to improving the matching prediction model. accuracy.

请参阅图5，图5是本公开视觉定位方法一实施例的流程示意图。视觉定位方法可以包括如下步骤：Please refer to FIG. 5 , which is a schematic flowchart of an embodiment of the visual positioning method of the present disclosure. The visual positioning method may include the following steps:

步骤S51：利用待定位图像和地图数据，构建待识别匹配数据。Step S51: Using the image to be located and the map data to construct matching data to be identified.

在本公开实施例中，待识别匹配数据包括若干组点对，每组点对的两个点分别来自待定位图像和地图数据。其中，待定位图像和地图数据所属的维度可以为2维或3维，在此不做限定。例如，待定位图像可以为二维图像，或者，待定位图像还可以还是RGB-D图像，在此不做限定；地图数据可以为单纯的二维图像组成，也可以是由三维点云地图组成，也可以是二维图像和三维点云的结合，在此不做限定。In the embodiment of the present disclosure, the matching data to be identified includes several groups of point pairs, and the two points of each group of point pairs respectively come from the image to be located and the map data. The dimension to which the image to be located and the map data belong may be 2-dimensional or 3-dimensional, which is not limited herein. For example, the image to be positioned may be a two-dimensional image, or the image to be positioned may also be an RGB-D image, which is not limited here; the map data may be composed of a simple two-dimensional image, or may be composed of a three-dimensional point cloud map , or a combination of a two-dimensional image and a three-dimensional point cloud, which is not limited here.

步骤S52：利用匹配预测模型对若干组点对进行预测处理，得到点对的预测匹配值。Step S52: Use the matching prediction model to perform prediction processing on several groups of point pairs to obtain the predicted matching values of the point pairs.

匹配预测模型为预先通过样本匹配数据训练得到的神经网络模型。在一个实施场景中，匹配预测模型可以是通过前述任一匹配预测模型的训练方法实施例中的步骤训练得到的，其中，训练步骤可以参考前述实施例中的步骤，在此不再赘述。The matching prediction model is a neural network model trained in advance through sample matching data. In an implementation scenario, the matching prediction model may be obtained by training through the steps in any of the foregoing embodiments of the matching prediction model training method, wherein the training steps may refer to the steps in the foregoing embodiments, which will not be repeated here.

通过利用匹配预测模型对若干组点对进行预测处理，可以得到待识别匹配数据中点对的预测匹配值。在一个实施场景中，待识别匹配数据为二分图，二分图中包括若干组点对和连接每组点对的连接边，匹配预测模型包括与待定位图像所属的维度对应的第一点特征提取子模型、与地图数据所属的维度对应的第二点特征提取子模型，以及边特征提取子模型，从而可以利用第一点特征提取子模型和第二点特征提取子模型对二分图进行特征提取，得到第一特征和第二特征，并利用边特征提取子模型对第一特征和第二特征进行特征提取，得到第三特征，进而利用第三特征，得到连接边对应的点对的预测匹配值。这里，可以参阅前述实施例中的步骤，在此不再赘述。By using the matching prediction model to perform prediction processing on several groups of point pairs, the predicted matching values of the point pairs in the matching data to be identified can be obtained. In an implementation scenario, the matching data to be identified is a bipartite graph, and the bipartite graph includes several groups of point pairs and connecting edges connecting each group of point pairs, and the matching prediction model includes a first point feature extraction corresponding to the dimension to which the image to be located belongs. A sub-model, a second point feature extraction sub-model corresponding to the dimension to which the map data belongs, and an edge feature extraction sub-model, so that the first point feature extraction sub-model and the second point feature extraction sub-model can be used to perform feature extraction on the bipartite graph , obtain the first feature and the second feature, and use the edge feature extraction sub-model to perform feature extraction on the first feature and the second feature to obtain the third feature, and then use the third feature to obtain the predicted matching of the point pair corresponding to the connecting edge value. Here, reference may be made to the steps in the foregoing embodiments, which will not be repeated here.

步骤S53：基于点对的预测匹配值，确定待定位图像的摄像器件的位姿参数。Step S53: Determine the pose parameters of the imaging device of the image to be positioned based on the predicted matching value of the point pair.

通过待识别匹配数据中点对的预测匹配值，可以优先利用预测匹配值比较高的点对，确定待定位图像的摄像器件的位姿参数。在一个实施场景中，可以利用预测匹配值比较高的n个点对，构建PnP(Perspective-n-Point)问题，从而采用诸如EPnP(Efficient PnP)等方式对PnP问题进行求解，进而得到待定位图像的摄像器件的位姿参数。在另一个实施场景中，还可以将若干组点对按照预测匹配值从高到低的顺序进行排序，并利用前预设数量组点对，确定待定位图像的摄像器件的位姿参数。其中，前预设数量可以根据实际情况进行设置，例如，将排序后的若干组点对中预测匹配值不为0的点对作为前预设数量组点对；或者，还可以将排序后的若组点对中预测匹配值大于一下限值的点对作为前预设数量组点对；对于前预设数量，可以根据实际应用而设置，在此不做限定。这里，还可以采用诸如PROSAC(PROgressive SAmple Consensus，渐进一致采样)的方式，对排序后的点对进行处理，得到待定位图像的摄像器件的位姿参数。在一个实施场景中，待定位图像的摄像器件的位姿参数可以包括摄像器件在地图数据所属的地图坐标系中的6个自由度(Degree of freedom,DoF)，包括：位姿(pose)，即坐标，以及环绕x轴的偏转yaw(俯仰角)、绕y轴的偏转pitch(偏航角)、绕z轴的偏转roll(翻滚角)。According to the predicted matching value of the point pair in the matching data to be identified, the point pair with a relatively high predicted matching value can be preferentially used to determine the pose parameters of the imaging device of the image to be positioned. In an implementation scenario, the PnP (Perspective-n-Point) problem can be constructed by using n point pairs with relatively high predicted matching values, so as to solve the PnP problem by means such as EPnP (Efficient PnP), and then obtain the to-be-located problem. The pose parameters of the camera device of the image. In another implementation scenario, several sets of point pairs may also be sorted in descending order of predicted matching values, and the previously preset number of sets of point pairs may be used to determine the pose parameters of the camera device of the image to be positioned. The first preset number can be set according to the actual situation, for example, point pairs whose predicted matching value is not 0 among the sorted groups of point pairs are used as the first preset number of point pairs; If the predicted matching value in the group point pair is greater than the lower limit value, the point pair is used as the first preset number of group point pairs; the first preset number can be set according to the actual application, which is not limited here. Here, a method such as PROSAC (PROgressive SAmple Consensus, Progressive Consistent Sampling) can also be used to process the sorted point pairs to obtain the pose parameters of the camera device of the image to be positioned. In an implementation scenario, the pose parameters of the camera device of the image to be positioned may include 6 degrees of freedom (DoF) of the camera device in the map coordinate system to which the map data belongs, including: pose, That is, the coordinates, and the deflection yaw (pitch angle) around the x-axis, the deflection pitch (yaw angle) around the y-axis, and the deflection roll (roll angle) around the z-axis.

上述方案，通过利用待定位图像和地图数据，构建待识别匹配数据，且待识别匹配数据包括若干组点对，每组点对的两个点分别来自待定位图像和地图数据，从而利用匹配预测模型对若干组点对进行预测处理，得到点对的预测匹配值，进而基于点对的预测匹配值，确定待定位图像的摄像器件的位姿参数，故能够在视觉定位中利用匹配预测模型预测点对之间的匹配值而建立匹配关系，能够有利于提高视觉定位的准确性和即时性。In the above scheme, the matching data to be identified is constructed by using the image to be located and the map data, and the matching data to be identified includes several groups of point pairs, and the two points of each group of point pairs are respectively from the image to be located and the map data, so as to use the matching prediction The model performs prediction processing on several groups of point pairs to obtain the predicted matching value of the point pair, and then determines the pose parameters of the camera device of the image to be positioned based on the predicted matching value of the point pair, so the matching prediction model can be used in visual positioning to predict Establishing a matching relationship based on matching values between point pairs can help improve the accuracy and immediacy of visual positioning.

请参阅图6，图6是本公开匹配预测模型的训练装置60一实施例的框架示意图。匹配预测模型的训练装置60包括样本构建部分61、预测处理部分62、损失确定部分63和参数调整部分64，样本构建部分61配置为利用样本图像和地图数据，构建样本匹配数据，其中，样本匹配数据包括若干组点对以及每组点对的实际匹配值，每组点对的两个点分别来自样本图像和地图数据；预测处理部分62配置为利用匹配预测模型对若干组点对进行预测处理，得到点对的预测匹配值；损失确定部分63配置为利用实际匹配值和预测匹配值，确定匹配预测模型的损失值；参数调整部分64配置为利用损失值，调整匹配预测模型的参数。Please refer to FIG. 6 , which is a schematic diagram of a framework of an embodiment of a training apparatus 60 for matching prediction models of the present disclosure. The training device 60 for matching the prediction model includes a sample construction part 61, a prediction processing part 62, a loss determination part 63 and a parameter adjustment part 64, and the sample construction part 61 is configured to use the sample image and map data to construct sample matching data, wherein the sample matching The data includes several groups of point pairs and the actual matching value of each group of point pairs, and the two points of each group of point pairs are respectively from the sample image and the map data; the prediction processing part 62 is configured to use the matching prediction model to perform prediction processing on the several groups of point pairs , to obtain the predicted matching value of the point pair; the loss determination part 63 is configured to use the actual matching value and the predicted matching value to determine the loss value of the matching prediction model; the parameter adjustment part 64 is configured to use the loss value to adjust the parameters of the matching prediction model.

上述方案，能够利用匹配预测模型建立匹配关系，从而能够在视觉定位中利用匹配预测模型预测点对之间的匹配值，因而能够基于预测得到的匹配值优先采样高匹配值的点对，进而能够有利于提高视觉定位的准确性和即时性。The above solution can use the matching prediction model to establish a matching relationship, so that the matching prediction model can be used to predict the matching value between the point pairs in the visual positioning, so the point pair with high matching value can be preferentially sampled based on the predicted matching value, and then can It is beneficial to improve the accuracy and immediacy of visual positioning.

在一些实施例中，样本构建部分61包括点对获取子部分，配置为从样本图像中获取若干图像点，以及从地图数据中获取若干地图点，以组成若干组点对；其中，若干组点对包括至少一组所包含的图像点和地图点之间匹配的匹配点对，样本构建部分61包括第一匹配值确定子部分，配置为对于每组匹配点对：利用样本图像的位姿参数将地图点投影至样本图像所属的维度中，得到地图点的投影点；并基于图像点和投影点之间的差异，确定匹配点对的实际匹配值。In some embodiments, the sample construction section 61 includes a point pair acquisition subsection configured to acquire several image points from the sample image and several map points from the map data to form several sets of point pairs; wherein the several sets of points The pair includes at least one set of matching point pairs that match between the included image points and map points, and the sample construction section 61 includes a first matching value determination subsection configured to, for each set of matching point pairs: use the pose parameters of the sample image Project the map point into the dimension to which the sample image belongs to obtain the projected point of the map point; and determine the actual matching value of the matching point pair based on the difference between the image point and the projected point.

区别于前述实施例，通过从样本图像中获取若干图像点，以及从地图数据中获取若干地图点，以组成若干组点对，且若干组点对中包括至少一组所包含的图像点和地图点之间匹配的匹配点对，故能够生成用于训练匹配预测模型的样本，并对于每组匹配点对，利用样本图像的位姿参数将地图点投影至样本图像所属的维度中，得到地图点的投影点，从而基于图像点和投影点之间的差异，确定匹配点对的实际匹配值，故能够使匹配预测模型在训练过程中学习到匹配点对几何特征，有利于提高匹配预测模型的准确性。Different from the foregoing embodiments, several groups of point pairs are formed by acquiring several image points from the sample image and several map points from the map data, and the several groups of point pairs include at least one group of the image points and the map. The matching point pairs matched between the points can generate samples for training the matching prediction model, and for each group of matching point pairs, the map points are projected to the dimension of the sample image by using the pose parameters of the sample image to obtain a map. The actual matching value of the matching point pair is determined based on the difference between the image point and the projected point, so the matching prediction model can learn the geometric features of the matching point pair during the training process, which is beneficial to improve the matching prediction model. accuracy.

在一些实施例中，若干组点对包括至少一组所包含的图像点和地图点之间不匹配的非匹配点对，样本构建部分61包括第二匹配值确定子部分，配置为将非匹配点对的实际匹配值设置为预设数值。In some embodiments, the sets of point pairs include at least one set of non-matching point pairs that do not match between the included image points and map points, and the sample construction section 61 includes a second matching value determination subsection configured to convert the non-matching points The actual match value of the point pair is set to the preset value.

区别于前述实施例，若干组点对包括至少一组所包含的图像点和地图点之间不匹配的非匹配点对，并区别于匹配点对，将非匹配点对的实际匹配值设置预设数值，从而能够有利于提高匹配预测模型的鲁棒性。Different from the foregoing embodiments, several groups of point pairs include at least one group of non-matching point pairs that do not match between the included image points and map points, and different from the matching point pairs, the actual matching value of the non-matching point pairs is set to a preset value. Set the value, which can help to improve the robustness of the matching prediction model.

在一些实施例中，点对获取子部分包括图像点划分部分，配置为将样本图像中的图像点划分为第一图像点和第二图像点，其中，第一图像点在地图数据中存在与其匹配的地图点，第二图像点在地图数据中不存在与其匹配的地图点，点对获取子部分包括第一点对获取部分，配置为对于每一第一图像点，从地图数据中分配若干第一地图点，并分别将第一图像点与每一第一地图点作为一第一点对，其中，第一地图点中包含与第一图像点匹配的地图点，点对获取子部分包括第二点对获取部分，配置为对于每一第二图像点，从地图数据中分配若干第二地图点，并分别将第二图像点与每一第二地图点作为一第二点对，点对获取子部分包括点对抽取部分，配置为从第一点对和第二点对中抽取得到若干组点对。In some embodiments, the point pair acquisition subsection includes an image point division section configured to divide the image points in the sample image into a first image point and a second image point, wherein the first image point exists in the map data with The matching map point, the second image point does not have a matching map point in the map data, the point pair acquisition subsection includes a first point pair acquisition section, and is configured to allocate a number of points from the map data for each first image point. The first map point, and the first image point and each first map point are respectively regarded as a first point pair, wherein the first map point includes a map point matching the first image point, and the point pair acquisition subsection includes The second point pair acquisition part is configured to allocate a number of second map points from the map data for each second image point, and respectively use the second image point and each second map point as a second point pair, point The pair acquisition subsection includes a point pair extraction section configured to extract several groups of point pairs from the first point pair and the second point pair.

区别于前述实施例，通过将样本图像中的图像点划分为第一图像点和第二图像点，且第一图像点在地图中存在与其匹配的地图点，第二图像点在图像数据中不存在与其匹配的图像点，并对第一图像点，从地图数据中分配若干第一地图点，分别将第一图像点与每一第一地图点作为一第一点对，且第一地图点中包含与第一图像点匹配的地图点，而对于每一第二图像点，从地图数据中分配若干第二地图点，分别将第二图像点与每一第二地图点作为一第二点对，并从第一点对和第二点对中抽取得到若干组点对，从而能够构建得到数量丰富且包含非匹配点对和匹配点对的若干组点对，以用于训练匹配预测模型，故能够有利于提高匹配预测模型的准确性。Different from the foregoing embodiments, the image points in the sample image are divided into first image points and second image points, and the first image point has a matching map point in the map, and the second image point does not exist in the image data. There is an image point that matches it, and for the first image point, a number of first map points are allocated from the map data, and the first image point and each first map point are respectively regarded as a first point pair, and the first map point contains map points matching the first image point, and for each second image point, assigns a number of second map points from the map data, and takes the second image point and each second map point as a second point respectively pair, and extract several groups of point pairs from the first point pair and the second point pair, so that several groups of point pairs that are abundant and include non-matching point pairs and matching point pairs can be constructed to be used for training matching prediction models. , so it can help to improve the accuracy of the matching prediction model.

在一些实施例中，第一匹配值确定子部分包括位姿计算部分，配置为基于匹配点对，计算样本图像的位姿参数，第一匹配值确定子部分包括投影部分，配置为利用位姿参数将地图点投影至样本图像所属的维度中，得到地图点的投影点。In some embodiments, the first matching value determination subsection includes a pose calculation section configured to calculate pose parameters of the sample image based on the matching point pairs, and the first matching value determination subsection includes a projection section configured to utilize the pose The parameter projects the map point into the dimension to which the sample image belongs to obtain the projected point of the map point.

区别于前述实施例，通过利用匹配点对，计算样本图像的位姿参数，并利用位姿参数将地图点投影至样本图像所属的维度中，得到地图点的投影点，从而能够有利于提高投影点与图像点之间差异的准确性，进而能够有利于提高匹配预测模型的准确性。Different from the previous embodiment, by using the matching point pair to calculate the pose parameters of the sample image, and using the pose parameters to project the map points into the dimension to which the sample image belongs, the projection points of the map points are obtained, which can help to improve the projection. The accuracy of the difference between the point and the image point can be beneficial to improve the accuracy of the matching prediction model.

在一些实施例中，第一匹配值确定子部分包括概率密度转换部分，配置为利用预设概率分布函数将差异转换为概率密度值，作为匹配点对的实际匹配值In some embodiments, the first matching value determination subsection includes a probability density conversion section configured to convert the difference into a probability density value using a preset probability distribution function as the actual matching value of the matching point pair

区别于前述实施例，通过利用预设概率分布函数将差异转换为概率密度值，作为匹配点对的实际匹配值，故能够有利于准确地描述投影点与图像点之间的差异，从而能够有利于提高匹配预测模型的准确性。Different from the foregoing embodiments, by using a preset probability distribution function to convert the difference into a probability density value as the actual matching value of the matching point pair, it is beneficial to accurately describe the difference between the projection point and the image point, so that there is a It is beneficial to improve the accuracy of the matching prediction model.

在一些实施例中，样本匹配数据为二分图，二分图包括若干组点对和连接每组点对的连接边，且连接边标注有对应点对的实际匹配值；匹配预测模型包括与样本图像所属的维度对应的第一点特征提取子模型、与地图数据所属的维度对应的第二点特征提取子模型以及边特征提取子模型，预测处理部分62包括点特征提取子部分，配置为分别利用第一点特征提取子模型和第二点特征提取子模型对二分图进行特征提取，得到第一特征和第二特征，预测处理部分62包括边特征提取子部分，配置为利用边特征提取子模型对第一特征和第二特征进行特征提取，得到第三特征，预测处理部分62包括预测子部分，配置为利用第三特征，得到连接边对应的点对的预测匹配值。In some embodiments, the sample matching data is a bipartite graph, and the bipartite graph includes several groups of point pairs and connecting edges connecting each group of point pairs, and the connecting edges are marked with the actual matching values of the corresponding point pairs; the matching prediction model includes matching with the sample image. The first point feature extraction sub-model corresponding to the dimension to which the map data belongs, the second point feature extraction sub-model and the edge feature extraction sub-model corresponding to the dimension to which the map data belongs, the prediction processing part 62 includes a point feature extraction sub-part, which is configured to use the The first point feature extraction sub-model and the second point feature extraction sub-model perform feature extraction on the bipartite graph to obtain the first feature and the second feature. The prediction processing part 62 includes an edge feature extraction sub-part, which is configured to extract the sub-model by using the edge feature. Feature extraction is performed on the first feature and the second feature to obtain a third feature. The prediction processing part 62 includes a prediction sub-part configured to use the third feature to obtain the predicted matching value of the point pair corresponding to the connecting edge.

区别于前述实施例，通过对二分图分别进行点特征抽取以及边特征抽取，从而能够使匹配预测模型更加有效地感知匹配的空间几何结构，进而能够有利于提高匹配预测模型的准确性。Different from the foregoing embodiments, by performing point feature extraction and edge feature extraction on the bipartite graph respectively, the matching prediction model can more effectively perceive the spatial geometric structure of the matching, thereby improving the accuracy of the matching prediction model.

在一些实施例中，第一点特征提取子模型和第二点特征提取子模型的结构为以下任一种：包括至少一个残差块，包括至少一个残差块和至少一个空间变换网络；和/或，边特征提取子模型包括至少一个残差块。In some embodiments, the structure of the first point feature extraction sub-model and the second point feature extraction sub-model is any of the following: including at least one residual block, including at least one residual block and at least one spatial transformation network; and /or, the edge feature extraction sub-model includes at least one residual block.

区别于前述实施例，通过将第一点特征提取子模型和第二点特征提取子模型的结构设置为以下任一者：包括至少一个残差块，包括至少一个残差块和至少一个空间变换网络，且将边特征提取子模型设置为包括至少一个残差块，故能够有利于匹配预测模型的优化，并提高匹配预测模型的准确性。Different from the foregoing embodiments, the structures of the first point feature extraction sub-model and the second point feature extraction sub-model are set to any one of the following: including at least one residual block, including at least one residual block and at least one spatial transformation network, and the edge feature extraction sub-model is set to include at least one residual block, so it can facilitate the optimization of the matching prediction model and improve the accuracy of the matching prediction model.

在一些实施例中，若干组点对包括至少一组所包含的图像点和地图点之间匹配的匹配点对和至少一组所包含的图像点和地图点之间不匹配的非匹配点对，损失确定部分63包括第一损失确定子部分，配置为利用匹配点对的预测匹配值和实际匹配值，确定匹配预测模型的第一损失值，损失确定部分63包括第二损失确定子部分，配置为利用非匹配点对的预测匹配值和实际匹配值，确定匹配预测模型的第二损失值，损失确定部分63包括损失加权子部分，配置为对第一损失值和第二损失值进行加权处理，得到匹配预测模型的损失值。In some embodiments, the sets of point pairs include at least one set of matched point pairs that match between the included image points and map points and at least one set of non-matched point pairs that do not match between the included image points and map points , the loss determination section 63 includes a first loss determination subsection configured to use the predicted matching value and the actual matching value of the matching point pair to determine a first loss value matching the prediction model, the loss determination section 63 includes a second loss determination subsection, is configured to use the predicted matching value and the actual matching value of the non-matching point pair to determine the second loss value of the matching prediction model, the loss determination section 63 includes a loss weighting subsection configured to weight the first loss value and the second loss value Process to get the loss value that matches the prediction model.

区别于前述实施例，通过利用匹配点对的预测匹配值和实际匹配值，确定匹配预测模型的第一损失值，并利用非匹配点对的预测匹配值和实际损失值，确定匹配预测模型的第二损失值，从而对第一损失值和第二损失值进行加权处理，得到匹配预测模型的损失值，故能够有利于使匹配预测模型有效感知匹配的空间几何结构，从而提高匹配预测模型的准确性。Different from the previous embodiment, the first loss value of the matching prediction model is determined by using the predicted matching value and the actual matching value of the matching point pair, and the predicted matching value and the actual loss value of the non-matching point pair are used to determine the matching prediction model. The second loss value, so that the first loss value and the second loss value are weighted to obtain the loss value of the matching prediction model, so it can help the matching prediction model to effectively perceive the matching spatial geometry, thereby improving the matching prediction model. accuracy.

在一些实施例中，损失确定部分63还包括数量统计子部分，配置为分别统计匹配点对的第一数量，以及非匹配点对的第二数量；第一损失确定子部分，配置为利用匹配点对的预测匹配值和实际匹配值之间的差值，以及第一数量，确定第一损失值；第二损失确定子部分具，配置为利用非匹配点对的预测匹配值和实际匹配值之间的差值，以及第二数量，确定第二损失值。In some embodiments, the loss determination part 63 further includes a quantity statistics subsection, configured to count the first number of matched point pairs and the second number of non-matched point pairs, respectively; the first loss determination subsection is configured to use matching The difference between the predicted matching value and the actual matching value of the point pair, and the first number, determine the first loss value; the second loss determining subsection is configured to use the predicted matching value and the actual matching value of the non-matching point pair The difference between , and the second quantity, determines the second loss value.

区别于前述实施例，通过统计匹配点对的第一数量，以及非匹配点对的第二数量，从而利用匹配点对的预测匹配值和实际匹配值之间的差值，以及第一数量，确定第一损失值，并利用非匹配点对的预测匹配值和实际匹配值之间的差异，以及第二数量，确定第二损失值，能够有利于提高匹配预测模型的损失值的准确性，从而能够有利于提高匹配预测模型的准确性。Different from the foregoing embodiments, by counting the first number of matching point pairs and the second number of non-matching point pairs, the difference between the predicted matching value and the actual matching value of the matching point pair, and the first number, are used, Determining the first loss value, and using the difference between the predicted matching value and the actual matching value of the unmatched point pair, and the second quantity, determining the second loss value can help to improve the accuracy of the loss value of the matching prediction model, Thus, the accuracy of the matching prediction model can be improved.

在一些实施例中，样本图像所属的维度为2维或3维，地图数据所属的维度为2维或3维。In some embodiments, the dimension to which the sample image belongs is 2D or 3D, and the dimension to which the map data belongs is 2D or 3D.

区别于前述实施例，通过设置样本图像和地图数据所属的维度，能够训练得到用于2维-2维的匹配预测模型，或者能够训练得到用于2维-3维的匹配预测模型，或者能够训练得到用于3维-3维的匹配预测模型，从而能够提高匹配预测模型的适用范围。Different from the foregoing embodiments, by setting the dimension to which the sample image and map data belong, a matching prediction model for 2-2D can be trained, or a matching prediction model for 2-3D can be trained, or a matching prediction model can be trained. The matching prediction model for 3D-3D is obtained by training, so that the applicable range of the matching prediction model can be improved.

请参阅图7，图7是本公开视觉定位装置70一实施例的框架示意图。视觉定位装置70包括数据构建部分71、预测处理部分72和参数确定部分73，数据构建部分71配置为利用待定位图像和地图数据，构建待识别匹配数据，其中，待识别匹配数据包括若干组点对，每组点对的两个点分别来自待定位图像和地图数据；预测处理部分72配置为利用匹配预测模型对若干组点对进行预测处理，得到点对的预测匹配值；参数确定部分73配置为基于点对的预测匹配值，确定待定位图像的摄像器件的位姿参数。Please refer to FIG. 7 , which is a schematic diagram of a frame of an embodiment of a visual positioning device 70 of the present disclosure. The visual positioning device 70 includes a data construction part 71, a prediction processing part 72 and a parameter determination part 73. The data construction part 71 is configured to use the image to be located and the map data to construct matching data to be identified, wherein the matching data to be identified includes several groups of points Right, the two points of each group of point pairs are respectively from the image to be located and the map data; the prediction processing part 72 is configured to use the matching prediction model to perform prediction processing on several groups of point pairs to obtain the predicted matching values of the point pairs; the parameter determination part 73 It is configured to determine the pose parameters of the imaging device of the image to be positioned based on the predicted matching value of the point pair.

上述方案，能够利用匹配预测模型建立匹配关系，从而能够在视觉定位中利用匹配预测模型预测点对之间的匹配值而建立匹配关系，能够有利于提高视觉定位的准确性和即时性。In the above solution, the matching prediction model can be used to establish the matching relationship, so that the matching prediction model can be used to predict the matching value between the point pairs in the visual positioning to establish the matching relationship, which can help to improve the accuracy and immediacy of the visual positioning.

在一些实施例中，参数确定部分73包括点对排序子部分，配置为将若干组点对按照预测匹配值从高到低的顺序进行排序，参数确定部分73还包括参数确定子部分，配置为利用前预设数量组点对，确定待定位图像的摄像器件的位姿参数。In some embodiments, the parameter determination section 73 includes a point pair sorting subsection configured to sort several groups of point pairs in descending order of predicted matching values, and the parameter determination section 73 further includes a parameter determination subsection configured as Using the previously preset number of point pairs, the pose parameters of the imaging device of the image to be positioned are determined.

在本公开实施例以及其他的实施例中，“部分”可以是部分电路、部分处理器、部分程序或软件等等，当然也可以是单元，还可以是模块也可以是非模块化的。In the embodiments of the present disclosure and other embodiments, a "part" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course, a unit, a module or a non-modularity.

区别于前述实施例，通过将若干组点对按照预测匹配值从高到低的顺序进行排序，并利用前预设数量组点对，确定待定位图像的摄像器件的位姿参数，从而能够有利于利用排序后的点对进行增量式采样，优先采样匹配值高的点对，故能够通过几何先验引导位姿参数的求解，从而能够提高视觉定位的准确性和即时性。Different from the foregoing embodiments, by sorting several groups of point pairs in descending order of predicted matching values, and using the previously preset number of groups of point pairs to determine the pose parameters of the camera device of the image to be positioned, it is possible to have It is beneficial to use the sorted point pairs for incremental sampling, and preferentially sample the point pairs with high matching values, so the solution of pose parameters can be guided by geometric prior, thereby improving the accuracy and immediacy of visual positioning.

在一些实施例中，匹配预测模型是利用上述任一匹配预测模型的训练装置实施例中的匹配预测模型的训练装置训练得到的。In some embodiments, the matching prediction model is obtained by training the matching prediction model training device in any of the above-mentioned embodiments of the matching prediction model training device.

区别于前述实施例，通过上述任一匹配预测模型的训练装置实施例中的匹配预测模型的训练装置得到的匹配预测模型进行视觉定位，能够提高视觉定位的准确性和即时性。Different from the foregoing embodiments, performing visual positioning through the matching prediction model obtained by the training device for matching prediction models in any of the above-mentioned embodiments of the training device for matching prediction models can improve the accuracy and immediacy of visual positioning.

请参阅图8，图8是本公开电子设备80一实施例的框架示意图。电子设备80包括相互耦接的存储器81和处理器82，处理器82用于执行存储器81中存储的程序指令，以实现上述任一匹配预测模型的训练方法实施例中的步骤，或实现上述任一视觉定位方法实施例中的步骤。在一个实施场景中，电子设备80可以包括但不限于：手机、匹配电脑等移动设备，在此不做限定。Please refer to FIG. 8 , which is a schematic diagram of a framework of an embodiment of an electronic device 80 of the present disclosure. The electronic device 80 includes a mutually coupled memory 81 and a processor 82, and the processor 82 is configured to execute program instructions stored in the memory 81 to implement the steps in any of the above-mentioned embodiments of the training method for matching the prediction model, or to implement any of the above-mentioned training methods. Steps in an embodiment of a visual positioning method. In an implementation scenario, the electronic device 80 may include, but is not limited to, mobile devices such as a mobile phone and a matching computer, which are not limited herein.

其中，处理器82用于控制其自身以及存储器81以实现上述任一匹配预测模型的训练方法实施例中的步骤，或实现上述任一视觉定位方法实施例中的步骤。处理器82还可以称为CPU(Central Processing Unit，中央处理单元)。处理器82可能是一种集成电路芯片，具有信号的处理能力。处理器82还可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。另外，处理器82可以由集成电路芯片共同实现。The processor 82 is configured to control itself and the memory 81 to implement the steps in any of the above-mentioned embodiments of the matching prediction model training method, or to implement the steps in any of the above-mentioned embodiments of the visual positioning method. The processor 82 may also be referred to as a CPU (Central Processing Unit, central processing unit). The processor 82 may be an integrated circuit chip with signal processing capability. The processor 82 may also be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a field programmable gate array (Field-Programmable Gate Array, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 82 may be jointly implemented by an integrated circuit chip.

上述方案，能够利用匹配预测模型建立匹配关系，从而能够在视觉定位中利用匹配预测模型预测点对之间的匹配值，因而能够基于预测得到的匹配值优先采样高匹配值的点对，进而能够有利于提高视觉定位的准确性和即时性。The above solution can use the matching prediction model to establish a matching relationship, so that the matching prediction model can be used in the visual positioning to predict the matching value between the point pairs, so the point pair with the high matching value can be preferentially sampled based on the predicted matching value, and then the matching value can be sampled. It is beneficial to improve the accuracy and immediacy of visual positioning.

请参阅图9，图9为本公开计算机可读存储介质90一实施例的框架示意图。计算机可读存储介质90存储有能够被处理器运行的程序指令901，程序指令901用于实现上述任一匹配预测模型的训练方法实施例中的步骤，或实现上述任一视觉定位方法实施例中的步骤。Please refer to FIG. 9 , which is a schematic diagram of a framework of an embodiment of the disclosed computer-readable storage medium 90 . The computer-readable storage medium 90 stores program instructions 901 that can be executed by the processor, and the program instructions 901 are used to implement the steps in any of the above-mentioned embodiments of the training method for matching prediction models, or to implement any of the above-mentioned embodiments of the visual positioning method. A step of.

在本公开所提供的几个实施例中，应该理解到，所揭露的方法和装置，可以通过其它的方式实现。例如，以上所描述的装置实施方式仅仅是示意性的，例如，模块或单元的划分，仅仅为一种逻辑功能划分，实际实现时可以有另外的划分方式，例如单元或组件可以结合或者可以集成到另一个***，或一些特征可以忽略，或不执行。另一点，所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口，装置或单元的间接耦合或通信连接，可以是电性、机械或其它的形式。In the several embodiments provided in the present disclosure, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the device implementations described above are only illustrative. For example, the division of modules or units is only a logical function division. In actual implementation, there may be other divisions. For example, units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.

作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施方式方案的目的。Units described as separate components may or may not be physically separated, and components shown as units may or may not be physical units, that is, may be located in one place, or may be distributed over network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this implementation manner.

另外，在本公开各个实施例中的各功能单元可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现，也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.

集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本公开的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)或处理器(processor)执行本公开各个实施方式方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented as a software functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solutions of the present disclosure can be embodied in the form of software products in essence, or the part that contributes to the prior art, or all or part of the technical solutions, and the computer software product is stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods of the various embodiments of the present disclosure. The aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .

工业实用性Industrial Applicability

本公开实施例中，能够利用匹配预测模型建立匹配关系，从而能够在视觉定位中利用匹配预测模型预测点对之间的匹配值，因而能够基于预测得到的匹配值优先采样高匹配值的点对，而建立匹配关系，进而能够有利于提高视觉定位的准确性和即时性。In the embodiment of the present disclosure, the matching prediction model can be used to establish a matching relationship, so that the matching prediction model can be used to predict the matching value between point pairs in visual positioning, so that the point pair with high matching value can be preferentially sampled based on the predicted matching value. , and establish a matching relationship, which can help to improve the accuracy and immediacy of visual positioning.

Claims

一种匹配预测模型的训练方法，包括：A training method for matching predictive models, including:

利用样本图像和地图数据，构建样本匹配数据，其中，所述样本匹配数据包括若干组点对以及每组点对的实际匹配值，每组点对的两个点分别来自所述样本图像和所述地图数据；Using the sample image and map data, construct sample matching data, wherein the sample matching data includes several groups of point pairs and the actual matching value of each group of point pairs, and the two points of each group of point pairs come from the sample image and the the map data;

利用匹配预测模型对所述若干组点对进行预测处理，得到所述点对的预测匹配值；Use a matching prediction model to perform prediction processing on the several groups of point pairs to obtain the predicted matching values of the point pairs;

利用所述实际匹配值和所述预测匹配值，确定所述匹配预测模型的损失值；Using the actual matching value and the predicted matching value, determining the loss value of the matching prediction model;

利用所述损失值，调整所述匹配预测模型的参数。Using the loss value, the parameters of the matching prediction model are adjusted.
根据权利要求1所述的训练方法，其中，所述利用样本图像和地图数据，构建样本匹配数据包括：The training method according to claim 1, wherein, using sample images and map data to construct sample matching data comprises:

从所述样本图像中获取若干图像点，以及从所述地图数据中获取若干地图点，以组成若干组点对；其中，所述若干组点对包括至少一组所包含的图像点和地图点之间匹配的匹配点对；Several image points are obtained from the sample image, and several map points are obtained from the map data to form several sets of point pairs; wherein the several sets of point pairs include at least one set of included image points and map points Matching point pairs that match between;

对于每组所述匹配点对：利用所述样本图像的位姿参数将所述地图点投影至所述样本图像所属的维度中，得到所述地图点的投影点；并基于所述图像点和所述投影点之间的差异，确定所述匹配点对的实际匹配值。For each set of matched point pairs: project the map point into the dimension to which the sample image belongs by using the pose parameters of the sample image to obtain the projected point of the map point; and based on the image point and The difference between the projected points determines the actual matching value of the matching point pair.
根据权利要求2所述的训练方法，其中，所述若干组点对包括至少一组所包含的图像点和地图点之间不匹配的非匹配点对，所述利用样本图像和地图数据，构建样本匹配数据还包括：The training method according to claim 2, wherein the several groups of point pairs include at least one group of non-matching point pairs that do not match between the included image points and map points, and the sample image and map data are used to construct the Sample match data also includes:

将所述非匹配点对的实际匹配值设置为预设数值。The actual matching value of the non-matching point pair is set as a preset value.
根据权利要求2或3所述的训练方法，其中，所述从所述样本图像中获取若干图像点，以及从所述地图数据中获取若干地图点，以组成若干组点对，包括：The training method according to claim 2 or 3, wherein the acquiring several image points from the sample image and acquiring several map points from the map data to form several groups of point pairs, including:

将所述样本图像中的图像点划分为第一图像点和第二图像点，其中，所述第一图像点在所述地图数据中存在与其匹配的所述地图点，所述第二图像点在所述地图数据中不存在与其匹配的所述地图点；Dividing the image points in the sample image into a first image point and a second image point, wherein the first image point has the map point matching it in the map data, and the second image point there is no matching said map point in said map data;

对于每一所述第一图像点，从所述地图数据中分配若干第一地图点，并分别将所述第一图像点与每一所述第一地图点作为一第一点对，其中，所述第一地图点中包含与所述第一图像点匹配的所述地图点；以及，For each of the first image points, a number of first map points are allocated from the map data, and the first image point and each of the first map points are respectively used as a first point pair, wherein, The first map point includes the map point that matches the first image point; and,

对于每一所述第二图像点，从所述地图数据中分配若干第二地图点，并分别将所述第二图像点与每一所述第二地图点作为一第二点对；for each of the second image points, assigning a plurality of second map points from the map data, and respectively using the second image point and each of the second map points as a second point pair;

从所述第一点对和所述第二点对中抽取得到若干组点对。Several sets of point pairs are extracted from the first point pair and the second point pair.
根据权利要求2至4任一项所述的训练方法，其中，所述利用所述样本图像的位姿参数将所述地图点投影至所述样本图像所属的维度中，得到所述地图点的投影点包括：The training method according to any one of claims 2 to 4, wherein the map point is projected into the dimension to which the sample image belongs by using the pose parameters of the sample image to obtain the map point's Projection points include:

基于所述匹配点对，计算所述样本图像的位姿参数；Based on the matched point pair, calculate the pose parameter of the sample image;

利用所述位姿参数将所述地图点投影至所述样本图像所属的维度中，得到所述地图点的投影点；Using the pose parameter to project the map point into the dimension to which the sample image belongs, to obtain the projected point of the map point;

和/或，所述基于所述图像点和所述投影点之间的差异，确定所述匹配点对的实际匹配值包括：And/or, determining the actual matching value of the matching point pair based on the difference between the image point and the projection point includes:

利用预设概率分布函数将所述差异转换为概率密度值，作为所述匹配点对的实际匹配值。The difference is converted into a probability density value using a preset probability distribution function as the actual matching value of the matching point pair.
根据权利要求1至5任一项所述的训练方法，其中，所述样本匹配数据为二分图，所述二分图包括若干组点对和连接每组点对的连接边，且所述连接边标注有对应所述点对的实际匹配值；所述匹配预测模型包括与所述样本图像所属的维度对应的第一点特征提取子模型、与所述地图数据所属的维度对应的第二点特征提取子模型以及边特征提取子模型；The training method according to any one of claims 1 to 5, wherein the sample matching data is a bipartite graph, and the bipartite graph includes several groups of point pairs and connecting edges connecting each group of point pairs, and the connecting edges The actual matching value corresponding to the point pair is marked; the matching prediction model includes a first point feature extraction sub-model corresponding to the dimension to which the sample image belongs, and a second point feature corresponding to the dimension to which the map data belongs Extract sub-models and edge feature extraction sub-models;

所述利用匹配预测模型对所述若干组点对进行预测处理，得到所述点对的预测匹配值包括：The performing prediction processing on the several groups of point pairs by using the matching prediction model, and obtaining the predicted matching values of the point pairs includes:

分别利用所述第一点特征提取子模型和所述第二点特征提取子模型对所述二分图进行特征提取，得到第一特征和第二特征；Using the first point feature extraction sub-model and the second point feature extraction sub-model to perform feature extraction on the bipartite graph, respectively, to obtain the first feature and the second feature;

利用所述边特征提取子模型对所述第一特征和所述第二特征进行特征提取，得到第三特征；Using the edge feature extraction sub-model to perform feature extraction on the first feature and the second feature to obtain a third feature;

利用所述第三特征，得到所述连接边对应的点对的预测匹配值。Using the third feature, the predicted matching value of the point pair corresponding to the connecting edge is obtained.
根据权利要求6所述的训练方法，其中，所述第一点特征提取子模型和所述第二点特征提取子模型的结构为以下任一种：包括至少一个残差块，包括至少一个残差块和至少一个空间变换网络；The training method according to claim 6, wherein the structure of the first point feature extraction sub-model and the second point feature extraction sub-model is any of the following: including at least one residual block, including at least one residual block difference blocks and at least one spatial transformation network;

和/或，所述边特征提取子模型包括至少一个残差块。And/or, the edge feature extraction sub-model includes at least one residual block.
根据权利要求1至7任一项所述的训练方法，其中，所述若干组点对包括至少一组所包含的图像点和地图点之间匹配的匹配点对和至少一组所包含的图像点和地图点之间不匹配的非匹配点对；The training method according to any one of claims 1 to 7, wherein the several sets of point pairs include at least one set of matched point pairs matched between the included image points and map points and at least one set of included images Non-matching point pairs that do not match between points and map points;

所述利用所述实际匹配值和所述预测匹配值，确定所述匹配预测模型的损失值包括：The determining the loss value of the matching prediction model by using the actual matching value and the predicted matching value includes:

利用所述匹配点对的所述预测匹配值和所述实际匹配值，确定所述匹配预测模型的第一损失值；Using the predicted matching value and the actual matching value of the matching point pair to determine a first loss value of the matching prediction model;

并利用所述非匹配点对的所述预测匹配值和所述实际匹配值，确定所述匹配预测模型的第二损失值；and using the predicted matching value and the actual matching value of the non-matching point pair to determine the second loss value of the matching prediction model;

对所述第一损失值和所述第二损失值进行加权处理，得到所述匹配预测模型的损失值。The first loss value and the second loss value are weighted to obtain the loss value of the matching prediction model.
根据权利要求8所述的训练方法，其中，所述利用所述匹配点对的所述预测匹配值和所述实际匹配值，确定所述匹配预测模型的第一损失值之前，所述方法还包括：The training method according to claim 8, wherein before the first loss value of the matching prediction model is determined by using the predicted matching value and the actual matching value of the matching point pair, the method further include:

分别统计所述匹配点对的第一数量，以及所述非匹配点对的第二数量；respectively count the first number of the matching point pairs and the second number of the non-matching point pairs;

所述利用所述匹配点对的所述预测匹配值和所述实际匹配值，确定所述匹配预测模型的第一损失值包括：The determining the first loss value of the matching prediction model by using the predicted matching value and the actual matching value of the matching point pair includes:

利用所述匹配点对的所述预测匹配值和所述实际匹配值之间的差值，以及所述第一数量，确定所述第一损失值；Using the difference between the predicted matching value and the actual matching value of the matching point pair, and the first number, determining the first loss value;

所述利用所述非匹配点对的所述预测匹配值和所述实际匹配值，确定所述匹配预测模型的第二损失值包括：Using the predicted matching value and the actual matching value of the non-matching point pair to determine the second loss value of the matching prediction model includes:

利用所述非匹配点对的所述预测匹配值和所述实际匹配值之间的差值，以及所述第二数量，确定所述第二损失值。The second loss value is determined using the difference between the predicted match value and the actual match value for the pair of non-matching points, and the second number.
根据权利要求1至9任一项所述的训练方法，其中，所述样本图像所属的维度为2维或3维，所述地图数据所属的维度为2维或3维。The training method according to any one of claims 1 to 9, wherein the dimension to which the sample image belongs is 2-dimensional or 3-dimensional, and the dimension to which the map data belongs is 2-dimensional or 3-dimensional.
一种视觉定位方法，包括：A visual positioning method comprising:

利用待定位图像和地图数据，构建待识别匹配数据，其中，所述待识别匹配数据包括若干组点对，每组点对的两个点分别来自所述待定位图像和所述地图数据；Using the to-be-located image and the map data, the to-be-identified matching data is constructed, wherein the to-be-identified matching data includes several groups of point pairs, and the two points of each group of point pairs are respectively from the to-be-located image and the map data;

利用匹配预测模型对所述若干组点对进行预测处理，得到所述点对的预测匹配值；Use a matching prediction model to perform prediction processing on the several groups of point pairs to obtain the predicted matching values of the point pairs;

基于所述点对的预测匹配值，确定所述待定位图像的摄像器件的位姿参数。Based on the predicted matching value of the point pair, the pose parameter of the imaging device of the image to be positioned is determined.
根据权利要求11所述的视觉定位方法，其中，所述基于所述点对的预测匹配值，确定所述待定位图像的摄像器件的位姿参数，包括：The visual positioning method according to claim 11, wherein the determining the pose parameters of the camera device of the image to be positioned based on the predicted matching value of the point pair comprises:

将所述若干组点对按照所述预测匹配值从高到低的顺序进行排序；sorting the groups of point pairs in descending order of the predicted matching values;

利用前预设数量组所述点对，确定所述待定位图像的摄像器件的位姿参数。The pose parameters of the imaging device of the to-be-located image are determined by using the previously preset number of sets of the point pairs.
根据权利要求11或12所述的视觉定位方法，其中，所述匹配预测模型是利用权利要求1至10任一项所述的匹配预测模型的训练方法得到的。The visual positioning method according to claim 11 or 12, wherein the matching prediction model is obtained by using the training method of the matching prediction model according to any one of claims 1 to 10.
一种匹配预测模型的训练装置，包括：A training device for matching a prediction model, comprising:

样本构建模块，用于利用样本图像和地图数据，构建样本匹配数据，其中，所述样本匹配数据包括若干组点对以及每组点对的实际匹配值，每组点对的两个点分别来自所述样本图像和所述地图数据；The sample building module is used to construct sample matching data by using sample images and map data, wherein the sample matching data includes several groups of point pairs and the actual matching value of each group of point pairs, and the two points of each group of point pairs come from the sample image and the map data;

预测处理部分，配置为利用匹配预测模型对所述若干组点对进行预测处理，得到所述点对的预测匹配值；a prediction processing part, configured to perform prediction processing on the several groups of point pairs by using a matching prediction model to obtain the predicted matching values of the point pairs;

损失确定部分，配置为利用所述实际匹配值和所述预测匹配值，确定所述匹配预测模型的损失值；a loss determination part configured to use the actual matching value and the predicted matching value to determine the loss value of the matching prediction model;

参数调整部分，配置为利用所述损失值，调整所述匹配预测模型的参数。The parameter adjustment part is configured to use the loss value to adjust the parameters of the matching prediction model.
一种视觉定位装置，其中，A visual positioning device, wherein,

待定位图像和地图数据，构建待识别匹配数据，其中，所述待识别匹配数据包括若干组点对，每组点对的两个点分别来自所述待定位图像和所述地图数据；The to-be-located image and map data, and the to-be-identified matching data is constructed, wherein the to-be-identified matching data includes several sets of point pairs, and the two points of each set of point pairs are respectively from the to-be-located image and the map data;

预测处理部分，配置为利用匹配预测模型对所述若干组点对进行预测处理，得到所述点对的预测匹配值；a prediction processing part, configured to perform prediction processing on the several groups of point pairs by using a matching prediction model to obtain the predicted matching values of the point pairs;

参数确定部分，配置为基于所述点对的预测匹配值，确定所述待定位图像的摄像器件的位姿参数。The parameter determination part is configured to determine the pose parameter of the imaging device of the to-be-positioned image based on the predicted matching value of the point pair.
一种电子设备，包括相互耦接的存储器和处理器，所述处理器用于执行所述存储器中存储的程序指令，以实现权利要求1至10任一项所述的匹配预测模型的训练方法，或权利要求11至13任一项所述的视觉定位方法。An electronic device, comprising a mutually coupled memory and a processor, the processor is configured to execute program instructions stored in the memory to implement the training method for a matching prediction model according to any one of claims 1 to 10, Or the visual positioning method according to any one of claims 11 to 13.
一种计算机可读存储介质，其上存储有程序指令，所述程序指令被处理器执行时实现权利要求1至10任一项所述的匹配预测模型的训练方法，或权利要求11至13任一项所述的视觉定位方法。A computer-readable storage medium on which program instructions are stored, and when the program instructions are executed by a processor, the training method of the matching prediction model according to any one of claims 1 to 10, or any one of claims 11 to 13 is realized. A described visual localization method.
一种计算机程序，包括计算机可读代码，在所述计算机可读代码在电子设备中运行，被所述电子设备中的处理器执行的情况下，实现权利要求1至10任一项所述的匹配预测模型的训练方法，或权利要求11至13任一项所述的视觉定位方法。A computer program, comprising computer-readable codes, in the case that the computer-readable codes are executed in an electronic device and executed by a processor in the electronic device, to implement the method described in any one of claims 1 to 10 The training method of the matching prediction model, or the visual positioning method according to any one of claims 11 to 13.