WO2017128934A1

WO2017128934A1 - Method, server, terminal and system for implementing augmented reality

Info

Publication number: WO2017128934A1
Application number: PCT/CN2017/070138
Authority: WO
Inventors: 柳寅秋
Original assignee: 成都理想境界科技有限公司
Priority date: 2016-01-29
Filing date: 2017-01-04
Publication date: 2017-08-03
Also published as: CN107025662A; CN107025662B

Abstract

A method for implementing augmented reality comprises the following steps: acquiring, in real time, images of an environment scene; extracting image characteristics from the environment scene images; according to the image characteristics, conducting an initial positioning for the environment scene and creating a local map of the environment scene; performing tracking for image frames in the environment scene images; and displaying, in a current frame image of the environment scene images, a virtual object according to the position and posture of the virtual object to be displayed. Also provided are a server, terminal and system for implementing augmented reality. The method, server, terminal and system for implementing augmented reality are capable of displaying an overlay image of the virtual object onto the environment scene or target object without using a template, effectively reducing the reliance on a planar template found in conventional augmented reality techniques, improving the precision of real-time matching of the virtual object to a real environment scene, and significantly enhancing the coordination and consistency of overlaying the virtual object onto an environment scene image.

Description

一种实现增强现实的方法、服务器、终端及***Method, server, terminal and system for realizing augmented reality

本申请要求享有2016年1月29日提交的名称为“一种实现增强现实的方法、服务器、终端及***”的中国专利申请CN201610070926.7的优先权，其全部内容通过引用并入本文中。The present application claims priority to Chinese Patent Application No. CN201610070926.7, filed on Jan. 29,,,,,,,,,,,,,,,,,,,,

技术领域Technical field

本发明涉及人工智能及计算机视觉技术领域，尤其涉及一种实现增强现实的方法、服务器、终端及***。The present invention relates to the field of artificial intelligence and computer vision technology, and in particular, to a method, a server, a terminal and a system for realizing augmented reality.

背景技术Background technique

同时定位与地图创建(Simultaneous Localization and Mapping，SLAM)，是一项应用于在未知环境中自主地图创建与自身定位相结合的技术。该技术是当前自主导航领域的研究热点，目标是解决在进入未知环境后，如何感知周围环境并构建增量式地图，同时进行自身定位的问题。该技术目前主要应用于无人驾驶、机器人及场景三维重建等技术方向。Simultaneous Localization and Mapping (SLAM) is a technique applied to the combination of autonomous map creation and self-localization in an unknown environment. This technology is a research hotspot in the field of autonomous navigation. The goal is to solve the problem of how to perceive the surrounding environment and construct an incremental map after entering the unknown environment. The technology is currently mainly used in technical directions such as unmanned driving, robots and scene reconstruction.

增强现实技术(Augmented Reality，AR)借助计算机图形技术和可视化技术生成现实环境中不存在的虚拟对象，并通过图像识别定位技术将虚拟对象准确地融合到真实环境中，借助显示设备将虚拟对象与真实环境融为一体，并呈现给使用者真实的感观体验。增强现实技术要解决的首要技术难题是如何将虚拟对象准确地融合到真实世界中，也就是要使虚拟对象以正确的角度姿态出现在真实场景的正确位置上，从而令使用者产生强烈的视觉真实感。Augmented Reality (AR) uses computer graphics technology and visualization technology to generate virtual objects that do not exist in the real environment, and accurately integrates virtual objects into the real environment through image recognition and positioning technology. The real environment is integrated and presented to the user with a realistic sensory experience. The primary technical problem to be solved by augmented reality technology is how to accurately integrate virtual objects into the real world, that is, to make the virtual objects appear in the correct position of the real scene with the correct angle, so that the user has a strong vision. Realism.

现有技术中，目前较为常见的是基于平面模板的增强现实技术方案。该技术方案通过对纹理特征丰富的平面模板进行训练，确定虚拟对象的显示位置。但是，受平面模板的限制，该技术方案对非平面物体及纹理特征偏少的物体的适用性较差。因此，如何在不同类型的真实场景中准确地确定虚拟对象的位置，实现虚拟对象在真实场景中的叠加融合，是增强现实技术发展中亟待解决的技术问题之一。In the prior art, a common augmented reality technology scheme based on a planar template is currently used. The technical solution determines the display position of the virtual object by training the planar template rich in texture features. However, due to the limitation of the planar template, the technical solution has poor applicability to non-planar objects and objects with less texture features. Therefore, how to accurately determine the position of a virtual object in different types of real scenes and realize the superposition and fusion of virtual objects in a real scene is one of the technical problems to be solved in the development of augmented reality technology.

发明内容 Summary of the invention

本发明的目的在于提供一种实现增强现实的方法、服务器、终端及***，将增强现实技术和同时定位与地图创建技术相结合，通过离线场景地图构建与在线初始化，实现虚拟对象在真实场景中的融合显示。The object of the present invention is to provide a method, a server, a terminal and a system for realizing augmented reality, combining augmented reality technology and simultaneous positioning with a map creation technology, and realizing a virtual object in a real scene through offline scene map construction and online initialization. The fusion display.

有鉴于此，本发明一方面提供一种实现增强现实的方法，包括以下步骤：实时采集环境场景的图像；提取所述环境场景图像中的图像特征；根据所述图像特征，对所述环境场景进行初始化定位，建立所述环境场景的局部地图；对所述环境场景图像中的图像帧进行跟踪；根据待显示的虚拟对象的位姿，在所述环境场景图像的当前帧图像中显示所述虚拟对象。In view of this, an aspect of the present invention provides a method for implementing an augmented reality, comprising the steps of: acquiring an image of an environment scene in real time; extracting an image feature in the environment scene image; and performing the environment scene according to the image feature Performing initial positioning to establish a partial map of the environment scene; tracking an image frame in the environment scene image; displaying the image in a current frame image of the environment scene image according to a pose of the virtual object to be displayed Virtual object.

优选地，所述实时采集环境场景的图像的步骤之前，还包括：获取所述环境场景的视频图像；根据所述环境场景视频图像，对所述环境场景进行三维场景重建；根据所述环境场景视频图像，构建所述环境场景的图像检索数据库；根据所述环境场景视频图像，预设所述待显示的虚拟对象在所述环境场景中的位姿。Preferably, the step of acquiring an image of the environment scene in real time includes: acquiring a video image of the environment scene; performing a three-dimensional scene reconstruction on the environment scene according to the environment scene video image; a video image, an image retrieval database of the environment scene is constructed; and a pose of the virtual object to be displayed in the environment scene is preset according to the environment scene video image.

优选地，所述根据所述环境场景视频图像，对所述环境场景进行三维场景重建的步骤，具体为：对所述环境场景视频图像中的图像帧进行图像特征提取；根据所述图像特征对所述环境场景视频图像进行帧间图像配准，计算所述图像帧之间的相对位姿；在所述环境场景视频图像中选定离线初始帧图像，根据所述离线初始帧图像建立所述环境场景的世界坐标系，确定所述环境场景中的3D点的坐标；根据所述环境场景视频图像中的离线初始帧图像确定所述环境场景视频图像中的离线关键帧图像；根据所述环境场景视频图像中的离线初始帧图像与当前帧图像之间所有离线关键帧图像的位姿及3D点坐标构建位姿图，对所述位姿图进行优化，更新所述离线初始帧图像与当前帧图像之间所有离线关键帧图像的位姿及3D点坐标。Preferably, the step of performing a three-dimensional scene reconstruction on the environment scene according to the video image of the environment scene is specifically: performing image feature extraction on an image frame in the video image of the environment scene; Performing inter-frame image registration on the environment scene video image, calculating a relative pose between the image frames; selecting an offline initial frame image in the environment scene video image, and establishing the offline initial frame image according to the offline scene frame image a world coordinate system of the environment scene, determining coordinates of the 3D point in the environment scene; determining an offline key frame image in the environment scene video image according to the offline initial frame image in the environment scene video image; Constructing a pose map of the pose and 3D point coordinates of all offline key frame images between the offline initial frame image and the current frame image in the scene video image, optimizing the pose map, updating the offline initial frame image and current The pose and 3D point coordinates of all offline keyframe images between frame images.

优选地，所述根据所述环境场景视频图像，构建所述环境场景的图像检索数据库的步骤，具体为：根据所述环境场景视频图像中的离线初始帧图像及离线关键帧图像中的图像特征，建立搜索树或者词袋。Preferably, the step of constructing an image retrieval database of the environment scene according to the environment scene video image is specifically: selecting an offline initial frame image and an image feature in an offline key frame image according to the environment scene video image , build a search tree or word bag.

优选地，所述实现增强现实的方法，还包括：根据所述环境场景的位置信息，获取所述环境场景的图像检索数据库。Preferably, the method for implementing augmented reality further includes: acquiring an image retrieval database of the environment scene according to location information of the environment scene.

优选地，所述根据所述图像特征，对所述环境场景进行初始化定位，建立所述环境场景的局部地图的步骤，具体为：对当前帧图像中的图像特征进行解析，在所述图像检索数据库中检索符合第一预设条件的离线关键帧图像；根据所述当前帧图像与所述离线关键帧图像的相对位姿，进行所述环境场景的初始化定位；根据所述当前帧图像中符合第二预设条件的3D点，建立所述环境场景的局部地图。Preferably, the step of initializing and positioning the environment scene according to the image feature, and establishing a local map of the environment scene, specifically: parsing image features in the current frame image, and searching in the image Retrieving an offline key frame image that meets a first preset condition in the database; performing initial positioning of the environmental scene according to a relative pose of the current frame image and the offline key frame image; Second preset A 3D point of the condition establishes a partial map of the environmental scene.

优选地，所述根据所述图像特征，对所述环境场景进行初始化定位，建立所述环境场景的局部地图的步骤，还包括：将所述当前帧图像、所述符合所述第一预设条件的离线关键帧图像及所述当前帧图像中符合所述第二预设条件的3D点加入所述环境场景的局部地图。Preferably, the step of initializing and positioning the environment scene according to the image feature, and establishing a partial map of the environment scene, further comprising: the current frame image, the meeting the first preset The offline key frame image of the condition and the 3D point of the current frame image that meets the second preset condition are added to the partial map of the environment scene.

优选地，所述对所述环境场景图像中的图像帧进行跟踪的步骤，具体为：根据所述环境场景图像的上一帧图像，检测所述环境场景图像的当前帧图像中与所述上一帧图像匹配的图像特征；判断所述匹配的图像特征数是否大于或者等于预设阈值；若所述匹配的图像特征数大于或者等于预设阈值，则根据所述环境场景图像的上一帧图像的位姿及3D点坐标确定所述环境场景图像的当前帧图像的位姿及3D点坐标；若所述匹配的图像特征数小于预设阈值，则在所述环境场景的图像检索数据库中检索与所述环境场景图像的当前帧图像匹配的离线关键帧图像，根据所述离线关键帧图像的位姿及3D点坐标，确定所述环境场景图像的当前帧图像的位姿及3D点坐标。Preferably, the step of tracking the image frame in the image of the environment scene is: detecting, in the current frame image of the environment scene image, from the previous frame image according to the image of the previous frame of the environment scene image a frame image matching image feature; determining whether the matched image feature number is greater than or equal to a preset threshold; if the matched image feature number is greater than or equal to a preset threshold, according to the previous frame of the environment scene image Position and 3D point coordinates of the image determine a pose and a 3D point coordinate of the current frame image of the environment scene image; if the matched image feature number is less than a preset threshold, in an image retrieval database of the environment scene Retrieving an offline key frame image matching the current frame image of the environment scene image, determining a pose and a 3D point coordinate of the current frame image of the environment scene image according to the pose and the 3D point coordinates of the offline key frame image .

优选地，所述对所述环境场景图像中的图像帧进行跟踪的步骤，还包括：判断所述当前帧图像的位姿是否满足第三预设条件，若是，则将所述当前帧图像加入所述环境场景的局部地图及所述环境场景的图像检索数据库；根据所述环境场景的局部地图中的所有图像帧的位姿及3D点坐标构建位姿图，对所述位姿图进行优化，更新所述局部地图中所有图像帧的位姿及3D点坐标。Preferably, the step of tracking the image frame in the environment scene image further includes: determining whether the pose of the current frame image satisfies a third preset condition, and if yes, adding the current frame image a partial map of the environment scene and an image retrieval database of the environment scene; constructing a pose map according to poses and 3D point coordinates of all image frames in the local map of the environment scene, and optimizing the pose map Updating the pose and 3D point coordinates of all image frames in the partial map.

优选地，所述对所述环境场景图像中的图像帧进行跟踪的步骤，还包括：对加入所述环境场景的图像检索数据库的图像帧进行回环检测，若检测到回环，则更新所述环境场景的图像检索数据库。Preferably, the step of tracking an image frame in the environment scene image further comprises: performing loopback detection on an image frame of an image retrieval database added to the environment scene, and updating the environment if a loopback is detected The image retrieval database for the scene.

优选地，所述根据待显示的虚拟对象的位姿，在所述环境场景图像的当前帧图像中显示所述虚拟对象的步骤，具体为：获取待显示的虚拟对象的位姿，根据所述环境场景图像的当前帧图像与所述待显示的虚拟对象之间的相对位姿，在所述环境场景图像的当前帧图像中显示所述虚拟对象。Preferably, the step of displaying the virtual object in a current frame image of the environment scene image according to a pose of the virtual object to be displayed is specifically: acquiring a pose of the virtual object to be displayed, according to the A relative pose between the current frame image of the environment scene image and the virtual object to be displayed, the virtual object being displayed in a current frame image of the environment scene image.

本发明另一方面提供一种实现增强现实的服务器，包括：视频获取模块：用于获取环境场景的视频图像；场景重建模块：用于根据视频获取模块获取的环境场景视频图像，对环境场景进行三维场景重建；数据库构建模块：用于根据视频获取模块获取的环境场景视频图像，构建环境场景的图像检索数据库。Another aspect of the present invention provides a server for implementing augmented reality, comprising: a video acquisition module: a video image for acquiring an environment scene; and a scene reconstruction module: configured to perform an environment scene according to an environment scene video image acquired by the video acquisition module. 3D scene reconstruction; database building module: used to construct an image retrieval database of the environment scene according to the environment scene video image acquired by the video acquisition module.

优选地，所述场景重建模块包括：特征提取单元：用于对环境场景视频图像中的每一帧图像进行图像特征提取；位姿计算单元：用于根据特征提取单元提取到的图像特征对环境场景视频图像进行帧间图像配准，计算所述图像帧之间的相对位姿；坐标建立单元，用于在环境场景视频图像中选定离线初始帧图像，根据所述离线初始帧图像建立环境场景的世界坐标系，确定所述环境场景中的3D点的坐标；关键帧选取单元：用于根据环境场景视频图像中的离线初始帧图像确定环境场景视频图像中的离线关键帧图像；位姿图构建单元：用于根据环境场景视频图像中的离线初始帧图像与当前帧图像之间所有离线关键帧图像的位姿及3D点坐标构建位姿图，以及对所述位姿图进行优化，更新所述离线初始帧图像与当前帧图像之间所有离线关键帧图像的位姿及3D点坐标。Preferably, the scene reconstruction module includes: a feature extraction unit: configured to each of the environment scene video images The frame image is subjected to image feature extraction; the pose calculation unit is configured to perform inter-frame image registration on the environment scene video image according to the image feature extracted by the feature extraction unit, and calculate a relative pose between the image frames; the coordinate establishing unit And selecting an offline initial frame image in the environment scene video image, establishing a world coordinate system of the environment scene according to the offline initial frame image, determining coordinates of the 3D point in the environment scene; and selecting a key frame selection unit: Determining an offline key frame image in the environment scene video image according to the offline initial frame image in the environment scene video image; the pose map construction unit: for all offline between the offline initial frame image and the current frame image in the video image according to the environment scene Constructing a pose map of the pose and 3D point coordinates of the key frame image, and optimizing the pose map to update the pose and 3D point coordinates of all offline key frame images between the offline initial frame image and the current frame image .

优选地，所述数据库构建模块具体用于根据环境场景视频图像中的离线初始帧图像及离线关键帧图像中的图像特征，建立搜索树或者词袋。Preferably, the database construction module is specifically configured to establish a search tree or a word bag according to the offline initial frame image in the environment scene video image and the image feature in the offline key frame image.

优选地，还包括：位姿设定模块：用于设定待显示的虚拟对象在环境场景中的位姿。Preferably, the method further includes: a pose setting module: configured to set a pose of the virtual object to be displayed in the environment scene.

优选地，还包括：检索模块：用于接收终端发送的获取环境场景的图像检索数据库的请求，以及将所述请求中对应的环境场景的图像检索数据库发送至终端。Preferably, the method further includes: a retrieval module: configured to receive a request for acquiring an image retrieval database of the environment scene sent by the terminal, and send an image retrieval database of the corresponding environment scene in the request to the terminal.

本发明同时提供一种实现增强现实的终端，包括：图像采集模块：用于实时采集环境场景的图像；特征提取模块：用于提取图像采集模块采集的环境场景图像中的图像特征；地图创建模块：用于根据特征提取模块提取的图像特征，对所述环境场景图像进行初始化定位，建立所述环境场景的局部地图；图像跟踪模块：用于对图像采集模块采集的环境场景图像中的图像帧进行跟踪；数据获取模块：用于获取待显示的虚拟对象的位姿；显示模块：用于根据数据获取模块获取的待显示的虚拟对象的位姿，在所述环境场景图像的当前帧图像中显示所述虚拟对象。The invention also provides a terminal for realizing augmented reality, comprising: an image acquisition module: an image for collecting an environment scene in real time; a feature extraction module: for extracting an image feature in an environment scene image collected by the image acquisition module; and a map creation module And being used for initializing and positioning the environment scene image according to the image feature extracted by the feature extraction module to establish a local map of the environment scene; and the image tracking module: the image frame used in the environment scene image collected by the image acquisition module Tracking; data acquisition module: for acquiring the pose of the virtual object to be displayed; display module: for the pose of the virtual object to be displayed acquired according to the data acquisition module, in the current frame image of the environment scene image The virtual object is displayed.

优选地，所述实现增强现实的终端，还包括：定位模块：用于确定所述环境场景的位置信息，以及数据获取模块还用于根据环境场景的位置信息，获取环境场景的图像检索数据库。Preferably, the terminal that implements the augmented reality further includes: a positioning module: configured to determine location information of the environment scenario, and the data acquisition module is further configured to acquire an image retrieval database of the environment scenario according to the location information of the environment scenario.

优选地，所述地图创建模块包括：图像解析单元：用于对当前帧图像中的图像特征进行解析，在所述图像检索数据库中检索符合第一预设条件的离线关键帧图像；初始定位单元：用于根据所述当前帧图像与所述离线关键帧图像的相对位姿，进行所述环境场景的初始化定位；地图建立单元：根据所述当前帧图像中符合第二预设条件的3D点，建立所述环境场景的局部地图。Preferably, the map creation module includes: an image parsing unit configured to parse image features in the current frame image, and retrieve an offline key frame image conforming to the first preset condition in the image retrieval database; initial positioning unit And performing initial positioning of the environment scene according to the relative pose of the current frame image and the offline key frame image; and the map establishing unit: according to the 3D point in the current frame image that meets the second preset condition Establishing a partial map of the environmental scene.

优选地，所述图像跟踪模块包括：检测单元：用于根据环境场景图像的上一帧图像，检测环境场景图像的当前帧图像中与上一帧图像匹配的图像特征；判断单元：用于判断所述匹配的图像特征数是否大于或者等于预设阈值；位姿计算单元：用于当判断单元判断所述匹配的图像特征数大于或者等于预设阈值时，根据环境场景图像的上一帧图像的位姿及3D点坐标，计算所述环境场景视频图像的当前帧图像的位姿及3D点坐标。Preferably, the image tracking module includes: a detecting unit configured to: detect an image feature matching the previous frame image in the current frame image of the environment scene image according to the previous frame image of the environment scene image; and the determining unit is configured to determine Place Whether the matching image feature number is greater than or equal to a preset threshold; the pose calculating unit is configured to: when the determining unit determines that the matched image feature number is greater than or equal to a preset threshold, according to the previous frame image of the environment scene image The pose and the 3D point coordinates calculate the pose and 3D point coordinates of the current frame image of the environment scene video image.

优选地，所述数据获取模块，还用于当判断单元判断所述匹配的图像特征数小于预设阈值时，在所述环境场景的图像检索数据库中检索与所述环境场景图像的当前帧图像匹配的离线关键帧图像；以及，所述位姿计算单元，还用于根据数据获取模块检索到的离线关键帧图像的位姿及3D点坐标，计算环境场景图像的当前帧图像的位姿及3D点坐标。Preferably, the data acquisition module is further configured to: when the determining unit determines that the matched image feature number is less than a preset threshold, retrieving a current frame image of the environment scene image in an image retrieval database of the environment scene a matching offline key frame image; and the pose calculation unit is further configured to calculate a pose of the current frame image of the environment scene image according to the pose and the 3D point coordinates of the offline key frame image retrieved by the data acquisition module 3D point coordinates.

优选地，所述位姿计算单元，还用于计算环境场景图像的当前帧图像与待显示的虚拟对象之间的相对位姿；以及，所述显示模块，还用于根据位姿计算单元计算得到的环境场景图像的当前帧图像与待显示的虚拟对象之间的相对位姿，在环境场景图像的当前帧图像中显示所述虚拟对象。Preferably, the pose calculation unit is further configured to calculate a relative pose between the current frame image of the environment scene image and the virtual object to be displayed; and the display module is further configured to calculate according to the pose calculation unit The relative pose between the current frame image of the obtained environment scene image and the virtual object to be displayed is displayed in the current frame image of the environment scene image.

本发明还提供一种实现增强现实的***，包括上述的实现增强现实的服务器，以及上述的实现增强现实的终端。The present invention also provides a system for realizing augmented reality, including the above-described server for realizing augmented reality, and the above-described terminal for realizing augmented reality.

本发明实现增强现实的方法、服务器、终端及***，通过实时采集环境场景图像，根据环境场景图像中的图像特征进行图像跟踪，确定待显示的虚拟对象与环境场景的相对位姿，在环境场景图像中显示虚拟对象。本发明实现在无模板的情况下对环境场景或目标对象进行虚拟对象的图像叠加显示，有效地降低了现有的增强现实技术对平面模板的依赖，提高了虚拟对象与真实环境场景的实时配准的准确性，显著地增强了虚拟对象叠加到环境场景图像中的协调性与一致性。The method, the server, the terminal and the system for realizing the augmented reality realize the image tracking according to the image features in the environment scene image, and determine the relative pose of the virtual object to be displayed and the environment scene, in the environment scene. A virtual object is displayed in the image. The invention realizes the image superimposition display of the virtual object on the environment scene or the target object without the template, effectively reduces the dependence of the existing augmented reality technology on the plane template, and improves the real-time matching of the virtual object and the real environment scene. Accuracy, significantly enhances the coordination and consistency of virtual object overlays into environmental scene images.

附图说明DRAWINGS

为了更清楚地说明本发明实施例或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其它的附图：In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only It is a certain embodiment of the present invention, and other drawings can be obtained according to these drawings for those skilled in the art without any creative work:

图1示出了本发明第一实施例的实现增强现实的方法的流程示意图；1 is a flow chart showing a method for realizing augmented reality according to a first embodiment of the present invention;

图2示出了本发明第二实施例的实现增强现实的服务器的结构示意图；2 is a schematic structural diagram of a server for realizing augmented reality according to a second embodiment of the present invention;

图3示出了本发明第二实施例的实现增强现实的服务器的场景重建模块的结构示意图；3 is a schematic structural diagram of a scene reconstruction module of a server implementing augmented reality according to a second embodiment of the present invention;

图4示出了本发明第三实施例的实现增强现实的终端的结构示意图； 4 is a schematic structural diagram of a terminal for realizing augmented reality according to a third embodiment of the present invention;

图5示出了本发明第三实施例的实现增强现实的终端的地图创建模块的结构示意图；FIG. 5 is a schematic structural diagram of a map creation module of a terminal for realizing augmented reality according to a third embodiment of the present invention; FIG.

图6示出了本发明第三实施例的实现增强现实的终端的图像跟踪模块的结构示意图；6 is a schematic structural diagram of an image tracking module of a terminal implementing augmented reality according to a third embodiment of the present invention;

图7示出了本发明第四实施例的实现增强现实的***的结构示意图。FIG. 7 is a block diagram showing the structure of a system for realizing augmented reality according to a fourth embodiment of the present invention.

具体实施方式detailed description

为了能够更清楚地理解本发明的目的、特征和优点，下面结合附图和具体实施方式对本发明做进一步的详细描述。需要说明的是，在不冲突的情况下，本申请的实施例及实施例中的特征可以相互结合。The present invention will be further described in detail below with reference to the drawings and specific embodiments. It should be noted that the embodiments in the present application and the features in the embodiments may be combined with each other without conflict.

在下面的描述中阐述了很多具体细节以便于充分理解本发明，但是，这仅仅是本发明的一些实施例，本发明还可以采用其他不同于在此描述的其他方式来实施，因此，本发明的保护范围并不受下面公开的具体实施例的限制。In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but this is only a few embodiments of the invention, and the invention may be practiced in other ways than those described herein. The scope of protection is not limited by the specific embodiments disclosed below.

实施例一Embodiment 1

本发明实施例的实现增强现实的方法，如图1所示，主要包括以下步骤：步骤S101，实时采集环境场景的图像；步骤S102，提取环境场景图像中的图像特征；步骤S103，根据图像特征，对环境场景进行初始化定位，建立环境场景的局部地图；步骤S104，对环境场景图像中的图像帧进行跟踪；步骤S105，根据待显示的虚拟对象的位姿，在环境场景图像的当前帧图像中显示虚拟对象。The method for implementing augmented reality in the embodiment of the present invention, as shown in FIG. 1 , mainly includes the following steps: Step S101: acquiring an image of an environment scene in real time; Step S102, extracting an image feature in an environment scene image; Step S103, according to image features Initializing and locating the environment scene to establish a partial map of the environment scene; step S104, tracking the image frame in the environment scene image; and step S105, according to the pose of the virtual object to be displayed, the current frame image of the environment scene image The virtual object is displayed in .

在该技术方案中，首先，通过对环境场景进行实时图像采集，提取环境场景实时图像中的图像特征，例如ORB(Oriented BRIEF)特征。其次，根据提取到的图像特征，进行环境场景的初始化定位，建立环境场景的局部地图，同时对实时采集的环境场景图像的图像帧进行图像跟踪，确定图像帧的位姿及图像帧中3D点的位置坐标，根据待显示的虚拟对象的位姿，在环境场景图像的当前帧图像中显示虚拟对象。In the technical solution, first, real-time image acquisition of an environmental scene is performed to extract image features in an real-time image of an environmental scene, such as an ORB (Oriented BRIEF) feature. Secondly, according to the extracted image features, the initial positioning of the environment scene is performed, a local map of the environment scene is established, and the image frame of the environment scene image collected in real time is image-tracked, and the pose of the image frame and the 3D point in the image frame are determined. The position coordinates of the virtual object are displayed in the current frame image of the environment scene image according to the pose of the virtual object to be displayed.

在上述技术方案中，优选地，所述步骤S101之前，还包括：获取环境场景的视频图像；根据环境场景视频图像，对环境场景进行三维场景重建；根据环境场景视频图像，构建环境场景的图像检索数据库；根据环境场景的视频图像，预设待显示的虚拟对象在环境场景中的位姿。In the above technical solution, preferably, before the step S101, the method further includes: acquiring a video image of the environment scene; performing a three-dimensional scene reconstruction on the environment scene according to the environment scene video image; and constructing an image of the environment scene according to the environment scene video image Retrieve the database; preset the pose of the virtual object to be displayed in the environment scene according to the video image of the environment scene.

在该技术方案中，通过预先拍摄所述环境场景的视频图像，记录环境场景中的图像细节，根据所述环境场景的视频图像，对所述环境场景进行三维重建。具体地，对环境场景视频图像中的每一帧图像进行图像特征提取，根据提取到的图像特征进行环境场景视频图像的帧间图像配准，确定图像帧之间的相对位姿。其中，帧间图像配准根据两图像帧中的图像特征，采用基于特征匹配或直接方法，得到一组2D点对，根据该组2D点对通过五点法计算得到该两图像帧之间的相对位姿。In this technical solution, the image details in the environmental scene are recorded by taking a video image of the environmental scene in advance, and the environmental scene is three-dimensionally reconstructed according to the video image of the environmental scene. Specifically, image feature extraction is performed on each frame image in the environment scene video image, and inter-frame image registration of the environment scene video image is performed according to the extracted image feature to determine a relative pose between the image frames. Wherein the inter-frame image registration is based on two image frames The image feature is obtained by using a feature matching or direct method to obtain a set of 2D point pairs, and the relative pose between the two image frames is calculated according to the set of 2D point pairs by a five-point method.

在上述技术方案中，进一步地，在环境场景视频图像中选定离线初始帧图像，根据所述离线初始帧图像建立所述环境场景的世界坐标系，确定所述环境场景中的3D点的坐标。同时，根据环境场景视频图像中的离线初始帧图像确定环境场景视频图像中的离线关键帧图像，根据环境场景视频图像中的离线初始帧图像与当前帧图像之间所有的离线关键帧图像的位姿及3D点坐标构建位姿图，并对所述位姿图进行优化，更新所述离线初始帧图像与当前帧图像之间所有的离线关键帧图像的位姿及3D点坐标。具体地，根据环境场景视频图像中的离线初始帧图像，解析环境场景视频图像中所述离线初始帧图像之后的图像帧，若所述离线初始帧图像中的3D点在所述当前帧图像中的复现率小于预设的阈值，则确定所述当前帧图像为所述环境场景视频图像的一个离线关键帧图像。继续解析环境场景视频图像中所述离线关键帧图像之后的图像帧，若所述离线关键帧图像中的3D点在所述当前帧图像中的复现率小于预设的阈值，则确定所述当前帧图像为所述环境场景视频图像的下一个离线关键帧图像。以此类推，完成对所述环境场景视频图像中所有图像帧的解析，得到所述环境场景视频图像中的全部离线关键帧图像。进一步地，将所述环境场景视频图像解析为连续的图像帧序列，通过对环境场景的视频图像中的图像帧进行顺序配准，得到一组图像帧的位姿序列以及与每一图像帧中的2D点对应的3D点的坐标，完成对位姿图的构建。通过环形闭合检测，修正累计误差。同时，通过位姿及约束条件对位姿图进行优化，修正环境场景视频图像中每一图像帧的位姿数据。In the above technical solution, further, selecting an offline initial frame image in the environment scene video image, establishing a world coordinate system of the environment scene according to the offline initial frame image, and determining coordinates of a 3D point in the environment scene . Simultaneously, determining an offline key frame image in the environment scene video image according to the offline initial frame image in the environment scene video image, according to bits of all offline key frame images between the offline initial frame image and the current frame image in the environment scene video image The pose and the 3D point coordinates construct a pose map, and optimize the pose map to update the pose and 3D point coordinates of all offline key frame images between the offline initial frame image and the current frame image. Specifically, the image frame after the offline initial frame image in the environment scene video image is parsed according to the offline initial frame image in the environment scene video image, if the 3D point in the offline initial frame image is in the current frame image The recurrence rate is less than a preset threshold, and the current frame image is determined to be an offline key frame image of the environment scene video image. And continuing to parse the image frame after the offline key frame image in the environment scene video image, if the recurrence rate of the 3D point in the offline key frame image in the current frame image is less than a preset threshold, determining the The current frame image is the next offline key frame image of the environmental scene video image. By analogy, the parsing of all image frames in the video image of the environment scene is completed, and all offline key frame images in the video image of the environment scene are obtained. Further, the environment scene video image is parsed into a continuous sequence of image frames, and the image frames in the video image of the environment scene are sequentially registered to obtain a pose sequence of a group of image frames and each image frame. The coordinates of the 3D point corresponding to the 2D point in the middle complete the construction of the pose map. The cumulative error is corrected by the ring closure detection. At the same time, the pose is optimized by the pose and constraints to correct the pose data of each image frame in the video image of the environment scene.

在上述技术方案中，进一步地，根据所述环境场景视频图像中的离线初始帧图像及离线关键帧图像中的图像特征，建立搜索树或者词袋。In the above technical solution, further, a search tree or a word bag is established according to the offline initial frame image in the environment scene video image and the image feature in the offline key frame image.

在上述技术方案中，优选地，根据环境场景的位置信息，获取环境场景对应的图像检索数据库。具体地，当进入某一环境场景时，可以根据该环境场景的地点名称、GPS信息等位置信息标签，获取该环境场景对应的图像检索数据库。In the above technical solution, preferably, the image retrieval database corresponding to the environment scene is acquired according to the location information of the environment scene. Specifically, when entering an environment scene, the image retrieval database corresponding to the environment scene may be obtained according to the location information label of the environment scene, the location information label, and the like.

在上述技术方案中，步骤S103具体为：对当前帧图像中的图像特征进行解析，在所述图像检索数据库中检索符合第一预设条件的离线关键帧图像；根据所述当前帧图像与所述离线关键帧图像的相对位姿，进行所述环境场景的初始化定位；根据所述当前帧图像中符合第二预设条件的3D点，建立所述环境场景的局部地图。In the above technical solution, step S103 is specifically: parsing image features in the current frame image, and retrieving an offline key frame image conforming to the first preset condition in the image retrieval database; according to the current frame image and The relative pose of the offline key frame image is used to perform initial positioning of the environment scene; and the local map of the environment scene is established according to the 3D point of the current frame image that meets the second preset condition.

在该技术方案中，实时采集当前环境场景的图像，解析当前帧图像中的图像特征，例如ORB特征。根据该图像特征，在预先获取的图像检索数据库中检索符合第一预设条件的离线关键帧图像。具体地，对实时采集的环境场景图像中的当前帧图像进行图像特征提取，提取当前帧图像的ORB特征，对当前帧图像的ORB特征进行降维得到用于图像检索的检索标签，通过该检索标签在预先获取的与所述环境场景对应的图像检索数据库中进行初步检索，得到一组候选的离线关键帧图像。环境场景图像的当前帧图像通过与候选的离线关键帧图像进行图像特征配准，确定候选的离线关键帧图像中与环境场景图像的当前帧图像匹配度最高的离线关键帧图像(即第一预设条件)，计算环境场景图像的当前帧图像与所述匹配度最高的离线关键帧图像的相对位姿，建立所述环境场景的世界坐标系，完成环境场景的初始化定位，并将所述环境场景图像的当前帧图像标记为初始关键帧图像。根据环境场景图像的当前帧图像中可见的3D点(即第二预设条件)，建立环境场景的局部地图。同样地，也可以直接获取预先建立的所述环境场景的离线地图，将环境场景图像的当前帧图像在所述环境场景的离线地图中进行匹配计算相对位姿，建立所述环境场景的坐标系，完成环境场景的初始化定位，建立环境场景的局部地图，此处不再赘述。In this technical solution, an image of a current environmental scene is acquired in real time, and image features in the current frame image, such as an ORB feature, are parsed. Retrieving the first preset condition in the image retrieval database acquired in advance according to the image feature Offline keyframe image. Specifically, the image feature extraction is performed on the current frame image in the real-time collected environment scene image, the ORB feature of the current frame image is extracted, and the ORB feature of the current frame image is dimension-reduced to obtain a search tag for image retrieval, and the search is performed by the search. The tag performs preliminary retrieval in an image retrieval database corresponding to the environment scene acquired in advance, and obtains a set of candidate offline key frame images. The current frame image of the environment scene image is subjected to image feature registration with the candidate offline key frame image, and the offline key frame image with the highest matching degree with the current frame image of the environment scene image in the candidate offline key frame image is determined (ie, the first pre-pre- a condition), calculating a relative pose of the current frame image of the environment scene image and the offline key frame image with the highest matching degree, establishing a world coordinate system of the environment scene, completing initial positioning of the environment scene, and setting the environment The current frame image of the scene image is marked as an initial key frame image. A partial map of the environmental scene is established according to the 3D point (ie, the second preset condition) visible in the current frame image of the environment scene image. Similarly, the offline map of the environment scene that is pre-established may be directly obtained, and the current frame image of the environment scene image is matched and calculated in the offline map of the environment scene to calculate a relative pose, and the coordinate system of the environment scene is established. Complete the initial positioning of the environment scene and establish a partial map of the environment scene, which is not described here.

在上述技术方案中，进一步地，将与所述环境场景图像的当前帧图像匹配度最高的离线关键帧图像、所述环境场景图像的初始关键帧图像(即当前帧图像)及所述初始关键帧图像中符合第二预设条件的3D点加入所述环境场景的局部地图。In the above technical solution, further, an offline key frame image having the highest matching degree with the current frame image of the environment scene image, an initial key frame image of the environmental scene image (ie, a current frame image), and the initial key A 3D point in the frame image that meets the second preset condition is added to the partial map of the environmental scene.

在上述技术方案中，步骤S104具体为：根据环境场景图像的上一帧图像，检测所述环境场景图像的当前帧图像中与所述上一帧图像匹配的图像特征；判断所述匹配的图像特征数是否大于或者等于预设阈值；若所述匹配的图像特征数大于或者等于预设阈值，则根据所述环境场景图像的上一帧图像的位姿及3D点坐标确定所述环境场景图像的当前帧图像的位姿及3D点坐标；若所述匹配的图像特征数小于预设阈值，则在所述环境场景的图像检索数据库中检索与所述环境场景图像的当前帧图像匹配的离线关键帧图像，根据所述离线关键帧图像的位姿及3D点坐标，确定所述环境场景图像的当前帧图像的位姿及3D点坐标。In the above technical solution, the step S104 is specifically: detecting an image feature that matches the previous frame image in the current frame image of the environment scene image according to the previous frame image of the environment scene image; and determining the matched image. Whether the number of features is greater than or equal to a preset threshold; if the number of matched image features is greater than or equal to a preset threshold, determining the environment scene image according to the pose of the previous frame image of the environment scene image and the 3D point coordinates a pose of the current frame image and a 3D point coordinate; if the matched number of image features is less than a preset threshold, searching for an offline match with the current frame image of the environment scene image in the image retrieval database of the environment scene The key frame image determines a pose and a 3D point coordinate of the current frame image of the environment scene image according to the pose of the offline key frame image and the 3D point coordinates.

在该技术方案中，进一步地，判断所述环境场景图像的当前帧图像的位姿是否满足第三预设条件，若是，则将所述当前帧图像加入所述环境场景的局部地图及所述环境场景对应的图像检索数据库。这里，第三预设条件指的是：当前帧图像M的前一关键帧图像P中的3D点在当前帧图像M中的复现率小于预设阈值。因此，在此处的图像跟踪过程中，依次解析环境场景图像的每一图像帧。若当前帧图像M的前一关键帧图像P中的3D点在当前帧图像M中的复现率小于预设阈值，则确定当前帧图像M为环境场景图像的一个关键帧图像。继续解析图像帧M(新的关键帧)之后图像帧，当解析至图像帧N(新的当前帧)，若图像帧M中的3D点在图像帧N中的复现率小于预设阈值，则确定帧图像 N为环境场景图像的一个关键帧图像。以此类推。In this technical solution, further, determining whether a pose of the current frame image of the environment scene image satisfies a third preset condition, and if yes, adding the current frame image to a partial map of the environment scene and the An image retrieval database corresponding to the environmental scene. Here, the third preset condition means that the recurrence rate of the 3D point in the previous key frame image P of the current frame image M in the current frame image M is less than a preset threshold. Therefore, in the image tracking process here, each image frame of the environmental scene image is sequentially analyzed. If the recurrence rate of the 3D point in the previous key frame image P of the current frame image M in the current frame image M is less than a preset threshold, it is determined that the current frame image M is a key frame image of the environment scene image. Continue to parse the image frame after the image frame M (new key frame), and when parsing to the image frame N (new current frame), if the recurrence rate of the 3D point in the image frame M in the image frame N is less than a preset threshold, Determine the frame image N is a key frame image of the environment scene image. And so on.

具体地，根据环境场景图像的初始关键帧图像，解析所述环境场景图像初始关键帧图像之后的图像帧，若判断所述初始关键帧图像中的3D点在所述当前帧图像中的复现率小于预设的阈值，则确定所述当前帧图像为所述环境场景图像的一个关键帧图像，将所述当前帧图像加入所述环境场景的局部地图及所述环境场景对应的图像检索数据库中。继续解析所述环境场景图像的关键帧图像之后的图像帧，若判断所述关键帧图像中的3D点在所述当前帧图像中的复现率小于预设的阈值，则确定所述当前帧图像为所述环境场景图像的另一个关键帧图像，将所述当前帧图像加入所述环境场景的局部地图及所述环境场景对应的图像检索数据库中。根据所述环境场景的局部地图中的所有关键帧图像的位姿及3D点坐标构建位姿图，对所述位姿图进行优化，更新所述环境场景的局部地图中的所有关键帧图像的位姿及3D点坐标。Specifically, parsing an image frame after the initial key frame image of the environment scene image according to an initial key frame image of the environment scene image, and determining a recurrence of the 3D point in the initial frame image in the current frame image The current frame image is determined as a key frame image of the environment scene image, and the current frame image is added to the local map of the environment scene and the image retrieval database corresponding to the environment scene. in. And continuing to parse the image frame after the key frame image of the environment scene image, and determining that the current frame is determined if the recurrence rate of the 3D point in the key frame image in the current frame image is less than a preset threshold The image is another key frame image of the environment scene image, and the current frame image is added to the local map of the environment scene and the image retrieval database corresponding to the environment scene. Constructing a pose map according to poses and 3D point coordinates of all key frame images in the local map of the environment scene, optimizing the pose map, and updating all key frame images in the partial map of the environment scene Pose and 3D point coordinates.

在上述技术方案中，进一步地，对加入所述环境场景的图像检索数据库的图像帧进行回环检测，若检测到回环，则更新所述环境场景的图像检索数据库中的关键帧图像的位姿及3D点坐标，从而修正累积误差。In the above technical solution, further, the image frame of the image retrieval database added to the environment scene is subjected to loopback detection, and if the loopback is detected, the pose of the key frame image in the image retrieval database of the environment scene is updated and 3D point coordinates to correct the cumulative error.

在上述技术方案中，步骤S105具体为：根据所述环境场景图像的当前帧图像与所述待显示的虚拟对象之间的相对位姿，在所述环境场景图像的当前帧图像中显示所述虚拟对象。具体地，获取预设的待显示的虚拟对象的位姿，根据所述环境场景图像的当前帧图像的位姿，计算所述环境场景图像的当前帧图像与所述待显示的虚拟对象之间的相对位姿，根据所述相对位姿，在所述环境场景图像的当前帧图像中显示所述待显示的虚拟对象。In the above technical solution, step S105 is specifically: displaying, according to a relative pose between the current frame image of the environment scene image and the virtual object to be displayed, in the current frame image of the environment scene image. Virtual object. Specifically, acquiring a preset pose of the virtual object to be displayed, and calculating a current frame image of the environment scene image and the virtual object to be displayed according to the pose of the current frame image of the environment scene image The relative pose of the display, according to the relative pose, displaying the virtual object to be displayed in a current frame image of the environment scene image.

实施例二Embodiment 2

本发明实施例的实现增强现实的服务器200，如图2所示，包括视频获取模块201、场景重建模块202和数据库构建模块203。其中，视频获取模块201用于获取环境场景的视频图像。场景重建模块202用于根据视频获取模块201获取的环境场景视频图像，对所述环境场景进行三维场景重建。数据库构建模块203用于根据视频获取模块201获取的环境场景视频图像，构建所述环境场景的图像检索数据库。The server 200 implementing the augmented reality embodiment of the present invention, as shown in FIG. 2, includes a video acquisition module 201, a scene reconstruction module 202, and a database construction module 203. The video obtaining module 201 is configured to acquire a video image of an environment scene. The scene reconstruction module 202 is configured to perform three-dimensional scene reconstruction on the environment scene according to the environment scene video image acquired by the video acquisition module 201. The database construction module 203 is configured to construct an image retrieval database of the environment scene according to the environment scene video image acquired by the video acquisition module 201.

在该技术方案中，视频获取模块201拍摄或者获取预先拍摄的环境场景的视频图像，记录环境场景中的图像细节。场景重建模块202根据视频获取模块201获取的环境场景的视频图像，对环境场景进行三维重建。数据库构建模块203根据视频获取模块201获取的环境场景视频图像，构建环境场景的图像检索数据库，用于进行环境场景图像检索。In this technical solution, the video acquisition module 201 captures or acquires a video image of a pre-captured environmental scene, and records image details in the environmental scene. The scene reconstruction module 202 performs three-dimensional reconstruction on the environment scene according to the video image of the environment scene acquired by the video acquisition module 201. The database construction module 203 constructs an image retrieval database of the environment scene according to the environment scene video image acquired by the video acquisition module 201, and performs an environment scene image retrieval.

在上述技术方案中，优选地，场景重建模块202，如图3所示，包括特征提取单元2021、位姿计算单元2022、坐标建立单元2023、关键帧选取单元2024和位姿图构建单元2025。其中，特征提取单元2021用于对所述环境场景视频图像中的图像帧进行图像特征提取。位姿计算单元2022用于根据特征提取单元2021提取到的图像特征对所述环境场景视频图像进行帧间图像配准，计算所述图像帧之间的相对位姿。坐标建立单元2023，用于在所述环境场景视频图像中选定离线初始帧图像，根据所述离线初始帧图像建立所述环境场景的世界坐标系，确定所述环境场景中的3D点的坐标。关键帧选取单元2024用于根据所述环境场景视频图像中的离线初始帧图像确定所述环境场景视频图像中的离线关键帧图像。位姿图构建单元2025用于根据所述环境场景视频图像中的离线初始帧图像与当前帧图像之间所有离线关键帧图像的位姿及3D点坐标构建位姿图，以及对所述位姿图进行优化，更新所述离线初始帧图像与当前帧图像之间所有离线关键帧图像的位姿及3D点坐标。具体地，特征提取单元2021对环境场景视频图像中的每一图像帧进行图像特征提取，位姿计算单元2022根据特征提取单元2021提取到的图像特征进行环境场景视频图像帧间图像配准，确定图像帧之间的相对位姿。其中，帧间图像配准根据两图像帧中的图像特征，采用基于特征匹配或直接方法，得到一组2D点对，根据该组2D点对通过五点法计算得到该两图像帧之间的相对位姿。坐标建立单元2023在环境场景视频图像中选定离线初始帧图像，根据所述离线初始帧图像建立所述环境场景的世界坐标系，确定所述环境场景中的3D点的坐标。关键帧选取单元2024根据环境场景视频图像的离线初始帧图像，解析所述环境场景视频图像的离线初始帧图像之后的图像帧，若所述离线初始帧图像中的3D点在所述当前帧图像中的复现率小于预设的阈值，则确定所述当前帧图像为所述环境场景视频图像的一个离线关键帧图像。继续解析所述环境场景视频图像的离线关键帧图像之后的图像帧，若所述离线关键帧图像中的3D点在所述当前帧图像中的复现率小于预设的阈值，则确定所述当前帧图像为所述环境场景视频图像的下一个离线关键帧图像。以此类推，完成对所述环境场景视频图像的所有图像帧的解析，得到所述环境场景视频图像中的全部离线关键帧图像。位姿图构建单元2025根据位姿计算单元2022通过对环境场景的视频图像进行图像帧顺序配准，得到的一组图像帧的位姿序列以及与每一图像帧的2D点对应的3D点的坐标，完成对位姿图的构建。通过环形闭合检测，修正累计误差。同时，通过位姿及约束条件对位姿图进行优化，修正环境场景视频图像中每一图像帧的位姿数据。In the above technical solution, preferably, the scene reconstruction module 202, as shown in FIG. 3, includes a feature extraction unit 2021. The pose calculation unit 2022, the coordinate establishment unit 2023, the key frame selection unit 2024, and the pose map construction unit 2025. The feature extraction unit 2021 is configured to perform image feature extraction on the image frame in the environment scene video image. The pose calculation unit 2022 is configured to perform inter-frame image registration on the environment scene video image according to the image features extracted by the feature extraction unit 2021, and calculate a relative pose between the image frames. a coordinate establishing unit 2023, configured to select an offline initial frame image in the environment scene video image, establish a world coordinate system of the environment scene according to the offline initial frame image, and determine a coordinate of a 3D point in the environment scene . The key frame selection unit 2024 is configured to determine an offline key frame image in the environment scene video image according to the offline initial frame image in the environment scene video image. The pose map construction unit 2025 is configured to construct a pose map according to the pose and 3D point coordinates of all offline key frame images between the offline initial frame image and the current frame image in the environment scene video image, and the pose pose The map is optimized to update the pose and 3D point coordinates of all offline keyframe images between the offline initial frame image and the current frame image. Specifically, the feature extraction unit 2021 performs image feature extraction on each image frame in the environment scene video image, and the pose calculation unit 2022 performs inter-frame image registration of the environment scene video image according to the image feature extracted by the feature extraction unit 2021. The relative pose between image frames. Wherein, the inter-frame image registration is based on the image features in the two image frames, and a set of 2D point pairs is obtained based on the feature matching or direct method, and the two image frames are calculated according to the set of 2D point pairs by the five-point method. Relative pose. The coordinate establishing unit 2023 selects an offline initial frame image in the environment scene video image, establishes a world coordinate system of the environment scene according to the offline initial frame image, and determines coordinates of the 3D point in the environment scene. The key frame selection unit 2024 parses the image frame after the offline initial frame image of the environment scene video image according to the offline initial frame image of the environment scene video image, if the 3D point in the offline initial frame image is in the current frame image The recurrence rate in the medium is less than a preset threshold, and the current frame image is determined to be an offline key frame image of the environment scene video image. And continuing to parse the image frame after the offline key frame image of the environment scene video image, if the recurrence rate of the 3D point in the offline key frame image in the current frame image is less than a preset threshold, determining the The current frame image is the next offline key frame image of the environmental scene video image. By analogy, all image frames of the environment scene video image are parsed, and all offline key frame images in the environment scene video image are obtained. The pose map construction unit 2025 obtains a pose sequence of a set of image frames and a 3D point corresponding to a 2D point of each image frame by performing image frame sequential registration on the video image of the environment scene according to the pose calculation unit 2022. The coordinates of the completion of the construction of the pose map. The cumulative error is corrected by the ring closure detection. At the same time, the pose is optimized by the pose and constraints to correct the pose data of each image frame in the video image of the environment scene.

在上述技术方案中，优选地，数据库构建模块203，根据所述环境场景视频图像中的离线初始帧图像及离线关键帧图像中的图像特征，建立搜索树或者词袋。In the above technical solution, preferably, the database construction module 203 establishes a search tree or a word bag according to the offline initial frame image in the environment scene video image and the image feature in the offline key frame image.

在上述技术方案中，优选地，实现增强现实的服务器200，还包括：位姿设定模块204：用于设定待显示的虚拟对象在环境场景中的位姿。In the above technical solution, preferably, the server 200 for realizing augmented reality further includes: a pose setting module 204: Used to set the pose of the virtual object to be displayed in the environment scene.

在上述技术方案中，优选地，实现增强现实的服务器200，还包括检索模块205。该检索模块205用于接收终端发送的获取环境场景的图像检索数据库的请求，以及将所述请求中对应的环境场景的图像检索数据库发送至所述终端。具体地，所述服务器存储一个或多个环境场景的图像检索数据库，服务器接收终端发送的获取环境场景的图像检索数据库的请求，所述请求中包括终端所在环境场景的位置标签，例如所述环境场景的地点名称、GPS信息等，检索模块205根据所述环境场景的位置标签检索对应的环境场景的图像检索数据库，将所述环境场景的图像检索数据库发送至所述终端。In the above technical solution, preferably, the server 200 implementing the augmented reality further includes a retrieval module 205. The retrieval module 205 is configured to receive a request for acquiring an image retrieval database of an environment scene sent by the terminal, and send an image retrieval database of the corresponding environment scene in the request to the terminal. Specifically, the server stores an image retrieval database of one or more environment scenarios, and the server receives a request for acquiring an image retrieval database of the environment scene sent by the terminal, where the request includes a location label of an environment scene where the terminal is located, for example, the environment. The location name of the scene, the GPS information, and the like, the retrieval module 205 retrieves an image retrieval database of the corresponding environment scene according to the location label of the environment scene, and sends an image retrieval database of the environment scene to the terminal.

实施例三Embodiment 3

本发明实施例的实现增强现实的终端300，如图4所示，包括图像采集模块301、特征提取模块302、地图创建模块303、图像跟踪模块304和显示模块306。其中，图像采集模块301用于实时采集环境场景的图像。特征提取模块302用于提取图像采集模块301采集的环境场景图像中的图像特征。地图创建模块303用于根据特征提取模块302提取的图像特征，对环境场景进行初始化定位，建立环境场景的局部地图。图像跟踪模块304用于对图像采集模块301采集的环境场景图像中的图像帧进行跟踪。数据获取模块305用于获取待显示的虚拟对象的位姿。显示模块306用于根据数据获取模块305获取的待显示的虚拟对象的位姿，在环境场景图像的当前帧图像中显示所述虚拟对象。As shown in FIG. 4, the terminal 300 for implementing augmented reality in the embodiment of the present invention includes an image acquisition module 301, a feature extraction module 302, a map creation module 303, an image tracking module 304, and a display module 306. The image acquisition module 301 is configured to collect an image of an environment scene in real time. The feature extraction module 302 is configured to extract image features in the environment scene image collected by the image acquisition module 301. The map creation module 303 is configured to initialize the environment scene according to the image features extracted by the feature extraction module 302, and establish a partial map of the environment scene. The image tracking module 304 is configured to track image frames in the environment scene image collected by the image acquisition module 301. The data acquisition module 305 is configured to acquire a pose of the virtual object to be displayed. The display module 306 is configured to display the virtual object in a current frame image of the environment scene image according to the pose of the virtual object to be displayed acquired by the data acquisition module 305.

在该技术方案中，图像采集模块301对环境场景进行实时图像采集。特征提取模块302提取图像采集模块301采集的环境场景的实时图像中的图像特征提取，例如ORB特征。地图创建模块303根据特征提取模块302提取的图像特征，对环境场景图像进行初始化定位，建立环境场景的局部地图。图像跟踪模块304根据特征提取模块302提取的图像特征对环境场景图像中的图像帧进行图像跟踪，确定图像帧的位姿及图像帧中3D点的位置坐标。数据获取模块305获取待显示的虚拟对象的位姿。显示模块306根据数据获取模块305获取的待显示的虚拟对象的位姿，在环境场景图像的当前帧图像中显示虚拟对象。In this technical solution, the image acquisition module 301 performs real-time image acquisition on the environmental scene. The feature extraction module 302 extracts image feature extraction, such as an ORB feature, in the real-time image of the environmental scene collected by the image acquisition module 301. The map creation module 303 initializes and locates the environment scene image according to the image features extracted by the feature extraction module 302, and establishes a partial map of the environment scene. The image tracking module 304 performs image tracking on the image frames in the environment scene image according to the image features extracted by the feature extraction module 302, and determines the pose of the image frame and the position coordinates of the 3D points in the image frame. The data acquisition module 305 acquires the pose of the virtual object to be displayed. The display module 306 displays the virtual object in the current frame image of the environment scene image according to the pose of the virtual object to be displayed acquired by the data acquisition module 305.

在上述技术方案中，优选地，实现增强现实的终端300还包括定位模块307。定位模块307根据实现增强现实的终端300所在的环境场景的地点名称、GPS信息等位置标签，获取实现增强现实的服务器200中所述环境场景对应的图像检索数据库。In the above technical solution, preferably, the terminal 300 implementing the augmented reality further includes a positioning module 307. The positioning module 307 acquires an image retrieval database corresponding to the environment scenario in the server 200 that implements the augmented reality according to the location name of the environment scene in which the augmented reality terminal 300 is located, and the location tag such as GPS information.

在上述技术方案中，优选地，地图创建模块303，如图5所示，包括图像解析单元3031、初始定位单元3032和地图建立单元3033。其中，图像解析单元3031用于对当前帧图像中的图像特征进行解析，在图像检索数据库中检索符合第一预设条件的离线关键帧图像。初始定位单元3032用于根据所述当前帧图像与所述离线关键帧图像的相对位姿，进行所述环境场景的初始化定位。地图建立单元3033根据当前帧图像中符合第二预设条件的3D点，建立所述环境场景的局部地图。In the above technical solution, preferably, the map creation module 303, as shown in FIG. 5, includes an image parsing unit 3031, an initial positioning unit 3032, and a map establishing unit 3033. The image parsing unit 3031 is configured to parse the image features in the current frame image, and retrieve an offline key frame image that meets the first preset condition in the image retrieval database. The initial positioning unit 3032 is configured to perform initial positioning of the environment scene according to the relative pose of the current frame image and the offline key frame image. The map establishing unit 3033 establishes a partial map of the environmental scene according to a 3D point in the current frame image that meets the second preset condition.

在该技术方案中，图像解析单元3031根据图像采集模块301实时采集的当前环境场景的图像，解析当前帧图像中的图像特征，例如ORB特征。根据该图像特征，在预先获取的环境场景的图像检索数据库中检索符合第一预设条件的离线关键帧图像。具体地，对实时采集的环境场景图像中的当前帧图像进行图像特征提取，提取当前帧图像的ORB特征，对当前帧图像的ORB特征进行降维得到用于图像检索的检索标签，通过该检索标签在预先获取的与所述环境场景对应的图像检索数据库中进行初步检索，得到一组候选的离线关键帧图像。环境场景图像的当前帧图像通过与候选的离线关键帧图像进行图像特征配准，确定候选的离线关键帧图像中与环境场景图像的当前帧图像匹配度最高的离线关键帧图像。初始定位单元3032计算环境场景图像的当前帧图像与所述匹配度最高的离线关键帧图像的相对位姿，建立所述环境场景的坐标系，完成环境场景的初始化定位，并将所述环境场景图像的当前帧图像标记为初始关键帧图像。地图建立单元3033根据环境场景图像的当前帧图像中可见的3D点，建立初始的局部地图。另外，将所述环境场景图像的当前帧图像匹配度最高的离线关键帧图像、所述环境场景图像的初始关键帧图像及所述初始关键帧中符合第二预设条件的3D点加入所述环境场景的局部地图。In this technical solution, the image parsing unit 3031 parses the image features in the current frame image, such as an ORB feature, according to the image of the current environment scene collected by the image acquisition module 301 in real time. According to the image feature, the offline key frame image conforming to the first preset condition is retrieved in the image retrieval database of the environment scene acquired in advance. Specifically, the image feature extraction is performed on the current frame image in the real-time collected environment scene image, the ORB feature of the current frame image is extracted, and the ORB feature of the current frame image is dimension-reduced to obtain a search tag for image retrieval, and the search is performed by the search. The tag performs preliminary retrieval in an image retrieval database corresponding to the environment scene acquired in advance, and obtains a set of candidate offline key frame images. The current frame image of the environment scene image is subjected to image feature registration with the candidate offline key frame image to determine an offline key frame image of the candidate offline key frame image that has the highest matching degree with the current frame image of the environment scene image. The initial positioning unit 3032 calculates a relative pose of the current frame image of the environment scene image and the offline key frame image with the highest matching degree, establishes a coordinate system of the environment scene, completes initial positioning of the environment scene, and performs the environment scene. The current frame image of the image is labeled as the initial keyframe image. The map establishing unit 3033 establishes an initial partial map based on the 3D points visible in the current frame image of the environment scene image. In addition, the offline key frame image with the highest matching degree of the current frame image of the environment scene image, the initial key frame image of the environment scene image, and the 3D point of the initial key frame meeting the second preset condition are added to the A partial map of the environmental scene.

在上述技术方案中，优选地，图像跟踪模块304，如图6所示，包括检测单元3041、判断单元3042和位姿计算单元3043。检测单元3041用于根据环境场景图像的上一帧图像，检测环境场景图像的当前帧图像中与所述上一帧图像匹配的图像特征。判断单元3042用于判断匹配的图像特征数是否大于或者等于预设阈值。位姿计算单元3043用于当判断单元3042判断匹配的图像特征数大于或者等于预设阈值时，根据环境场景图像的上一帧图像的位姿及3D点坐标，计算环境场景图像的当前帧图像的位姿及3D点坐标。In the above technical solution, preferably, the image tracking module 304, as shown in FIG. 6, includes a detecting unit 3041, a determining unit 3042, and a pose calculating unit 3043. The detecting unit 3041 is configured to detect, according to the previous frame image of the environment scene image, an image feature that matches the previous frame image in the current frame image of the environment scene image. The determining unit 3042 is configured to determine whether the matched image feature number is greater than or equal to a preset threshold. The pose calculating unit 3043 is configured to calculate a current frame image of the environment scene image according to the pose and the 3D point coordinates of the previous frame image of the environment scene image when the determining unit 3042 determines that the matched image feature number is greater than or equal to the preset threshold. Position and 3D point coordinates.

在上述技术方案中，优选地，数据获取模块305，还用于，当判断单元3042判断匹配的图像特征数小于预设阈值时，在所述环境场景的图像检索数据库中检索与所述环境场景图像的当前帧图像匹配的离线关键帧图像。另外，位姿计算单元3043，还用于根据数据获取模块305检索到的所述离线关键帧图像的位姿及3D点坐标，计算环境场景图像的当前帧图像的位姿及3D点坐标。In the above technical solution, preferably, the data obtaining module 305 is further configured to: when the determining unit 3042 determines that the number of matched image features is less than a preset threshold, retrieving the environment scene in the image retrieval database of the environment scene The offline keyframe image of the image's current frame image matches. In addition, the pose calculation unit 3043 is further configured to calculate the pose and the 3D point coordinates of the current frame image of the environment scene image according to the pose and the 3D point coordinates of the offline key frame image retrieved by the data acquisition module 305.

在上述技术方案中，优选地，位姿计算单元3043，还用于，计算环境场景图像的当前帧图像与待显示的虚拟对象之间的相对位姿。另外，显示模块306，还用于，根据位姿计算单元3043计算得到的环境场景图像的当前帧图像与待显示的虚拟对象之间的相对位姿，在环境场景图像的当前帧图像中显示虚拟对象。In the above technical solution, preferably, the pose calculation unit 3043 is further configured to calculate a relative pose between the current frame image of the environment scene image and the virtual object to be displayed. In addition, the display module 306 is further configured to be based on the pose The calculation unit 3043 calculates the relative pose between the current frame image of the environment scene image and the virtual object to be displayed, and displays the virtual object in the current frame image of the environment scene image.

实施例四Embodiment 4

本发明实施例的实现增强现实的***400，如图7所示，包括：至少一个实现增强现实的服务器200，以及至少一个实现增强现实的终端300。The system 400 for implementing augmented reality according to an embodiment of the present invention, as shown in FIG. 7, includes: at least one server 200 implementing augmented reality, and at least one terminal 300 implementing augmented reality.

根据本发明实施例的实现增强现实的方法、服务器、终端及***，通过实时采集环境场景图像，根据环境场景图像中的图像特征进行图像跟踪，确定待显示的虚拟对象与环境场景的相对位姿，在环境场景图像中显示虚拟对象。本发明实施例实现了在无模板的情况下对环境场景或目标对象进行虚拟对象的图像叠加显示，有效地降低了现有的增强现实技术对平面模板的依赖，提高了虚拟对象与真实环境场景的实时配准的准确性，显著地增强了虚拟对象叠加到环境场景图像中的协调性与一致性。A method, a server, a terminal, and a system for implementing augmented reality according to an embodiment of the present invention, by collecting an environment scene image in real time, performing image tracking according to image features in an environment scene image, and determining a relative position of the virtual object to be displayed and the environment scene. , display the virtual object in the environment scene image. The embodiment of the invention realizes superimposed display of virtual objects on an environment scene or a target object without a template, effectively reducing the dependence of the existing augmented reality technology on the planar template, and improving the virtual object and the real environment scene. The accuracy of real-time registration significantly enhances the coordination and consistency of virtual object overlays into environmental scene images.

再次声明，本说明书中公开的所有特征，或公开的所有方法或过程中的步骤，除了互相排斥的特征和/或步骤以外，均可以以任何方式组合。Again, all features disclosed in this specification, or steps in all methods or processes disclosed, can be combined in any manner other than mutually exclusive features and/or steps.

本说明书(包括任何附加权利要求、摘要和附图)中公开的任一特征，除非特别叙述，均可被其他等效或具有类似目的的替代特征加以替换。即，除非特别叙述，每个特征只是一系列等效或类似特征中的一个例子而已。Any feature disclosed in the specification, including any additional claims, abstract and drawings, may be replaced by other equivalents or alternative features, unless otherwise stated. That is, unless specifically stated, each feature is only one example of a series of equivalent or similar features.

本发明并不局限于前述的具体实施方式。本发明可以扩展到任何在本说明书中披露的新特征或任何新的组合，以及披露的任一新的方法或过程的步骤或任何新的组合。 The invention is not limited to the specific embodiments described above. The invention can be extended to any new feature or any new combination disclosed in this specification, as well as any novel method or process steps or any new combination disclosed.

Claims

一种实现增强现实的方法，其特征在于，包括以下步骤：A method for realizing augmented reality, comprising the steps of:

实时采集环境场景的图像；Acquire an image of an environmental scene in real time;

提取所述环境场景图像中的图像特征；Extracting image features in the environment scene image;

根据所述图像特征，对所述环境场景进行初始化定位，建立所述环境场景的局部地图；Performing initial positioning on the environment scene according to the image feature, and establishing a partial map of the environment scene;

对所述环境场景图像中的图像帧进行跟踪；Tracking image frames in the environment scene image;

根据待显示的虚拟对象的位姿，在所述环境场景图像的当前帧图像中显示所述虚拟对象。The virtual object is displayed in a current frame image of the environment scene image according to a pose of the virtual object to be displayed.
根据权利要求1所述的实现增强现实的方法，其特征在于，所述实时采集环境场景的图像的步骤之前，还包括：The method for realizing augmented reality according to claim 1, wherein before the step of acquiring an image of the environment scene in real time, the method further comprises:

获取所述环境场景的视频图像；Obtaining a video image of the environmental scene;

根据所述环境场景视频图像，对所述环境场景进行三维场景重建；Performing a three-dimensional scene reconstruction on the environmental scene according to the environment scene video image;

根据所述环境场景视频图像，构建所述环境场景的图像检索数据库；Constructing an image retrieval database of the environmental scene according to the environment scene video image;

根据所述环境场景视频图像，预设所述待显示的虚拟对象在所述环境场景中的位姿。Determining a pose of the virtual object to be displayed in the environment scene according to the environment scene video image.
根据权利要求2所述的实现增强现实的方法，其特征在于，所述根据所述环境场景视频图像，对所述环境场景进行三维场景重建的步骤，具体为：The method for implementing augmented reality according to claim 2, wherein the step of performing a three-dimensional scene reconstruction on the environment scene according to the video image of the environment scene is specifically:

对所述环境场景视频图像中的图像帧进行图像特征提取；Performing image feature extraction on image frames in the environment scene video image;

根据所述图像特征对所述环境场景视频图像进行帧间图像配准，计算所述图像帧之间的相对位姿；Performing inter-frame image registration on the environment scene video image according to the image feature, and calculating a relative pose between the image frames;

在所述环境场景视频图像中选定离线初始帧图像，根据所述离线初始帧图像建立所述环境场景的世界坐标系，确定所述环境场景中的3D点的坐标；Determining an offline initial frame image in the environment scene video image, establishing a world coordinate system of the environment scene according to the offline initial frame image, and determining coordinates of a 3D point in the environment scene;

根据所述环境场景视频图像中的离线初始帧图像确定所述环境场景视频图像中的离线关键帧图像；Determining an offline key frame image in the environment scene video image according to the offline initial frame image in the environment scene video image;

根据所述环境场景视频图像中的离线初始帧图像与当前帧图像之间所有离线关键帧图像的位姿及3D点坐标构建位姿图，对所述位姿图进行优化，以更新所述离线初始帧图像与当前帧图像之间所有离线关键帧图像的位姿及3D点坐标。And constructing a pose map according to the pose and 3D point coordinates of all offline key frame images between the offline initial frame image and the current frame image in the environment scene video image, and optimizing the pose map to update the offline The pose and 3D point coordinates of all offline keyframe images between the initial frame image and the current frame image.
根据权利要求3所述的实现增强现实的方法，其特征在于，所述根据所述环境场景视频图像，构建所述环境场景的图像检索数据库的步骤，具体为： The method for implementing an augmented reality according to claim 3, wherein the step of constructing an image retrieval database of the environmental scene according to the environment scene video image is specifically:

根据所述环境场景视频图像中的离线初始帧图像及离线关键帧图像中的图像特征，建立搜索树或者词袋。A search tree or a word bag is established according to the offline initial frame image in the environment scene video image and the image feature in the offline key frame image.
根据权利要求1至4中任一项所述的实现增强现实的方法，其特征在于，还包括：根据所述环境场景的位置信息，获取所述环境场景的图像检索数据库。The method for realizing augmented reality according to any one of claims 1 to 4, further comprising: acquiring an image retrieval database of the environmental scene according to location information of the environment scene.
根据权利要求5所述的实现增强现实的方法，其特征在于，所述根据所述图像特征，对所述环境场景进行初始化定位，建立所述环境场景的局部地图的步骤，具体为：The method for realizing augmented reality according to claim 5, wherein the step of initializing and positioning the environment scene according to the image feature, and establishing a partial map of the environment scene is specifically:

对当前帧图像中的图像特征进行解析，在所述图像检索数据库中检索符合第一预设条件的离线关键帧图像；Parsing image features in the current frame image, and retrieving offline keyframe images in accordance with the first preset condition in the image retrieval database;

根据所述当前帧图像与所述离线关键帧图像的相对位姿，进行所述环境场景的初始化定位；Performing initial positioning of the environment scene according to a relative pose of the current frame image and the offline key frame image;

根据所述当前帧图像中符合第二预设条件的3D点，建立所述环境场景的局部地图。And establishing a partial map of the environmental scene according to the 3D point in the current frame image that meets the second preset condition.
根据权利要求6所述的实现增强现实的方法，其特征在于，所述根据所述图像特征，对所述环境场景进行初始化定位，建立所述环境场景的局部地图的步骤，还包括：The method for implementing augmented reality according to claim 6, wherein the step of initializing and positioning the environment scene according to the image feature, and establishing a partial map of the environment scene, further comprising:

将所述当前帧图像、所述符合所述第一预设条件的离线关键帧图像及所述当前帧图像中符合所述第二预设条件的3D点加入所述环境场景的局部地图。Adding the current frame image, the offline key frame image that meets the first preset condition, and the 3D point of the current frame image that meets the second preset condition to a local map of the environment scene.
根据权利要求7所述的实现增强现实的方法，其特征在于，所述对所述环境场景图像中的图像帧进行跟踪的步骤，具体为：The method for implementing augmented reality according to claim 7, wherein the step of tracking an image frame in the environment scene image is specifically:

根据所述环境场景图像的上一帧图像，检测所述环境场景图像的当前帧图像中与所述上一帧图像匹配的图像特征；And detecting, according to the previous frame image of the environment scene image, image features that match the previous frame image in the current frame image of the environment scene image;

判断所述匹配的图像特征数是否大于或者等于预设阈值；Determining whether the matched image feature number is greater than or equal to a preset threshold;

若所述匹配的图像特征数大于或者等于预设阈值，则根据所述环境场景图像的上一帧图像的位姿及3D点坐标确定所述环境场景图像的当前帧图像的位姿及3D点坐标；If the number of matching image features is greater than or equal to a preset threshold, determining a pose and a 3D point of the current frame image of the environment scene image according to the pose and the 3D point coordinates of the previous frame image of the environment scene image. coordinate;

若所述匹配的图像特征数小于预设阈值，则在所述环境场景的图像检索数据库中检索与所述环境场景图像的当前帧图像匹配的离线关键帧图像，根据所述离线关键帧图像的位姿及3D点坐标，确定所述环境场景图像的当前帧图像的位姿及3D点坐标。And if the matched image feature number is less than a preset threshold, searching, in an image retrieval database of the environment scene, an offline key frame image that matches a current frame image of the environment scene image, according to the offline key frame image The pose and the 3D point coordinates determine the pose and 3D point coordinates of the current frame image of the environment scene image.
根据权利要求8所述的实现增强现实的方法，其特征在于，所述对所述环境场景图像中的图像帧进行跟踪的步骤，还包括：The method for implementing an augmented reality according to claim 8, wherein the step of tracking an image frame in the image of the environment scene further comprises:

判断所述当前帧图像的位姿是否满足第三预设条件，若是，则将所述当前帧图像加入所述环境场景的局部地图及所述环境场景的图像检索数据库； Determining whether the pose of the current frame image satisfies a third preset condition, and if yes, adding the current frame image to the local map of the environment scene and the image retrieval database of the environment scene;

根据所述环境场景的局部地图中的所有图像帧的位姿及3D点坐标构建位姿图，对所述位姿图进行优化，更新所述局部地图中所有图像帧的位姿及3D点坐标。Constructing a pose map according to the pose and 3D point coordinates of all image frames in the local map of the environment scene, optimizing the pose map, and updating the pose and 3D point coordinates of all image frames in the partial map. .
根据权利要求9所述的实现增强现实的方法，其特征在于，所述对所述环境场景图像中的图像帧进行跟踪的步骤，还包括：The method for implementing augmented reality according to claim 9, wherein the step of tracking an image frame in the image of the environment scene further comprises:

对加入所述环境场景的图像检索数据库的图像帧进行回环检测，若检测到回环，则更新所述环境场景的图像检索数据库。Loopback detection is performed on an image frame of an image retrieval database added to the environment scene, and if a loopback is detected, an image retrieval database of the environment scene is updated.
根据权利要求10所述的实现增强现实的方法，其特征在于，所述根据待显示的虚拟对象的位姿，在所述环境场景图像的当前帧图像中显示所述虚拟对象的步骤，具体为：The method for realizing augmented reality according to claim 10, wherein the step of displaying the virtual object in a current frame image of the environment scene image according to a pose of the virtual object to be displayed is specifically :

获取待显示的虚拟对象的位姿，根据所述环境场景图像的当前帧图像与所述待显示的虚拟对象之间的相对位姿，在所述环境场景图像的当前帧图像中显示所述虚拟对象。Obtaining a pose of the virtual object to be displayed, displaying the virtual in the current frame image of the environment scene image according to a relative pose between the current frame image of the environment scene image and the virtual object to be displayed Object.
一种实现增强现实的服务器，其特征在于，包括：A server for realizing augmented reality, comprising:

视频获取模块：用于获取环境场景的视频图像；Video acquisition module: a video image for acquiring an environment scene;

场景重建模块：用于根据所述视频获取模块获取的环境场景视频图像，对所述环境场景进行三维场景重建；a scene reconstruction module, configured to perform a three-dimensional scene reconstruction on the environment scene according to the environment scene video image acquired by the video acquisition module;

数据库构建模块：用于根据所述视频获取模块获取的环境场景视频图像，构建所述环境场景的图像检索数据库。The database construction module is configured to construct an image retrieval database of the environment scene according to the environment scene video image acquired by the video acquisition module.
根据权利要求12所述的实现增强现实的服务器，其特征在于，所述场景重建模块包括：The server for realizing augmented reality according to claim 12, wherein the scene reconstruction module comprises:

特征提取单元：用于对所述环境场景视频图像中的图像帧进行图像特征提取；a feature extraction unit: configured to perform image feature extraction on an image frame in the environment scene video image;

位姿计算单元：用于根据所述特征提取单元提取到的图像特征对所述环境场景视频图像进行帧间图像配准，计算所述帧之间的相对位姿；a pose calculation unit: configured to perform inter-frame image registration on the environment scene video image according to image features extracted by the feature extraction unit, and calculate a relative pose between the frames;

坐标建立单元，用于在所述环境场景视频图像中选定离线初始帧图像，根据所述离线初始帧图像建立所述环境场景的世界坐标系，确定所述环境场景中的3D点的坐标；a coordinate establishing unit, configured to select an offline initial frame image in the environment scene video image, establish a world coordinate system of the environment scene according to the offline initial frame image, and determine coordinates of a 3D point in the environment scene;

关键帧选取单元：用于根据所述环境场景视频图像中的离线初始帧图像确定所述环境场景视频图像中的离线关键帧图像；a key frame selection unit: configured to determine an offline key frame image in the environment scene video image according to the offline initial frame image in the environment scene video image;

位姿图构建单元：用于根据所述环境场景视频图像中的离线初始帧图像与当前帧图像之间所有离线关键帧图像的位姿及3D点坐标构建位姿图，以及对所述位姿图进行优化，更新所述离线初始帧图像与当前帧图像之间所有离线关键帧图像的位姿及3D点坐标。 a pose map construction unit: configured to construct a pose map according to the pose and 3D point coordinates of all offline key frame images between the offline initial frame image and the current frame image in the video image of the environment scene, and to the pose The map is optimized to update the pose and 3D point coordinates of all offline keyframe images between the offline initial frame image and the current frame image.
根据权利要求13所述的实现增强现实的服务器，其特征在于，所述数据库构建模块，具体用于根据所述环境场景视频图像中的离线初始帧图像及离线关键帧图像中的图像特征，建立搜索树或者词袋。The server for realizing augmented reality according to claim 13, wherein the database construction module is specifically configured to establish an image according to an offline initial frame image and an offline key frame image in the video image of the environment scene. Search for trees or word bags.
根据权利要求14所述的实现增强现实的服务器，其特征在于，还包括：The server for realizing augmented reality according to claim 14, further comprising:

位姿设定模块：用于设定待显示的虚拟对象在所述环境场景中的位姿。Position setting module: used to set the pose of the virtual object to be displayed in the environment scene.
根据权利要求14或15所述的实现增强现实的服务器，其特征在于，还包括：The server for realizing augmented reality according to claim 14 or 15, further comprising:

检索模块：用于接收终端发送的获取环境场景的图像检索数据库的请求，以及将所述请求中对应的环境场景的图像检索数据库发送至所述终端。The retrieval module is configured to receive a request for acquiring an image retrieval database of the environment scene sent by the terminal, and send an image retrieval database of the corresponding environment scene in the request to the terminal.
一种实现增强现实的终端，其特征在于，包括：A terminal for realizing augmented reality, comprising:

图像采集模块：用于实时采集环境场景的图像；Image acquisition module: used to collect images of environmental scenes in real time;

特征提取模块：用于提取所述图像采集模块采集的环境场景图像中的图像特征；a feature extraction module: configured to extract image features in an environment scene image collected by the image acquisition module;

地图创建模块：用于根据所述特征提取模块提取的图像特征，对所述环境场景图像进行初始化定位，建立所述环境场景的局部地图；a map creation module: configured to initialize an image of the environment scene image according to the image feature extracted by the feature extraction module, and establish a partial map of the environment scene;

图像跟踪模块：用于对所述图像采集模块采集的环境场景图像中的图像帧进行跟踪；An image tracking module is configured to track an image frame in an environment scene image collected by the image acquisition module;

数据获取模块：用于获取待显示的虚拟对象的位姿；a data acquisition module: configured to acquire a pose of a virtual object to be displayed;

显示模块：用于根据所述数据获取模块获取的所述待显示的虚拟对象的位姿，在所述环境场景图像的当前帧图像中显示所述虚拟对象。a display module: configured to display the virtual object in a current frame image of the environment scene image according to the pose of the virtual object to be displayed acquired by the data acquisition module.
根据权利要求17所述的实现增强现实的终端，其特征在于，还包括定位模块，用于确定所述环境场景的位置信息；以及，The terminal for realizing augmented reality according to claim 17, further comprising a positioning module, configured to determine location information of the environment scene;

所述数据获取模块还用于根据所述环境场景的位置信息，获取所述环境场景对应的图像检索数据库。The data acquisition module is further configured to acquire an image retrieval database corresponding to the environment scenario according to the location information of the environment scenario.
根据权利要求18所述的实现增强现实的终端，其特征在于，所述地图创建模块包括：The terminal for realizing augmented reality according to claim 18, wherein the map creation module comprises:

图像解析单元：用于对当前帧图像中的图像特征进行解析，在所述图像检索数据库中检索符合第一预设条件的离线关键帧图像；An image parsing unit: configured to parse an image feature in a current frame image, and retrieve an offline key frame image that meets a first preset condition in the image retrieval database;

初始定位单元：用于根据所述当前帧图像与所述离线关键帧图像的相对位姿，进行所述环境场景的初始化定位；An initial positioning unit: configured to perform initial positioning of the environment scene according to a relative pose of the current frame image and the offline key frame image;

地图建立单元：根据所述当前帧图像中符合第二预设条件的3D点，建立所述环境场景的局部地图。a map establishing unit: establishing the environment field according to a 3D point in the current frame image that meets a second preset condition Partial map of the scene.
根据权利要求19所述的实现增强现实的终端，其特征在于，所述图像跟踪模块包括：The terminal for realizing augmented reality according to claim 19, wherein the image tracking module comprises:

检测单元：用于根据所述环境场景图像的上一帧图像，检测所述环境场景图像的当前帧图像中与所述上一帧图像匹配的图像特征；a detecting unit, configured to detect, according to a previous frame image of the environment scene image, an image feature that matches the previous frame image in a current frame image of the environment scene image;

判断单元：用于判断所述匹配的图像特征数是否大于或者等于预设阈值；a determining unit: configured to determine whether the matched image feature number is greater than or equal to a preset threshold;

位姿计算单元：用于当所述判断单元判断所述匹配的图像特征数大于或者等于预设阈值时，根据所述环境场景图像的上一帧图像的位姿及3D点坐标，计算所述环境场景图像的当前帧图像的位姿及3D点坐标。a pose calculating unit: configured to calculate, according to the pose of the previous frame image and the 3D point coordinate of the environment scene image, when the determining unit determines that the matched image feature number is greater than or equal to a preset threshold The pose of the current frame image of the environment scene image and the 3D point coordinates.
根据权利要求20所述的实现增强现实的终端，其特征在于，所述数据获取模块，还用于当所述判断单元判断所述匹配的图像特征数小于预设阈值时，在所述环境场景的图像数据库中检索与所述环境场景图像的当前帧图像匹配的离线关键帧图像；以及，The augmented reality terminal according to claim 20, wherein the data acquisition module is further configured to: when the determining unit determines that the matched image feature number is less than a preset threshold, in the environmental scenario Retrieving an offline key frame image matching the current frame image of the environment scene image in the image database; and

所述位姿计算单元，还用于根据所述数据获取模块检索到的离线关键帧图像的位姿及3D点坐标，计算所述环境场景图像的当前帧图像的位姿及3D点坐标。The pose calculation unit is further configured to calculate a pose and a 3D point coordinate of the current frame image of the environment scene image according to the pose and 3D point coordinates of the offline key frame image retrieved by the data acquisition module.
根据权利要求20或21所述的实现增强现实的终端，其特征在于，所述位姿计算单元，还用于计算所述环境场景图像的当前帧图像与所述待显示的虚拟对象之间的相对位姿；以及，The terminal for realizing augmented reality according to claim 20 or 21, wherein the pose calculation unit is further configured to calculate between a current frame image of the environment scene image and the virtual object to be displayed. Relative pose; and,

所述显示模块，还用于根据所述位姿计算单元计算得到的所述环境场景图像的当前帧图像与所述待显示的虚拟对象之间的相对位姿，在所述环境场景图像的当前帧图像中显示所述虚拟对象。The display module is further configured to: according to the relative pose between the current frame image of the environment scene image calculated by the pose calculation unit and the virtual object to be displayed, where the current scene image is current The virtual object is displayed in a frame image.
一种实现增强现实的***，其特征在于，包括权利要求12至16中任一项所述的实现增强现实的服务器，以及权利17至22中任一项所述的实现增强现实的终端。 A system for realizing augmented reality, comprising the server for realizing augmented reality according to any one of claims 12 to 16, and the terminal for realizing augmented reality according to any one of claims 17 to 22.