WO2022095543A1 - 图像帧拼接方法和装置、可读存储介质及电子设备 - Google Patents

图像帧拼接方法和装置、可读存储介质及电子设备 Download PDF

Info

Publication number
WO2022095543A1
WO2022095543A1 PCT/CN2021/113122 CN2021113122W WO2022095543A1 WO 2022095543 A1 WO2022095543 A1 WO 2022095543A1 CN 2021113122 W CN2021113122 W CN 2021113122W WO 2022095543 A1 WO2022095543 A1 WO 2022095543A1
Authority
WO
WIPO (PCT)
Prior art keywords
scene images
panoramic
video stream
preview video
image
Prior art date
Application number
PCT/CN2021/113122
Other languages
English (en)
French (fr)
Inventor
施文博
Original Assignee
贝壳技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 贝壳技术有限公司 filed Critical 贝壳技术有限公司
Publication of WO2022095543A1 publication Critical patent/WO2022095543A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/21805Source of audio or video content, e.g. local disk arrays enabling multiple viewpoints, e.g. using a plurality of cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing

Definitions

  • the present disclosure relates to computer vision technology, and in particular, to an image frame stitching method and apparatus, a computer-readable storage medium, an electronic device, and a computer program product.
  • the panoramic image in the related art is based on splicing multiple images to achieve a wide-angle effect to show more scenes.
  • some special scenes such as repetitive textures in buildings, occlusion of walls, and similar spaces, the problem of wrong stitching often occurs.
  • an image frame stitching method comprising: acquiring a preview video stream captured by moving a panoramic photographing device in a set space; in response to receiving a video stream in the process of moving the panoramic photographing device The obtained multiple shooting instructions, obtain images of multiple positions in the set space through the panoramic shooting device to obtain multiple frames of scene images; estimate the corresponding poses of the multiple frames of scene images based on the preview video stream information; based on the corresponding pose information of the multi-frame scene images, stitching the multi-frame scene images to obtain a panoramic image of the set space.
  • an image frame splicing apparatus including a device for implementing the above-mentioned image frame splicing method.
  • a computer-readable storage medium stores a computer program, and the computer program is used to execute the above-mentioned image frame stitching method.
  • an electronic device includes: a processor; a memory for storing instructions executable by the processor; a processor for reading the executable instructions from the memory, and The instructions are executed to implement the above image frame stitching method.
  • a computer program product including a computer program, wherein the computer program implements the above-mentioned image frame stitching method when executed by a processor.
  • FIG. 1 is a flowchart of an image frame stitching method according to an embodiment of the present disclosure.
  • FIG. 2 is a flowchart of an image frame stitching method according to yet another embodiment of the present disclosure.
  • FIG. 3 is a flowchart of an image frame stitching method according to still another embodiment of the present disclosure.
  • FIG. 4 is a flowchart of an image frame stitching method according to another embodiment of the present disclosure.
  • FIG. 5 is a schematic structural diagram of an image frame stitching apparatus according to an embodiment of the present disclosure.
  • FIG. 6 is a structural diagram of an electronic device according to an exemplary embodiment of the present disclosure.
  • a plurality may refer to two or more, and “at least one” may refer to one, two or more.
  • the term "and/or" in the present disclosure is only an association relationship to describe associated objects, indicating that there can be three kinds of relationships, for example, A and/or B, it can mean that A exists alone, and A and B exist at the same time , there are three cases of B alone.
  • the character "/" in the present disclosure generally indicates that the related objects are an "or" relationship.
  • Embodiments of the present disclosure can be applied to electronic devices such as terminal devices, computer systems, servers, etc., which can operate with numerous other general-purpose or special-purpose computing system environments or configurations.
  • Examples of well-known terminal equipment, computing systems, environments and/or configurations suitable for use with terminal equipment, computer systems, servers, etc. electronic equipment include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients computer, handheld or laptop devices, microprocessor-based systems, set-top boxes, programmable consumer electronics, network personal computers, minicomputer systems, mainframe computer systems, and distributed cloud computing technology environments including any of the foregoing, among others.
  • Electronic devices such as terminal devices, computer systems, servers, etc., may be described in the general context of computer system-executable instructions, such as program modules, being executed by the computer system.
  • program modules may include routines, programs, object programs, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types.
  • Computer systems/servers may be implemented in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network.
  • program modules may be located on local or remote computing system storage media including storage devices.
  • FIG. 1 is a flowchart of an image frame stitching method according to an exemplary embodiment of the present disclosure. This embodiment can be applied to electronic equipment. As shown in FIG. 1 , the image frame splicing method includes the following steps:
  • S102 Acquire a preview video stream shot by moving the panoramic shooting device in the set space.
  • Panoramic shooting device is used to refer to a device provided with a panoramic shooting camera and a controller, wherein the panoramic shooting camera can be a fisheye panoramic camera, a multi-lens panoramic camera, or a mobile client that can generate panoramic shooting effects; the controller can include SLAM ( real-time positioning and map construction) system.
  • the preview video stream is used to represent continuous image frame data generated after the mobile panorama shooting device is initialized.
  • S104 in response to the multiple shooting instructions received during the process of moving the panoramic shooting device, acquire images of multiple positions in the set space through the panoramic shooting device to obtain multiple frames of scene images.
  • the preview video stream can be viewed in real time through a remote device such as a mobile phone, and a shooting instruction can be sent through the remote device to realize remote control.
  • the corresponding pose information of the multi-frame scene images is used to represent the displacement and posture of the panoramic photographing device corresponding to the corresponding scene images in the multi-frame scene images.
  • the preview video stream a1 is obtained by moving the fisheye panoramic camera in the room A1 of the house A.
  • the user continuously shoots the room A1 with the fisheye panoramic camera to obtain images of multiple positions of the room A1, and obtains multiple frames of scene images of the room A1.
  • the user continues to move the fisheye panoramic camera to the next room A2, and shoots the room A2 in the same way.
  • the corresponding pose information of the multi-frame scene images of the house A is estimated based on all the preview video streams, and the multi-frame scene images of the house A are determined according to the corresponding pose information. interrelationships between. By stitching together adjacent scene images, a panoramic image of house A can be obtained.
  • a preview video stream captured by moving a panoramic photographing device in a set space is obtained; in response to a plurality of photographing instructions received in the process of moving the panoramic photographing Obtaining images of multiple positions in the set space to obtain multi-frame scene images; estimating the corresponding pose information of the multi-frame scene images based on the preview video stream; Stitch to get a panoramic image of the set space.
  • the embodiments of the present disclosure can effectively solve the problem of incorrect splicing of scene images in a panoramic image by using the corresponding pose information of multiple frames of scene images.
  • the global pose of the panoramic photographing device for photographing the set space can also be estimated, so as to obtain an accurate panoramic image.
  • step S106 may further include: removing the moving object in the preview video stream to obtain a preview video stream with the moving object removed, then step S106 may further include: based on the preview from which the moving object is removed The video stream estimates the corresponding pose information for multiple frames of scene images.
  • FIG. 2 is a schematic flowchart of an image frame stitching method according to another exemplary embodiment of the present disclosure.
  • the above-mentioned removal of the moving target in the preview video stream may include the following steps:
  • S201 Perform moving object detection on a scene image in a preview video stream to determine whether a moving object is detected.
  • the moving target can be a person or an animal.
  • the embodiments of the present disclosure can determine whether there is a moving target through feature point detection.
  • the preset second neural network is used to represent the neural network that detects moving objects and removes moving objects, for example, SSD (single deep neural network model, Single Shot MultiBox Detector), Yolo (you can recognize your model at a glance, You Only look once), Deeplab (a hole convolution model).
  • the embodiment of the present disclosure can remove redundant moving objects in the scene image, so that the image frame information is more complete and accurate.
  • FIG. 3 is a schematic flowchart of an image frame stitching method according to another exemplary embodiment of the present disclosure.
  • step S106 may specifically include the following steps:
  • S301 based on the real-time positioning and mapping algorithm and the loop closure detection algorithm, process the motion trajectory of the panoramic photographing device to estimate the pose information of the panoramic photographing device corresponding to the scene image in the preview video stream.
  • the Live Localization and Mapping (SLAM) algorithm and the loop closure detection algorithm are pre-stored in the Live Localization and Mapping (SLAM) system.
  • the purpose of the real-time localization and mapping (SLAM) algorithm is to estimate the pose at each moment in the motion trajectory of the panoramic shooting device; the purpose of the loop closure detection algorithm is to find out whether the current scene has appeared in history, and if it has occurred, it can be used accordingly.
  • SLAM real-time localization and mapping
  • the embodiments of the present disclosure can use the real-time positioning and mapping algorithm and the loop closure detection algorithm to estimate the pose information of the panoramic shooting device at each moment, so as to realize the estimation of the relative displacement and relative rotation between the scene images, so as to Ensure smooth jumping between scene images of each frame.
  • step S108 the following steps may be further included: acquiring the pose scale of the panoramic shooting device, then step 108 may further include: based on the pose scale of the panoramic shooting device and the corresponding pose information of the multi-frame scene images, Stitches multi-frame scene images.
  • the pose scale is used to represent the ratio of the on-map distance in the multi-frame scene image to the corresponding actual distance in the set space.
  • obtaining the pose scale of the panoramic shooting device may include the following steps: obtaining the pose scale of the panoramic shooting device based on the actual distance between the panoramic shooting device and the fixed reference object; or based on a preset first
  • the neural network processes the preview video stream to obtain the pose scale of the panoramic shooting device.
  • a fixed reference can be the floor or ceiling of a room. For example, if the distance between the observation point and the ground in the multi-frame scene image is 1, and the actual distance between the fisheye panoramic camera placed on the tripod and the ground is 1.5 meters, the observation point and the multi-frame scene image The ratio between the distance between the ground and the actual distance between the fisheye panoramic camera placed on the tripod and the ground is 1:1.5; or, through the preset first neural network, that is, the neural network that obtains depth information,
  • the preview video stream is processed to determine the pose scale of the fisheye panoramic camera. For example, by inputting the preview video stream data into the convolutional neural network model trained on the test set, the pose scale of the fisheye panoramic camera can be obtained.
  • the pose scale of the panoramic photographing device is obtained by using the actual distance between the panoramic photographing device and the fixed reference object or by inputting the preview video stream into a preset first neural network, so as to determine the position and orientation of the panoramic photographing device in the multi-frame scene images.
  • FIG. 4 is a schematic flowchart of an image frame stitching method according to another exemplary embodiment of the present disclosure.
  • step S108 may specifically include the following steps:
  • S401 Determine the splicing sequence of the multi-frame scene images based on the corresponding pose information of the multi-frame scene images.
  • the splicing sequence of the multi-frame scene images is used to represent the sequence of continuous changes of the pose corresponding to the panoramic photographing device, that is, the change of translation coordinates and the change of rotation coordinates.
  • S402 Determine a panoramic image of the set space based on the splicing sequence of the multiple frames of scene images.
  • image fusion processing is performed on the overlapping part of the images.
  • the embodiment of the present disclosure can also project the panoramic image onto a spherical surface, a cylindrical surface or a cube, so as to realize all-round view browsing.
  • the embodiments of the present disclosure use the pose information of the multi-frame scene images to stitch the multi-frame scene images, which effectively solves the problem that when the panoramic shooting device encounters different panoramic images in a similar space, it is easy to give a wrong estimate, so that a multi-frame scene occurs.
  • Any image frame stitching method provided by the embodiments of the present disclosure may be executed by any appropriate device with data processing capabilities, including but not limited to: terminal devices and servers.
  • any image frame stitching method provided by the embodiments of the present disclosure may be executed by a processor, for example, the processor executes any of the image frame stitching methods mentioned in the embodiments of the present disclosure by invoking corresponding instructions stored in the memory. No further description will be given below.
  • FIG. 5 is a schematic structural diagram of an image frame stitching apparatus according to an exemplary embodiment of the present disclosure.
  • the apparatus can be set in electronic equipment such as terminal equipment and servers, so as to execute the image frame stitching method of any of the above-mentioned embodiments of the present disclosure.
  • the device includes:
  • the first obtaining module 51 is configured to obtain a preview video stream captured by moving the panoramic shooting device in the set space;
  • the first obtaining module 52 is configured to, in response to multiple shooting instructions received in the process of moving the panoramic shooting device, obtain images of multiple positions in the set space through the panoramic shooting device to obtain Multi-frame scene images;
  • an estimation module 53 configured to estimate the corresponding pose information of the multi-frame scene images based on the preview video stream
  • the second obtaining module 54 is configured to stitch the multiple frames of scene images based on the corresponding pose information of the multiple frames of scene images to obtain a panoramic image of the set space.
  • a preview video stream captured by moving a panoramic photographing device in a set space is obtained;
  • the shooting device acquires images of multiple positions in the set space to obtain multi-frame scene images; estimates the corresponding pose information of the multi-frame scene images based on the preview video stream;
  • the images are stitched to obtain a panoramic image of the set space.
  • the embodiments of the present disclosure can effectively solve the problem of incorrect splicing of scene images in panoramic images by using corresponding pose information images of multiple frames of scene images.
  • the global pose of the panoramic photographing device for photographing the set space can also be estimated, so as to obtain an accurate panoramic image.
  • the estimation module 53 includes:
  • a removing unit configured to remove a moving object in the preview video stream to obtain a preview video stream with the moving object removed
  • the first estimation unit is configured to estimate the corresponding pose information of the multi-frame scene images based on the preview video stream from which the moving object is removed.
  • the removal unit includes:
  • a first determining unit configured to perform moving object detection on the scene image in the preview video stream to determine whether a moving object is detected
  • the processing unit is configured to, in response to detecting the moving target, remove the moving target based on a preset second neural network.
  • the estimation module 53 includes:
  • the second estimation unit is configured to process the motion trajectory of the panoramic photographing device based on the real-time positioning and mapping algorithm and the loop closure detection algorithm, so as to estimate the motion trajectory of the panoramic photographing device corresponding to the scene image in the preview video stream. pose information;
  • the first obtaining unit is configured to obtain the corresponding pose information of the multi-frame scene images based on the pose information of the panoramic photographing device corresponding to the scene images in the preview video stream.
  • the second obtaining module 54 includes:
  • the second obtaining unit is configured to obtain the pose scale of the panoramic photographing device, wherein the pose scale is used to represent the distance on the map in the multi-frame scene images and the corresponding actual distance in the set space distance ratio;
  • the stitching unit is configured to stitch the multiple frames of scene images based on the pose scale of the panoramic photographing device and the corresponding pose information of the multiple frames of scene images to obtain a panoramic image of the set space.
  • the second obtaining unit is configured to:
  • the preview video stream is processed to obtain the pose scale of the panoramic shooting device.
  • the second obtaining module 54 includes:
  • a second determining unit configured to determine the splicing sequence of the multi-frame scene images based on the corresponding pose information of the multi-frame scene images
  • the third determination unit is configured to determine the panoramic image of the set space based on the splicing sequence of the multiple frames of scene images.
  • it also includes:
  • the fusion module is configured to, in response to determining that at least one scene image in the multiple frames of scene images has image overlap, perform image fusion processing on the overlapping portion of the images.
  • the electronic device may be either or both of the first device and the second device, or a stand-alone device independent of them that can communicate with the first device and the second device to receive the collected data from them input signal.
  • FIG. 6 illustrates a block diagram of an electronic device according to an embodiment of the present disclosure.
  • electronic device 60 includes one or more processors 61 and memory 62 .
  • Processor 61 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in electronic device 60 to perform desired functions.
  • CPU central processing unit
  • Processor 61 may be a central processing unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in electronic device 60 to perform desired functions.
  • Memory 62 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory.
  • the volatile memory may include, for example, random access memory (RAM) and/or cache memory, or the like.
  • the non-volatile memory may include, for example, read only memory (ROM), hard disk, flash memory, and the like.
  • One or more computer program instructions may be stored on the computer-readable storage medium, and the processor 61 may execute the program instructions to implement the image frame stitching method and/or the various embodiments of the present disclosure described above. Other desired features.
  • Various contents such as input signals, signal components, noise components, etc. may also be stored in the computer-readable storage medium.
  • the electronic device 60 may also include an input device 63 and an output device 64 interconnected by a bus system and/or other form of connection mechanism (not shown).
  • the input device 63 may be the above-mentioned microphone or microphone array for capturing the input signal of the sound source.
  • the input device 63 may be a communication network connector for receiving the collected input signals from the first device and the second device.
  • the input device 63 may also include, for example, a keyboard, a mouse, and the like.
  • the output device 64 can output various information to the outside, including the determined distance information, direction information, and the like.
  • the output devices 64 may include, for example, displays, speakers, printers, and communication networks and their connected remote output devices, among others.
  • the electronic device 60 may also include any other suitable components according to the specific application.
  • embodiments of the present disclosure may also be computer program products comprising computer program instructions that, when executed by a processor, cause the processor to perform the "exemplary method" described above in this specification The steps in the image frame stitching method according to various embodiments of the present disclosure described in the section.
  • the computer program product may write program code for performing operations of embodiments of the present disclosure in any combination of one or more programming languages, including object-oriented programming languages, such as Java, C++, etc. , also includes conventional procedural programming languages, such as "C" language or similar programming languages.
  • the program code may execute entirely on the user computing device, partly on the user device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server execute on.
  • embodiments of the present disclosure may also be computer-readable storage media having computer program instructions stored thereon that, when executed by a processor, cause the processor to perform the above-described "Example Method" section of this specification The steps in the image frame stitching method according to various embodiments of the present disclosure described in .
  • the computer-readable storage medium may employ any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • a readable storage medium may include, for example, but is not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the foregoing.
  • the methods and apparatus of the present disclosure may be implemented in many ways.
  • the methods and apparatus of the present disclosure may be implemented in software, hardware, firmware, or any combination of software, hardware, and firmware.
  • the above-described order of steps for the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise.
  • the present disclosure can also be implemented as programs recorded in a recording medium, the programs including machine-readable instructions for implementing methods according to the present disclosure.
  • the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
  • each component or each step may be decomposed and/or recombined. These disaggregations and/or recombinations should be considered equivalents of the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Neurology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Studio Devices (AREA)

Abstract

公开了一种图像帧拼接方法和装置、电子设备和存储介质。该方法包括:获取通过在设定空间中移动全景拍摄设备而拍摄的预览视频流;响应于在移动全景拍摄设备的过程中接收到的多个拍摄指令,通过全景拍摄设备获取设定空间中的多个位置的图像以得到多帧场景图像;基于预览视频流估计多帧场景图像的相应位姿信息;以及基于多帧场景图像的相应位姿信息,对多帧场景图像进行拼接以得到设定空间的全景图像。

Description

图像帧拼接方法和装置、可读存储介质及电子设备 技术领域
本公开涉及计算机视觉技术,尤其涉及一种图像帧拼接方法和装置、计算机可读存储介质、电子设备及计算机程序产品。
背景技术
随着终端在人们生活中的普及和应用,用户可以采用终端进行全景图像的拍摄。相关技术中的全景图像是基于拼接多幅图像以达到广角的效果,来展现更多的场景。但对于一些特殊场景,如建筑物中的重复性纹理、墙壁的遮挡、以及相似的空间等,经常会出现错误拼接的问题。
发明内容
根据本公开实施例的一个方面,提供了一种图像帧拼接方法,包括:获取通过在设定空间中移动全景拍摄设备而拍摄的预览视频流;响应于在移动所述全景拍摄设备的过程中接收到的多个拍摄指令,通过所述全景拍摄设备获取所述设定空间中的多个位置的图像以得到多帧场景图像;基于所述预览视频流估计所述多帧场景图像的相应位姿信息;基于所述多帧场景图像的相应位姿信息,对所述多帧场景图像进行拼接以得到所述设定空间的全景图像。
根据本公开实施例的另一个方面,提供了一种图像帧拼接装置,包括用于实现上述图像帧拼接方法的装置。
根据本公开实施例的另一个方面,提供了一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,计算机程序用于执行上述图像帧拼接方法。
根据本公开实施例的另一个方面,提供了一种电子设备,电子设备包括:处理器;用于存储处理器可执行指令的存储器;处理器,用于从存储器中读取可执行指令,并执行指令以实现上述图像帧拼接方法。
根据本公开实施例的另一个方面,提供了一种计算机程序产品,包括计算机程序,其中,所述计算机程序在被处理器执行时实现上述图像帧拼接方法。
下面通过附图和实施例,对本公开的技术方案做进一步的详细描述。
附图说明
通过结合附图对本公开实施例进行更详细的描述,本公开的上述以及其他目的、特征和优势将变得更加明显。附图用来提供对本公开实施例的进一步理解,并且构成说明书的一部分,与本公开实施例一起用于解释本公开,并不构成对本公开的限制。在附图中,相同的参考标号通常代表相同部件或步骤。
图1是根据本公开的一个实施例的图像帧拼接方法的流程图。
图2是根据本公开的又一个实施例的图像帧拼接方法的流程图。
图3是根据本公开的再一个实施例的图像帧拼接方法的流程图。
图4是根据本公开的另一个实施例的图像帧拼接方法的流程图。
图5是根据本公开的一个实施例的图像帧拼接装置的结构示意图。
图6是根据本公开一示例性实施例的电子设备的结构图。
具体实施方式
下面,将参考附图详细地描述根据本公开的示例实施例。显然,所描述的实施例仅仅是本公开的一部分实施例,而不是本公开的全部实施例,应理解,本公开不受这里描述的示例实施例的限制。
应注意到:除非另外具体说明,否则在这些实施例中阐述的部件和步骤的相对布置、数字表达式和数值不限制本公开的范围。
本领域技术人员可以理解,本公开实施例中的“第一”、“第二”等术语仅用于区别不同步骤、设备或模块等,既不代表任何特定技术含义,也不表示它们之间的必然逻辑顺序。
还应理解,在本公开实施例中,“多个”可以指两个或两个以上,“至少一个”可以指一个、两个或两个以上。
还应理解,对于本公开实施例中提及的任一部件、数据或结构,在没有明确限定或者在前后文给出相反启示的情况下,一般可以理解为一个或多个。
另外,本公开中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本公开中字符“/”,一般表示前后关联对象是一种“或”的关系。
还应理解,本公开对各个实施例的描述着重强调各个实施例之间的不同之处,其相同或相似之处可以相互参考,为了简洁,不再一一赘述。
同时,应当明白,为了便于描述,附图中所示出的各个部分的尺寸并不是按照实际的比例关系绘制的。
以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本公开及其应用或使用的任何限制。
对于相关领域普通技术人员已知的技术、方法和设备可能不作详细讨论,但在适当情况下,所述技术、方法和设备应当被视为说明书的一部分。
应注意到:相似的标号和字母在下面的附图中表示类似项,因此,一旦某一项在一个附图中被定义,则在随后的附图中不需要对其进行进一步讨论。
本公开实施例可以应用于终端设备、计算机***、服务器等电子设备,其可与众多其它通用或专用计算***环境或配置一起操作。适于与终端设备、计算机***、服务器等电子设备一起使用的众所周知的终端设备、计算***、环境和/或配置的例子包括但不限于:个人计算机***、服务器计算机***、瘦客户机、厚客户机、手持或膝上设备、基于微处理器的***、机顶盒、可编程消费电子产品、网络个人电脑、小型计算机***、大型计算机***和包括上述任何***的分布式云计算技术环境,等等。
终端设备、计算机***、服务器等电子设备可以在由计算机***执行的计算机***可执行指令(诸如程序模块)的一般语境下描述。通常,程序模块可以包括例程、程序、目标程序、组件、逻辑、数据结构等等,它们执行特定的任务或者实现特定的抽象数据类型。计算机***/服务器可以在分布式云计算环境中实施,在分布式云计算环境中,任务是由通过通信网络链接的远程处理设备执行的。在分布式云计算环境中,程序模块可以位于包括存储设备的本地或远程计算***存储介质上。
图1是根据本公开一示例性实施例的图像帧拼接方法的流程图。本实施例可应用在电子设备上,如图1所示,该图像帧拼接方法包括如下步骤:
S102,获取通过在设定空间中移动全景拍摄设备而拍摄的预览视频流。
设定空间可以是室内的房间,也可以是室外的场所。全景拍摄设备用于表示设有全景拍摄相机和控制器的设备,其中,全景拍摄相机可以是鱼眼全景相机、多镜头全景相机或可以生成全景拍摄效果的移动客户端;控制器可以包括SLAM(即时定位与地图构建)***。预览视频流用于表示移动全景拍摄设备初始化后生成的连续图像帧数据。
S104,响应于在移动全景拍摄设备的过程中接收到的多个拍摄指令,通过全景拍摄设备获取设定空间中的多个位置的图像以得到多帧场景图像。
本公开实施例还可以通过手机等远程设备实时查看预览视频流,并通过远程设备发送拍摄指令,实现远程控制。
S106,基于预览视频流估计多帧场景图像的相应位姿信息。
多帧场景图像的相应位姿信息用于表示与多帧场景图像中的相应场景图像对应的全景拍摄设备的位移和姿态。
S108,基于多帧场景图像的相应位姿信息,对多帧场景图像进行拼接以得到设定空间的全景图像。
例如,对整套房源A进行全景拍摄。首先通过在房源A的房间A1中移动鱼眼全景相机来获取预览视频流a1。在获取预览视频流a1后,用户利用鱼眼全景相机对房间A1进行连续拍摄来获取房间A1的多个位置的图像,得到房间A1的多帧场景图像。然后,用户继续移动鱼眼全景相机到下一房间A2,按照同样的方式,对房间A2进行拍摄。直到完成对房源A中全部房间的拍摄后,基于全部预览视频流估计房源A的多帧场景图像的相应位姿信息,并根据所述相应位姿信息确定房源A的多帧场景图像之间的相互关系。将相邻的场景图像拼接起来,即可得到房源A的全景图像。
根据本公开实施例的图像帧拼接方法,获取通过在设定空间中移动全景拍摄设备而拍摄的预览视频流;响应于在移动全景拍摄设备的过程中接收到的多个拍摄指令,通过全景拍摄设备获取设定空间中的多个位置的图像以得到多帧场景图像;基于预览视频流估计多帧场景图像的相应位姿信息;基于多帧场景图像的相应位姿信息,对多帧场景图像进行拼接以得到设定空间的全景图像。本公开实施例利用多帧场景图像的相应位姿信息可以有效解决全景图像中的场景图像的错误拼接问题。此外,利用全景拍摄设备的预览视频流还可以对拍摄设定空间的全景拍摄设备的全局姿态进行估计,以获取准确的全景图像。
在一些实施方式中,步骤S106之前还可以包括:移除预览视频流中的移动目标,以获得移除了移动目标的预览视频流,则步骤S106进一步可以包括:基于移除了移动目标的预览视频流估计多帧场景图像的相应位姿信息。
图2是根据本公开另一示例性实施例的图像帧拼接方法的流程示意图。上述移除预览视频流中的移动目标,可以包括如下步骤:
S201,对预览视频流中的场景图像进行移动目标检测以确定是否检测到移动目标。
移动目标可以是人或动物。
S202,响应于检测到移动目标,基于预设的第二神经网络,移除移动目标。
本公开实施例可以通过特征点检测确定是否存在移动目标。预设的第二神经网络用于表示检测移动目标以及移除移动目标的神经网络,例如,SSD(单个深度神经网络模型,Single Shot MultiBox Detector)、Yolo(一眼就能认出你模型,You Only look once),Deeplab(空洞卷积模型)。
本公开实施例可以移除场景图像中多余的移动目标,使得图像帧信息更加完整准确。
图3是根据本公开另一示例性实施例的图像帧拼接方法的流程示意图,在上述图1所示实施例的基础上,步骤S106具体可以包括如下步骤:
S301,基于即时定位与建图算法和回环检测算法,对全景拍摄设备的运动轨迹进行处理,以估计与预览视频流中的场景图像对应的全景拍摄设备的位姿信息。
即时定位与建图(SLAM)算法和回环检测算法被预存在即时定位与建图(SLAM)***中。即时定位与建图(SLAM)算法的目的是估计全景拍摄设备的运动轨迹中的各个时刻的位姿;回环检测算法的目的是找到当前场景在历史中是否出现过,如果出现过,就可以相应提供一个非常强的约束条件,即把偏离较大的全景拍摄设备轨迹修正到正确的位置上。
S302,基于与预览视频流中的场景图像对应的全景拍摄设备的位姿信息,获取多帧场景图像的相应位姿信息。
由此,本公开实施例利用即时定位与建图算法和回环检测算法可以对各个时刻的全景拍摄设备的位姿信息进行估计,从而实现对场景图像之间的相对位移和相对旋转的估计,以保证各帧场景图像之间的顺畅跳转。
在一些实施方式中,步骤S108之前还可以包括如下步骤:获取全景拍摄设备的位姿尺度,则步骤108还可以包括:基于全景拍摄设备的位姿尺度和多帧场景图像的相应位姿信息,对多帧场景图像进行拼接。
位姿尺度用于表示多帧场景图像中的图上距离与设定空间中对应的实际距离之比。
在一些实施方式中,上述获取全景拍摄设备的位姿尺度可以包括如下步骤:基于全景拍摄设备与固定参照物之间的实际距离,获取全景拍摄设备的位姿尺度;或基于预设的第一神经网络,对所述预览视频流进行处理以获取全景拍摄设备的位姿尺度。
固定参照物可以是房间的地面或天花板。例如,设定观测点与多帧场景图像中的地面之间的距离为1,安放在三角架上的鱼眼全景相机与地面之间的实际距离为1.5米,则观测点与多帧场景图像中的地面之间的距离与安放在三角架上的鱼眼全景相机与地面之间的实际距离之比为1:1.5;或,通过预设的第一神经网络即获取深度信息的神经网络, 对预览视频流进行处理以确定鱼眼全景相机的位姿尺度,例如:将预览视频流数据输入由测试集训练后得到的卷积神经网络模型,即可得到鱼眼全景相机的位姿尺度。
本公开实施例通过全景拍摄设备与固定参照物之间的实际距离或将预览视频流输入预设的第一神经网络的方式,获取全景拍摄设备的位姿尺度,以确定多帧场景图像中的信息与实际场景中信息的距离对应关系。
图4是根据本公开另一示例性实施例的图像帧拼接方法的流程示意图,在上述图1所示实施例的基础上,步骤S108具体可以包括如下步骤:
S401,基于多帧场景图像的相应位姿信息,确定多帧场景图像的拼接顺序。
多帧场景图像的拼接顺序用于表示全景拍摄设备对应的位姿连续变化的顺序,即平移坐标的变化和旋转坐标的变化。
S402,基于多帧场景图像的拼接顺序,确定设定空间的全景图像。
在一些实施方式中,若多帧场景图像中存在具有图像重叠的场景图像,则对图像重叠的部分进行图像融合处理。
例如,基于多帧场景图像的拼接顺序,将相邻图像帧中存在重叠的部分进行融合处理后,按照拼接顺序,将图像帧拼接至一起,得到全景图像。此外,本公开实施例还可以将该全景图像投射至球面、柱面或立方体上,以实现全方位的视图浏览。
本公开实施例利用多帧场景图像的位姿信息,对多帧场景图像进行拼接,有效解决了全景拍摄设备遇到相似空间的不同全景图像时,容易给出错误估计,以至于出现多帧场景图像错误拼接的问题。
本公开实施例提供的任一种图像帧拼接方法可以由任意适当的具有数据处理能力的设备执行,包括但不限于:终端设备和服务器等。或者,本公开实施例提供的任一种图像帧拼接方法可以由处理器执行,如处理器通过调用存储器存储的相应指令来执行本公开实施例提及的任一种图像帧拼接方法。下文不再赘述。
图5是根据本公开一示例性实施例的图像帧拼接装置的结构示意图。该装置可以设置于终端设备、服务器等电子设备中,以执行本公开上述任一实施例的图像帧拼接方法。如图5所示,该装置包括:
第一获取模块51,被配置为获取通过在设定空间中移动全景拍摄设备而拍摄的预览视频流;
第一得到模块52,被配置为响应于在移动所述全景拍摄设备的过程中接收到的多个拍摄指令,通过所述全景拍摄设备获取所述设定空间中的多个位置的图像以得到多帧场景图像;
估计模块53,被配置为基于所述预览视频流估计所述多帧场景图像的相应位姿信息;以及
第二得到模块54,被配置为基于所述多帧场景图像的相应位姿信息,对所述多帧场景图像进行拼接以得到所述设定空间的全景图像。
基于本公开上述实施例提供的图像帧拼接装置,获取通过在设定空间中移动全景拍摄设备而拍摄的预览视频流;响应于在移动全景拍摄设备的过程中接收到的多个拍摄指令,通过全景拍摄设备获取设定空间中的多个位置的图像以得到多帧场景图像;基于预览视频流估计多帧场景图像的相应位姿信息;基于多帧场景图像的相应位姿信息,对多帧场景图像进行拼接以得到设定空间的全景图像。本公开实施例利用多帧场景图像的相应位姿信息图像可以有效解决全景图像中的场景图像的错误拼接问题。此外,利用全景拍摄设备中的预览视频流还可以对拍摄设定空间的全景拍摄设备的全局姿态进行估计,以获取准确的全景图像。
在一些实施方式中,所述估计模块53包括:
移除单元,被配置为移除所述预览视频流中的移动目标,以得到移除了移动目标的预览视频流;以及
第一估计单元,被配置为基于移除了移动目标的预览视频流,估计所述多帧场景图像的相应位姿信息。
在一些实施方式中,所述移除单元包括:
第一确定单元,被配置为对所述预览视频流中的场景图像进行移动目标检测以确定是否检测到移动目标;以及
处理单元,被配置为响应于检测到移动目标,基于预设的第二神经网络,移除所述移动目标。
在一些实施方式中,所述估计模块53包括:
第二估计单元,被配置为基于即时定位与建图算法和回环检测算法,对所述全景拍摄设备的运动轨迹进行处理,以估计与所述预览视频流中的场景图像对应的全景拍摄设备的位姿信息;以及
第一获取单元,被配置为基于与所述预览视频流中的场景图像对应的全景拍摄设备的位姿信息,获取所述多帧场景图像的相应位姿信息。
在一些实施方式中,所述第二得到模块54包括:
第二获取单元,被配置为获取所述全景拍摄设备的位姿尺度,其中,所述位姿尺度用于表示所述多帧场景图像中的图上距离与所述设定空间中的对应实际距离之比;以及
拼接单元,被配置为基于所述全景拍摄设备的位姿尺度和所述多帧场景图像的相应位姿信息,对所述多帧场景图像进行拼接以得到所述设定空间的全景图像。
在一些实施方式中,所述第二获取单元被配置为:
基于所述全景拍摄设备与固定参照物之间的实际距离,获取所述全景拍摄设备的位姿尺度;或
基于预设的第一神经网络,对所述预览视频流进行处理以获取所述全景拍摄设备的位姿尺度。
在一些实施方式中,所述第二得到模块54包括:
第二确定单元,被配置为基于所述多帧场景图像的相应位姿信息,确定所述多帧场景图像的拼接顺序;以及
第三确定单元,被配置为基于所述多帧场景图像的拼接顺序,确定所述设定空间的全景图像。
在一些实施方式中,还包括:
融合模块,被配置为响应于确定所述多帧场景图像中的至少一个场景图像存在图像重叠,对所述图像重叠的部分进行图像融合处理。
下面,参考图6来描述根据本公开实施例的电子设备。该电子设备可以是第一设备和第二设备中的任一个或两者、或与它们独立的单机设备,该单机设备可以与第一设备和第二设备进行通信,以从它们接收所采集到的输入信号。
图6图示了根据本公开实施例的电子设备的框图。
如图6所示,电子设备60包括一个或多个处理器61和存储器62。
处理器61可以是中央处理单元(CPU)或者具有数据处理能力和/或指令执行能力的其他形式的处理单元,并且可以控制电子设备60中的其他组件以执行期望的功能。
存储器62可以包括一个或多个计算机程序产品,所述计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失 性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。在所述计算机可读存储介质上可以存储一个或多个计算机程序指令,处理器61可以运行所述程序指令,以实现上文所述的本公开的各个实施例的图像帧拼接方法以及/或者其他期望的功能。在所述计算机可读存储介质中还可以存储诸如输入信号、信号分量、噪声分量等各种内容。
在一个示例中,电子设备60还可以包括:输入装置63和输出装置64,这些组件通过总线***和/或其他形式的连接机构(未示出)互连。
例如,在该电子设备是第一设备或第二设备时,该输入装置63可以是上述的麦克风或麦克风阵列,用于捕捉声源的输入信号。在该电子设备是单机设备时,该输入装置63可以是通信网络连接器,用于从第一设备和第二设备接收所采集的输入信号。
此外,该输入设备63还可以包括例如键盘、鼠标等等。
该输出装置64可以向外部输出各种信息,包括确定出的距离信息、方向信息等。该输出设备64可以包括例如显示器、扬声器、打印机、以及通信网络及其所连接的远程输出设备等等。
当然,为了简化,图6中仅示出了该电子设备60中与本公开有关的组件中的一些,省略了诸如总线、输入/输出接口等等的组件。除此之外,根据具体应用情况,电子设备60还可以包括任何其他适当的组件。
除了上述方法和设备以外,本公开的实施例还可以是计算机程序产品,其包括计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述“示例性方法”部分中描述的根据本公开各种实施例的图像帧拼接方法中的步骤。
所述计算机程序产品可以以一种或多种程序设计语言的任意组合来编写用于执行本公开实施例操作的程序代码,所述程序设计语言包括面向对象的程序设计语言,诸如Java、C++等,还包括常规的过程式程序设计语言,诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。
此外,本公开的实施例还可以是计算机可读存储介质,其上存储有计算机程序指令,所述计算机程序指令在被处理器运行时使得所述处理器执行本说明书上述“示例性方法”部分中描述的根据本公开各种实施例的图像帧拼接方法中的步骤。
所述计算机可读存储介质可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以包括但不限于电、磁、光、电 磁、红外线、或半导体的***、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
以上结合具体实施例描述了本公开的基本原理,但是,需要指出的是,在本公开中提及的优点、优势、效果等仅是示例而非限制,不能认为这些优点、优势、效果等是本公开的各个实施例必须具备的。另外,上述公开的具体细节仅是为了示例的作用和便于理解的作用,而非限制,上述细节并不限制本公开为必须采用上述具体的细节来实现。
本说明书中各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似的部分相互参见即可。对于***实施例而言,由于其与方法实施例基本对应,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
本公开中涉及的器件、装置、设备、***的方框图仅作为例示性的例子并且不意图要求或暗示必须按照方框图示出的方式进行连接、布置、配置。如本领域技术人员将认识到的,可以按任意方式连接、布置、配置这些器件、装置、设备、***。诸如“包括”、“包含”、“具有”等等的词语是开放性词汇,指“包括但不限于”,且可与其互换使用。这里所使用的词汇“或”和“和”指词汇“和/或”,且可与其互换使用,除非上下文明确指示不是如此。这里所使用的词汇“诸如”指词组“诸如但不限于”,且可与其互换使用。
可能以许多方式来实现本公开的方法和装置。例如,可通过软件、硬件、固件或者软件、硬件、固件的任何组合来实现本公开的方法和装置。用于所述方法的步骤的上述顺序仅是为了进行说明,本公开的方法的步骤不限于以上具体描述的顺序,除非以其它方式特别说明。此外,在一些实施例中,还可将本公开实施为记录在记录介质中的程序,这些程序包括用于实现根据本公开的方法的机器可读指令。因而,本公开还覆盖存储用于执行根据本公开的方法的程序的记录介质。
还需要指出的是,在本公开的装置、设备和方法中,各部件或各步骤是可以分解和/或重新组合的。这些分解和/或重新组合应视为本公开的等效方案。
提供所公开的方面的以上描述以使本领域的任何技术人员能够做出或者使用本公开。对这些方面的各种修改对于本领域技术人员而言是非常显而易见的,并且在此定义的一 般原理可以应用于其他方面而不脱离本公开的范围。因此,本公开不意图被限制到在此示出的方面,而是按照与在此公开的原理和新颖的特征一致的最宽范围。
为了例示和描述的目的已经给出了以上描述。此外,此描述不意图将本公开的实施例限制到在此公开的形式。尽管以上已经讨论了多个示例方面和实施例,但是本领域技术人员将认识到其某些变型、修改、改变、添加和子组合。

Claims (12)

  1. 一种图像帧拼接方法,包括:
    获取通过在设定空间中移动全景拍摄设备而拍摄的预览视频流;
    响应于在移动所述全景拍摄设备的过程中接收到的多个拍摄指令,通过所述全景拍摄设备获取所述设定空间中的多个位置的图像以得到多帧场景图像;
    基于所述预览视频流估计所述多帧场景图像的相应位姿信息;以及
    基于所述多帧场景图像的相应位姿信息,对所述多帧场景图像进行拼接以得到所述设定空间的全景图像。
  2. 根据权利要求1所述的方法,其中,所述基于所述预览视频流估计所述多帧场景图像的相应位姿信息,包括:
    移除所述预览视频流中的移动目标,以得到移除了移动目标的预览视频流;以及
    基于所述移除了移动目标的预览视频流,估计所述多帧场景图像的相应位姿信息。
  3. 根据权利要求2所述的方法,其中,所述移除所述预览视频流中的移动目标,包括:
    对所述预览视频流中的场景图像进行移动目标检测以确定是否检测到移动目标;以及
    响应于检测到移动目标,基于预设的第二神经网络,移除所述移动目标。
  4. 根据权利要求1-3任一所述的方法,其中,所述基于所述预览视频流估计所述多帧场景图像的相应位姿信息,包括:
    基于即时定位与建图算法和回环检测算法,对所述全景拍摄设备的运动轨迹进行处理,以估计与所述预览视频流中的场景图像对应的全景拍摄设备的位姿信息;以及
    基于与所述预览视频流中的场景图像对应的全景拍摄设备的位姿信息,获取所述多帧场景图像的相应位姿信息。
  5. 根据权利要求1-4任一所述的方法,其中,所述基于所述多帧场景图像的相应位姿信息,对所述多帧场景图像进行拼接以得到所述设定空间的全景图像,包括:
    获取所述全景拍摄设备的位姿尺度,其中,所述位姿尺度用于表示所述多帧场景图像中的图上距离与所述设定空间中的对应实际距离之比;以及
    基于所述全景拍摄设备的位姿尺度和所述多帧场景图像的相应位姿信息,对所述多帧场景图像进行拼接以得到所述设定空间的全景图像。
  6. 根据权利要求5所述的方法,其中,所述获取所述全景拍摄设备的位姿尺度,包括:
    基于所述全景拍摄设备与固定参照物之间的实际距离,获取所述全景拍摄设备的位姿尺度;或
    基于预设的第一神经网络,对所述预览视频流进行处理以获取所述全景拍摄设备的位姿尺度。
  7. 根据权利要求1-4任一所述的方法,其中,所述基于所述多帧场景图像的相应位姿信息,对所述多帧场景图像进行拼接以得到所述设定空间的全景图像,包括:
    基于所述多帧场景图像的相应位姿信息,确定所述多帧场景图像的拼接顺序;以及
    基于所述多帧场景图像的拼接顺序,确定所述设定空间的全景图像。
  8. 根据权利要求1-7任一所述的方法,还包括:
    响应于确定所述多帧场景图像中的至少一个场景图像存在图像重叠,对所述图像重叠的部分进行图像融合处理。
  9. 一种图像帧拼接装置,包括:用于实现权利要求1-8中任一项所述方法的装置。
  10. 一种计算机可读存储介质,其中,所述存储介质存储有计算机程序,所述计算机程序用于执行权利要求1-8中任一项所述的方法。
  11. 一种电子设备,包括:
    处理器;以及
    用于存储所述处理器可执行指令的存储器,其中,所述可执行指令在由所述处理器执行时实现权利要求1-8中任一项所述的方法。
  12. 一种计算机程序产品,包括计算机程序,其中,所述计算机程序在被处理器执行时实现权利要求1-8中任一项所述的方法。
PCT/CN2021/113122 2020-11-04 2021-08-17 图像帧拼接方法和装置、可读存储介质及电子设备 WO2022095543A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011219006.XA CN112399188A (zh) 2020-11-04 2020-11-04 图像帧拼接方法和装置、可读存储介质及电子设备
CN202011219006.X 2020-11-04

Publications (1)

Publication Number Publication Date
WO2022095543A1 true WO2022095543A1 (zh) 2022-05-12

Family

ID=74597482

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/113122 WO2022095543A1 (zh) 2020-11-04 2021-08-17 图像帧拼接方法和装置、可读存储介质及电子设备

Country Status (2)

Country Link
CN (1) CN112399188A (zh)
WO (1) WO2022095543A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115050013A (zh) * 2022-06-14 2022-09-13 南京人工智能高等研究院有限公司 一种行为检测方法、装置、车辆、存储介质和电子设备
CN115861050A (zh) * 2022-08-29 2023-03-28 如你所视(北京)科技有限公司 用于生成全景图像的方法、装置、设备和存储介质
WO2023221923A1 (zh) * 2022-05-19 2023-11-23 影石创新科技股份有限公司 视频处理方法、装置、电子设备及存储介质

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112399188A (zh) * 2020-11-04 2021-02-23 贝壳技术有限公司 图像帧拼接方法和装置、可读存储介质及电子设备
CN113160053B (zh) * 2021-04-01 2022-06-14 华南理工大学 一种基于位姿信息的水下视频图像复原与拼接方法
CN113344789B (zh) * 2021-06-29 2023-03-21 Oppo广东移动通信有限公司 图像拼接方法及装置、电子设备、计算机可读存储介质
CN113744339B (zh) * 2021-11-05 2022-02-08 贝壳技术有限公司 生成全景图像的方法、装置、电子设备和存储介质
CN115988322A (zh) * 2022-11-29 2023-04-18 北京百度网讯科技有限公司 生成全景图像的方法、装置、电子设备和存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108447105A (zh) * 2018-02-02 2018-08-24 微幻科技(北京)有限公司 一种全景图像的处理方法及装置
US20190020817A1 (en) * 2017-07-13 2019-01-17 Zillow Group, Inc. Connecting and using building interior data acquired from mobile devices
CN110198438A (zh) * 2019-07-05 2019-09-03 浙江开奇科技有限公司 用于全景视频影像的影像处理方法及终端设备
CN110505463A (zh) * 2019-08-23 2019-11-26 上海亦我信息技术有限公司 基于拍照的实时自动3d建模方法
US20200116493A1 (en) * 2018-10-11 2020-04-16 Zillow Group, Inc. Automated Mapping Information Generation From Inter-Connected Images
CN112399188A (zh) * 2020-11-04 2021-02-23 贝壳技术有限公司 图像帧拼接方法和装置、可读存储介质及电子设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9646571B1 (en) * 2013-06-04 2017-05-09 Bentley Systems, Incorporated Panoramic video augmented reality
CN106598387A (zh) * 2016-12-06 2017-04-26 北京尊豪网络科技有限公司 一种显示房源信息的方法及装置
CN111145352A (zh) * 2019-12-20 2020-05-12 北京乐新创展科技有限公司 一种房屋实景图展示方法、装置、终端设备及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190020817A1 (en) * 2017-07-13 2019-01-17 Zillow Group, Inc. Connecting and using building interior data acquired from mobile devices
CN108447105A (zh) * 2018-02-02 2018-08-24 微幻科技(北京)有限公司 一种全景图像的处理方法及装置
US20200116493A1 (en) * 2018-10-11 2020-04-16 Zillow Group, Inc. Automated Mapping Information Generation From Inter-Connected Images
CN110198438A (zh) * 2019-07-05 2019-09-03 浙江开奇科技有限公司 用于全景视频影像的影像处理方法及终端设备
CN110505463A (zh) * 2019-08-23 2019-11-26 上海亦我信息技术有限公司 基于拍照的实时自动3d建模方法
CN112399188A (zh) * 2020-11-04 2021-02-23 贝壳技术有限公司 图像帧拼接方法和装置、可读存储介质及电子设备

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023221923A1 (zh) * 2022-05-19 2023-11-23 影石创新科技股份有限公司 视频处理方法、装置、电子设备及存储介质
CN115050013A (zh) * 2022-06-14 2022-09-13 南京人工智能高等研究院有限公司 一种行为检测方法、装置、车辆、存储介质和电子设备
CN115861050A (zh) * 2022-08-29 2023-03-28 如你所视(北京)科技有限公司 用于生成全景图像的方法、装置、设备和存储介质

Also Published As

Publication number Publication date
CN112399188A (zh) 2021-02-23

Similar Documents

Publication Publication Date Title
WO2022095543A1 (zh) 图像帧拼接方法和装置、可读存储介质及电子设备
US11165959B2 (en) Connecting and using building data acquired from mobile devices
US11632516B2 (en) Capture, analysis and use of building data from mobile devices
US9661214B2 (en) Depth determination using camera focus
US10250800B2 (en) Computing device having an interactive method for sharing events
CN112712584B (zh) 空间建模方法、装置、设备
WO2021249390A1 (zh) 实现增强现实的方法和装置、存储介质、电子设备
US8989506B1 (en) Incremental image processing pipeline for matching multiple photos based on image overlap
CN111432119B (zh) 图像拍摄方法、装置、计算机可读存储介质及电子设备
CN112037279B (zh) 物品位置识别方法和装置、存储介质、电子设备
WO2022247414A1 (zh) 空间几何信息估计模型的生成方法和装置
CA3069813C (en) Capturing, connecting and using building interior data from mobile devices
CN113436311A (zh) 一种户型图生成方法及其装置
WO2022262273A1 (zh) 光心对齐检测方法和装置、存储介质、电子设备
WO2021073562A1 (zh) 多点云平面融合方法及装置
CN112328150B (zh) 自动截图方法、装置以及设备、存储介质
WO2022256651A1 (en) Matching segments of video for virtual display of space
CN111429519B (zh) 三维场景显示方法、装置、可读存储介质及电子设备
TW201439664A (zh) 控制方法及電子裝置
CN112184766A (zh) 一种对象的跟踪方法、装置、计算机设备和存储介质
CN113379838B (zh) 虚拟现实场景的漫游路径的生成方法和存储介质
WO2024140962A1 (zh) 用于确定相对位姿的方法、装置、***、设备和介质
CN112991542B (zh) 房屋三维重建方法、装置和电子设备
CN115499594B (zh) 全景图像生成方法及计算机可读存储介质
CN111627061A (zh) 位姿检测方法、装置以及电子设备、存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21888242

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21888242

Country of ref document: EP

Kind code of ref document: A1