CN112399188A

CN112399188A - Image frame splicing method and device, readable storage medium and electronic equipment

Info

Publication number: CN112399188A
Application number: CN202011219006.XA
Authority: CN
Inventors: 施文博
Original assignee: Beike Technology Co Ltd
Current assignee: You Can See Beijing Technology Co ltd AS
Priority date: 2020-11-04
Filing date: 2020-11-04
Publication date: 2021-02-23
Also published as: WO2022095543A1

Abstract

The embodiment of the disclosure discloses an image frame splicing method and device, electronic equipment and a storage medium, wherein the image frame splicing method comprises the following steps: acquiring a preview video stream by moving the panoramic shooting equipment in a set space; responding to a plurality of shooting instructions received in the moving process of the panoramic shooting equipment, and acquiring images of a plurality of positions in a set space through the panoramic shooting equipment to obtain a multi-frame scene image; estimating pose information corresponding to each frame of scene image in the multi-frame scene images based on the preview video stream; and splicing the multi-frame scene images based on the pose information corresponding to each frame of scene image in the multi-frame scene images to obtain a panoramic image of the set space. The embodiment of the disclosure can effectively solve the problem of error splicing of each frame of scene image in the panoramic image by using the pose information corresponding to each frame of scene image.

Description

Image frame splicing method and device, readable storage medium and electronic equipment

Technical Field

The present disclosure relates to computer vision technologies, and in particular, to an image frame stitching method and apparatus, a readable storage medium, and an electronic device.

Background

With popularization and application of terminals in life of people, a user can shoot panoramic images by adopting the terminal, the panoramic images in the prior art are based on the effect of splicing a plurality of images to achieve a wide angle, so that more scenes are displayed, but for some special scenes, such as repetitive textures in buildings, shielding of walls, similar spaces and the like, the problem of wrong splicing often occurs.

Disclosure of Invention

The present disclosure is proposed to solve the above technical problems. The embodiment of the disclosure provides an image frame splicing method and device, a readable storage medium and an electronic device.

According to an aspect of the embodiments of the present disclosure, there is provided an image frame stitching method, including:

acquiring a preview video stream by moving the panoramic shooting equipment in a set space;

responding to a plurality of shooting instructions received in the moving process of the panoramic shooting equipment, and acquiring images of a plurality of positions in the set space through the panoramic shooting equipment to obtain a plurality of frames of scene images;

estimating pose information corresponding to each frame of scene image in the multi-frame scene images based on the preview video stream;

and splicing the multi-frame scene images based on the pose information corresponding to each frame scene image in the multi-frame scene images to obtain the panoramic image of the set space.

Optionally, in each of the above method embodiments of the present disclosure, before estimating pose information corresponding to each frame of scene image in the multiple frames of scene images based on the preview video stream, the method further includes:

deducting the moving target in the preview video stream to obtain the preview video stream with the subtracted moving target;

the estimating pose information corresponding to each frame of scene image in the multiple frames of scene images based on the preview video stream includes:

and estimating the corresponding pose information of each frame of scene image in the multi-frame scene images based on the preview video stream with the mobile target deducted.

Optionally, in the foregoing method embodiments of the present disclosure, the deducting the moving object in the preview video stream includes:

detecting a moving target of each frame of scene image in the preview video stream, and determining whether the moving target exists in the plurality of frames of scene images;

and if the moving target is detected in the multi-frame scene images, deducting the moving target in the multi-frame scene images based on a preset second neural network.

Optionally, in each of the above method embodiments of the present disclosure, the estimating pose information corresponding to each frame of scene image in the multiple frames of scene images based on the preview video stream includes:

processing the motion trail of the panoramic shooting equipment based on an instant positioning and mapping algorithm and a loop detection algorithm, and estimating the pose information of the panoramic shooting equipment corresponding to each frame of scene image in the preview video stream;

and acquiring the position and pose information corresponding to each frame of scene image in the multi-frame scene images based on the position and pose information of the panoramic shooting equipment corresponding to each frame of scene image in the preview video stream.

Optionally, in each method embodiment of the present disclosure, before the splicing the multiple frames of scene images based on the pose information corresponding to each frame of scene image in the multiple frames of scene images to obtain the panoramic image of the set space, the method further includes:

acquiring a pose scale of the panoramic shooting equipment; wherein the pose scale is used for representing the ratio of the on-map distance in each frame of scene image to the corresponding actual distance in the setting space;

the splicing of the multiple frames of scene images based on the pose information corresponding to each frame of scene image in the multiple frames of scene images comprises the following steps:

and splicing the multi-frame scene images based on the pose scale of the panoramic shooting equipment and the pose information corresponding to each frame scene image in the multi-frame scene images.

Optionally, in each of the method embodiments of the present disclosure, the acquiring a pose scale of the panoramic shooting device includes:

acquiring a pose scale of the panoramic shooting equipment based on an actual distance between the panoramic shooting equipment and a fixed reference object; or

And processing the preview video stream based on a preset first neural network to obtain the pose scale of the panoramic shooting equipment.

Optionally, in each method embodiment of the present disclosure, the stitching the multiple frames of scene images based on the pose dimension of the panoramic shooting device and the pose information corresponding to each frame of scene image in the multiple frames of scene images to obtain the panoramic image of the set space includes:

determining the splicing sequence of each frame of scene image based on the pose information corresponding to each frame of scene image in the multiple frames of scene images;

and determining the panoramic image of the set space based on the splicing sequence of each frame of scene image in the plurality of frames of scene images.

Optionally, in each of the method embodiments of the present disclosure, the method further includes:

and if the scene images in the multi-frame scene images have image overlapping, performing image fusion processing on the overlapped part of the images.

According to another aspect of the embodiments of the present disclosure, there is provided an image frame stitching apparatus including:

the panoramic shooting device comprises a first acquisition module, a second acquisition module and a display module, wherein the first acquisition module is used for acquiring a preview video stream by moving the panoramic shooting device in a set space;

a first obtaining module, configured to obtain, by the panoramic shooting device, images at multiple positions in the set space in response to multiple shooting instructions received during movement of the panoramic shooting device, so as to obtain a multi-frame scene image;

the estimation module is used for estimating pose information corresponding to each frame of scene image in the plurality of frames of scene images based on the preview video stream;

and the second obtaining module is used for splicing the multi-frame scene images based on the pose information corresponding to each frame scene image in the multi-frame scene images to obtain the panoramic image of the set space.

Optionally, in the above apparatus embodiments of the present disclosure, before the estimating module, the method further includes:

the deduction module is used for deducting the moving target in the preview video stream to obtain the preview video stream with the subtracted moving target;

the estimation module is specifically configured to: and estimating the corresponding pose information of each frame of scene image in the multi-frame scene images based on the preview video stream with the mobile target deducted.

Optionally, in each of the above device embodiments of the present disclosure, the deduction module includes:

the first determining unit is used for detecting a moving target of each frame of scene image in the preview video stream and determining whether the moving target exists in the plurality of frames of scene images;

and the deduction unit is used for deducting the moving target in the multi-frame scene images based on a preset second neural network if the moving target is detected in the multi-frame scene images.

Optionally, in each of the apparatus embodiments of the present disclosure, the estimating module includes:

the estimating unit is used for processing the motion trail of the panoramic shooting equipment based on an instant positioning and mapping algorithm and a loop detection algorithm and estimating the pose information of the panoramic shooting equipment corresponding to each frame of scene image in the preview video stream;

and the acquisition unit is used for acquiring the pose information corresponding to each frame of scene image in the multi-frame scene images based on the pose information of the panoramic shooting equipment corresponding to each frame of scene image in the preview video stream.

Optionally, in the above apparatus embodiments of the present disclosure, before the second obtaining module, the method further includes:

the second acquisition module is used for acquiring the pose scale of the panoramic shooting equipment; wherein the pose scale is used for representing the ratio of the on-map distance in each frame of scene image to the corresponding actual distance in the setting space;

the second obtaining module is specifically configured to: and splicing the multi-frame scene images based on the pose scale of the panoramic shooting equipment and the pose information corresponding to each frame scene image in the multi-frame scene images.

Optionally, in each apparatus embodiment of the present disclosure, the second obtaining module is specifically configured to:

Optionally, in each apparatus embodiment of the present disclosure, the second obtaining module includes:

the second determining unit is used for determining the splicing sequence of each frame of scene image based on the pose information corresponding to each frame of scene image in the multiple frames of scene images;

and the third determining unit is used for determining the panoramic image of the set space based on the splicing sequence of each frame of scene image in the plurality of frames of scene images.

Optionally, in each of the above apparatus embodiments of the present disclosure, the method further includes:

a fusion module, configured to perform image fusion processing on the overlapped part of the images if there is image overlap in the scene images of the multiple frames of scene images

According to another aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium storing a computer program for executing the above-described method of creating an accident map.

According to another aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing processor-executable instructions; and the processor is used for reading the executable instructions from the memory and executing the instructions to realize the image frame splicing method.

Based on the image frame splicing method and device, the readable storage medium and the electronic device provided by the embodiment of the disclosure, the preview video stream is acquired by moving the panoramic shooting device in the set space; responding to a plurality of shooting instructions received in the moving process of the panoramic shooting equipment, and acquiring images of a plurality of positions in a set space through the panoramic shooting equipment to obtain a multi-frame scene image; estimating pose information corresponding to each frame of scene image in the multi-frame scene images based on the preview video stream; and splicing the multi-frame scene images based on the pose information corresponding to each frame of scene image in the multi-frame scene images to obtain a panoramic image of the set space. The embodiment of the disclosure can effectively solve the problem of wrong splicing of each frame of scene image in the panoramic image by using the pose information corresponding to each frame of scene image, and in addition, the preview video stream in the panoramic shooting equipment can also estimate the global attitude of the panoramic shooting equipment for shooting the set space so as to obtain the accurate panoramic image.

The technical solution of the present disclosure is further described in detail by the accompanying drawings and examples.

Drawings

The above and other objects, features and advantages of the present disclosure will become more apparent by describing in more detail embodiments of the present disclosure with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure and not to limit the disclosure. In the drawings, like reference numbers generally represent like parts or steps.

Fig. 1 is a flow chart of one embodiment of the disclosed image frame stitching method.

Fig. 2 is a flowchart of yet another embodiment of the image frame stitching method of the present disclosure.

Fig. 3 is a flow chart of yet another embodiment of the disclosed image frame stitching method.

Fig. 4 is a flow chart of another embodiment of the disclosed image frame stitching method.

Fig. 5 is a schematic structural diagram of an embodiment of the image frame stitching apparatus according to the present disclosure.

Fig. 6 is a block diagram of an electronic device provided in an exemplary embodiment of the present disclosure.

Detailed Description

Hereinafter, example embodiments according to the present disclosure will be described in detail with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of the embodiments of the present disclosure and not all embodiments of the present disclosure, with the understanding that the present disclosure is not limited to the example embodiments described herein.

It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.

It will be understood by those of skill in the art that the terms "first," "second," and the like in the embodiments of the present disclosure are used merely to distinguish one element from another, and are not intended to imply any particular technical meaning, nor is the necessary logical order between them.

It is also understood that in embodiments of the present disclosure, "a plurality" may refer to two or more and "at least one" may refer to one, two or more.

It is also to be understood that any reference to any component, data, or structure in the embodiments of the disclosure, may be generally understood as one or more, unless explicitly defined otherwise or stated otherwise.

In addition, the term "and/or" in the present disclosure is only one kind of association relationship describing an associated object, and means that three kinds of relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the former and latter associated objects are in an "or" relationship.

It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and the same or similar parts may be referred to each other, so that the descriptions thereof are omitted for brevity.

Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

The disclosed embodiments may be applied to electronic devices such as terminal devices, computer systems, servers, etc., which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with electronic devices, such as terminal devices, computer systems, servers, and the like, include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set top boxes, programmable consumer electronics, network pcs, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above systems, and the like.

Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be practiced in distributed cloud computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.

Fig. 1 is a flowchart of an image frame stitching method according to an exemplary embodiment of the present disclosure. The embodiment can be applied to an electronic device, and as shown in fig. 1, the image frame stitching method includes the following steps:

and S102, moving the panoramic shooting equipment in the set space to acquire a preview video stream.

The setting space may be an indoor room or an outdoor place. The panorama photographing apparatus is used to represent an apparatus provided with a panorama photographing camera, which may be a fisheye panorama camera, a multi-lens panorama camera, or a mobile client that can generate a panorama photographing effect, and a controller, which may include a SLAM (instant positioning and mapping) system. The preview video stream is used to represent continuous image frame data generated after the mobile panorama shooting device is initialized.

And S104, responding to a plurality of shooting instructions received in the moving process of the panoramic shooting device, and acquiring images of a plurality of positions in a set space through the panoramic shooting device to obtain a plurality of frames of scene images.

The embodiment of the disclosure can also check the preview video stream in real time through remote equipment such as a mobile phone and the like, and send a shooting instruction through the remote equipment to realize remote control.

And S106, estimating the corresponding pose information of each frame of scene image in the multiple frames of scene images based on the preview video stream.

And the pose information corresponding to each frame of scene image is used for representing the displacement and the posture of the mobile panoramic shooting equipment corresponding to each frame of scene image.

And S108, splicing the multi-frame scene images based on the pose information corresponding to each frame of scene image in the multi-frame scene images to obtain a panoramic image of the set space.

For example, when panoramic shooting is performed on the whole set of the house source A, firstly, by moving the fisheye panoramic camera in the room A1 of the house source A, after acquiring the preview video stream A1, the user continuously photographs the room a1 with the fisheye panoramic camera, acquires images of a plurality of positions of the room a1, obtains a plurality of frames of scene images of the room a1, the user then continues to move the fisheye panoramic camera to the next room a2, in the same manner, shooting the room A2 until all rooms in the room source A are shot, estimating the corresponding pose information of each frame of scene image in the multi-frame scene images of the room source A based on all preview video streams, and determining the interrelation between each frame of scene image in the multi-frame scene images of the house source A according to the pose information corresponding to each frame of scene image, and splicing the adjacent scene images to obtain the panoramic image of the house source A.

Based on the image frame splicing method provided by the embodiment of the disclosure, the preview video stream is obtained by moving the panoramic shooting equipment in the set space; responding to a plurality of shooting instructions received in the moving process of the panoramic shooting equipment, and acquiring images of a plurality of positions in a set space through the panoramic shooting equipment to obtain a multi-frame scene image; estimating pose information corresponding to each frame of scene image in the multi-frame scene images based on the preview video stream; and splicing the multi-frame scene images based on the pose information corresponding to each frame of scene image in the multi-frame scene images to obtain a panoramic image of the set space. According to the panoramic shooting method and device, the problem of wrong splicing of each frame of scene image in the panoramic image can be effectively solved by using the pose information image corresponding to each frame of scene image, and in addition, the preview video stream in the panoramic shooting device can also estimate the global posture of the panoramic shooting device for shooting the set space so as to obtain the accurate panoramic image.

In some optional embodiments, step S106 may further include, before: subtracting the moving object from the preview video stream to obtain the preview video stream with the subtracted moving object, step S106 may further include: and estimating the corresponding pose information of each frame of scene image in the multi-frame scene images based on the preview video stream with the mobile target subtracted.

Fig. 2 is a flowchart illustrating an image frame stitching method according to another exemplary embodiment of the present disclosure, where: deducting the moving object from the preview video stream may include the steps of:

s201, moving target detection is carried out on each frame of scene image in the preview video stream, and whether a moving target exists in the plurality of frames of scene images is determined.

Wherein the moving target may be a human or an animal.

And S202, if the moving target is detected in the multi-frame scene images, deducting the moving target in the multi-frame scene images based on a preset second neural network.

The embodiment of the disclosure can determine whether a moving target exists through feature point detection. A preset second neural network is used to represent a neural network for detecting a moving target and subtracting the moving target, for example, SSD (Single Shot multi box Detector), Yolo (You can recognize your model at a glance), Deeplab (hole convolution model).

The embodiment of the disclosure can deduct redundant moving targets in each frame of scene image, so that the image frame information is more complete and accurate.

Fig. 3 is a schematic flow chart of an image frame stitching method according to another exemplary embodiment of the present disclosure, and on the basis of the embodiment shown in fig. 1, the step S106 may specifically include the following steps:

s301, processing the motion track of the panoramic shooting equipment based on an instant positioning and mapping algorithm and a loop detection algorithm, and estimating the pose information of the panoramic shooting equipment corresponding to each frame of scene image in the preview video stream.

The system comprises an instant positioning and mapping (SLAM) algorithm and a loop detection algorithm, wherein the instant positioning and mapping (SLAM) algorithm and the loop detection algorithm are prestored in an instant positioning and mapping (SLAM) system, and the purpose of the instant positioning and mapping (SLAM) algorithm is to estimate the pose of each moment in the motion track of the panoramic shooting equipment; the purpose of the loop detection algorithm is to find out whether the current scene appears in the history, and if the current scene appears, a very strong constraint condition can be correspondingly provided, namely, the track of the panoramic shooting device which deviates a lot is corrected to the correct position.

S302, acquiring the pose information corresponding to each frame of scene image in the multi-frame scene images based on the pose information of the panoramic shooting equipment corresponding to each frame of scene image in the preview video stream.

Therefore, the embodiment of the disclosure can estimate the pose information of the panoramic shooting equipment at each moment by using the instant positioning and mapping algorithm and the loop detection algorithm, thereby realizing the estimation of the relative displacement and the relative rotation between each frame of scene image and ensuring the smooth jump between each frame of scene image.

In some optional embodiments, step S108 may further include the following steps before: acquiring a pose dimension of the panoramic shooting device, step 108 may further include: and splicing the multi-frame scene images based on the pose scale of the panoramic shooting equipment and the pose information corresponding to each frame of scene image in the multi-frame scene images.

Wherein the pose scale is used for representing the ratio of the on-map distance in each frame of scene image to the corresponding actual distance in the set space.

In some optional embodiments, the acquiring the pose dimension of the panorama shooting device may include the following steps: acquiring a pose scale of the panoramic shooting equipment based on an actual distance between the panoramic shooting equipment and a fixed reference object; or processing the preview video stream based on a preset first neural network to obtain the pose scale of the panoramic shooting equipment.

The fixed reference object can be the floor or the ceiling of a room, for example, the distance between the observation point and the floor in each frame of scene image is set to be 1, the actual distance between the fisheye panoramic camera arranged on the tripod and the floor is set to be 1.5 m, and the ratio of the distance between the observation point and the floor in each frame of scene image to the actual distance between the fisheye panoramic camera arranged on the tripod and the floor is 1: 1.5; or, processing the preview video stream through a preset first neural network, that is, a neural network for acquiring depth information, and determining a pose scale of the fisheye panoramic camera, for example: and inputting the preview video stream data into the convolutional neural network model obtained after test training of the test set, so as to obtain the pose scale of the fisheye panoramic camera.

According to the embodiment of the disclosure, the pose scale of the panoramic shooting device is obtained through the actual distance between the panoramic shooting device and the fixed reference object and the mode of inputting the preview video stream into the preset first neural network, so as to determine the distance corresponding relation between the information in the multi-frame scene image and the information in the actual scene.

Fig. 4 is a schematic flowchart of an image frame stitching method according to another exemplary embodiment of the present disclosure, and on the basis of the embodiment shown in fig. 1, the step S108 may specifically include the following steps:

s401, determining the splicing sequence of each frame of scene image based on the pose information corresponding to each frame of scene image in the plurality of frames of scene images.

The splicing sequence of each frame of scene image is used for representing the sequence of continuous change of the corresponding pose of the panoramic shooting equipment, namely the change of the translation coordinate and the change of the rotation coordinate.

S402, determining a panoramic image of a set space based on the splicing sequence of each frame of scene image in the plurality of frames of scene images.

If the scene images in the multi-frame scene images have image superposition, image fusion processing is carried out on the part with the image superposition.

For example, based on the splicing sequence of each frame of scene image in multiple frames of scene images, after fusion processing is performed on overlapped parts in adjacent image frames, the image frames are spliced together according to the splicing sequence, so as to obtain a panoramic image. In addition, the panoramic image can be projected to a sphere, a cylinder or a cube, so that the all-around view browsing is realized.

The embodiment of the disclosure splices the multi-frame scene images by using the pose information corresponding to each frame of scene image, thereby effectively solving the problem that the multi-frame scene images are spliced incorrectly due to the fact that error estimation is easily given when the panoramic shooting equipment encounters different panoramic images in similar space.

Any of the image frame stitching methods provided by the embodiments of the present disclosure may be performed by any suitable device having data processing capabilities, including but not limited to: terminal equipment, a server and the like. Alternatively, any image frame splicing method provided by the embodiments of the present disclosure may be executed by a processor, for example, the processor may execute any image frame splicing method mentioned in the embodiments of the present disclosure by calling a corresponding instruction stored in a memory. And will not be described in detail below.

Fig. 5 is a schematic structural diagram of an image frame stitching apparatus according to an exemplary embodiment of the present disclosure. The testing device can be arranged in electronic equipment such as terminal equipment and a server and executes the image frame splicing method of any embodiment of the disclosure. As shown in fig. 5, the recommendation apparatus includes:

a first obtaining module 51, configured to obtain a preview video stream by moving the panorama shooting apparatus in a set space;

a first obtaining module 52, configured to obtain, by the panoramic shooting device, images at multiple positions in the set space in response to multiple shooting instructions received during movement of the panoramic shooting device, so as to obtain multi-frame scene images;

an estimating module 53, configured to estimate, based on the preview video stream, pose information corresponding to each frame of scene image in the multiple frames of scene images;

and a second obtaining module 54, configured to splice the multiple frames of scene images based on pose information corresponding to each frame of scene image in the multiple frames of scene images, so as to obtain a panoramic image of the set space.

Based on the image frame splicing device provided by the embodiment of the disclosure, the preview video stream is obtained by moving the panoramic shooting equipment in the set space; responding to a plurality of shooting instructions received in the moving process of the panoramic shooting equipment, and acquiring images of a plurality of positions in a set space through the panoramic shooting equipment to obtain a multi-frame scene image; estimating pose information corresponding to each frame of scene image in the multi-frame scene images based on the preview video stream; and splicing the multi-frame scene images based on the pose information corresponding to each frame of scene image in the multi-frame scene images to obtain a panoramic image of the set space. According to the panoramic shooting method and device, the problem of wrong splicing of each frame of scene image in the panoramic image can be effectively solved by using the pose information image corresponding to each frame of scene image, and in addition, the preview video stream in the panoramic shooting device can also estimate the global posture of the panoramic shooting device for shooting the set space so as to obtain the accurate panoramic image.

In some embodiments, the estimation module further comprises:

In some embodiments, the knock-out module comprises:

In some embodiments, the estimation module comprises:

In some embodiments, before the second obtaining module, the method further includes:

In some embodiments, the second obtaining module is specifically configured to:

In some embodiments, the second obtaining module includes:

In some embodiments, the method further comprises:

and the fusion module is used for carrying out image fusion processing on the overlapped part of the images if the scene images in the multi-frame scene images have image overlapping.

Next, an electronic apparatus according to an embodiment of the present disclosure is described with reference to fig. 6. The electronic device may be either or both of the first device 100 and the second device 200, or a stand-alone device separate from them that may communicate with the first device and the second device to receive the collected input signals therefrom.

FIG. 6 illustrates a block diagram of an electronic device in accordance with an embodiment of the disclosure.

As shown in fig. 6, the electronic device 6 includes one or more processors 61 and memory 62.

The processor 61 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 60 to perform desired functions.

Memory 62 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. One or more computer program instructions may be stored on the computer-readable storage medium and executed by the processor 61 to implement the image frame stitching methods of the various embodiments of the present disclosure described above and/or other desired functions. Various contents such as an input signal, a signal component, a noise component, etc. may also be stored in the computer-readable storage medium.

In one example, the electronic device 60 may further include: an input device 63 and an output device 64, which are interconnected by a bus system and/or other form of connection mechanism (not shown).

For example, when the electronic device is the first device 100 or the second device 200, the input device 63 may be a microphone or a microphone array as described above for capturing an input signal of a sound source. When the electronic device is a stand-alone device, the input means 63 may be a communication network connector for receiving the acquired input signals from the first device 100 and the second device 200.

The input device 63 may also include, for example, a keyboard, a mouse, and the like.

The output device 64 may output various information including the determined distance information, direction information, and the like to the outside. The output devices 64 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, among others.

Of course, for simplicity, only some of the components of the electronic device 60 relevant to the present disclosure are shown in fig. 6, omitting components such as buses, input/output interfaces, and the like. In addition, the electronic device 60 may include any other suitable components depending on the particular application.

In addition to the above-described methods and apparatus, embodiments of the present disclosure may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the image frame stitching method according to various embodiments of the present disclosure described in the "exemplary methods" section above of this specification.

The computer program product may write program code for carrying out operations for embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.

Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform the steps in the image frame stitching method according to various embodiments of the present disclosure described in the "exemplary methods" section above in this specification.

The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.

In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," "comprising," "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".

The methods and apparatus of the present disclosure may be implemented in a number of ways. For example, the methods and apparatus of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.

It is also noted that in the devices, apparatuses, and methods of the present disclosure, each component or step can be decomposed and/or recombined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.

The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims

1. An image frame splicing method, comprising:

2. The method of claim 1, wherein before estimating pose information corresponding to each frame of the plurality of frames of scene images based on the preview video stream, further comprising:

3. The method of claim 2, wherein subtracting moving objects in the preview video stream comprises:

4. The method according to any one of claims 1-3, wherein said estimating pose information corresponding to each frame of the plurality of frames of scene images based on the preview video stream comprises:

5. The method according to any one of claims 1 to 4, wherein before the stitching the multiple frames of scene images based on the pose information corresponding to each frame of scene image in the multiple frames of scene images to obtain the panoramic image of the set space, the method further comprises:

6. The method of claim 5, wherein the obtaining the pose dimension of the panorama shooting device comprises:

7. The method according to any one of claims 1 to 6, wherein the stitching the multiple frames of scene images based on the pose scale of the panoramic shooting device and the pose information corresponding to each frame of scene image in the multiple frames of scene images to obtain the panoramic image of the set space comprises:

8. An image frame stitching device, comprising:

9. A computer-readable storage medium storing a computer program for executing the image frame stitching method according to any one of claims 1 to 7.

10. An electronic device, the electronic device comprising:

a processor;

a memory for storing the processor-executable instructions;

the processor is configured to read the executable instructions from the memory and execute the instructions to implement the image frame stitching method according to any one of the claims 1 to 7.