WO2024119902A1 - 图像拼接方法及装置 - Google Patents

图像拼接方法及装置 Download PDF

Info

Publication number
WO2024119902A1
WO2024119902A1 PCT/CN2023/115094 CN2023115094W WO2024119902A1 WO 2024119902 A1 WO2024119902 A1 WO 2024119902A1 CN 2023115094 W CN2023115094 W CN 2023115094W WO 2024119902 A1 WO2024119902 A1 WO 2024119902A1
Authority
WO
WIPO (PCT)
Prior art keywords
camera
target
image
stitching
positions
Prior art date
Application number
PCT/CN2023/115094
Other languages
English (en)
French (fr)
Inventor
彭璐
徐海
林建平
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2024119902A1 publication Critical patent/WO2024119902A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • G06T3/4038Image mosaicing, e.g. composing plane images from plane sub-images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration

Definitions

  • the present application relates to the field of image processing technology, and in particular to an image stitching method and device.
  • the field of view (FOV) of cameras currently used in conference terminals is usually small, and the horizontal field of view of most cameras is within 90°.
  • FOV field of view
  • the shooting area of a single camera cannot completely cover the entire conference scene, and there may be blind spots on both sides of the camera where participants are located.
  • a common method is to deploy multiple cameras to collect images separately, and then stitch the images collected by the multiple cameras to obtain a panoramic image.
  • the panoramic image obtained by stitching the images collected by multiple cameras often has obvious seams, and the stitching effect is poor. How to improve the image stitching effect is an urgent problem to be solved.
  • the present application provides an image stitching method and device, which can improve the image stitching effect.
  • an image stitching method which can be applied to an image processing device.
  • the method includes: obtaining multiple frames of images captured by a multi-camera module at the same time, the multi-camera module includes multiple cameras, and the multiple cameras include a first camera and a second camera deployed adjacently. The first camera and the second camera have a field of view overlap area.
  • the multiple frames of images include a first image captured by the first camera and a second image captured by the second camera.
  • the target stitching parameters are determined according to multiple sets of stitching parameters corresponding to the first camera and the second camera and the position of the shooting target, and the multiple sets of stitching parameters are obtained based on the camera parameters calibrated by the first camera and the second camera for different calibration positions in the field of view overlap area.
  • the first image and the second image are stitched using the target stitching parameters to obtain a stitched image.
  • stitching parameters can be further determined based on the camera parameters calibrated by the two cameras for the calibration position, so as to obtain stitching parameters corresponding to different calibration positions.
  • the image processing device can determine the stitching parameters suitable for the position where the shooting target is located based on the stitching parameters corresponding to the multiple calibration positions within the overlapping field of view, thereby making the stitching effect of the position where the shooting target is located in the stitched image better, thereby improving the display effect of the shooting target in the stitched image.
  • each set of stitching parameters corresponding to the first camera and the second camera includes projection transformation parameters of the image captured by the first camera to the image captured by the second camera
  • each set of stitching parameters corresponding to the first camera and the second camera includes projection transformation parameters of the image captured by the first camera to the target plane coordinate system and projection transformation parameters of the image captured by the second camera to the target plane coordinate system.
  • the projection transformation parameters of the image captured by the first camera to the image captured by the second camera are used to transform the image captured by the first camera to the pixel coordinate system corresponding to the image captured by the second camera.
  • the projection transformation parameters of the image captured by the first camera to the target plane coordinate system are used to transform the image captured by the first camera to the target plane coordinate system
  • the projection transformation parameters of the image captured by the second camera to the target plane coordinate system are used to transform the image captured by the second camera to the target plane coordinate system.
  • the present application can achieve pairing of pixel points in the image captured by the first camera and the image captured by the second camera by transforming the image captured by the first camera and the image captured by the second camera to the same plane coordinate system.
  • any two calibration positions among the multiple calibration positions within the overlapping area of the field of view of the first camera and the second camera meet one or more of the following conditions: the distances from the two calibration positions to the center positions of the first camera and the second camera are different; the horizontal angles of the two calibration positions relative to the arrangement direction of the first camera and the second camera are different; the vertical angles of the two calibration positions relative to the arrangement direction of the first camera and the second camera are different.
  • the arrangement direction of the first camera and the second camera may be, for example, the direction of the straight line where the connecting line of the first camera and the second camera is located.
  • the overlapping area of the field of view of the first camera and the second camera includes a photographing target, and the first camera and the second camera correspond to each other.
  • the method for determining target stitching parameters based on multiple sets of stitching parameters and the position of the photographed target includes: determining the target stitching parameters based on stitching parameters corresponding to one or more calibrated positions closest to the position of the photographed target among multiple calibrated positions.
  • the implementation method of determining the target stitching parameters according to the stitching parameters corresponding to one or more calibration positions that are closest to the position where the photographed target is located among the multiple calibration positions includes: if the multiple calibration positions include the position where the photographed target is located, taking the stitching parameters corresponding to the position where the photographed target is located as the target stitching parameters. If the multiple calibration positions do not include the position where the photographed target is located, determining the target stitching parameters according to the stitching parameters corresponding to the two calibration positions that are closest to the position where the photographed target is located among the multiple calibration positions.
  • the stitching parameters corresponding to the position of the shooting target are used as the target stitching parameters, or the target stitching parameters are calculated according to the stitching parameters corresponding to the two calibration positions closest to the position of the shooting target.
  • the target stitching parameters are used to stitch the images captured by the first camera and the second camera, the stitching effect of the position of the shooting target can be better, thereby ensuring the display effect of the shooting target in the stitched image.
  • the implementation method of determining the target stitching parameters according to the stitching parameters corresponding to two calibration positions closest to the position of the photographed target among multiple calibration positions includes: based on the distance of the position of the photographed target relative to the two calibration positions, using the stitching parameters corresponding to the two calibration positions to interpolate and calculate the target stitching parameters corresponding to the position of the photographed target.
  • the overlapping area of the fields of view of the first camera and the second camera includes multiple shooting targets.
  • An implementation method for determining the target stitching parameters according to the multiple sets of stitching parameters corresponding to the first camera and the second camera and the positions of the shooting targets includes: determining the target stitching parameters according to the multiple sets of stitching parameters and the positions of the multiple shooting targets.
  • This implementation method can be applied to scenarios where multiple camera modules adopt director mode.
  • an implementation method for determining target stitching parameters based on multiple sets of stitching parameters and the positions of multiple shooting targets includes: using the stitching parameters corresponding to the target calibration position as the target stitching parameters, the target calibration position being the calibration position among the multiple calibration positions having the smallest sum of distances to the positions of the multiple shooting targets.
  • This implementation method comprehensively considers the image stitching effect of multiple shooting targets, selects the stitching parameter corresponding to the calibration position with the smallest sum of distances to the positions of the multiple shooting targets as the target stitching parameter, and when the target stitching parameter is used to stitch the images captured by the first camera and the second camera, the overall stitching effect of the positions of the multiple shooting targets can be better, thereby making the overall display effect of the multiple shooting targets in the stitched image better.
  • the calculation process of this implementation method is simple and consumes less processing resources.
  • another implementation of determining target stitching parameters according to multiple sets of stitching parameters and the positions of multiple shooting targets includes: for each shooting target among the multiple shooting targets, obtaining stitching parameters corresponding to one or more calibrated positions closest to the position of the shooting target among the multiple calibrated positions. Determine the target stitching parameters according to all the stitching parameters obtained for the multiple shooting targets.
  • This implementation method comprehensively considers the image stitching effects of multiple shooting targets, determines the target stitching parameters according to the stitching parameters corresponding to one or more calibration positions closest to each shooting target, and uses the target stitching parameters to stitch the images captured by the first camera and the second camera.
  • the stitching effects of the positions where the multiple shooting targets are located can be compromised, and the stitching effects of the positions where the multiple shooting targets are located can be made as consistent as possible, so that the overall display effect of the multiple shooting targets in the stitched image is better.
  • a cropped image is output to a screen for display.
  • the cropped image is cropped from the stitched image and includes all captured targets within an overlapping area of the field of view of the first camera and the second camera.
  • the overlapping area of the field of view includes multiple shooting targets
  • another implementation method of determining the target stitching parameters according to the multiple sets of stitching parameters corresponding to the first camera and the second camera and the position of the shooting target includes: for each shooting target among the multiple shooting targets, determining the stitching parameters corresponding to the shooting target according to the stitching parameters corresponding to one or more calibration positions closest to the position of the shooting target among the multiple calibration positions.
  • the implementation method of stitching the first image and the second image using the target stitching parameters to obtain the stitched image includes: for each shooting target, stitching the first image and the second image using the stitching parameters corresponding to the shooting target to obtain the stitched image corresponding to the shooting target.
  • This implementation method can be applied to scenarios where multiple camera modules adopt intelligent averaging mode or multi-person co-frame mode.
  • the first image and the second image are stitched using stitching parameters corresponding to the shooting target to obtain a stitched image corresponding to the shooting target, and then the combined image is output to a screen for display.
  • the combined image is obtained by combining multiple cropped images, and the multiple cropped images are respectively cropped from multiple stitched images corresponding to the multiple shooting targets, and each cropped image respectively contains the corresponding shooting target in the cropped stitched image.
  • the image processing device performs image stitching for each of the multiple shooting targets to obtain multiple stitched images, so as to ensure the display effect of each shooting target in the corresponding stitched image, so that each of the combined images output by the image processing device
  • the stitched images from which the photographed object comes can all guarantee that the photographed object has a good display effect, and thus can guarantee the display effect of the combined image that is finally output.
  • the image processing device stores stitching parameters corresponding to two adjacent cameras in a multi-camera module in multiple deployment scenarios, and obtains the deployment scenario of the multi-camera module before determining the target stitching parameters according to the multiple sets of stitching parameters corresponding to the first camera and the second camera and the position of the shooting target. Obtain multiple sets of stitching parameters corresponding to the first camera and the second camera in the deployment scenario of the multi-camera module.
  • This application provides stitching parameters corresponding to different positions of multi-camera modules in various deployment scenarios.
  • the image processing device can flexibly select the corresponding stitching parameters according to the deployment scenario of the multi-camera module, so that the image stitching effect is more matched with the current deployment scenario, thereby achieving better image stitching effect for images collected in different deployment scenarios of the multi-camera module.
  • the deployment scenario of the multi-camera module can be ignored and the same set of stitching parameters can be used for all deployment scenarios.
  • an image stitching device in a second aspect, includes multiple functional modules, and the multiple functional modules interact with each other to implement the method in the first aspect and its respective embodiments.
  • the multiple functional modules can be implemented based on software, hardware, or a combination of software and hardware, and the multiple functional modules can be arbitrarily combined or divided based on specific implementations.
  • an image stitching device comprising: a processor and a memory;
  • the memory is used to store a computer program, wherein the computer program includes program instructions
  • the processor is used to call the computer program to implement the method in the above-mentioned first aspect and its various embodiments.
  • a computer-readable storage medium on which instructions are stored.
  • the instructions are executed by a processor, the method in the above-mentioned first aspect and its various embodiments is implemented.
  • a computer program product including a computer program, and when the computer program is executed by a processor, the method in the above-mentioned first aspect and its various embodiments is implemented.
  • a chip in a sixth aspect, is provided, the chip including a programmable logic circuit and/or program instructions, and when the chip is running, the method in the above-mentioned first aspect and its various embodiments is implemented.
  • FIG1 is a schematic diagram of the structure of a multi-camera module provided in an embodiment of the present application.
  • FIG2 is a schematic diagram of an application scenario provided by an embodiment of the present application.
  • FIG3 is a schematic diagram of a flow chart of an image stitching method provided in an embodiment of the present application.
  • FIG4 is a schematic diagram of a camera calibration scenario provided in an embodiment of the present application.
  • FIG5 is a schematic diagram of a user interaction interface provided in an embodiment of the present application.
  • FIG6 is a schematic diagram of a location of a photographing target provided by an embodiment of the present application.
  • FIG7 is a schematic diagram of another location of a photographing target provided by an embodiment of the present application.
  • FIG8 is a schematic diagram of a location of another shooting target provided in an embodiment of the present application.
  • FIG9 is a schematic diagram of a conference room scene provided in an embodiment of the present application.
  • FIG10 is a schematic diagram of another conference room scene provided in an embodiment of the present application.
  • FIG11 is a schematic diagram of an image captured by a multi-camera module according to an embodiment of the present application.
  • FIG12 is a schematic diagram of a spliced image provided in an embodiment of the present application.
  • FIG13 is a schematic diagram of a cropped image provided in an embodiment of the present application.
  • FIG14 is a schematic diagram of another image captured by a multi-camera module provided in an embodiment of the present application.
  • FIG15 is a schematic diagram of a combined image provided by an embodiment of the present application.
  • FIG16 is a schematic diagram of the structure of an image stitching device provided in an embodiment of the present application.
  • FIG. 17 is a schematic diagram of the hardware structure of an image processing device provided in an embodiment of the present application.
  • Object detection Find all targets (objects) of interest in the image, determine the target category and the target location in the image.
  • SSL Sound source localization
  • Audio-visual matching Taking the frame positions of all targets in the image and the sound source localization coordinates as input, the sound source localization coordinates are matched to the corresponding pixel positions in the image, and the speaker, for example, can be found. This process is called audio-visual matching.
  • the frame position of the target refers to the position of the frame containing the target, which can be a labeled real frame or a predicted frame obtained through target detection.
  • Pixel coordinate system is a coordinate system with the upper left vertex of the image captured by the camera as the origin.
  • the x-axis (horizontal axis) and y-axis (vertical axis) of the pixel coordinate system are the width and height directions of the image captured by the camera.
  • the world coordinate system can describe the position of the camera in the real world, and can also describe the position of objects in the images captured by the camera in the real world.
  • Camera calibration The process of solving the projection transformation relationship between the world coordinate system and the pixel coordinate system corresponding to the image collected by the camera through a series of algorithms. It can also be understood as the process of determining the camera parameters.
  • Camera parameters include camera intrinsic parameters and camera extrinsic parameters. Among them, the camera intrinsic parameters are the inherent properties of the camera, including distortion coefficients, and are related to the camera focal length and pixel size.
  • the camera extrinsic parameters are related to the position and posture of the camera in the world coordinate system.
  • the position and posture include position and attitude. The position refers to the coordinates of the camera in the world coordinate system, and the attitude refers to the orientation of the camera in the world coordinate system.
  • Image stitching technology is an increasingly popular research field. It has become a hot topic in photogrammetry, computer vision, image processing, and computer graphics. A series of spatially overlapping images are processed through image stitching technology to form a high-definition image. The stitched image has a higher resolution and a larger field of view than a single image. However, due to binocular parallax between different cameras, as well as factors such as stitching algorithm errors and limited computing power, the panoramic images obtained by stitching images collected by multiple cameras often have obvious seams and unnatural transitions. For conference scenes, when participants are located in the overlapping area of the image, the face or body may be incompletely displayed due to the seams, affecting the image display effect.
  • the main principle is to perform offline calibration on the multi-camera module before leaving the factory or before installation to obtain the fixed camera parameters of each camera in the multi-camera module, and then obtain the fixed stitching parameters based on the fixed camera parameters of each camera, and use the fixed stitching parameters to stitch the images collected by the multi-camera module.
  • the camera parameters include camera intrinsic parameters and camera extrinsic parameters
  • the camera intrinsic parameters include distortion parameters.
  • the stitching parameters used when stitching images are always fixed, the stitching effect of some positions in the overlapping area of the final stitched image is better, and the stitching effect of some positions is poor, and the stitching effect of multiple positions in the overlapping area of the image cannot be taken into account.
  • the other is online real-time stitching. During stitching, the feature points of the overlapping area of the image are detected and matched, so as to generate the stitching camera image mapping relationship in real time.
  • an embodiment of the present application provides an image stitching method, in which images are collected by a multi-camera module, the multi-camera module includes multiple cameras, the relative deployment positions of the multiple cameras are fixed, and two cameras deployed adjacently in the multiple cameras have a field of view overlap area.
  • the image processing device obtains multiple frames of images collected by the multi-camera module at the same time, when the field of view overlap area of two cameras deployed adjacently in the multi-camera module includes a shooting target, the target stitching parameters are determined according to the multiple sets of stitching parameters corresponding to the two cameras and the position of the shooting target, and then the images collected by the two cameras are stitched using the target stitching parameters to obtain a stitched image.
  • the multiple sets of stitching parameters corresponding to the two cameras are respectively obtained based on the camera parameters calibrated by the two cameras for different calibration positions in the field of view overlap area.
  • the embodiment of the present application offline calibrates the camera parameters of two cameras deployed adjacently in the multi-camera module for different calibration positions in the field of view overlap area.
  • the stitching parameters can be further determined according to the camera parameters calibrated by the two cameras for the calibration position, so as to obtain the stitching parameters corresponding to different calibration positions.
  • the image processing device can determine the stitching parameters suitable for the position where the photographed object is located according to the stitching parameters corresponding to the multiple calibrated positions in the overlapping area of the fields of view, and then This makes the stitching effect of the position where the shooting target is located in the stitched image better, thereby improving the display effect of the shooting target in the stitched image.
  • there is no need to detect and match feature points on the image which is applicable to more shooting scenes and requires lower computing power.
  • the image stitching method provided in the embodiment of the present application can be applied to an image processing device.
  • the image processing device can be a multi-camera module, or a display device, or a video server connected to a display device.
  • the display device has a built-in multi-camera module, or the display device is connected to an external multi-camera module.
  • the multi-camera module includes a plurality of cameras, and the relative deployment positions of the plurality of cameras are fixed. The plurality of cameras are respectively used to capture images of different shooting areas to obtain multi-channel video streams.
  • the multi-camera module can also be referred to as a panoramic camera.
  • the image processing device is used to stitch multiple frames of images captured by the multi-camera modules at the same time, so that the display device can display the stitched images.
  • the video server can be a single server, or a server cluster consisting of a plurality of servers, or a cloud computing platform, etc.
  • all cameras in the multi-camera module have the same time and frequency for capturing images.
  • camera synchronization technology can be used to achieve synchronous shooting of all cameras in the multi-camera module. Any two adjacent cameras in the multi-camera module have overlapping fields of view. Among them, two cameras have overlapping fields of view, which means that the shooting areas of the two cameras have overlapping areas.
  • multiple cameras in the multi-camera module can be arranged in a straight line, in a fan-shaped arrangement, or in other irregular arrangements, and the corresponding camera arrangement can be designed according to the actual shooting scene.
  • FIG. 1 is a structural schematic diagram of a multi-camera module provided in an embodiment of the present application. As shown in FIG.
  • the multi-camera module 10 includes three cameras, namely, cameras 1-3. Cameras 1-3 are arranged in sequence in a straight line arrangement. Among them, camera 1 and camera 2 have overlapping fields of view A12. Camera 2 and camera 3 have overlapping fields of view A23.
  • the number of cameras and the camera arrangement included in the multi-camera module shown in FIG. 1 are only used as an exemplary illustration and are not intended to limit the multi-camera module involved in the embodiment of the present application.
  • the encoding format of the image captured by the camera in the multi-camera module can be RGB, YUV or HSV, etc.
  • R (red) in RGB is the red component
  • G (green) is the green component
  • B (blue) is the blue component
  • Y in YUV is the brightness component
  • U and V are color components
  • H (hue) in HSV is the hue component
  • S (saturation) is the saturation component
  • V (value) is the brightness component.
  • the resolution of the image captured by the camera can be 4K, or it can also be 1080P, 720P, 540P or 360P, etc.
  • the aspect ratio of the image captured by the camera can be 4:3 or 16:9, etc.
  • the encoding format, resolution, and aspect ratio of the images captured by different cameras in the multi-camera module can be the same or different. If different, the image needs to be converted into a unified format before image processing.
  • the embodiment of the present application does not limit the encoding format, resolution, and aspect ratio of the image captured by the camera.
  • the image stitching method provided in the embodiment of the present application can be applied to a variety of scenarios, including but not limited to video conferencing scenarios, monitoring scenarios or video live broadcast scenarios.
  • the embodiment of the present application takes the application of the image stitching method to a video conferencing scenario as an example for explanation, and the display device can be a conference terminal, such as a large screen, an electronic whiteboard, a mobile phone, a tablet computer or a smart wearable device and other electronic devices with display functions.
  • FIG. 2 is a schematic diagram of an application scenario provided by an embodiment of the present application.
  • the application scenario is a video conferencing scenario.
  • the application scenario includes a conference terminal 201A and a conference terminal 201B (collectively referred to as conference terminal 201).
  • Conference terminal 201A is communicatively connected to conference terminal 201B.
  • Conference terminal 201A has a built-in multi-camera module (not shown in the figure).
  • the application scenario also includes a video server 202.
  • Multiple conference terminals 201 are respectively connected to the video server 202.
  • Multiple conference terminals 201 communicate with each other through the video server 202, and the video server 202 can be, for example, a multi-control unit (MCU).
  • MCU multi-control unit
  • the embodiment of the present application does not exclude the situation where different conference terminals are directly connected.
  • the conference terminal 201A can perform splicing processing on the multi-frame images captured at the same time in the multi-channel video streams, and send the spliced spliced images as one video stream to the video server 202, which is then sent by the video server 202 to the conference terminal 201B for display by the conference terminal 201B.
  • the conference terminal 201A can send the multi-channel video streams to the video server 202, which performs splicing processing on the multi-frame images captured at the same time in the multi-channel video streams, and then sends the spliced spliced images as one video stream to the conference terminal 201B for display by the conference terminal 201B.
  • the conference terminal 201A can send the multi-channel video streams to the video server 202, which sends the multi-channel video streams to the conference terminal 201B, which is then spliced by the conference terminal 201B for the multi-frame images captured at the same time in the multi-channel video streams, and displays the spliced spliced images.
  • the image stitching method provided in the embodiment of the present application can be executed by a device on the image acquisition side (such as conference terminal 201A), or can be executed by an image forwarding device (such as video server 202), or can be executed by a device on the image receiving side (such as conference terminal 201B).
  • the embodiment of the present application does not limit the execution subject of the scheme.
  • FIG3 is a flow chart of an image stitching method provided in an embodiment of the present application.
  • the method can be applied to an image processing device.
  • the image processing device can be, for example, a conference terminal 201A, a video server 202, or a conference terminal 201B in the application scenario shown in FIG2.
  • the method includes but is not limited to the following steps 301 to 303.
  • Step 301 Acquire multiple frames of images captured by multiple camera modules at the same time.
  • the multi-camera module includes a plurality of cameras, and the plurality of cameras include a first camera and a second camera that are adjacently deployed.
  • the first camera and the second camera have an overlapping field of view.
  • the plurality of frames of images captured by the multi-camera module at the same time include a first image captured by the first camera and a second image captured by the second camera.
  • the following embodiments of the present application all take the first camera and the second camera that are adjacently deployed in the multi-camera module as an example to illustrate the stitching process of the images captured by the first camera and the second camera.
  • the stitching process of the images captured by other adjacently deployed cameras in the multi-camera module can refer to the stitching process of the images captured by the first camera and the second camera, and the embodiments of the present application will not be repeated one by one.
  • the multiple frames of images can be preprocessed separately to remove noise in the images.
  • the images can be median filtered and then a subsequent stitching process can be performed on the preprocessed images.
  • Step 302 When the photographed object is included in the overlapping area of the fields of view of the first camera and the second camera, determine the object stitching parameters according to the multiple groups of stitching parameters corresponding to the first camera and the second camera and the position of the photographed object.
  • the multiple groups of stitching parameters corresponding to the first camera and the second camera are respectively obtained based on the camera parameters calibrated by the first camera and the second camera for different calibration positions in the overlapping field of view, that is, a group of stitching parameters corresponding to the first camera and the second camera are obtained based on the camera parameters calibrated by the first camera and the second camera for the same calibration position in the overlapping field of view.
  • any two calibration positions among the multiple calibration positions in the overlapping field of view of the first camera and the second camera meet one or more of the following conditions: the distances from the two calibration positions to the center positions of the first camera and the second camera are different; the horizontal angles of the two calibration positions relative to the arrangement direction of the first camera and the second camera are different. The vertical angles of the two calibration positions relative to the arrangement direction of the first camera and the second camera are different.
  • the arrangement direction of the first camera and the second camera can be, for example, the direction of the straight line where the connecting line of the first camera and the second camera is located.
  • the camera parameters of each camera in the multi-camera module can be obtained by offline calibration, and the offline calibration can be performed before the product leaves the factory, or when the product is installed, or after the product is installed.
  • FIG4 is a schematic diagram of a camera calibration scene provided in an embodiment of the present application. Taking the different distances from different calibration positions to the multi-camera module as an example, as shown in FIG4, six calibration positions are set in the overlapping area of the field of view of the first camera and the second camera, including calibration positions A-F. The distances from the calibration positions A-F to the multi-camera module are 1 meter, 3 meters, 5 meters, 8 meters, 10 meters and 20 meters, respectively.
  • the first camera and the second camera are calibrated for the calibration positions A-F, respectively, to obtain 6 sets of camera parameters calibrated for the calibration positions A-F of the first camera and 6 sets of camera parameters calibrated for the calibration positions A-F of the second camera.
  • the imaging effect of the calibration position is better than that of other positions, for example, the reprojection error of the calibration position is smaller.
  • a checkerboard can be used as a calibration reference for calibration. Specifically, the checkerboard is placed at different positions in the shooting scene, and multiple images containing the checkerboard are taken respectively. Then, the positions of the corner points of the checkerboard are detected, and the corresponding camera parameters are obtained by solving the calibration algorithm.
  • the calibration algorithm here can adopt the Zhang Zhengyou calibration algorithm, or other algorithms can be used. In this implementation method, more images taken by the camera when the checkerboard is placed at a certain calibration position can be used for camera calibration, and accordingly, the camera parameters calibrated for the calibration position can be obtained.
  • the checkerboard is not required as a calibration reference.
  • an active visual camera calibration method can be used to calibrate the camera using certain known motion information of the camera, or a self-calibration algorithm can be used, including but not limited to Hartley's QR decomposition method, Triggs' absolute quadratic surface method, and Pollefeys' modular constraint method.
  • stitching parameters corresponding to the multiple calibration positions can be further calculated.
  • each set of stitching parameters corresponding to the first camera and the second camera includes a projection transformation parameter from the image captured by the first camera to the image captured by the second camera, and the projection transformation parameter is used to transform the image captured by the first camera to the pixel coordinate system corresponding to the image captured by the second camera.
  • the image captured by the second camera is used as the reference image
  • the image captured by the first camera is used as the image to be registered.
  • the projection transformation parameters from the image captured by the first camera to the image captured by the second camera can be represented by a pixel coordinate mapping table, and the pixel coordinate mapping table includes a correspondence between multiple pixel coordinates in the image captured by the first camera and multiple pixel coordinates in the image captured by the second camera.
  • the correspondence here can be a correspondence between one or more pixel coordinates in the image captured by the first camera and a pixel coordinate in the image captured by the second camera, such as a pixel coordinate (x1, y1) in the image captured by the first camera and a pixel coordinate (x2, y2), when the image captured by the first camera is transformed into the pixel coordinate system corresponding to the image captured by the second camera, the pixel value at the pixel coordinate (x1, y1) can be set to the pixel coordinate (x2, y2) accordingly.
  • the pixel coordinates (x11, y11) and (x12, y12) in the image captured by the first camera correspond to the pixel coordinate (x2, y2) in the image captured by the second camera.
  • the pixel value at the pixel coordinate (x11, y11) and the pixel value at the pixel coordinate (x12, y12) can be interpolated or averaged, and the calculated pixel value can be set to the pixel coordinate (x2, y2) accordingly.
  • the projection transformation parameters from the image captured by the first camera to the image captured by the second camera can also be represented by an image transformation matrix. Then, the image processing device can multiply the pixel coordinates of the image captured by the first camera by the image transformation matrix to obtain the pixel coordinates of the image captured by the first camera in the pixel coordinate system corresponding to the image captured by the second camera.
  • the projection transformation parameters of the image captured by the first camera to the image captured by the second camera can be calculated using the camera parameters calibrated by the first camera for the calibration position and the camera parameters calibrated by the second camera for the calibration position.
  • the image captured by the first camera can be subjected to a cylindrical projection transformation or a spherical projection transformation using the camera parameters calibrated by the first camera for the calibration position
  • the image captured by the second camera can be subjected to a cylindrical projection transformation or a spherical projection transformation using the camera parameters calibrated by the second camera for the calibration position, so that the image captured by the first camera and the image captured by the second camera are projected and transformed onto the same cylinder or the same sphere, thereby determining the same pixel points in the overlapping area of the image captured by the first camera and the image captured by the second camera, and then generating a pixel coordinate mapping table according to the pixel coordinates of the multiple pixel points in the overlapping area in the image captured by the first camera and the pixel coordinates in the image captured by the second camera, or calculating the image transformation matrix from the image captured by the first camera to the image captured by the second camera.
  • each set of stitching parameters corresponding to the first camera and the second camera includes projection transformation parameters of the image captured by the first camera to the target plane coordinate system and projection transformation parameters of the image captured by the second camera to the target plane coordinate system.
  • the target plane coordinate system is different from the pixel coordinate system corresponding to the image captured by the first camera and the pixel coordinate system corresponding to the image captured by the second camera.
  • the projection transformation parameters of the image captured by the first camera to the target plane coordinate system and the projection transformation parameters of the image captured by the second camera to the target plane coordinate system can be represented by pixel coordinate mapping tables, respectively.
  • the projection transformation parameters of the image captured by the first camera to the target plane coordinate system can be represented by pixel coordinate mapping table A, and pixel coordinate mapping table A includes the correspondence between multiple pixel coordinates in the image captured by the first camera and multiple coordinates in the target plane coordinate system.
  • the projection transformation parameters of the image captured by the second camera to the target plane coordinate system can be represented by pixel coordinate mapping table B, and pixel coordinate mapping table B includes the correspondence between multiple pixel coordinates in the image captured by the second camera and multiple coordinates in the target plane coordinate system.
  • the projection transformation parameters of the image captured by the first camera to the target plane coordinate system and the projection transformation parameters of the image captured by the second camera to the target plane coordinate system can be represented by image transformation matrices, respectively.
  • the projection transformation parameters of the image captured by the first camera to the target plane coordinate system can be represented by the image transformation matrix A
  • the projection transformation parameters of the image captured by the second camera to the target plane coordinate system can be represented by the image transformation matrix B.
  • the image processing device can multiply the pixel coordinates of the image captured by the first camera by the image transformation matrix A to obtain the pixel coordinates of the image captured by the first camera in the target plane coordinate system.
  • the image processing device can multiply the pixel coordinates of the image captured by the second camera by the image transformation matrix B to obtain the pixel coordinates of the image captured by the second camera in the target plane coordinate system.
  • the camera parameters calibrated by the first camera for the calibration position can be used to calculate the projection transformation parameters of the image captured by the first camera to the target plane coordinate system.
  • the camera parameters calibrated by the first camera for the calibration position can be used to perform a cylindrical projection transformation or a spherical projection transformation on the image captured by the first camera, and then the cylindrical image or the spherical image can be projected onto the plane where the target plane coordinate system is located, and a pixel coordinate mapping table A is generated according to the pixel coordinates of multiple pixel points in the image captured by the first camera and the coordinates in the target plane coordinate system, or the projection transformation parameters of the image captured by the first camera to the target plane coordinate system are calculated.
  • the camera parameters calibrated by the second camera for the calibration position can be used to calculate the projection transformation parameters of the image captured by the second camera to the target plane coordinate system.
  • the specific calculation method can refer to the calculation method of the projection transformation parameters of the image captured by the first camera to the target plane coordinate system, and the embodiments of the present application will not be repeated here.
  • the pairing of pixel points in the image captured by the first camera with the pixel points in the image captured by the second camera can be achieved based on a pixel coordinate mapping table, or the image captured by the first camera and the image captured by the second camera can be transformed into the same plane coordinate system through an image transformation matrix to achieve the pairing of pixel points in the image captured by the first camera with the pixel points in the image captured by the second camera.
  • each set of stitching parameters corresponding to the first camera and the second camera also includes image fusion parameters, which are used to perform image fusion processing on the image captured by the first camera and the image captured by the second camera.
  • the image fusion parameters include but are not limited to the pixel value weights of the image captured by the first camera and the image captured by the second camera, the exposure weight of the image captured by the first camera, the exposure weight of the image captured by the second camera, the white balance weight of the image captured by the first camera or the white balance weight of the image captured by the second camera, etc.
  • the pixel value weight is used to calculate the pixel value of the pixel points in the overlapping area of the image captured by the first camera and the image captured by the second camera, and the pixel value weight specifically includes the first The proportion of pixel values of the image captured by the first camera in the fused image and the proportion of pixel values of the image captured by the second camera in the fused image.
  • the exposure weight is used to adjust the image brightness.
  • the exposure weight of the image captured by the first camera is used to adjust the brightness of the image captured by the first camera
  • the exposure weight of the image captured by the second camera is used to adjust the brightness of the image captured by the second camera.
  • the white balance weight is used to adjust the color of the image.
  • the white balance weight of the image captured by the first camera is used to adjust the color of the image captured by the first camera
  • the white balance weight of the image captured by the second camera is used to adjust the color of the image captured by the second camera.
  • the stitching effect of the calibrated position is better than that of other positions.
  • the image processing device may store a correspondence between multiple sets of stitching parameters corresponding to two cameras deployed adjacently in a multi-camera module and multiple calibration positions in the overlapping area of the field of view of the two cameras.
  • the image processing device may store a correspondence between multiple sets of camera parameters of two cameras deployed adjacently in a multi-camera module and multiple calibration positions in the overlapping area of the field of view of the two cameras.
  • the embodiment of the present application can also generate stitching parameters corresponding to multiple different calibration positions for different deployment scenarios.
  • different deployment scenarios are different conference room scenarios, including different conference room types and/or different conference room sizes.
  • the conference room types can be divided into three types: open conference room, semi-open conference room and closed conference room, or can be divided into two types: indoor conference room and outdoor conference room.
  • the size of the conference room includes the length (maximum depth distance from the multi-camera module), width (left and right width), and height (up and down height) of the conference room.
  • the embodiment of the present application can generate stitching parameters at different positions for each of the three types of open conference room, semi-open conference room, and closed conference room, as well as three sizes of a total of 9 conference room scenes. For example, for a closed and large conference room, stitching parameters corresponding to six positions of 1 meter, 3 meters, 5 meters, 8 meters, 10 meters, and 20 meters from the multi-camera module can be generated.
  • an image processing device stores stitching parameters corresponding to two adjacent cameras in a multi-camera module in multiple deployment scenarios. Before the image processing device determines the target stitching parameters based on the multiple sets of stitching parameters corresponding to the first camera and the second camera and the position of the shooting target, it is necessary to first obtain the deployment scenario of the multi-camera module, and then obtain the multiple sets of stitching parameters corresponding to the first camera and the second camera in the deployment scenario of the multi-camera module.
  • the image processing device first needs to obtain the deployment scenario of the multi-camera module.
  • the deployment scenario of the multi-camera module can be determined by the image processing device based on the images collected by the multi-camera module, or can be identified by other sensors, or can be input or selected through a user interaction interface.
  • the image processing device can identify the type of conference room and estimate the size of the conference room based on the images collected by the multi-camera modules.
  • the image processing device can use a classification algorithm to identify the type of the current conference room for images collected by one or more cameras in the multi-camera module. If the images collected by multiple cameras are used for classification, the image processing device can classify the images collected by each camera separately, and weighted average the classification results to obtain the final classification result, or the image processing device can also input the images collected by multiple cameras into a classification model, and then obtain the classification results output by the classification model.
  • the image processing device can estimate the spatial size of the three-dimensional design (3D layout) based on the monocular image, for example, by estimating the spatial size of the input monocular image through a deep learning model.
  • the distance measurement or generation of a three-dimensional point cloud image can be performed by millimeter wave radar, ultrasonic radar or multi-microphone array, etc., the size of the conference room can be calculated, and then the conference room scene can be determined by the classification model.
  • the type of conference room can be determined based on the image, and the size of the conference room can be calculated by millimeter wave radar, and then the conference room scene can be determined according to the type of conference room and the size of the conference room.
  • the image processing device can display multiple deployment scenario options for user selection through a user interaction interface.
  • Figure 5 is a schematic diagram of a user interaction interface provided in an embodiment of the present application.
  • the user interaction interface is a deployment scenario selection interface
  • the user interaction interface includes three conference room type options and three conference room size options.
  • the three conference room type options are open, semi-open and closed, and the three conference room size options are large, medium and small.
  • the conference room size options can also include specific conference room dimensions corresponding to large, medium and small, such as length, width and height, which are not shown one by one in the figure.
  • the user can select the conference room type and conference room size on the user interaction interface, and then the image processing device determines the conference room scene deployed by the multi-camera module according to the user selection result.
  • the embodiment of the present application provides stitching parameters corresponding to different positions of the multi-camera modules in a variety of different deployment scenarios.
  • the image processing device can flexibly select the corresponding stitching parameters according to the deployment scenario of the multi-camera modules, so that the image stitching effect is more matched with the current deployment scenario, thereby It is possible to achieve good image stitching effects for images collected in different deployment scenarios of multiple camera modules.
  • the deployment scenarios of multiple camera modules may not be distinguished, and all deployment scenarios use the same set of stitching parameters, which is not limited in the embodiments of the present application.
  • the image processing device can estimate the position of the shooting target based on the image, sound or sensor.
  • Position estimation can be implemented using director technology. For example, if the shooting target is a person, the person's position can be determined by face or body tracking, or by sound source positioning, or by using a millimeter wave sensor or an ultrasonic sensor to use a liveness detection algorithm to determine the person's position, or by using the position of a moving object determined by a motion detection algorithm as the person's position, or by combining at least two of the above position estimation schemes to determine the person's position.
  • the position of the shooting target can also be obtained by manual input, for example, the user can enter the position coordinates through a user interaction interface. The embodiment of the present application does not limit the manner in which the image processing device obtains the position of the shooting target.
  • the position of the photographed target may be one-dimensional, two-dimensional or three-dimensional, and the dimension of the position of the photographed target determined here may be consistent with the dimension of the calibration position selected during camera calibration.
  • the one-dimensional representation of the position of the photographed target located in the overlapping area of the field of view of two adjacently deployed cameras may be the distance from the photographed target to the center position of the two cameras, or the horizontal (left-right) angle of the photographed target relative to the arrangement direction of the two cameras, or the vertical (up-down) angle of the photographed target relative to the arrangement direction of the two cameras.
  • the two-dimensional representation of the position of the photographed target located in the overlapping area of the field of view of two adjacently deployed cameras may be the distance from the photographed target to the center position of the two cameras and the horizontal angle of the photographed target relative to the arrangement direction of the two cameras, or the distance from the photographed target to the center position of the two cameras and the vertical angle of the photographed target relative to the arrangement direction of the two cameras, or the horizontal angle of the photographed target relative to the arrangement direction of the two cameras and the vertical angle of the photographed target relative to the arrangement direction of the two cameras.
  • the three-dimensional representation of the position of the shooting target can be the distance from the shooting target to the center position of the two cameras, the horizontal angle of the shooting target relative to the arrangement direction of the two cameras, and the vertical angle of the shooting target relative to the arrangement direction of the two cameras.
  • Figures 6 to 8 are schematic diagrams of a position of a shooting target provided in an embodiment of the present application. As shown in Figure 6, the distance from the shooting target to the multi-camera module is 3 meters.
  • the distance from the shooting target to the multi-camera module is 3 meters
  • the horizontal angle of the shooting target relative to the multi-camera module is 30°, where 0° represents the front, the horizontal positive angle represents the right side of the multi-camera module, the horizontal negative angle represents the left side of the multi-camera module, and the horizontal angle of the shooting target relative to the multi-camera module is 30°, which means that the shooting target is 30° to the right of the multi-camera module.
  • the distance from the shooting target to the multi-camera module is 3 meters
  • the horizontal angle of the shooting target relative to the multi-camera module is 30°
  • the vertical angle of the shooting target relative to the multi-camera module is 20°, wherein the vertical positive angle indicates that it is above the multi-camera module, and the vertical negative angle indicates that it is below the multi-camera module.
  • the vertical angle of the shooting target relative to the multi-camera module of 20° indicates that the shooting target is 20° directly above the multi-camera module.
  • the distance from the shooting target to the multi-camera module can be the distance from the shooting target to the center position of the two cameras constituting the overlapping area of the field of view where the shooting target is located
  • the horizontal angle or vertical angle of the shooting target relative to the multi-camera module can be the horizontal angle or vertical angle of the arrangement direction of the shooting target to the two cameras constituting the overlapping area of the field of view where the shooting target is located.
  • the camera in the multi-camera module is not specifically shown in the figure, and it is only uniformly represented as a multi-camera module. Among them, the x-axis direction in FIG7 and FIG8 is the horizontal direction, the y-axis direction is the depth direction, and the z-axis direction in FIG8 is the height direction.
  • the shooting target may be a person, or may be any object such as an animal, a car, or a factory workpiece, and the embodiments of the present application do not limit the type of the shooting target.
  • the implementation method of step 302 may be that the image processing device determines the target stitching parameters according to the stitching parameters corresponding to one or more calibration positions closest to the position of the photographing target among the multiple calibration positions in the overlapping area of the field of view of the first camera and the second camera.
  • the image processing device may use the stitching parameters corresponding to the position of the photographing target as the target stitching parameters. If the multiple calibration positions do not include the position of the photographing target, the image processing device may determine the target stitching parameters according to the stitching parameters corresponding to the two calibration positions closest to the position of the photographing target among the multiple calibration positions.
  • an implementation method in which the image processing device determines the target stitching parameters based on the stitching parameters corresponding to the two calibration positions that are closest to the position of the shooting target among the multiple calibration positions includes: the image processing device uses the stitching parameters corresponding to the two calibration positions to interpolate and calculate the target stitching parameters corresponding to the position of the shooting target based on the distance of the position of the shooting target relative to the two calibration positions.
  • the shooting target is 2.2 meters away from the multi-camera module
  • the two calibration positions closest to the shooting target are 2 meters and 3 meters away from the multi-camera module respectively.
  • the stitching parameters corresponding to the calibration position 2 meters away from the multi-camera module and the stitching parameters corresponding to the calibration position 3 meters away from the multi-camera module can be interpolated and calculated to obtain the stitching parameters at a distance of 2.2 meters from the multi-camera module, and the stitching parameters are used as the target stitching parameters.
  • the interpolation algorithm used here can be a linear interpolation algorithm or a nonlinear interpolation algorithm. Taking the linear interpolation algorithm as an example, assuming that the distance from the multi-camera module is 2.2 meters.
  • the stitching parameter corresponding to the calibrated position of the module at 2 meters is T1
  • the stitching parameter corresponding to the calibrated position at 3 meters from the multi-camera module is T2.
  • the stitching parameter at a distance of 2.2 meters from the multi-camera module calculated by the linear interpolation algorithm is (0.8*T1+0.2*T2).
  • the image processing device may also use the average of the stitching parameters corresponding to the two calibrated positions closest to the position where the photographed target is located as the target stitching parameter corresponding to the position where the photographed target is located.
  • the stitching parameters corresponding to the position of the shooting target are used as the target stitching parameters, or the target stitching parameters are calculated according to the stitching parameters corresponding to the two calibration positions closest to the position of the shooting target.
  • the target stitching parameters are used to stitch the images captured by the first camera and the second camera, the stitching effect of the position of the shooting target can be better, thereby ensuring the display effect of the shooting target in the stitched image.
  • the overlapping area of the field of view of the first camera and the second camera includes multiple shooting targets.
  • the image processing device can determine a stitching parameter for the multiple shooting targets, or can also determine a corresponding stitching parameter for each of the multiple shooting targets.
  • the image processing device can determine a stitching parameter for multiple shooting targets.
  • the director mode includes but is not limited to auto framing mode, speaker close-up mode, speaker tracking mode or dialogue mode.
  • the speaker close-up mode and speaker tracking mode refer to the selection and tracking of individuals speaking in the meeting. Face detection is required in these modes, and the speaker close-up mode also requires sound source localization. Since there are errors in sound source localization, this requires the sound source position and the image to be matched. According to the position of the sound source after localization, the face is searched for at that position in the image. If the face is successfully found, the director will close up the face, which is the process of sound and image matching.
  • a specific implementation manner of step 302 may be that the image processing device determines the target stitching parameters according to the multiple groups of stitching parameters corresponding to the first camera and the second camera and the positions of the multiple shooting targets.
  • an implementation method in which the image processing device determines the target stitching parameters according to the multiple sets of stitching parameters corresponding to the first camera and the second camera and the positions of the multiple shooting targets includes: the image processing device uses the stitching parameters corresponding to the target calibration position as the target stitching parameters, and the target calibration position is the calibration position with the smallest sum of distances to the positions of the multiple shooting targets among the multiple calibration positions. For example, there are three shooting targets in the overlapping area of the field of view of the first camera and the second camera, and the distances of the three shooting targets to the multi-camera module are 2 meters, 3 meters and 5 meters respectively.
  • the stitching parameters corresponding to the calibration position 3 meters away from the multi-camera module can be used as the target stitching parameters.
  • This implementation method comprehensively considers the image stitching effect of multiple shooting targets, selects the stitching parameter corresponding to the calibration position with the smallest sum of distances to the positions of the multiple shooting targets as the target stitching parameter, and when the target stitching parameter is used to stitch the images captured by the first camera and the second camera, the overall stitching effect of the positions of the multiple shooting targets can be better, thereby making the overall display effect of the multiple shooting targets in the stitched image better.
  • the calculation process of this implementation method is simple and consumes less processing resources.
  • another implementation manner in which the image processing device determines the target stitching parameters based on the multiple groups of stitching parameters corresponding to the first camera and the second camera and the positions of the multiple shooting targets includes: the image processing device obtains, for each shooting target among the multiple shooting targets, stitching parameters corresponding to one or more calibration positions that are closest to the position of the shooting target among the multiple calibration positions. The image processing device determines the target stitching parameters based on all the stitching parameters obtained for the multiple shooting targets. If the multiple calibration positions include the position of the shooting target, the one or more calibration positions that are closest to the position of the shooting target may be the position of the shooting target. If the multiple calibration positions do not include the position of the shooting target, the one or more calibration positions that are closest to the position of the shooting target may be the two calibration positions that are closest to the position of the shooting target.
  • an implementation process of the image processing device determining the target stitching parameters according to all stitching parameters obtained for the multiple shooting targets includes: the image processing device first determines the stitching parameters corresponding to the position of each shooting target, and then determines the target stitching parameters according to the stitching parameters corresponding to the positions of the multiple shooting targets.
  • the implementation process of the image processing device determining the stitching parameters corresponding to the position of a single shooting target can refer to the relevant description in the first possible case above, and the embodiments of the present application will not be repeated here.
  • the image processing device may use the average value of the stitching parameters corresponding to the positions of the multiple shooting targets as the target stitching parameter. For example, there are two shooting targets in the overlapping area of the field of view of the first camera and the second camera, and the stitching parameter corresponding to the position of one shooting target is T1, and the stitching parameter corresponding to the position of the other shooting target is T2, then the target stitching parameter may be (T1+T2)/2.
  • the image processing device may also use the weighted average value of the stitching parameters corresponding to the positions of the multiple shooting targets as the target stitching parameter, and the weight ratio of the stitching parameters corresponding to the positions of each shooting target is positively correlated with the distance to the cluster center of the positions of the multiple shooting targets.
  • the image processing device determines another implementation process of the target stitching parameter based on all the stitching parameters obtained for the multiple shooting targets, including: the image processing device uses the average value of all the stitching parameters obtained for the multiple shooting targets as the target stitching parameter. For example, there are two shooting targets in the overlapping area of the field of view of the first camera and the second camera, and the stitching parameters corresponding to the two calibration positions closest to the position of one of the shooting targets are T1 and T2 respectively, and the stitching parameters corresponding to the two calibration positions closest to the position of the other shooting target are T3 and T4 respectively, then the target stitching parameter can be (T1+T2+T3+T4)/4.
  • This implementation method comprehensively considers the image stitching effects of multiple shooting targets, determines the target stitching parameters according to the stitching parameters corresponding to one or more calibration positions closest to each shooting target, and uses the target stitching parameters to stitch the images captured by the first camera and the second camera.
  • the stitching effects of the positions where the multiple shooting targets are located can be compromised, and the stitching effects of the positions where the multiple shooting targets are located can be made as consistent as possible, so that the overall display effect of the multiple shooting targets in the stitched image is better.
  • the image processing device may determine corresponding stitching parameters for each of the multiple shooting targets. This implementation may be applied to scenarios where multiple camera modules adopt an intelligent averaging mode or a multi-person same-frame mode.
  • the specific implementation of step 302 may be that, for each of the multiple shooting targets, the image processing device determines the stitching parameters corresponding to the shooting target according to the stitching parameters corresponding to one or more calibrated positions closest to the position of the shooting target among the multiple calibrated positions in the overlapping area of the field of view of the first camera and the second camera. That is, in this possible case, the image processing device determines the stitching parameters corresponding to the position of the shooting target for each shooting target, and the implementation process of the image processing device determining the stitching parameters corresponding to the position of a single shooting target can refer to the relevant description in the above-mentioned first possible case, and the embodiment of the present application will not be repeated here.
  • the intelligent averaging mode or the multi-person same-frame mode refers to combining close-up images of multiple people and displaying them on the same screen.
  • the target stitching parameters include stitching parameters corresponding to each of the multiple shooting targets.
  • Step 303 Use the target stitching parameters to stitch the first image captured by the first camera and the second image captured by the second camera to obtain a stitched image.
  • the target stitching parameters include projection transformation parameters from the first image to the second image
  • the target stitching parameters include projection transformation parameters from the first image to the target plane coordinate system and projection transformation parameters from the second image to the target plane coordinate system.
  • the target stitching parameters are used to stitch the first image captured by the first camera and the second image captured by the second camera, which may be to transform the first image to the pixel coordinate system corresponding to the second image using the projection transformation parameters from the first image to the second image, and then fuse the second image with the transformed first image, or to transform the first image to the plane where the target plane coordinate system is located using the projection transformation parameters from the first image to the target plane coordinate system, and to transform the second image to the plane where the target plane coordinate system is located using the projection transformation parameters from the second image to the target plane coordinate system, and then fuse the transformed first image with the transformed second image.
  • Image fusion of the first image and the second image is actually to fuse the overlapping area of the first image and the second image.
  • the target stitching parameters also include image fusion parameters
  • the image processing device can perform image fusion on the overlapping area of the first image and the second image based on the image fusion parameters, such as adjusting the brightness of the first image and/or the second image based on the exposure weight, adjusting the color of the first image and/or the second image based on the white balance weight, and calculating the pixel value of the pixel point in the overlapping area of the first image and the second image based on the pixel value weight.
  • Performing image fusion on the overlapping area of the first image and the second image may include calculating the target pixel value according to the pixel value of the same pixel point in the overlapping area of the first image and the second image, and using the target pixel value as the pixel value of the pixel point in the fused image.
  • the image fusion parameters include the pixel value weights of the first image and the second image
  • the pixel value weights may be used to calculate the target pixel value.
  • the weighted average method may also be used to calculate the target pixel value of each pixel point in the overlapping area of the two frames of images.
  • the target pixel value of pixel a in the final image obtained by fusion of the first image and the second image can be (p*0.5+q*0.5).
  • the image processing device performs image stitching for all the shooting targets to obtain a stitched image.
  • the image processing device can output a cropped image to the screen for display, the cropped image is cropped from the stitched image, and the cropped image includes all the shooting targets located in the overlapping area of the field of view of the first camera and the second camera.
  • the image processing device can crop the stitched image to obtain a cropped image including all the shooting targets, and output the cropped image to the screen for display.
  • the image processing device may first stitch together a panorama, and then crop the panorama to obtain a close-up image of the speaker.
  • the image processing device may also stitch together only the image of the speaker's area based on the speaker's position to obtain a stitched close-up image of the speaker.
  • Figures 9 and 10 are schematic diagrams of a conference room scene provided by an embodiment of the present application.
  • the multi-camera module is deployed in front of the conference room, and there are four different positions A, B, C, and D in the conference room.
  • the structure of the multi-camera module can be as shown in Figure 1, where positions A and C are located in the overlapping area A12 of the field of view of camera 1 and camera 2, and positions B and D are located in the overlapping area A23 of the field of view of camera 2 and camera 3.
  • the images captured by the three cameras in the multi-camera module can be shown in FIG11 , where image 1a is the image captured by camera 1, image 2a is the image captured by camera 2, and image 3a is the image captured by camera 3.
  • Image 2a refers to the area within the rectangular frame. For illustration, the part of the speaker that camera 2 did not fully capture is also reflected in the figure, and the part outside the rectangular frame does not actually belong to image 2a. Since the director's speaker is at position A, the image processing device can use the stitching parameters corresponding to position A to stitch image 1 and image 2 to obtain a stitched image as shown in FIG12 .
  • the image processing device can crop the position frame containing the speaker from FIG12 for enlarged display, and the resulting cropped image can be shown in FIG13 .
  • the image processing device can use the stitching parameters corresponding to position B to stitch the image captured by camera 2 and the image captured by camera 3.
  • the image processing device can use the stitching parameters corresponding to the spokesperson's position to stitch the images captured by the corresponding cameras.
  • the image processing device uses the stitching parameters corresponding to position A to stitch the images captured by camera 1 and camera 2.
  • the image processing device uses the stitching parameters corresponding to position B to stitch the images captured by camera 2 and camera 3.
  • the image processing device uses the stitching parameters corresponding to position C to stitch the images captured by camera 1 and camera 2.
  • the image processing device uses the stitching parameters corresponding to position D to stitch the images captured by camera 2 and camera 3. In this way, the current position of the director's close-up object can have a better image stitching effect, thereby ensuring the display effect of the director's close-up object.
  • the director mode and the panoramic display mode can be switched between each other.
  • the image processing device stitches the images captured by all cameras in the multi-camera module.
  • the image processing device can first stitch the images captured by camera 1 and camera 2, and then stitch the obtained stitched image with the image captured by camera 3 to obtain a panoramic image.
  • the image processing device can first stitch the images captured by camera 1 and camera 2, and stitch the images captured by camera 2 and camera 3, and then further stitch the two stitched images obtained to obtain a panoramic image.
  • the image stitching in the embodiment of the present application can be stitching images captured by adjacently deployed cameras.
  • the overlapping area of the field of view of the first camera and the second camera includes multiple shooting targets
  • the implementation of step 303 is that for each shooting target, the image processing device uses the stitching parameters corresponding to the shooting target to stitch the first image and the second image to obtain a stitched image corresponding to the shooting target. That is, the image processing device stitches the images once for each of the multiple shooting targets to obtain multiple stitched images to ensure the display effect of each shooting target in the corresponding stitched image.
  • the image processing device can output a combined image to the screen for display, the combined image is obtained by combining multiple cropped images, the multiple cropped images are respectively cropped from the multiple stitched images corresponding to the multiple shooting targets, and each cropped image contains the corresponding shooting targets in the cropped stitched images.
  • the image processing device can crop the stitched image corresponding to each shooting target to obtain a cropped image containing the corresponding shooting target, the image processing device combines the multiple cropped images corresponding to the multiple shooting targets to obtain a combined image, and outputs the combined image to the screen for display.
  • the stitched image from which each photographed target in the combined image generated by the image processing device comes can ensure that the photographed target has a good display effect, the display effect of the finally output combined image can be guaranteed.
  • the image processing device can directly stitch the image containing the area where the shooting target is located using the stitching parameters corresponding to the shooting target according to the location of the shooting target. For example, if the multi-camera module includes three cameras and a participant only appears in the field of view of two cameras, only the images captured by these two cameras will be stitched during stitching. If a participant appears in the field of view of three cameras at the same time, the images of two adjacent cameras can be selected for stitching, and it is only necessary to ensure that the imaging of the participant in the stitched image is complete.
  • images captured by the three cameras in the multi-camera module can be shown in FIG14, where image 1b is the image captured by camera 1, image 2b is the image captured by camera 2, and image 3b is the image captured by camera 3.
  • Images 1b-3b refer to the area within the rectangular frame.
  • the image processing device may use the stitching parameters corresponding to the A position to stitch image 1b and image 2b, and cut out the area containing the speaker at position A from the stitched image.
  • the image processing device may use the stitching parameters corresponding to the B position to stitch image 2b and image 3b, and cut out the area containing the speaker at position B from the stitched image.
  • the image processing device may use the stitching parameters corresponding to the C position to stitch image 1b and image 2b, and cut out the area containing the speaker at position C from the stitched image.
  • the image processing device may use the stitching parameters corresponding to the D position to stitch image 2b and image 3b, and cut out the area containing the speaker at position D from the stitched image. Further, the image processing device may combine the four cut-out areas to obtain a combined image as shown in FIG15.
  • the image processing device outputs the image to a screen for display, which may be to display the image on its own screen, or to send the image to other devices for display on other devices.
  • the conference terminal may stitch together multiple frames of images captured by multiple camera modules on the local end to obtain a stitched image, and may further display the stitched image, cropped image, or combined image.
  • the conference terminal may also send the stitched image, cropped image, or combined image to other conference terminals at the remote end for display by other conference terminals.
  • the camera parameters of two cameras deployed adjacently in the multi-camera module for different calibration positions in the overlapping field of view are calibrated offline.
  • stitching parameters can be further determined based on the camera parameters calibrated by the two cameras for the calibration position to obtain stitching parameters corresponding to different calibration positions.
  • the image processing device can determine the stitching parameters applicable to the position where the shooting target is located based on the stitching parameters corresponding to the multiple calibration positions in the overlapping field of view, thereby making the stitching effect of the position where the shooting target is located in the stitched image better, thereby improving the display effect of the shooting target in the stitched image.
  • there is no need to detect and match feature points on the image which is applicable to more shooting scenes and requires lower computing power.
  • the order of the steps of the image stitching method provided in the embodiment of the present application can be adjusted appropriately, and the steps can also be increased or decreased accordingly according to the situation. Any technician familiar with the technical field can easily think of a method of change within the technical scope disclosed in this application, and it should be covered within the scope of protection of this application.
  • the embodiment of the present application takes the shooting target being located in the shooting area of two cameras in a multi-camera module as an example to illustrate the image stitching process. If the shooting target is located in the shooting area of three or more cameras in the multi-camera module, the image processing device can select the images captured by two adjacently deployed cameras for stitching. It is only necessary to ensure that the imaging of the shooting target in the stitched image is complete, and the embodiments of the present application will not be repeated one by one.
  • FIG16 is a schematic diagram of the structure of an image stitching device provided in an embodiment of the present application.
  • the image stitching device can be applied to an image processing device.
  • the device 1600 includes but is not limited to: an acquisition module 1601, a determination module 1602, and a stitching module 1603.
  • the device 1600 also includes an output module 1604.
  • the acquisition module 1601 is used to acquire multiple frames of images captured by a multi-camera module at the same time.
  • the multi-camera module includes multiple cameras, and the multiple cameras include a first camera and a second camera deployed adjacent to each other. The first camera and the second camera have an overlapping field of view.
  • the multiple frames of images include a first image captured by the first camera and a second image captured by the second camera.
  • Determination module 1602 is used to determine the target stitching parameters according to multiple sets of stitching parameters corresponding to the first camera and the second camera and the position of the target when the overlapping field of view includes the target, and the multiple sets of stitching parameters are obtained based on the camera parameters calibrated by the first camera and the second camera at different calibration positions in the overlapping field of view.
  • the stitching module 1603 is used to stitch the first image and the second image using target stitching parameters to obtain a stitched image.
  • Each set of stitching parameters corresponding to the first camera and the second camera includes projection transformation parameters of the image captured by the first camera to the image captured by the second camera, or each set of stitching parameters corresponding to the first camera and the second camera includes projection transformation parameters of the image captured by the first camera to the target plane coordinate system and projection transformation parameters of the image captured by the second camera to the target plane coordinate system.
  • any two calibration positions among the multiple calibration positions satisfy one or more of the following conditions: the distances between the two calibration positions and the center positions of the first camera and the second camera are different; the horizontal angles of the two calibration positions relative to the arrangement direction of the first camera and the second camera are different; the vertical angles of the two calibration positions relative to the arrangement direction of the first camera and the second camera are different.
  • the overlapping field of view includes a photographic target
  • the determination module 1602 is configured to determine the target stitching parameters according to stitching parameters corresponding to one or more calibrated positions that are closest to the position of the photographic target among the multiple calibrated positions.
  • the determination module 1602 is specifically configured to: if the multiple calibration positions include the position where the photographed target is located, use the stitching parameters corresponding to the position where the photographed target is located as the target stitching parameters. If the multiple calibration positions do not include the position where the photographed target is located, determine the target stitching parameters according to the stitching parameters corresponding to the two calibration positions closest to the position where the photographed target is located among the multiple calibration positions.
  • the determination module 1602 is specifically configured to: based on the distance between the position of the photographed target and the two calibrated positions, obtain the target stitching parameters corresponding to the position of the photographed target by interpolating the stitching parameters corresponding to the two calibrated positions.
  • the overlapping field of view includes multiple shooting targets
  • the determination module 1602 is used to determine the target stitching parameters according to the multiple groups of stitching parameters and the positions of the multiple shooting targets.
  • the determination module 1602 is specifically configured to: use the stitching parameters corresponding to the target calibration position as the target stitching parameters, and the target calibration position is the calibration position with the smallest sum of distances to the positions of the multiple shooting targets among the multiple calibration positions.
  • the determination module 1602 is specifically configured to: for each of the multiple shooting targets, obtain stitching parameters corresponding to one or more calibrated positions closest to the position of the shooting target among the multiple calibrated positions, and determine the target stitching parameters based on all stitching parameters obtained for the multiple shooting targets.
  • the output module 1604 is used to output a cropped image to a screen for display after stitching the first image and the second image using the target stitching parameters to obtain the stitched image, wherein the cropped image is cropped from the stitched image and includes all the photographed targets.
  • the overlapping area of the field of view includes multiple shooting targets
  • the determination module 1602 is used to: for each shooting target among the multiple shooting targets, determine the stitching parameters corresponding to the shooting target according to the stitching parameters corresponding to one or more calibration positions closest to the position where the shooting target is located among the multiple calibration positions.
  • the stitching module 1603 is used to: for each shooting target, use the stitching parameters corresponding to the shooting target to stitch the first image and the second image to obtain a stitched image corresponding to the shooting target.
  • the output module 1604 is used to stitch the first image and the second image for each of the multiple shooting targets using stitching parameters corresponding to the shooting target, and after obtaining the stitched image corresponding to the shooting target, output the combined image to the screen for display, where the combined image is obtained by combining multiple cropped images, and the multiple cropped images are respectively cropped from the multiple stitched images corresponding to the multiple shooting targets, and each cropped image respectively contains the corresponding shooting target in the cropped stitched image.
  • the image processing device stores stitching parameters corresponding to two adjacent cameras in the multi-camera module in multiple deployment scenarios.
  • the acquisition module 1601 is further used to acquire the deployment scenario of the multi-camera module before determining the target stitching parameters according to the multiple sets of stitching parameters corresponding to the first camera and the second camera and the position of the shooting target, and acquire the multiple sets of stitching parameters corresponding to the first camera and the second camera in the deployment scenario of the multi-camera module.
  • FIG17 is a schematic diagram of the hardware structure of an image processing device provided in an embodiment of the present application.
  • the image processing device 1700 includes a processor 1701 and a memory 1702, and the memory 1701 is connected to the memory 1702 via a bus 1703.
  • FIG17 illustrates the processor 1701 and the memory 1702 as being independent of each other.
  • the processor 1701 and the memory 1702 are integrated together.
  • the image processing device 1700 in FIG17 is any conference terminal 201 or video server 202 in the application scenario shown in FIG2 .
  • the memory 1702 is used to store computer programs, including operating systems and program codes.
  • the memory 1702 is a storage medium of various types, such as read-only memory (ROM), random access memory (RAM), electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM), flash memory, optical storage, register, optical disk storage, optical disk storage, magnetic disk or other magnetic storage devices.
  • ROM read-only memory
  • RAM random access memory
  • EEPROM electrically erasable programmable read-only memory
  • CD-ROM compact disc read-only memory
  • flash memory optical storage, register, optical disk storage, optical disk storage, magnetic disk or other magnetic storage devices.
  • the processor 1701 is a general-purpose processor or a special-purpose processor.
  • the processor 1701 may be a single-core processor or a multi-core processor.
  • the processor 1701 includes at least one circuit to execute the above-mentioned image stitching method provided in the embodiment of the present application.
  • the image processing device 1700 further includes a network interface 1704, which is connected to the processor 1701 and the memory 1702 via the bus 1703.
  • the network interface 1704 enables the image processing device 1700 to communicate with other devices.
  • the processor 1701 can communicate with other devices via the network interface 1704 to obtain images captured by the camera, etc.
  • the image processing device 1700 further includes an input/output (I/O) interface 1705, which is connected to the processor 1701 and the memory 1702 via the bus 1703.
  • the processor 1701 can receive input commands or data, etc. via the I/O interface 1705.
  • the I/O interface 1705 is used for the image processing device 1700 to connect input devices, such as keyboards, mice, etc.
  • the above-mentioned network interface 1704 and the I/O interface 1705 are collectively referred to as communication interfaces.
  • the image processing device 1700 further includes a display 1706, which is connected to the processor 1701 and the memory 1702 via the bus 1703.
  • the display 1706 can be used to display the intermediate results and/or final results generated by the processor 1701 executing the above method, for example Display stitched images, cropped images or combined images, etc.
  • the display 1706 is a touch display screen to provide a human-computer interaction interface.
  • the bus 1703 is any type of communication bus for interconnecting the internal devices of the image processing device 1700.
  • a system bus for example, a system bus.
  • the embodiment of the present application takes the interconnection of the above-mentioned devices in the image processing device 1700 through the bus 1703 as an example.
  • the above-mentioned devices in the image processing device 1700 are connected to each other in a communication manner other than the bus 1703, for example, the above-mentioned devices in the image processing device 1700 are interconnected through a logical interface in the image processing device 1700.
  • the above devices may be arranged on independent chips, or at least partially or completely on the same chip. Whether to arrange each device independently on different chips or to integrate them on one or more chips often depends on the needs of product design.
  • the embodiments of the present application do not limit the specific implementation form of the above devices.
  • the image processing device 1700 shown in Fig. 17 is merely exemplary. During implementation, the image processing device 1700 includes other components, which are not listed here.
  • the image processing device 1700 shown in Fig. 17 can implement image stitching by executing all or part of the steps of the method provided in the above embodiment.
  • the embodiment of the present application further provides a computer-readable storage medium, on which instructions are stored.
  • the instructions are executed by a processor, the image stitching method shown in FIG. 3 is implemented.
  • the embodiment of the present application further provides a computer program product, including a computer program.
  • a computer program product including a computer program.
  • the computer program is executed by a processor, the image stitching method shown in FIG. 3 is implemented.
  • a and/or B can represent: A exists alone, A and B exist at the same time, and B exists alone.
  • the character "/" in this article generally indicates that the associated objects before and after are in an "or" relationship.
  • the information including but not limited to user device information, user personal information, etc.
  • data including but not limited to data used for analysis, stored data, displayed data, etc.
  • signals involved in this application are all authorized by the user or fully authorized by all parties, and the collection, use and processing of relevant data must comply with relevant laws, regulations and standards of relevant countries and regions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Studio Devices (AREA)
  • Image Processing (AREA)

Abstract

本申请公开了一种图像拼接方法及装置,属于图像处理技术领域。获取多相机模组在同一时刻采集的多帧图像。当该多相机模组中相邻部署的第一相机和第二相机的视野重叠区域内包括拍摄目标时,根据第一相机和第二相机对应的多组拼接参数以及拍摄目标所在的位置确定目标拼接参数。采用目标拼接参数对第一相机采集的图像和第二相机采集的图像进行拼接,得到拼接图像。本申请通过离线标定第一相机和第二相机针对视野重叠区域内的不同标定位置的相机参数,得到不同标定位置对应的拼接参数,进一步根据多个标定位置对应的拼接参数确定适用于拍摄目标所在的位置的拼接参数,从而提升对拍摄目标的成像拼接效果。

Description

图像拼接方法及装置
本申请要求于2022年12月05日提交的申请号为202211550973.3、发明名称为“一种图像拼接的方法”的中国专利申请的优先权,以及于2023年02月14日提交的申请号为202310140790.2、发明名称为“图像拼接方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术领域,特别涉及一种图像拼接方法及装置。
背景技术
目前应用于会议终端的相机的视场角(field of view,FOV)通常较小,大部分相机的水平视场角在90°以内。对于大中型会议室,单相机的拍摄区域无法完全覆盖整个会议场景,可能会出现与会人位于相机两侧的视野盲区的情况。为了增大拍摄视角,目前一种常用的方法是通过部署多个相机分别采集图像,然后对该多个相机采集到的图像进行拼接,得到全景图像。但是,对多个相机采集到的图像进行拼接得到的全景图像经常存在明显的拼缝,拼接效果较差。如何提升图像拼接效果是目前亟需解决的问题。
发明内容
本申请提供了一种图像拼接方法及装置,可以提升图像拼接效果。
第一方面,提供了一种图像拼接方法,该方法可以应用于图像处理设备。该方法包括:获取多相机模组在同一时刻采集的多帧图像,该多相机模组包括多个相机,该多个相机包括相邻部署的第一相机和第二相机。第一相机和第二相机具有视野重叠区域。该多帧图像包括第一相机采集的第一图像和第二相机采集的第二图像。当第一相机和第二相机的视野重叠区域内包括拍摄目标时,根据第一相机和第二相机对应的多组拼接参数以及拍摄目标所在的位置确定目标拼接参数,该多组拼接参数分别基于第一相机和第二相机共同针对视野重叠区域内的不同标定位置标定的相机参数得到。采用目标拼接参数对第一图像和第二图像进行拼接,得到拼接图像。
本申请中,通过离线标定多相机模组中相邻部署的两个相机针对视野重叠区域内的不同标定位置的相机参数,对于每个标定位置,进一步可以根据这两个相机针对该标定位置标定得到的相机参数确定拼接参数,以得到不同标定位置分别对应的拼接参数。当拍摄目标位于两个相机的视野重叠区域内时,图像处理设备可以根据该视野重叠区域内的多个标定位置各自对应的拼接参数确定适用于拍摄目标所在的位置的拼接参数,进而使得拼接图像中拍摄目标所在的位置的拼接效果较好,从而提升拍摄目标在拼接图像中的显示效果。
可选地,第一相机和第二相机对应的每组拼接参数包括第一相机采集的图像到第二相机采集的图像的投影变换参数,或者,第一相机和第二相机对应的每组拼接参数包括第一相机采集的图像到目标平面坐标系的投影变换参数以及第二相机采集的图像到目标平面坐标系的投影变换参数。
其中,第一相机采集的图像到第二相机采集的图像的投影变换参数用于将第一相机采集的图像变换到第二相机采集的图像对应的像素坐标系下。第一相机采集的图像到目标平面坐标系的投影变换参数用于将第一相机采集的图像变换到目标平面坐标系下,第二相机采集的图像到目标平面坐标系的投影变换参数用于将第二相机采集的图像变换到目标平面坐标系下。本申请通过将第一相机采集的图像与第二相机采集的图像变换到同一平面坐标系下,可以实现对第一相机采集的图像中的像素点与第二相机采集的图像中的像素点的配对。
可选地,第一相机和第二相机的视野重叠区域内的多个标定位置中的任意两个标定位置满足以下一个或多个条件:两个标定位置到第一相机和第二相机的中心位置的距离不同;两个标定位置相对于第一相机和第二相机的排布方向的水平角度不同;两个标定位置相对于第一相机和第二相机的排布方向的垂直角度不同。第一相机和第二相机的排布方向例如可以是第一相机与第二相机的连线所在直线的方向。
可选地,第一相机和第二相机的视野重叠区域内包括一个拍摄目标,根据第一相机和第二相机对应的 多组拼接参数以及拍摄目标所在的位置确定目标拼接参数的实现方式,包括:根据多个标定位置中距离拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数确定目标拼接参数。
可选地,根据多个标定位置中距离拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数确定目标拼接参数的实现方式,包括:如果多个标定位置包括拍摄目标所在的位置,将拍摄目标所在的位置对应的拼接参数作为目标拼接参数。如果多个标定位置不包括拍摄目标所在的位置,根据多个标定位置中距离拍摄目标所在的位置最近的两个标定位置对应的拼接参数确定目标拼接参数。
本申请中,在第一相机和第二相机的视野重叠区域内只有一个拍摄目标的情况下,将拍摄目标所在的位置对应的拼接参数作为目标拼接参数,或者根据距离拍摄目标所在的位置最近的两个标定位置对应的拼接参数计算得到目标拼接参数,将该目标拼接参数用于对第一相机和第二相机采集的图像进行拼接时,可以使拍摄目标所在的位置的拼接效果较好,从而可以保证拍摄目标在拼接图像中的显示效果。
可选地,根据多个标定位置中距离拍摄目标所在的位置最近的两个标定位置对应的拼接参数确定目标拼接参数的实现方式,包括:基于拍摄目标所在的位置相对于两个标定位置的距离,采用两个标定位置对应的拼接参数插值计算得到拍摄目标所在的位置对应的目标拼接参数。
可选地,第一相机和第二相机的视野重叠区域内包括多个拍摄目标,根据第一相机和第二相机对应的多组拼接参数以及拍摄目标所在的位置确定目标拼接参数的一种实现方式,包括:根据多组拼接参数以及多个拍摄目标所在的位置确定目标拼接参数。
本实现方式可以应用于多相机模组采用导播模式的场景。
可选地,根据多组拼接参数以及多个拍摄目标所在的位置确定目标拼接参数的一种实现方式,包括:将目标标定位置对应的拼接参数作为目标拼接参数,目标标定位置为多个标定位置中到多个拍摄目标所在的位置的距离之和最小的标定位置。
本实现方式综合考虑对多个拍摄目标的图像拼接效果,选择到多个拍摄目标所在的位置的距离之和最小的标定位置对应的拼接参数作为目标拼接参数,将该目标拼接参数用于对第一相机和第二相机采集的图像进行拼接时,可以使多个拍摄目标所在的位置的整体拼接效果较好,从而使得多个拍摄目标在拼接图像中的整体显示效果较好。另外本实现方式的计算过程简单,所消耗的处理资源较少。
或者,根据多组拼接参数以及多个拍摄目标所在的位置确定目标拼接参数的另一种实现方式,包括:针对多个拍摄目标中的每个拍摄目标,获取多个标定位置中距离拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数。根据针对多个拍摄目标获取的所有拼接参数,确定目标拼接参数。
本实现方式综合考虑对多个拍摄目标的图像拼接效果,根据距离各个拍摄目标最近的一个或多个标定位置对应的拼接参数确定目标拼接参数,将该目标拼接参数用于对第一相机和第二相机采集的图像进行拼接时,可以折中多个拍摄目标所在的位置的拼接效果,尽可能使多个拍摄目标所在的位置的拼接效果趋于一致,从而使得多个拍摄目标在拼接图像中的整体显示效果较好。
可选地,在采用目标拼接参数对第一图像和第二图像进行拼接,得到拼接图像之后,输出裁剪图像到屏幕上显示,该裁剪图像从拼接图像中裁剪得到,裁剪图像包含位于第一相机和第二相机的视野重叠区域内的所有拍摄目标。
可选地,视野重叠区域内包括多个拍摄目标,根据第一相机和第二相机对应的多组拼接参数以及拍摄目标所在的位置确定目标拼接参数的另一种实现方式,包括:针对多个拍摄目标中的每个拍摄目标,根据多个标定位置中距离拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数确定拍摄目标对应的拼接参数。相应地,采用目标拼接参数对第一图像和第二图像进行拼接,得到拼接图像的实现方式,包括:针对每个拍摄目标,采用拍摄目标对应的拼接参数对第一图像和第二图像进行拼接,得到拍摄目标对应的拼接图像。
本实现方式可以应用于多相机模组采用智能均分模式或多人同框模式的场景。
可选地,在针对多个拍摄目标中的每个拍摄目标,采用拍摄目标对应的拼接参数对第一图像和第二图像进行拼接,得到拍摄目标对应的拼接图像之后,输出组合图像到屏幕上显示,该组合图像由多张裁剪图像组合得到,多张裁剪图像分别从多个拍摄目标对应的多张拼接图像中裁剪得到,且每张裁剪图像分别包含所裁剪的拼接图像中对应的拍摄目标。
本申请中,图像处理设备针对多个拍摄目标中的每个拍摄目标分别进行一次图像拼接,得到多张拼接图像,以保证每个拍摄目标在对应的拼接图像中的显示效果,使得图像处理设备输出的组合图像中的每个 拍摄目标所来自的拼接图像都能够保证该拍摄目标具有较好的显示效果,因此能够保证最终输出的组合图像的显示效果。
可选地,图像处理设备中存储有多相机模组中相邻部署的两个相机在多种部署场景下分别对应的拼接参数,在根据第一相机和第二相机对应的多组拼接参数以及拍摄目标所在的位置确定目标拼接参数之前,获取多相机模组的部署场景。获取第一相机和第二相机在多相机模组的部署场景下对应的多组拼接参数。
本申请通过提供多相机模组在多种不同部署场景下不同位置对应的拼接参数,图像处理设备可以根据多相机模组的部署场景灵活选择对应的拼接参数,使得图像拼接效果与当前部署场景更匹配,从而能够实现多相机模组不同部署场景下采集的图像都有较好的图像拼接效果。或者也可以不区分多相机模组的部署场景,所有部署场景使用同一套拼接参数。
第二方面,提供了一种图像拼接装置。所述装置包括多个功能模块,所述多个功能模块相互作用,实现上述第一方面及其各实施方式中的方法。所述多个功能模块可以基于软件、硬件或软件和硬件的结合实现,且所述多个功能模块可以基于具体实现进行任意组合或分割。
第三方面,提供了一种图像拼接装置,包括:处理器和存储器;
所述存储器,用于存储计算机程序,所述计算机程序包括程序指令;
所述处理器,用于调用所述计算机程序,实现上述第一方面及其各实施方式中的方法。
第四方面,提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有指令,当所述指令被处理器执行时,实现上述第一方面及其各实施方式中的方法。
第五方面,提供了一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时,实现上述第一方面及其各实施方式中的方法。
第六方面,提供了一种芯片,芯片包括可编程逻辑电路和/或程序指令,当芯片运行时,实现上述第一方面及其各实施方式中的方法。
附图说明
图1是本申请实施例提供的一种多相机模组的结构示意图;
图2是本申请实施例提供的一种应用场景示意图;
图3是本申请实施例提供的一种图像拼接方法的流程示意图;
图4是本申请实施例提供的一种相机标定场景示意图;
图5是本申请实施例提供的一种用户交互界面示意图;
图6是本申请实施例提供的一种拍摄目标所在的位置的示意图;
图7是本申请实施例提供的另一种拍摄目标所在的位置的示意图;
图8是本申请实施例提供的又一种拍摄目标所在的位置的示意图;
图9是本申请实施例提供的一种会议室场景示意图;
图10是本申请实施例提供的另一种会议室场景示意图;
图11是本申请实施例提供的一种多相机模组采集的图像示意图;
图12是本申请实施例提供的一种拼接图像示意图;
图13是本申请实施例提供的一种裁剪图像示意图;
图14是本申请实施例提供的另一种多相机模组采集的图像示意图;
图15是本申请实施例提供的一种组合图像示意图;
图16是本申请实施例提供的一种图像拼接装置的结构示意图;
图17是本申请实施例提供的一种图像处理设备的硬件结构示意图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
为了便于读者对本申请技术方案的理解,首先对本申请实施例涉及的部分名词进行介绍。
1、目标检测(object detection,OD):找出图像中所有感兴趣的目标(物体),确定目标的类别以及目标在图像中的位置。
2、声源定位(sound source localization,SSL):利用多个麦克风在环境不同位置点对声信号进行测量,由于声信号到达各麦克风的时间有不同程度的延迟,利用算法对测量到的声信号进行处理,由此获得声源点相对于麦克风的到达方向(包括方位角和俯仰角)和距离等。
3、声像匹配:以图像中所有目标的框位置和声源定位坐标为输入,将声源定位坐标对应到图像中对应的像素位置,就可以找到发声者,如发言人,该过程称为声像匹配。其中,目标的框位置指的是包含目标在内的框的位置,该框可以是标注的真实框或通过目标检测得到的预测框。
4、像素坐标系:像素坐标系是以相机采集到的图像的左上顶点为坐标原点的坐标系。像素坐标系的x轴(横轴)与y轴(纵轴)分别为相机采集到的图像的宽高方向。
5、世界坐标系:世界坐标系能够描述相机在现实世界中的位置,同样还能够描述相机采集到的图像中的物体在现实世界中的位置。
6、相机标定:通过一系列算法求解世界坐标系与相机采集的图像对应的像素坐标系之间的投影变换关系的过程,也可理解为确定相机参数的过程。相机参数包括相机内参和相机外参。其中相机内参是相机的固有属性,包含畸变系数,与相机焦距、像元尺寸相关。相机外参与相机在世界坐标系下的位姿有关,位姿包括位置和姿态,位置指相机在世界坐标系下的坐标,姿态指相机在世界坐标系下的朝向。
图像拼接技术是一个日益流行的研究领域,它已经成为照相绘图学、计算机视觉、图像处理和计算机图形学研究的热点。一系列空间重叠的图像,通过图像拼接技术处理,构成一个高清晰的图像,拼接后的图像具有比单个图像更高的分辨率和更大的视野。但是,由于不同相机之间存在双目视差,以及受到拼接算法误差、算力受限等因素影响,对多个相机采集到的图像进行拼接得到的全景图像经常存在明显的拼缝,过渡不自然。对于会议场景,当与会人位于图像重叠区域时,人脸或人体可能会因拼缝而显示不完整,影响图像显示效果。
为了提升图像拼接效果,目前常用的图像拼接技术主要有两种。一种是利用离线生成的固定的拼接参数在线进行拼接,主要原理是对多相机模组在出厂前或安装前进行离线标定,得到多相机模组中每个相机固定的相机参数,然后基于各个相机固定的相机参数得到固定的拼接参数,并使用固定的拼接参数对多相机模组采集的图像进行拼接。其中,相机参数包括相机内参和相机外参,相机内参包括畸变参数。在这种图像拼接技术下,由于拼接图像时使用的拼接参数始终是固定的,最终得到的拼接图像的重叠区域内有些位置的拼接效果较好,有些位置的拼接效果较差,无法兼顾图像重叠区域内多个位置的拼接效果。另一种是在线实时拼接,拼接时检测图像重叠区域的特征点并进行匹配,从而实时生成拼接的相机图像映射关系。在这种图像拼接技术下,一方面,需要相邻部署的两个相机的视野重叠区域较大且视野重叠区域内要有明显的特征点,否则特征点匹配容易出错,导致拼接效果较差;另一方面针对每个相机采集的每帧图像都需要实时进行特征点检测与匹配,图像拼接所需的算力较大。
基于此,本申请实施例提供了一种图像拼接方法,由多相机模组采集图像,该多相机模组包括多个相机,该多个相机的相对部署位置是固定的,该多个相机中相邻部署的两个相机具有视野重叠区域。图像处理设备获取多相机模组在同一时刻采集的多帧图像之后,当该多相机模组中相邻部署的两个相机的视野重叠区域内包括拍摄目标时,根据该两个相机对应的多组拼接参数以及拍摄目标所在的位置确定目标拼接参数,再采用目标拼接参数对该两个相机采集的图像进行拼接,得到拼接图像。其中,该两个相机对应的多组拼接参数分别基于该两个相机共同针对视野重叠区域内的不同标定位置标定的相机参数得到。本申请实施例通过离线标定多相机模组中相邻部署的两个相机针对视野重叠区域内的不同标定位置的相机参数,对于每个标定位置,进一步可以根据这两个相机针对该标定位置标定得到的相机参数确定拼接参数,以得到不同标定位置分别对应的拼接参数。当拍摄目标位于两个相机的视野重叠区域内时,图像处理设备可以根据该视野重叠区域内的多个标定位置各自对应的拼接参数确定适用于拍摄目标所在的位置的拼接参数,进 而使得拼接图像中拍摄目标所在的位置的拼接效果较好,从而提升拍摄目标在拼接图像中的显示效果。另外,本申请实施例中无需对图像进行特征点检测和匹配,适用于更多的拍摄场景且所需算力较低。
下面从应用场景、方法流程、软件装置、硬件装置等多个角度,对本申请提供的技术方案进行详细介绍。
下面对本申请实施例的应用场景举例说明。
本申请实施例提供的图像拼接方法可以应用于图像处理设备。该图像处理设备可以是多相机模组,或者也可以是显示设备,又或者可以是与显示设备连接的视频服务器。可选地,显示设备内置有多相机模组,或者,显示设备与外置的多相机模组相连。多相机模组包括多个相机,该多个相机的相对部署位置固定。该多个相机分别用于采集不同拍摄区域的图像,以得到多路视频流。多相机模组也可称为全景相机。图像处理设备用于对多相机模组在同一时刻采集的多帧图像进行拼接,以供显示设备显示拼接图像。视频服务器可以是一台服务器,或者由多台服务器组成的服务器集群,或者云计算平台等。
可选地,多相机模组中所有相机采集图像的时刻和频率相同。例如可以采用相机同步技术实现多相机模组中所有相机的同步拍摄。多相机模组中的任意相邻两个相机具有视野重叠区域。其中,两个相机具有视野重叠区域,是指该两个相机的拍摄区域具有重合区域。可选地,多相机模组中的多个相机可以采用直线排布方式、扇形排布方式或其它不规则排布方式等,可根据实际拍摄场景设计相应的相机排布方式。例如,图1是本申请实施例提供的一种多相机模组的结构示意图。如图1所示,多相机模组10包括3个相机,分别为相机1-3。相机1-3采用直线排布方式依次设置。其中,相机1与相机2具有视野重叠区域A12。相机2与相机3具有视野重叠区域A23。图1示出的多相机模组所包含的相机数量和相机排布方式仅用作示例性说明,不作为对本申请实施例涉及的多相机模组的限定。
可选地,多相机模组中的相机所采集图像的编码格式可以是RGB、YUV或HSV等。其中,RGB中的R(red)是红色分量,G(green)是绿色分量,B(blue)是蓝色分量。YUV中的Y是亮度分量,U和V是色彩分量。HSV中的H(hue)是色相分量,S(saturation)是饱和度分量,V(value)是明度分量。相机所采集图像的分辨率可以是4K,或者也可以是1080P、720P、540P或360P等。相机所采集图像的画面比例可以是4:3或16:9等。多相机模组中不同相机所采集的图像的编码格式、分辨率、画面比例可以相同或者也可以不同,如果不同,则后续需要先将图像转换成统一格式之后再进行图像处理。本申请实施例对相机所采集的图像的编码格式、分辨率和画面比例均不作限定。
本申请实施例提供的图像拼接方法可以应用于多种场景,包括但不限于视频会议场景、监控场景或视频直播场景。本申请实施例以图像拼接方法应用于视频会议场景为例进行说明,显示设备可以是会议终端,例如可以是大屏、电子白板、手机、平板电脑或智能可穿戴设备等具有显示功能的电子设备。
例如,图2是本申请实施例提供的一种应用场景示意图。该应用场景是视频会议场景。如图2所示,该应用场景包括会议终端201A和会议终端201B(统称为会议终端201)。会议终端201A与会议终端201B通信连接。会议终端201A内置有多相机模组(图中未示出)。
可选地,请继续参见图2,该应用场景还包括视频服务器202。多个会议终端201分别与视频服务器202连接。多个会议终端201之间通过视频服务器202实现通信,视频服务器202例如可以是多点控制单元(multi control unit,MCU)。当然,本申请实施例也不排除不同会议终端之间直接相连的情况。
在如图2所示的应用场景中,会议终端201A在获取内置的多相机模组采集的多路视频流之后,会议终端201A可以对该多路视频流中采集时刻相同的多帧图像进行拼接处理,并将拼接得到的拼接图像作为一路视频流发送给视频服务器202,再由视频服务器202发送给会议终端201B,以供会议终端201B显示。或者,会议终端201A可以将该多路视频流发送给视频服务器202,由视频服务器202对该多路视频流中采集时刻相同的多帧图像进行拼接处理,再将拼接得到的拼接图像作为一路视频流发送给会议终端201B,以供会议终端201B显示。又或者,会议终端201A可以将该多路视频流发送给视频服务器202,由视频服务器202将该多路视频流发送给会议终端201B,再由会议终端201B对该多路视频流中采集时刻相同的多帧图像进行拼接处理,并显示拼接得到的拼接图像。也就是说,本申请实施例提供的图像拼接方法可以由图像采集侧的设备(比如会议终端201A)执行,或者可以由图像转发设备(比如视频服务器202)执行,又或者可以由图像接收侧的设备(比如会议终端201B)执行,本申请实施例对方案的执行主体不做限定。
下面对本申请实施例的方法流程举例说明。
例如,图3是本申请实施例提供的一种图像拼接方法的流程示意图。该方法可以应用于图像处理设备。图像处理设备例如可以是图2示出的应用场景中的会议终端201A、视频服务器202或会议终端201B。如图3所示,该方法包括但不限于以下步骤301至步骤303。
步骤301、获取多相机模组在同一时刻采集的多帧图像。
多相机模组包括多个相机,该多个相机包括相邻部署的第一相机和第二相机。第一相机和第二相机具有视野重叠区域。多相机模组在同一时刻采集的多帧图像包括第一相机采集的第一图像和第二相机采集的第二图像。本申请以下实施例均以多相机模组中相邻部署的第一相机和第二相机为例,对第一相机和第二相机采集的图像的拼接过程进行说明,对多相机模组中其它相邻部署的相机采集的图像的拼接过程可参考对第一相机和第二相机采集的图像的拼接过程,本申请实施例不再一一赘述。
可选地,图像处理设备获取多相机模组在同一时刻采集的多帧图像之后,可以先对多帧图像分别进行预处理,以去除图像中的噪声,例如可以对图像进行中值滤波处理,再对经过预处理的图像执行后续拼接流程。
步骤302、当第一相机和第二相机的视野重叠区域内包括拍摄目标时,根据第一相机和第二相机对应的多组拼接参数以及拍摄目标所在的位置确定目标拼接参数。
第一相机和第二相机对应的多组拼接参数分别基于第一相机和第二相机共同针对视野重叠区域内的不同标定位置标定的相机参数得到,也就是说,第一相机和第二相机对应的一组拼接参数基于第一相机和第二相机针对视野重叠区域内的同一标定位置标定的相机参数得到。可选地,第一相机和第二相机的视野重叠区域内的多个标定位置中的任意两个标定位置满足以下一个或多个条件:两个标定位置到第一相机和第二相机中的中心位置的距离不同;两个标定位置相对于第一相机和第二相机的排布方向的水平角度不同。两个标定位置相对于第一相机和第二相机的排布方向的垂直角度不同。第一相机和第二相机的排布方向例如可以是第一相机与第二相机的连线所在直线的方向。
可选地,多相机模组中各个相机的相机参数可以通过离线标定得到,离线标定可以是在产品出厂前、或者产品安装时、或者产品安装后进行。例如,图4是本申请实施例提供的一种相机标定场景示意图。以不同标定位置到多相机模组的距离不同为例,如图4所示,第一相机和第二相机的视野重叠区域内设置有6个标定位置,包括标定位置A-F。标定位置A-F到多相机模组的距离分别为1米、3米、5米、8米、10米和20米。则在多相机模组应用之前,先针对标定位置A-F分别对第一相机和第二相机进行相机标定,得到第一相机分别针对标定位置A-F标定的6组相机参数以及第二相机分别针对标定位置A-F标定的6组相机参数。相机采用针对某个标定位置标定的相机参数成像时,该标定位置的成像效果相较于其它位置的成像效果更好,例如该标定位置的重投影误差更小。
本申请实施例对相机标定的实现方式不做限定。一种实现方式,可以将棋盘格作为标定参照物进行标定,具体是将棋盘格放置在拍摄场景中的不同位置,分别拍摄多张包含棋盘格的图像,然后检测棋盘格角点位置,通过标定算法求解得到对应的相机参数。此处标定算法可以采用张正友标定算法,也可以采用其他算法。这种实现方式下,可以使用更多将棋盘格放置在某个标定位置时相机拍摄的图像进行相机标定,相应地,可以得到该相机针对该标定位置标定的相机参数。另一种实现方式,不需要棋盘格作为标定参照物,例如可以采用主动视觉相机标定法,利用已知相机的某些运动信息对相机进行标定,或者也可以采用自标定算法,包括但不限于Hartley的QR分解法、Triggs的绝对二次曲面法、Pollefeys的模约束法。
可选地,在获取第一相机针对第一相机与第二相机的视野重叠区域内的多个标定位置标定的相机参数以及第二相机针对该多个标定位置标定的相机参数之后,可以进一步计算得到该多个标定位置分别对应的拼接参数。
可选地,第一相机和第二相机对应的每组拼接参数包括第一相机采集的图像到第二相机采集的图像的投影变换参数,该投影变换参数用于将第一相机采集的图像变换到第二相机采集的图像对应的像素坐标系下。本申请实施例中将第二相机采集的图像作为基准图像,并将第一相机采集的图像作为待配准图像。第一相机采集的图像到第二相机采集的图像的投影变换参数可以采用像素坐标映射表来表示,像素坐标映射表包括第一相机采集的图像中的多个像素坐标与第二相机采集的图像中的多个像素坐标的对应关系,这里的对应关系可以是第一相机采集的图像中的一个或多个像素坐标与第二相机采集的图像中的一个像素坐标的对应关系,比如第一相机采集的图像中的像素坐标(x1,y1)与第二相机采集的图像中的像素坐标(x2, y2)对应,则在将第一相机采集的图像变换到第二相机采集的图像对应的像素坐标系时,可以将像素坐标(x1,y1)处的像素值对应设置在像素坐标(x2,y2)处,又比如第一相机采集的图像中的像素坐标(x11,y11)和(x12,y12)与第二相机采集的图像中的像素坐标(x2,y2)对应,则在将第一相机采集的图像变换到第二相机采集的图像对应的像素坐标系时,可以对像素坐标(x11,y11)处的像素值与像素坐标(x12,y12)处的像素值进行插值计算或取均值计算等,并将计算得到的像素值对应设置在像素坐标(x2,y2)处。或者,第一相机采集的图像到第二相机采集的图像的投影变换参数也可以采用图像变换矩阵表示,则图像处理设备可以将第一相机采集的图像的像素坐标与该图像变换矩阵相乘,以得到第一相机采集的图像在第二相机采集的图像对应的像素坐标系下的像素坐标。
本申请实施例中,针对第一相机与第二相机的视野重叠区域内的一个标定位置,可以采用第一相机针对该标定位置标定的相机参数以及第二相机针对该标定位置标定的相机参数计算得到第一相机采集的图像到第二相机采集的图像的投影变换参数。例如可以采用第一相机针对该标定位置标定的相机参数对第一相机采集的图像进行柱面投影变换或球面投影变换,以及采用第二相机针对该标定位置标定的相机参数对第二相机采集的图像进行柱面投影变换或球面投影变换,以使第一相机采集的图像和第二相机采集的图像投影变换到同一柱面或同一球面上,从而确定第一相机采集的图像与第二相机采集的图像的重叠区域内的相同像素点,再根据重叠区域内的多个像素点分别在第一相机采集的图像中的像素坐标以及在第二相机采集的图像中的像素坐标生成像素坐标映射表,或者计算第一相机采集的图像到第二相机采集的图像的图像变换矩阵。
或者,第一相机和第二相机对应的每组拼接参数包括第一相机采集的图像到目标平面坐标系的投影变换参数以及第二相机采集的图像到目标平面坐标系的投影变换参数。目标平面坐标系不同于第一相机采集的图像对应的像素坐标系以及第二相机采集的图像对应的像素坐标系。第一相机采集的图像到目标平面坐标系的投影变换参数和第二相机采集的图像到目标平面坐标系的投影变换参数分别可以采用像素坐标映射表来表示。比如第一相机采集的图像到目标平面坐标系的投影变换参数可以采用像素坐标映射表A来表示,像素坐标映射表A包括第一相机采集的图像中的多个像素坐标与目标平面坐标系中的多个坐标的对应关系。第二相机采集的图像到目标平面坐标系的投影变换参数可以采用像素坐标映射表B来表示,像素坐标映射表B包括第二相机采集的图像中的多个像素坐标与目标平面坐标系中的多个坐标的对应关系。或者,第一相机采集的图像到目标平面坐标系的投影变换参数和第二相机采集的图像到目标平面坐标系的投影变换参数分别可以采用图像变换矩阵表示。比如第一相机采集的图像到目标平面坐标系的投影变换参数可以采用图像变换矩阵A表示,第二相机采集的图像到目标平面坐标系的投影变换参数可以采用图像变换矩阵B表示。图像处理设备可以将第一相机采集的图像的像素坐标与图像变换矩阵A相乘,以得到第一相机采集的图像在目标平面坐标系下的像素坐标。图像处理设备可以将第二相机采集的图像的像素坐标与图像变换矩阵B相乘,以得到第二相机采集的图像在目标平面坐标系下的像素坐标。
针对第一相机与第二相机的视野重叠区域内的一个标定位置,可以采用第一相机针对该标定位置标定的相机参数计算得到第一相机采集的图像到目标平面坐标系的投影变换参数。例如可以采用第一相机针对该标定位置标定的相机参数对第一相机采集的图像进行柱面投影变换或球面投影变换,再将柱面图像或球面图像投影到目标平面坐标系所在的平面上,根据多个像素点分别在第一相机采集的图像中的像素坐标以及在目标平面坐标系中的坐标生成像素坐标映射表A,或者计算第一相机采集的图像到目标平面坐标系的投影变换参数。同理,可以采用第二相机针对标定位置标定的相机参数计算得到第二相机采集的图像到目标平面坐标系的投影变换参数,具体计算方式可参考对第一相机采集的图像到目标平面坐标系的投影变换参数的计算方式,本申请实施例在此不再赘述。
本申请实施例中,可以基于像素坐标映射表实现对第一相机采集的图像中的像素点与第二相机采集的图像中的像素点的配对,或者可以通过图像变换矩阵将第一相机采集的图像与第二相机采集的图像变换到同一平面坐标系下,实现对第一相机采集的图像中的像素点与第二相机采集的图像中的像素点的配对。
可选地,第一相机和第二相机对应的每组拼接参数还包括图像融合参数,该图像融合参数用于对第一相机采集的图像与第二相机采集的图像进行图像融合处理。图像融合参数包括但不限于第一相机采集的图像与第二相机采集的图像的像素值权重、第一相机采集的图像的曝光权重、第二相机采集的图像的曝光权重、第一相机采集的图像的白平衡权重或第二相机采集的图像的白平衡权重等。其中,像素值权重用于计算第一相机采集的图像和第二相机采集的图像的重叠区域内的像素点的像素值,像素值权重具体包括第一 相机采集的图像的像素值在融合后的图像中的占比以及第二相机采集的图像的像素值在融合后的图像中的占比。曝光权重用于调整图像亮度,第一相机采集的图像的曝光权重用于调整第一相机采集的图像的亮度,第二相机采集的图像的曝光权重用于调整第二相机采集的图像的亮度,通过设置曝光权重,可以使第一相机采集的图像与第二相机采集的图像的亮度趋于一致。白平衡权重用于调整图像色彩度,第一相机采集的图像的白平衡权重用于调整第一相机采集的图像的色彩度,第二相机采集的图像的白平衡权重用于调整第二相机采集的图像的色彩度,通过设置白平衡权重,可以使第一相机采集的图像与第二相机采集的图像的色彩度趋于一致。图像融合参数可以是人工设置且可调整的。
本申请实施例中,使用第一相机与第二相机的视野重叠区域内的某个标定位置对应的拼接参数对第一相机采集的图像和第二相机采集的图像进行拼接时,该标定位置的拼接效果相较于其它位置的拼接效果更好。
可选地,图像处理设备中可以存储有多相机模组中相邻部署的两个相机对应的多组拼接参数与该两个相机的视野重叠区域内多个标定位置的对应关系。或者,图像处理设备中可以存储有多相机模组中相邻部署的两个相机各自的多组相机参数与该两个相机的视野重叠区域内多个标定位置的对应关系,在需要使用该两个相机对应的拼接参数时,图像处理设备根据各个标定位置分别对应的该两个相机的相机参数计算对应的拼接参数。
本申请实施例还可以针对不同部署场景分别生成多个不同标定位置对应的拼接参数。以视频会议场景为例,不同部署场景即不同会议室场景,包括会议室类型不同和/或会议室大小不同。会议室类型可以分为开放式会议室、半开放式会议室和封闭式会议室这三种,或者可以分为室内会议室和室外会议室这两种。会议室大小包括会议室的长(距离多相机模组的最大深度距离)、宽(左右宽度)、高(上下高度)。例如,本申请实施例可以针对开放式会议室、半开放式会议室、封闭式会议室这3种类型,以及3种大小的共9种会议室场景中的每种场景分别生成不同位置下的拼接参数。比如对于封闭式、大型会议室,可以生成距离多相机模组1米、3米、5米、8米、10米和20米这6个位置对应的拼接参数。
一种可能实现方式,图像处理设备中存储有多相机模组中相邻部署的两个相机在多种部署场景下分别对应的拼接参数,则图像处理设备在根据第一相机和第二相机对应的多组拼接参数以及拍摄目标所在的位置确定目标拼接参数之前,需要先获取多相机模组的部署场景,再获取第一相机和第二相机在该多相机模组的部署场景下对应的多组拼接参数。
这种实现方式下,图像处理设备首先需要获取多相机模组的部署场景。可选地,多相机模组的部署场景可以由图像处理设备根据多相机模组采集的图像确定,或者可以由其它传感器识别,又或者可以通过用户交互界面输入或选择。
以多相机模组的部署场景为视频会议场景为例,图像处理设备可以根据多相机模组采集的图像识别会议室类型以及估计会议室大小。一种具体实现方式,图像处理设备可以对多相机模组中一个或者多个相机采集的图像采用分类算法以识别当前会议室类型,如果采用多个相机采集的图像进行分类,图像处理设备可以对每个相机采集的图像分别进行分类,并对分类结果加权平均得到最终分类结果,或者图像处理设备也可以将多个相机采集的图像输入一个分类模型,然后得到分类模型输出的分类结果。另一种具体实现方式,图像处理设备可以根据单目图像进行三维设计(3D layout)空间尺寸估计,例如通过深度学习模型对输入的单目图像进行空间尺寸估计。或者,可以通过毫米波雷达、超声波雷达或多麦克风阵列等进行测距或生成三维点云图像,计算得到会议室大小,再通过分类模型确定会议室场景。上述图像识别方案和传感器识别方案也可以结合使用,例如可以基于图像确定会议室类型,以及通过毫米波雷达计算会议室大小,进而根据会议室类型以及会议室大小确定会议室场景。
可选地,图像处理设备可以通过用户交互界面显示多种部署场景的选项以供用户选择。例如,图5是本申请实施例提供的一种用户交互界面示意图。如图5所示,该用户交互界面为部署场景选择界面,该用户交互界面包括三种会议室类型选项和三种会议室大小选项,三种会议室类型选项分别为开放式、半开放式和封闭式,三种会议室大小选项分别为大、中、小。可选地,会议室大小选项还可以包括大、中、小分别对应的具体会议室尺寸,例如长宽高,图中未一一展示。用户可以在该用户交互界面上选择会议室类型和会议室大小,进而图像处理设备根据用户选择结果确定多相机模组所部署的会议室场景。
本申请实施例通过提供多相机模组在多种不同部署场景下不同位置对应的拼接参数,图像处理设备可以根据多相机模组的部署场景灵活选择对应的拼接参数,使得图像拼接效果与当前部署场景更匹配,从而 能够实现多相机模组不同部署场景下采集的图像都有较好的图像拼接效果。或者也可以不区分多相机模组的部署场景,所有部署场景使用同一套拼接参数,本申请实施例对此不做限定。
可选地,第一相机和第二相机的视野重叠区域内可能有一个拍摄目标,也可能有多个拍摄目标。图像处理设备可以基于图像、声音或传感器估计拍摄目标所在的位置。位置估计可以采用导播技术实现,例如拍摄目标为人,可以通过人脸或人体跟踪确定人的位置,或者通过声源定位确定人的位置,又或者通过毫米波传感器或超声波传感器采用活体检测算法确定人的位置,又或者将通过运动检测算法确定的运动物体的位置作为人的位置,又或者结合使用至少两种上述位置估计方案确定人的位置。或者,拍摄目标所在的位置也可以通过手动输入得到,例如用户可以通过用户交互界面输入位置坐标。本申请实施例对图像处理设备获取拍摄目标所在的位置的方式不做限定。
可选地,拍摄目标所在的位置可以是一维、二维或三维的,这里所确定的拍摄目标所在的位置的维度可以与相机标定时所选用的标定位置的维度一致。比如,位于相邻部署的两个相机的视野重叠区域内的拍摄目标所在的位置的一维表示可以是拍摄目标到该两个相机的中心位置的距离,或者拍摄目标相对于该两个相机的排布方向的水平(左右)角度,或者拍摄目标相对于该两个相机的排布方向的垂直(上下)角度。位于相邻部署的两个相机的视野重叠区域内的拍摄目标所在的位置的二维表示可以是拍摄目标到该两个相机的中心位置的距离以及拍摄目标相对于该两个相机的排布方向的水平角度,或者拍摄目标到该两个相机的中心位置的距离以及拍摄目标相对于该两个相机的排布方向的垂直角度,或者拍摄目标相对于该两个相机的排布方向的水平角度以及拍摄目标相对于该两个相机的排布方向的垂直角度。拍摄目标所在的位置的三维表示可以是拍摄目标到该两个相机的中心位置的距离,拍摄目标相对于该两个相机的排布方向的水平角度,以及拍摄目标相对于该两个相机的排布方向的垂直角度。例如,图6至图8分别是本申请实施例提供的一种拍摄目标所在的位置的示意图。如图6所示,拍摄目标到多相机模组的距离为3米。如图7所示,拍摄目标到多相机模组的距离为3米,且拍摄目标相对于多相机模组的水平角度为30°,其中0°表示正前方,水平正角度表示在多相机模组右边,水平负角度表示在多相机模组左边,拍摄目标相对于多相机模组的水平角度为30°表示拍摄目标在多相机模组的正前方右侧30°。如图8所示,拍摄目标到多相机模组的距离为3米,拍摄目标相对于多相机模组的水平角度为30°,且拍摄目标相对于多相机模组的垂直角度为20°,其中垂直正角度表示在多相机模组上方,垂直负角度表示在多相机模组下方,拍摄目标相对于多相机模组的垂直角度为20°表示拍摄目标在多相机模组的正上方20°。在针对图6至图8的描述中,拍摄目标到多相机模组的距离可以是拍摄目标到构成拍摄目标所在视野重叠区域的两个相机的中心位置的距离,拍摄目标相对于多相机模组的水平角度或垂直角度可以是拍摄目标到构成拍摄目标所在视野重叠区域的两个相机的排布方向的水平角度或垂直角度,图中未具体示出多相机模组中的相机,仅以多相机模组统一表示。其中,图7和图8中的x轴方向为水平方向,y轴方向为深度方向,图8中的z轴方向为高度方向。
本申请以下实施例针对视野重叠区域内包括一个拍摄目标的情况和包括多个拍摄目标的情况,对步骤302的实现方式分别进行说明。可选地,拍摄目标可以是人,或者也可以是动物、汽车或工厂工件等任意物体,本申请实施例对拍摄目标的类型不做限定。
第一种可能情况,第一相机和第二相机的视野重叠区域内包括一个拍摄目标,步骤302的实现方式可以是,图像处理设备根据第一相机和第二相机的视野重叠区域内的多个标定位置中距离拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数确定目标拼接参数。可选地,如果该多个标定位置包括拍摄目标所在的位置,图像处理设备可以将拍摄目标所在的位置对应的拼接参数作为目标拼接参数。如果该多个标定位置不包括拍摄目标所在的位置,图像处理设备可以根据该多个标定位置中距离拍摄目标所在的位置最近的两个标定位置对应的拼接参数确定目标拼接参数。
可选地,图像处理设备根据多个标定位置中距离拍摄目标所在的位置最近的两个标定位置对应的拼接参数确定目标拼接参数的一种实现方式,包括:图像处理设备基于拍摄目标所在的位置相对于两个标定位置的距离,采用两个标定位置对应的拼接参数插值计算得到拍摄目标所在的位置对应的目标拼接参数。比如,拍摄目标距离多相机模组2.2米,离拍摄目标最近的两个标定位置分别距离多相机模组2米和3米,那么可以采用距离多相机模组2米的标定位置对应的拼接参数以及距离多相机模组3米的标定位置对应的拼接参数插值计算得到距离多相机模组2.2米处的拼接参数,并将该拼接参数作为目标拼接参数。可选地,这里采用的插值算法可以是线性插值算法或非线性插值算法。以采用线性插值算法为例,假设距离多相机 模组2米的标定位置对应的拼接参数为T1,距离多相机模组3米的标定位置对应的拼接参数为T2,则采用线性插值算法计算得到的距离多相机模组2.2米处的拼接参数为(0.8*T1+0.2*T2)。
或者,图像处理设备也可以将距离拍摄目标所在的位置最近的两个标定位置对应的拼接参数的均值作为拍摄目标所在的位置对应的目标拼接参数。
本申请实施例中,在第一相机和第二相机的视野重叠区域内只有一个拍摄目标的情况下,将拍摄目标所在的位置对应的拼接参数作为目标拼接参数,或者根据距离拍摄目标所在的位置最近的两个标定位置对应的拼接参数计算得到目标拼接参数,将该目标拼接参数用于对第一相机和第二相机采集的图像进行拼接时,可以使拍摄目标所在的位置的拼接效果较好,从而可以保证拍摄目标在拼接图像中的显示效果。
第二种可能情况,第一相机和第二相机的视野重叠区域内包括多个拍摄目标。这种可能情况下,图像处理设备可以针对多个拍摄目标确定一个拼接参数,或者也可以针对多个拍摄目标中的每个拍摄目标分别确定对应的拼接参数,本申请以下实施例对这两种实现方式分别进行说明。
第一种实现方式,图像处理设备可以针对多个拍摄目标确定一个拼接参数。该实现方式可以应用于多相机模组采用导播模式的场景。可选地,导播模式包括但不限于智能取景(auto framing)模式、发言人特写模式、演讲者跟踪模式或对话模式。其中发言人特写模式和演讲者跟踪模式是指对会议中发言的个人进行框选和跟踪。这些模式中都需要进行人脸检测,其中发言人特写模式还需要进行声源定位。由于声源定位存在误差,这就要求声源位置和图像进行匹配。根据声源定位后的位置,在图像中该位置寻找人脸,如果成功找到人脸,则导播特写该人脸,这也就是声像匹配的过程。
在第一种实现方式下,步骤302的具体实现方式可以是,图像处理设备根据第一相机和第二相机对应的多组拼接参数以及多个拍摄目标所在的位置确定目标拼接参数。
可选地,图像处理设备根据第一相机和第二相机对应的多组拼接参数以及多个拍摄目标所在的位置确定目标拼接参数的一种实现方式,包括:图像处理设备将目标标定位置对应的拼接参数作为目标拼接参数,目标标定位置为多个标定位置中到该多个拍摄目标所在的位置的距离之和最小的标定位置。比如,第一相机和第二相机的视野重叠区域内有3个拍摄目标,该3个拍摄目标到多相机模组的距离分别为2米,3米和5米,则距离多相机模组3米的标定位置到这3个拍摄目标所在的位置的距离之和为1+0+2=3米,距离多相机模组4米的标定位置到这3个拍摄目标所在的位置的距离之和为2+1+1=4米,因此可以将距离多相机模组3米的标定位置对应的拼接参数作为目标拼接参数。
本实现方式综合考虑对多个拍摄目标的图像拼接效果,选择到多个拍摄目标所在的位置的距离之和最小的标定位置对应的拼接参数作为目标拼接参数,将该目标拼接参数用于对第一相机和第二相机采集的图像进行拼接时,可以使多个拍摄目标所在的位置的整体拼接效果较好,从而使得多个拍摄目标在拼接图像中的整体显示效果较好。另外本实现方式的计算过程简单,所消耗的处理资源较少。
可选地,图像处理设备根据第一相机和第二相机对应的多组拼接参数以及多个拍摄目标所在的位置确定目标拼接参数的另一种实现方式,包括:图像处理设备针对该多个拍摄目标中的每个拍摄目标,获取多个标定位置中距离该拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数。图像处理设备根据针对该多个拍摄目标获取的所有拼接参数,确定目标拼接参数。其中,如果多个标定位置包括拍摄目标所在的位置,则距离拍摄目标所在的位置最近的一个或多个标定位置可以是该拍摄目标所在的位置。如果多个标定位置不包括拍摄目标所在的位置,则距离拍摄目标所在的位置最近的一个或多个标定位置可以是距离拍摄目标所在的位置最近的两个标定位置。
可选地,图像处理设备根据针对该多个拍摄目标获取的所有拼接参数,确定目标拼接参数的一种实现过程,包括:图像处理设备先确定每个拍摄目标所在的位置分别对应的拼接参数,再根据该多个拍摄目标所在的位置分别对应的拼接参数,确定目标拼接参数。其中,图像处理设备确定单个拍摄目标所在的位置对应的拼接参数的实现过程可参考上述第一种可能情况中的相关描述,本申请实施例在此不再赘述。
可选地,图像处理设备可以将该多个拍摄目标所在的位置分别对应的拼接参数的平均值作为目标拼接参数。比如,第一相机和第二相机的视野重叠区域内有2个拍摄目标,其中一个拍摄目标所在的位置对应的拼接参数为T1,另一个拍摄目标所在的位置对应的拼接参数为T2,则目标拼接参数可以是(T1+T2)/2。或者,图像处理设备也可以将该多个拍摄目标所在的位置分别对应的拼接参数的加权平均值作为目标拼接参数,各个拍摄目标所在的位置对应的拼接参数的权重占比与到该多个拍摄目标所在的位置的聚类中心的距离正相关。
或者,图像处理设备根据针对该多个拍摄目标获取的所有拼接参数,确定目标拼接参数的另一种实现过程,包括:图像处理设备将针对该多个拍摄目标获取的所有拼接参数的平均值作为目标拼接参数。比如,第一相机和第二相机的视野重叠区域内有2个拍摄目标,距离其中一个拍摄目标所在的位置最近的2个标定位置对应的拼接参数分别为T1和T2,距离其中另一个拍摄目标所在的位置最近的2个标定位置对应的拼接参数分别为T3和T4,则目标拼接参数可以是(T1+T2+T3+T4)/4。
本实现方式综合考虑对多个拍摄目标的图像拼接效果,根据距离各个拍摄目标最近的一个或多个标定位置对应的拼接参数确定目标拼接参数,将该目标拼接参数用于对第一相机和第二相机采集的图像进行拼接时,可以折中多个拍摄目标所在的位置的拼接效果,尽可能使多个拍摄目标所在的位置的拼接效果趋于一致,从而使得多个拍摄目标在拼接图像中的整体显示效果较好。
第二种实现方式,图像处理设备可以针对多个拍摄目标中的每个拍摄目标分别确定对应的拼接参数。该实现方式可以应用于多相机模组采用智能均分模式或多人同框模式的场景。
在第二种实现方式下,步骤302的具体实现方式可以是,针对多个拍摄目标中的每个拍摄目标,图像处理设备根据第一相机和第二相机的视野重叠区域内的多个标定位置中距离该拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数确定该拍摄目标对应的拼接参数。也即是,这种可能情况下,图像处理设备针对每个拍摄目标分别确定该拍摄目标所在的位置对应的拼接参数,图像处理设备确定单个拍摄目标所在的位置对应的拼接参数的实现过程可参考上述第一种可能情况中的相关描述,本申请实施例在此不再赘述。
智能均分模式或多人同框模式是指,将多人特写画面组合后在同一个屏幕上显示。这种可能情况下,目标拼接参数包括多个拍摄目标中的每个拍摄目标对应的拼接参数。
步骤303、采用目标拼接参数对第一相机采集的第一图像和第二相机采集的第二图像进行拼接,得到拼接图像。
可选地,目标拼接参数包括第一图像到第二图像的投影变换参数,或者,目标拼接参数包括第一图像到目标平面坐标系的投影变换参数以及第二图像到目标平面坐标系的投影变换参数。采用目标拼接参数对第一相机采集的第一图像和第二相机采集的第二图像进行拼接,可以是采用第一图像到第二图像的投影变换参数将第一图像变换到第二图像对应的像素坐标系下,再对第二图像与经过变换的第一图像进行图像融合,或者可以是采用第一图像到目标平面坐标系的投影变换参数将第一图像变换到目标平面坐标系所在的平面上,以及采用第二图像到目标平面坐标系的投影变换参数将第二图像变换到目标平面坐标系所在的平面上,再对经过变换的第一图像和经过变换的第二图像进行图像融合。对第一图像和第二图像进行图像融合,实际上是对第一图像和第二图像的重叠区域进行图像融合。可选地,目标拼接参数还包括图像融合参数,图像处理设备可以基于图像融合参数对第一图像和第二图像的重叠区域进行图像融合,比如基于曝光权重调整第一图像和/或第二图像的亮度,基于白平衡权重调整第一图像和/或第二图像的色彩度,基于像素值权重计算第一图像和第二图像的重叠区域内像素点的像素值。对第一图像和第二图像的重叠区域进行图像融合,可以包括,根据第一图像和第二图像的重叠区域内同一像素点的像素值计算得到目标像素值,并将该目标像素值作为融合得到的图像中该像素点的像素值。在图像融合参数包括第一图像与第二图像的像素值权重的情况下,可以采用该像素值权重计算目标像素值。或者,在图像融合参数不包括第一图像与第二图像的像素值权重的情况下,也可以采用加权平均法计算两帧图像的重叠区域中每个像素点的目标像素值。假设第一图像和第二图像的重叠区域存在同一像素点a,像素点a在第一图像中的像素值为p,像素点a在第二图像中的像素值为q,则像素点a在由第一图像和第二图像融合得到的最终图像中的目标像素值可以是(p*0.5+q*0.5)。
结合上述步骤302中的第一种可能情况或第二种可能情况中的第一种实现方式,图像处理设备针对所有拍摄目标进行一次图像拼接,得到一张拼接图像。在图像处理设备得到拼接图像之后,图像处理设备可以输出裁剪图像到屏幕上显示,该裁剪图像从拼接图像中裁剪得到,该裁剪图像包含位于第一相机和第二相机的视野重叠区域内的所有拍摄目标。比如,图像处理设备可以对该拼接图像进行裁剪,以得到包含所有拍摄目标的裁剪图像,并输出裁剪图像到屏幕上显示。
可选地,对于发言人特写模式,图像处理设备可以先拼接得到全景图,然后对全景图裁剪得到发言人的特写图像。为了优化性能、降低资源消耗,图像处理设备也可以直接根据发言人的位置,仅对发言人所在区域的图像进行拼接,得到拼接后的发言人特写图像。
例如,图9和图10分别是本申请实施例提供的一种会议室场景示意图。如图9或图10所示,多相机模组部署在会议室前方,会议室中有A、B、C、D四个不同位置。其中多相机模组的结构可以如图1所示,A位置和C位置位于相机1和相机2的视野重叠区域A12内,B位置和D位置位于相机2和相机3的视野重叠区域A23内。
如图9所示,假设导播场景的发言人初始时在A位置,则多相机模组中的3个相机采集的图像可以如图11所示,其中图像1a是相机1采集的图像,图像2a是相机2采集的图像,图像3a是相机3采集的图像。图像2a是指位于矩形框内的区域,图中为了示意,将相机2对发言人未拍全的部分也进行了体现,位于矩形框外的部分实际不属于图像2a。由于导播的发言人在A位置,此时图像处理设备可以采用A位置对应的拼接参数对图像1和图像2进行拼接,得到如图12所示的拼接图像。如果切换到发言人特写模式,则图像处理设备可以从图12中裁剪出包含发言人的位置框进行放大显示,得到的裁剪图像可以如图13所示。当导播的发言人从A位置移动到B位置时,图像处理设备可以采用B位置对应的拼接参数对相机2采集的图像和相机3采集的图像进行拼接。同理,当导播的发言人在C位置或D位置时,图像处理设备均可以采用发言人所在位置对应的拼接参数对相应相机采集的图像进行拼接。
如图10所示,A、B、C、D四个位置处有4个不同的发言人,可以在不同时刻分别不同发言人进行导播特写。在对A位置的发言人进行导播特写时,图像处理设备采用A位置对应的拼接参数对相机1和相机2采集的图像进行拼接。在对B位置的发言人进行导播特写时,图像处理设备采用B位置对应的拼接参数对相机2和相机3采集的图像进行拼接。在对C位置的发言人进行导播特写时,图像处理设备采用C位置对应的拼接参数对相机1和相机2采集的图像进行拼接。在对D位置的发言人进行导播特写时,图像处理设备采用D位置对应的拼接参数对相机2和相机3采集的图像进行拼接。这样能够使得当前导播特写对象所在位置具有较好的图像拼接效果,进而保证导播特写对象的显示效果。
可选地,导播模式与全景显示模式之间可以相互切换。在全景显示模式下,图像处理设备对多相机模组中所有相机采集的图像进行拼接。比如在图9或图10示出的会议室场景中,图像处理设备可以先对相机1和相机2采集的图像进行拼接,再将得到的拼接图像与相机3采集的图像进行拼接,得到全景图像。或者,图像处理设备可以先对相机1和相机2采集的图像进行拼接,以及对相机2和相机3采集的图像进行拼接,再对得到的两张拼接图像进行进一步拼接,得到全景图像。本申请实施例中的图像拼接可以是对相邻部署的相机采集的图像进行拼接。
结合上述步骤302中的第二种可能情况的第二种实现方式,第一相机和第二相机的视野重叠区域内包括多个拍摄目标,步骤303的实现方式为,针对每个拍摄目标,图像处理设备采用该拍摄目标对应的拼接参数对第一图像和第二图像进行拼接,得到该拍摄目标对应的拼接图像。也即是,图像处理设备针对多个拍摄目标中的每个拍摄目标分别进行一次图像拼接,得到多张拼接图像,以保证每个拍摄目标在对应的拼接图像中的显示效果。
可选地,在图像处理设备得到多个拍摄目标分别对应的拼接图像之后,图像处理设备可以输出组合图像到屏幕上显示,该组合图像由多张裁剪图像组合得到,多张裁剪图像分别从多个拍摄目标对应的多张拼接图像中裁剪得到,且每张裁剪图像分别包含所裁剪的拼接图像中对应的拍摄目标。比如,图像处理设备可以针对每个拍摄目标对应的拼接图像,对该拼接图像进行裁剪,以得到包含对应的拍摄目标的裁剪图像,图像处理设备对该多个拍摄目标对应的多张裁剪图像进行组合,以得到组合图像,并输出组合图像到屏幕上显示。
本申请实施例中,由于图像处理设备生成的组合图像中的每个拍摄目标所来自的拼接图像能够保证该拍摄目标具有较好的显示效果,因此能够保证最终输出的组合图像的显示效果。
可选地,对于智能均分模式或多人同框模式,为了优化性能、降低资源消耗,图像处理设备可以直接根据拍摄目标所在的位置,采用该拍摄目标对应的拼接参数对包含拍摄目标所在区域的图像进行拼接。比如多相机模组包括3个相机,如果一个与会者只出现在两个相机的视野范围内,则拼接时只对这两个相机采集的图像进行拼接。如果一个与会者同时出现在三个相机的视野范围内,则拼接时可以选取其中相邻两个相机的图像进行拼接,只需保证拼接得到的图像中该与会者的成像完整即可。
例如,在如图10所示的会议室场景中,多相机模组中的3个相机采集的图像可以如图14所示,其中图像1b是相机1采集的图像,图像2b是相机2采集的图像,图像3b是相机3采集的图像。图像1b-3b是指位于矩形框内的区域,图中为了示意,将相机对发言人未拍全的部分也进行了体现,位于矩形框外的部 分实际不属于对应的图像。图像处理设备可以采用A位置对应的拼接参数对图像1b和图像2b进行拼接,并从拼接得到的图像中裁剪出包含A位置的发言人的区域。图像处理设备可以采用B位置对应的拼接参数对图像2b和图像3b进行拼接,并从拼接得到的图像中裁剪出包含B位置的发言人的区域。图像处理设备可以采用C位置对应的拼接参数对图像1b和图像2b进行拼接,并从拼接得到的图像中裁剪出包含C位置的发言人的区域。图像处理设备可以采用D位置对应的拼接参数对图像2b和图像3b进行拼接,并从拼接得到的图像中裁剪出包含D位置的发言人的区域。进一步地,图像处理设备可以对裁剪出来的四个区域进行组合,得到如图15所示的组合图像。
可选地,图像处理设备输出图像到屏幕上显示,可以是在自身屏幕上显示图像,或者也可以是向其它设备发送图像以供其它设备显示。例如在视频会议场景中,会议终端可以对本端多相机模组采集的多帧图像进行拼接得到拼接图像,进一步可以显示拼接图像、裁剪图像或组合图像。或者,会议终端还可以向远端的其它会议终端发送拼接图像、裁剪图像或组合图像,以供其它会议终端显示。
在本申请实施例提供的图像拼接方法中,通过离线标定多相机模组中相邻部署的两个相机针对视野重叠区域内的不同标定位置的相机参数,对于每个标定位置,进一步可以根据这两个相机针对该标定位置标定得到的相机参数确定拼接参数,以得到不同标定位置分别对应的拼接参数。当拍摄目标位于两个相机的视野重叠区域内时,图像处理设备可以根据该视野重叠区域内的多个标定位置各自对应的拼接参数确定适用于拍摄目标所在的位置的拼接参数,进而使得拼接图像中拍摄目标所在的位置的拼接效果较好,从而提升拍摄目标在拼接图像中的显示效果。另外,本申请实施例中无需对图像进行特征点检测和匹配,适用于更多的拍摄场景且所需算力较低。
本申请实施例提供的图像拼接方法的步骤的先后顺序能够进行适当调整,步骤也能够根据情况进行相应增减。任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化的方法,都应涵盖在本申请的保护范围之内。比如,本申请实施例以拍摄目标位于多相机模组中的两个相机的拍摄区域内为例对图像拼接过程进行说明,如果拍摄目标位于多相机模组中的三个或三个以上相机的拍摄区域内,则图像处理设备可以选取其中相邻部署的两个相机采集的图像进行拼接,只需保证拍摄目标在拼接得到的图像中的成像完整即可,本申请实施例不再一一赘述。
下面对本申请实施例涉及的虚拟装置举例说明。
例如,图16是本申请实施例提供的一种图像拼接装置的结构示意图。该图像拼接图像可以应用于图像处理设备,如图16所示,该装置1600包括但不限于:获取模块1601、确定模块1602和拼接模块1603。可选地,该装置1600还包括输出模块1604。
获取模块1601,用于获取多相机模组在同一时刻采集的多帧图像,多相机模组包括多个相机,多个相机包括相邻部署的第一相机和第二相机,第一相机和第二相机具有视野重叠区域,多帧图像包括第一相机采集的第一图像和第二相机采集的第二图像。
确定模块1602,用于当视野重叠区域内包括拍摄目标时,根据第一相机和第二相机对应的多组拼接参数以及拍摄目标所在的位置确定目标拼接参数,多组拼接参数分别基于第一相机和第二相机共同针对视野重叠区域内的不同标定位置标定的相机参数得到。
拼接模块1603,用于采用目标拼接参数对第一图像和第二图像进行拼接,得到拼接图像。
第一相机和第二相机对应的每组拼接参数包括第一相机采集的图像到第二相机采集的图像的投影变换参数,或者,第一相机和第二相机对应的每组拼接参数包括第一相机采集的图像到目标平面坐标系的投影变换参数以及第二相机采集的图像到目标平面坐标系的投影变换参数。
可选地,多个标定位置中的任意两个标定位置满足以下一个或多个条件:两个标定位置到第一相机和第二相机的中心位置的距离不同;两个标定位置相对于第一相机和第二相机的排布方向的水平角度不同;两个标定位置相对于第一相机和第二相机的排布方向的垂直角度不同。
可选地,视野重叠区域内包括一个拍摄目标,确定模块1602,用于:根据多个标定位置中距离拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数确定目标拼接参数。
可选地,确定模块1602,具体用于:如果多个标定位置包括拍摄目标所在的位置,将拍摄目标所在的位置对应的拼接参数作为目标拼接参数。如果多个标定位置不包括拍摄目标所在的位置,根据多个标定位置中距离拍摄目标所在的位置最近的两个标定位置对应的拼接参数确定目标拼接参数。
可选地,确定模块1602,具体用于:基于拍摄目标所在的位置相对于两个标定位置的距离,采用两个标定位置对应的拼接参数插值计算得到拍摄目标所在的位置对应的目标拼接参数。
可选地,视野重叠区域内包括多个拍摄目标,确定模块1602,用于:根据多组拼接参数以及多个拍摄目标所在的位置确定目标拼接参数。
可选地,确定模块1602,具体用于:将目标标定位置对应的拼接参数作为目标拼接参数,目标标定位置为多个标定位置中到多个拍摄目标所在的位置的距离之和最小的标定位置。
或者,确定模块1602,具体用于:针对多个拍摄目标中的每个拍摄目标,获取多个标定位置中距离拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数。根据针对多个拍摄目标获取的所有拼接参数,确定目标拼接参数。
可选地,输出模块1604,用于在采用目标拼接参数对第一图像和第二图像进行拼接,得到拼接图像之后,输出裁剪图像到屏幕上显示,裁剪图像从拼接图像中裁剪得到,裁剪图像包含所有拍摄目标。
可选地,视野重叠区域内包括多个拍摄目标,确定模块1602,用于:针对多个拍摄目标中的每个拍摄目标,根据多个标定位置中距离拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数确定拍摄目标对应的拼接参数。相应地,拼接模块1603,用于:针对每个拍摄目标,采用拍摄目标对应的拼接参数对第一图像和第二图像进行拼接,得到拍摄目标对应的拼接图像。
可选地,输出模块1604,用于在针对多个拍摄目标中的每个拍摄目标,采用拍摄目标对应的拼接参数对第一图像和第二图像进行拼接,得到拍摄目标对应的拼接图像之后,输出组合图像到屏幕上显示,组合图像由多张裁剪图像组合得到,多张裁剪图像分别从多个拍摄目标对应的多张拼接图像中裁剪得到,且每张裁剪图像分别包含所裁剪的拼接图像中对应的拍摄目标。
可选地,图像处理设备中存储有多相机模组中相邻部署的两个相机在多种部署场景下分别对应的拼接参数。获取模块1601,还用于在根据第一相机和第二相机对应的多组拼接参数以及拍摄目标所在的位置确定目标拼接参数之前,获取多相机模组的部署场景,并获取第一相机和第二相机在多相机模组的部署场景下对应的多组拼接参数。
关于上述实施例中的装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。
下面对本申请实施例涉及的基本硬件结构举例说明。
例如,图17是本申请实施例提供的一种图像处理设备的硬件结构示意图。如图17所示,图像处理设备1700包括处理器1701和存储器1702,存储器1701与存储器1702通过总线1703连接。图17以处理器1701和存储器1702相互独立说明。可选地,处理器1701和存储器1702集成在一起。可选地,结合图2来看,图17中的图像处理设备1700是图2所示的应用场景中的任一会议终端201或视频服务器202。
其中,存储器1702用于存储计算机程序,计算机程序包括操作***和程序代码。存储器1702是各种类型的存储介质,例如只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)、闪存、光存储器、寄存器、光盘存储、光碟存储、磁盘或者其它磁存储设备。
其中,处理器1701是通用处理器或专用处理器。处理器1701可能是单核处理器或多核处理器。处理器1701包括至少一个电路,以执行本申请实施例提供的上述图像拼接方法。
可选地,图像处理设备1700还包括网络接口1704,网络接口1704通过总线1703与处理器1701和存储器1702连接。网络接口1704能够实现图像处理设备1700与其它设备通信。例如,处理器1701能够通过网络接口1704与其它设备通信来获取相机采集的图像等。
可选地,图像处理设备1700还包括输入/输出(input/output,I/O)接口1705,I/O接口1705通过总线1703与处理器1701和存储器1702连接。处理器1701能够通过I/O接口1705接收输入的命令或数据等。I/O接口1705用于图像处理设备1700连接输入设备,这些输入设备例如是键盘、鼠标等。可选地,在一些可能的场景中,上述网络接口1704和I/O接口1705被统称为通信接口。
可选地,图像处理设备1700还包括显示器1706,显示器1706通过总线1703与处理器1701和存储器1702连接。显示器1706能够用于显示处理器1701执行上述方法产生的中间结果和/或最终结果等,例如 显示拼接图像、裁剪图像或组合图像等。在一种可能的实现方式中,显示器1706是触控显示屏,以提供人机交互接口。
其中,总线1703是任何类型的,用于实现图像处理设备1700的内部器件互连的通信总线。例如***总线。本申请实施例以图像处理设备1700内部的上述器件通过总线1703互连为例说明,可选地,图像处理设备1700内部的上述器件采用除了总线1703之外的其他连接方式彼此通信连接,例如图像处理设备1700内部的上述器件通过图像处理设备1700内部的逻辑接口互连。
上述器件可以分别设置在彼此独立的芯片上,也可以至少部分的或者全部的设置在同一块芯片上。将各个器件独立设置在不同的芯片上,还是整合设置在一个或者多个芯片上,往往取决于产品设计的需要。本申请实施例对上述器件的具体实现形式不做限定。
图17所示的图像处理设备1700仅仅是示例性的,在实现过程中,图像处理设备1700包括其他组件,本文不再一一列举。图17所示的图像处理设备1700可以通过执行上述实施例提供的方法的全部或部分步骤来实现图像拼接。
本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有指令,当所述指令被处理器执行时,实现如图3所示的图像拼接方法。
本申请实施例还提供了一种计算机程序产品,包括计算机程序,所述计算机程序被处理器执行时,实现如图3所示的图像拼接方法。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
在本申请实施例中,术语“第一”、“第二”和“第三”仅用于描述目的,而不能理解为指示或暗示相对重要性。
本申请中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
需要说明的是,本申请所涉及的信息(包括但不限于用户设备信息、用户个人信息等)、数据(包括但不限于用于分析的数据、存储的数据、展示的数据等)以及信号,均为经用户授权或者经过各方充分授权的,且相关数据的收集、使用和处理需要遵守相关国家和地区的相关法律法规和标准。
以上所述仅为本申请的可选实施例,并不用以限制本申请,凡在本申请的构思和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (29)

  1. 一种图像拼接方法,其特征在于,应用于图像处理设备,所述方法包括:
    获取多相机模组在同一时刻采集的多帧图像,所述多相机模组包括多个相机,所述多个相机包括相邻部署的第一相机和第二相机,所述第一相机和所述第二相机具有视野重叠区域,所述多帧图像包括所述第一相机采集的第一图像和所述第二相机采集的第二图像;
    当所述视野重叠区域内包括拍摄目标时,根据所述第一相机和所述第二相机对应的多组拼接参数以及所述拍摄目标所在的位置确定目标拼接参数,所述多组拼接参数分别基于所述第一相机和所述第二相机共同针对所述视野重叠区域内的不同标定位置标定的相机参数得到;
    采用所述目标拼接参数对所述第一图像和所述第二图像进行拼接,得到拼接图像。
  2. 根据权利要求1所述的方法,其特征在于,每组所述拼接参数包括所述第一相机采集的图像到所述第二相机采集的图像的投影变换参数,或者,每组所述拼接参数包括所述第一相机采集的图像到目标平面坐标系的投影变换参数以及所述第二相机采集的图像到所述目标平面坐标系的投影变换参数。
  3. 根据权利要求1或2所述的方法,其特征在于,所述多个标定位置中的任意两个标定位置满足以下一个或多个条件:
    所述两个标定位置到所述第一相机和所述第二相机的中心位置的距离不同;
    所述两个标定位置相对于所述第一相机和所述第二相机的排布方向的水平角度不同;
    所述两个标定位置相对于所述第一相机和所述第二相机的排布方向的垂直角度不同。
  4. 根据权利要求1至3任一所述的方法,其特征在于,所述视野重叠区域内包括一个拍摄目标,所述根据所述第一相机和所述第二相机对应的多组拼接参数以及所述拍摄目标所在的位置确定目标拼接参数,包括:
    根据所述多个标定位置中距离所述拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数确定所述目标拼接参数。
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述多个标定位置中距离所述拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数确定所述目标拼接参数,包括:
    如果所述多个标定位置包括所述拍摄目标所在的位置,将所述拍摄目标所在的位置对应的拼接参数作为所述目标拼接参数;
    如果所述多个标定位置不包括所述拍摄目标所在的位置,根据所述多个标定位置中距离所述拍摄目标所在的位置最近的两个标定位置对应的拼接参数确定所述目标拼接参数。
  6. 根据权利要求5所述的方法,其特征在于,所述根据所述多个标定位置中距离所述拍摄目标所在的位置最近的两个标定位置对应的拼接参数确定所述目标拼接参数,包括:
    基于所述拍摄目标所在的位置相对于所述两个标定位置的距离,采用所述两个标定位置对应的拼接参数插值计算得到所述拍摄目标所在的位置对应的所述目标拼接参数。
  7. 根据权利要求1至3任一所述的方法,其特征在于,所述视野重叠区域内包括多个拍摄目标,所述根据所述第一相机和所述第二相机对应的多组拼接参数以及所述拍摄目标所在的位置确定目标拼接参数,包括:
    根据所述多组拼接参数以及所述多个拍摄目标所在的位置确定所述目标拼接参数。
  8. 根据权利要求7所述的方法,其特征在于,所述根据所述多组拼接参数以及所述多个拍摄目标所在的位置确定所述目标拼接参数,包括:
    将目标标定位置对应的拼接参数作为所述目标拼接参数,所述目标标定位置为所述多个标定位置中到所述多个拍摄目标所在的位置的距离之和最小的标定位置。
  9. 根据权利要求7所述的方法,其特征在于,所述根据所述多组拼接参数以及所述多个拍摄目标所在的位置确定所述目标拼接参数,包括:
    针对所述多个拍摄目标中的每个拍摄目标,获取所述多个标定位置中距离所述拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数;
    根据针对所述多个拍摄目标获取的所有拼接参数,确定所述目标拼接参数。
  10. 根据权利要求4至9任一所述的方法,其特征在于,在所述采用所述目标拼接参数对所述第一图像和所述第二图像进行拼接,得到拼接图像之后,所述方法还包括:
    输出裁剪图像到屏幕上显示,所述裁剪图像从所述拼接图像中裁剪得到,所述裁剪图像包含所有所述拍摄目标。
  11. 根据权利要求1至3任一所述的方法,其特征在于,所述视野重叠区域内包括多个拍摄目标,所述根据所述第一相机和所述第二相机对应的多组拼接参数以及所述拍摄目标所在的位置确定目标拼接参数,包括:
    针对所述多个拍摄目标中的每个拍摄目标,根据所述多个标定位置中距离所述拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数确定所述拍摄目标对应的拼接参数;
    所述采用所述目标拼接参数对所述第一图像和所述第二图像进行拼接,得到拼接图像,包括:
    针对所述每个拍摄目标,采用所述拍摄目标对应的拼接参数对所述第一图像和所述第二图像进行拼接,得到所述拍摄目标对应的拼接图像。
  12. 根据权利要求11所述的方法,其特征在于,在所述针对所述多个拍摄目标中的每个拍摄目标,采用所述拍摄目标对应的拼接参数对所述第一图像和所述第二图像进行拼接,得到所述拍摄目标对应的拼接图像之后,所述方法还包括:
    输出组合图像到屏幕上显示,所述组合图像由多张裁剪图像组合得到,所述多张裁剪图像分别从所述多个拍摄目标对应的多张拼接图像中裁剪得到,且每张所述裁剪图像分别包含所裁剪的拼接图像中对应的拍摄目标。
  13. 根据权利要求1至12任一所述的方法,其特征在于,所述图像处理设备中存储有所述多相机模组中相邻部署的两个相机在多种部署场景下分别对应的拼接参数,在所述根据所述第一相机和所述第二相机对应的多组拼接参数以及所述拍摄目标所在的位置确定目标拼接参数之前,所述方法还包括:
    获取所述多相机模组的部署场景;
    获取所述第一相机和所述第二相机在所述多相机模组的部署场景下对应的所述多组拼接参数。
  14. 一种图像拼接装置,其特征在于,应用于图像处理设备,所述装置包括:
    获取模块,用于获取多相机模组在同一时刻采集的多帧图像,所述多相机模组包括多个相机,所述多个相机包括相邻部署的第一相机和第二相机,所述第一相机和所述第二相机具有视野重叠区域,所述多帧图像包括所述第一相机采集的第一图像和所述第二相机采集的第二图像;
    确定模块,用于当所述视野重叠区域内包括拍摄目标时,根据所述第一相机和所述第二相机对应的多组拼接参数以及所述拍摄目标所在的位置确定目标拼接参数,所述多组拼接参数分别基于所述第一相机和所述第二相机共同针对所述视野重叠区域内的不同标定位置标定的相机参数得到;
    拼接模块,用于采用所述目标拼接参数对所述第一图像和所述第二图像进行拼接,得到拼接图像。
  15. 根据权利要求14所述的装置,其特征在于,每组所述拼接参数包括所述第一相机采集的图像到所述第二相机采集的图像的投影变换参数,或者,每组所述拼接参数包括所述第一相机采集的图像到目标平 面坐标系的投影变换参数以及所述第二相机采集的图像到所述目标平面坐标系的投影变换参数。
  16. 根据权利要求14或15所述的装置,其特征在于,所述多个标定位置中的任意两个标定位置满足以下一个或多个条件:
    所述两个标定位置到所述第一相机和所述第二相机的中心位置的距离不同;
    所述两个标定位置相对于所述第一相机和所述第二相机的排布方向的水平角度不同;
    所述两个标定位置相对于所述第一相机和所述第二相机的排布方向的垂直角度不同。
  17. 根据权利要求14至16任一所述的装置,其特征在于,所述视野重叠区域内包括一个拍摄目标,所述确定模块,用于:
    根据所述多个标定位置中距离所述拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数确定所述目标拼接参数。
  18. 根据权利要求17所述的装置,其特征在于,所述确定模块,用于:
    如果所述多个标定位置包括所述拍摄目标所在的位置,将所述拍摄目标所在的位置对应的拼接参数作为所述目标拼接参数;
    如果所述多个标定位置不包括所述拍摄目标所在的位置,根据所述多个标定位置中距离所述拍摄目标所在的位置最近的两个标定位置对应的拼接参数确定所述目标拼接参数。
  19. 根据权利要求18所述的装置,其特征在于,所述确定模块,用于:
    基于所述拍摄目标所在的位置相对于所述两个标定位置的距离,采用所述两个标定位置对应的拼接参数插值计算得到所述拍摄目标所在的位置对应的所述目标拼接参数。
  20. 根据权利要求14至16任一所述的装置,其特征在于,所述视野重叠区域内包括多个拍摄目标,所述确定模块,用于:
    根据所述多组拼接参数以及所述多个拍摄目标所在的位置确定所述目标拼接参数。
  21. 根据权利要求20所述的装置,其特征在于,所述确定模块,用于:
    将目标标定位置对应的拼接参数作为所述目标拼接参数,所述目标标定位置为所述多个标定位置中到所述多个拍摄目标所在的位置的距离之和最小的标定位置。
  22. 根据权利要求20所述的装置,其特征在于,所述确定模块,用于:
    针对所述多个拍摄目标中的每个拍摄目标,获取所述多个标定位置中距离所述拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数;
    根据针对所述多个拍摄目标获取的所有拼接参数,确定所述目标拼接参数。
  23. 根据权利要求17至22任一所述的装置,其特征在于,所述装置还包括:
    输出模块,用于在采用所述目标拼接参数对所述第一图像和所述第二图像进行拼接,得到拼接图像之后,输出裁剪图像到屏幕上显示,所述裁剪图像从所述拼接图像中裁剪得到,所述裁剪图像包含所有所述拍摄目标。
  24. 根据权利要求14至16任一所述的装置,其特征在于,所述视野重叠区域内包括多个拍摄目标,所述确定模块,用于:
    针对所述多个拍摄目标中的每个拍摄目标,根据所述多个标定位置中距离所述拍摄目标所在的位置最近的一个或多个标定位置对应的拼接参数确定所述拍摄目标对应的拼接参数;
    所述拼接模块,用于:
    针对所述每个拍摄目标,采用所述拍摄目标对应的拼接参数对所述第一图像和所述第二图像进行拼接, 得到所述拍摄目标对应的拼接图像。
  25. 根据权利要求24所述的装置,其特征在于,所述装置还包括:
    输出模块,用于在针对所述多个拍摄目标中的每个拍摄目标,采用所述拍摄目标对应的拼接参数对所述第一图像和所述第二图像进行拼接,得到所述拍摄目标对应的拼接图像之后,输出组合图像到屏幕上显示,所述组合图像由多张裁剪图像组合得到,所述多张裁剪图像分别从所述多个拍摄目标对应的多张拼接图像中裁剪得到,且每张所述裁剪图像分别包含所裁剪的拼接图像中对应的拍摄目标。
  26. 根据权利要求14至25任一所述的装置,其特征在于,所述图像处理设备中存储有所述多相机模组中相邻部署的两个相机在多种部署场景下分别对应的拼接参数,
    所述获取模块,还用于在所述根据所述第一相机和所述第二相机对应的多组拼接参数以及所述拍摄目标所在的位置确定目标拼接参数之前,获取所述多相机模组的部署场景,并获取所述第一相机和所述第二相机在所述多相机模组的部署场景下对应的所述多组拼接参数。
  27. 一种图像拼接装置,其特征在于,包括:处理器和存储器;
    所述存储器,用于存储计算机程序,所述计算机程序包括程序指令;
    所述处理器,用于调用所述计算机程序,实现如权利要求1至13任一所述的图像拼接方法。
  28. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有指令,当所述指令被处理器执行时,实现如权利要求1至13任一所述的图像拼接方法。
  29. 一种计算机程序产品,其特征在于,包括计算机程序,所述计算机程序被处理器执行时,实现如权利要求1至13任一所述的图像拼接方法。
PCT/CN2023/115094 2022-12-05 2023-08-25 图像拼接方法及装置 WO2024119902A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202211550973.3 2022-12-05
CN202211550973 2022-12-05
CN202310140790.2A CN118154415A (zh) 2022-12-05 2023-02-14 图像拼接方法及装置
CN202310140790.2 2023-02-14

Publications (1)

Publication Number Publication Date
WO2024119902A1 true WO2024119902A1 (zh) 2024-06-13

Family

ID=91295657

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/115094 WO2024119902A1 (zh) 2022-12-05 2023-08-25 图像拼接方法及装置

Country Status (2)

Country Link
CN (1) CN118154415A (zh)
WO (1) WO2024119902A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109064404A (zh) * 2018-08-10 2018-12-21 西安电子科技大学 一种基于多相机标定的全景拼接方法、全景拼接***
CN109255754A (zh) * 2018-09-30 2019-01-22 北京宇航时代科技发展有限公司 一种大场景多相机图像拼接与真实展现的方法和***
EP3629292A1 (en) * 2018-09-27 2020-04-01 Continental Automotive GmbH Reference point selection for extrinsic parameter calibration
CN113256742A (zh) * 2021-07-15 2021-08-13 禾多科技(北京)有限公司 界面展示方法、装置、电子设备和计算机可读介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109064404A (zh) * 2018-08-10 2018-12-21 西安电子科技大学 一种基于多相机标定的全景拼接方法、全景拼接***
EP3629292A1 (en) * 2018-09-27 2020-04-01 Continental Automotive GmbH Reference point selection for extrinsic parameter calibration
CN109255754A (zh) * 2018-09-30 2019-01-22 北京宇航时代科技发展有限公司 一种大场景多相机图像拼接与真实展现的方法和***
CN113256742A (zh) * 2021-07-15 2021-08-13 禾多科技(北京)有限公司 界面展示方法、装置、电子设备和计算机可读介质

Also Published As

Publication number Publication date
CN118154415A (zh) 2024-06-07

Similar Documents

Publication Publication Date Title
CN106251334B (zh) 一种摄像机参数调整方法、导播摄像机及***
EP3054414B1 (en) Image processing system, image generation apparatus, and image generation method
TWI558208B (zh) 影像處理方法、影像處理裝置及顯示系統
JP4268206B2 (ja) 魚眼レンズカメラ装置及びその画像歪み補正方法
JP4243767B2 (ja) 魚眼レンズカメラ装置及びその画像抽出方法
US20200275079A1 (en) Generating three-dimensional video content from a set of images captured by a camera array
US11736801B2 (en) Merging webcam signals from multiple cameras
US8749607B2 (en) Face equalization in video conferencing
JP2001094857A (ja) バーチャル・カメラの制御方法、カメラアレイ、及びカメラアレイの整合方法
US10691012B2 (en) Image capturing apparatus, method of controlling image capturing apparatus, and non-transitory computer-readable storage medium
JP5963006B2 (ja) 画像変換装置、カメラ、映像システム、画像変換方法およびプログラムを記録した記録媒体
KR101916419B1 (ko) 광각 카메라용 다중 뷰 영상 생성 장치 및 영상 생성 방법
CN114640833A (zh) 投影画面调整方法、装置、电子设备和存储介质
TWI615808B (zh) 全景即時影像處理方法
JP6665917B2 (ja) 画像処理装置
KR20190019059A (ko) 수평 시차 스테레오 파노라마를 캡쳐하는 시스템 및 방법
US20080100697A1 (en) Methods and systems for producing seamless composite images without requiring overlap of source images
JP7424076B2 (ja) 画像処理装置、画像処理システム、撮像装置、画像処理方法およびプログラム
US20220230275A1 (en) Imaging system, image processing apparatus, imaging device, and recording medium
CN114096984A (zh) 从通过拼接部分图像而创建的全向图像中去除图像捕获装置
WO2024119902A1 (zh) 图像拼接方法及装置
CN111325790B (zh) 目标追踪方法、设备及***
CN112261281B (zh) 视野调整方法及电子设备、存储装置
JP2014192557A (ja) 被写体画像抽出装置および被写体画像抽出・合成装置
WO2024001342A1 (zh) 成像畸变矫正方法及装置