CN116580152A - Three-dimensional modeling method and device for large-resolution oblique photographic image - Google Patents

Three-dimensional modeling method and device for large-resolution oblique photographic image Download PDF

Info

Publication number
CN116580152A
CN116580152A CN202310501817.6A CN202310501817A CN116580152A CN 116580152 A CN116580152 A CN 116580152A CN 202310501817 A CN202310501817 A CN 202310501817A CN 116580152 A CN116580152 A CN 116580152A
Authority
CN
China
Prior art keywords
image
segmented
original image
segmented image
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310501817.6A
Other languages
Chinese (zh)
Inventor
陈明杰
田淇元
李浩杰
蔡明成
郭林春
王江安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tudou Data Technology Group Co ltd
Original Assignee
Tudou Data Technology Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tudou Data Technology Group Co ltd filed Critical Tudou Data Technology Group Co ltd
Priority to CN202310501817.6A priority Critical patent/CN116580152A/en
Publication of CN116580152A publication Critical patent/CN116580152A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T17/00Three dimensional [3D] modelling, e.g. data description of 3D objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T15/003D [Three Dimensional] image rendering
    • G06T15/04Texture mapping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/80Geometric correction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/80Analysis of captured images to determine intrinsic or extrinsic camera parameters, i.e. camera calibration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Computer Graphics (AREA)
  • Geometry (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Image Processing (AREA)

Abstract

The application discloses a three-dimensional modeling method and a device for a large-resolution oblique photographic image, wherein the method comprises the following steps: splitting the original image to enable the video memory required for calculating the depth map of each split image to be smaller than or equal to the video memory of a computer; extracting the characteristics of each segmented image to obtain characteristic points of each segmented image; converting the pixel coordinates of each feature point in the corresponding segmented image into pixel coordinates on the original image, and determining camera internal parameters and distortion parameters of the original image; calculating camera internal parameters and distortion parameters of each segmented image, and correcting each segmented image; and performing feature matching on each segmented image subjected to distortion correction, and acquiring and fusing depth maps of a plurality of segmented images to obtain a three-dimensional point cloud. The method solves the problems that the memory and the video memory of a computer leak when the acquired large-size high-resolution aerial image is subjected to three-dimensional modeling at present, so that the consumption of the video memory is extremely high, and the video memory overflows when a single image is processed.

Description

Three-dimensional modeling method and device for large-resolution oblique photographic image
Technical Field
The application relates to the technical field of remote sensing mapping, in particular to a three-dimensional modeling method and device for a large-resolution oblique photographic image.
Background
With the recent trend of science and technology, computer vision has been increasingly focused and emphasized by various industries, such as fields of equipment detection and monitoring, medical image processing, cultural relic protection, robot vision, automatic navigation, industrial product appearance design and production, and the like. Computer vision technology presents opportunities for people and challenges. Three-dimensional reconstruction is one of the most popular research directions in computer vision technology, and involves a plurality of discipline systems including image processing, stereoscopic vision, pattern recognition, and the like. With the continuous development of industrialization, various technologies are realized by acquiring three-dimensional information of Yu Mubiao objects.
The oblique photography technology overtakes the limitation that the original orthographic image can only be photographed from a vertical angle, and the user is introduced into the real visual world which accords with the human vision by carrying a plurality of sensors on the same flight platform and collecting images from five different angles which are vertical and four oblique. By utilizing the image data obtained by aerial photography to carry out three-dimensional modeling, a real ground three-dimensional model can be obtained quickly, accurate information of solid geometry and space can be obtained, and a large number of terrain elements can be obtained.
When the three-dimensional modeling is carried out on the acquired large-size high-resolution aerial image, the existing method has the problems of leakage of the memory and the video memory of a computer and the like, so that the consumption of the video memory is extremely high, and the video memory overflows when a single image is processed.
Disclosure of Invention
According to the embodiment of the application, by providing the three-dimensional modeling method and the device for the large-resolution oblique photographic image, the problems that the memory and the video memory of a computer leak and the like exist when the acquired large-size and high-resolution aerial image is subjected to three-dimensional modeling by the existing method can be solved, so that the consumption of the video memory is extremely high, and the video memory overflows when a single image is processed.
In a first aspect, an embodiment of the present application provides a three-dimensional modeling method for a large-resolution oblique photographic image, the method including: splitting the original image to enable the video memory required for calculating the depth map of each split image to be smaller than or equal to the video memory of a computer; extracting the characteristics of each segmented image to obtain characteristic points of each segmented image; converting pixel coordinates of each characteristic point in the corresponding segmented image into pixel coordinates on the original image, and determining camera internal parameters and distortion parameters of the original image; calculating camera internal parameters and distortion parameters of each segmented image, and correcting each segmented image; and performing feature matching on each segmented image subjected to distortion correction, and acquiring and fusing depth maps of a plurality of segmented images to obtain a three-dimensional point cloud.
With reference to the first aspect, in one possible implementation manner, the video memory required for the depth map of each of the segmented images is determined by the following steps: calculating the video memory required by the depth map of each segmented image according to the memory space occupied by one pixel of the formula bytes=w×h; wherein w is the width of each segmented image, and h is the height of each segmented image.
With reference to the first aspect, in a possible implementation manner, the converting pixel coordinates of each feature point in the corresponding segmented image into pixel coordinates on the original image includes: and respectively calculating pixel coordinates of each characteristic point on the original image according to the position of the segmented image on the original image and the width and the height of the original image.
With reference to the first aspect, in a possible implementation manner, the determining camera intrinsic parameters and distortion parameters of the original image includes: performing feature matching on the original image according to the pixel coordinates of each feature point on the original image; performing aerial triangulation on the original image to obtain a measurement result; wherein the measurement results include camera intrinsic parameters of the original image and connection points between the original image and each of the segmented images.
With reference to the first aspect, in a possible implementation manner, the calculating camera intrinsic parameters and distortion parameters of each of the segmented images includes: taking the distortion parameters of the original image as the distortion parameters of each segmented image; and converting the camera internal parameters of the original image into the camera internal parameters of the segmented image according to the position of the segmented image on the original image and the width and the height of the original image.
With reference to the first aspect, in one possible implementation manner, reconstructing three-dimensional coordinates of a plurality of segmented images to obtain depth information of the depth map; sequentially taking each segmented image as a reference image, and projecting the depth estimated value of each segmented image to world coordinates; and extracting a three-dimensional point cloud from the three-dimensional point clouds of the adjacent segmented images, and reserving the three-dimensional point clouds with two identical adjacent estimates according to the estimated depth, the normal line and the reprojection error constraint of each segmented image in the world coordinates.
With reference to the first aspect, in one possible implementation manner, the method further includes: constructing a three-dimensional grid according to the three-dimensional point cloud, so that any point in the three-dimensional point cloud falls on the vertex, side or inside the triangle of the corresponding triangle; and carrying out texture mapping on the three-dimensional grid.
In a second aspect, an embodiment of the present application provides a three-dimensional modeling apparatus for a large-resolution oblique photographic image, the apparatus including:
the segmentation module is used for segmenting the original image, so that the video memory required by calculating the depth map of each segmented image is smaller than or equal to the video memory of the computer; the feature extraction module is used for extracting features of each segmented image to obtain feature points of each segmented image; the coordinate conversion module is used for converting the pixel coordinate of each characteristic point corresponding to the segmentation image into the pixel coordinate on the original image and determining the camera internal parameters and distortion parameters of the original image; the correction module is used for calculating camera internal parameters and distortion parameters of each segmented image and correcting each segmented image; and the fusion module is used for carrying out feature matching on each segmented image subjected to distortion correction, obtaining and fusing depth maps of a plurality of segmented images, and obtaining a three-dimensional point cloud.
With reference to the second aspect, in one possible implementation manner, the video memory required for the depth map of each of the segmented images is determined by the following steps: calculating the video memory required by the depth map of each segmented image according to the memory space occupied by one pixel of the formula bytes=w×h; wherein w is the width of each segmented image, and h is the height of each segmented image.
With reference to the second aspect, in one possible implementation manner, the coordinate conversion module is specifically configured to: and respectively calculating pixel coordinates of each characteristic point on the original image according to the position of the segmented image on the original image and the width and the height of the original image.
With reference to the second aspect, in one possible implementation manner, the coordinate conversion module is further specifically configured to: performing feature matching on the original image according to the pixel coordinates of each feature point on the original image; performing aerial triangulation on the original image to obtain a measurement result; wherein the measurement results include camera intrinsic parameters of the original image and connection points between the original image and each of the segmented images.
With reference to the second aspect, in one possible implementation manner, the correction module is specifically configured to: taking the distortion parameters of the original image as the distortion parameters of each segmented image; and converting the camera internal parameters of the original image into the camera internal parameters of the segmented image according to the position of the segmented image on the original image and the width and the height of the original image.
With reference to the second aspect, in one possible implementation manner, the fusion module is specifically configured to: reconstructing three-dimensional coordinates of a plurality of segmented images to obtain depth information of the depth map; sequentially taking each segmented image as a reference image, and projecting the depth estimated value of each segmented image to world coordinates; and extracting a three-dimensional point cloud from the three-dimensional point clouds of the adjacent segmented images, and reserving the three-dimensional point clouds with two identical adjacent estimates according to the estimated depth, the normal line and the reprojection error constraint of each segmented image in the world coordinates.
With reference to the second aspect, in a possible implementation manner, the apparatus further includes: the texture mapping module is used for constructing a three-dimensional grid according to the three-dimensional point cloud, so that any point in the three-dimensional point cloud falls on the vertex, the side or the inside of the corresponding triangle; and carrying out texture mapping on the three-dimensional grid.
In a third aspect, an embodiment of the present application provides a three-dimensional modeling server for a large-resolution oblique photographic image, including a memory and a processor; the memory is used for storing computer executable instructions; the processor is configured to execute the computer-executable instructions to implement the method of the first aspect or any one of the possible implementation manners of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing executable instructions that when executed by a computer are capable of implementing the method of the first aspect or any one of the possible implementations of the first aspect.
One or more technical solutions provided in the embodiments of the present application at least have the following technical effects:
the embodiment of the application provides a three-dimensional modeling method of a large-resolution oblique photographic image, which comprises the steps of segmenting an original image through a set segmentation rule, carrying out feature matching and aerial triangulation on the original image, obtaining the coordinates of connection points between the original image and a plurality of segmented images according to the result of aerial triangulation, calculating new coordinates of the original image and the plurality of segmented images according to the coordinates of the connection points, and finishing updating and reorganizing data between the original image and the plurality of segmented images.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are needed in the embodiments of the present application or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a three-dimensional modeling method for a large-resolution oblique photographic image according to an embodiment of the application;
FIG. 2 is a flowchart showing specific steps for determining camera intrinsic and distortion parameters of an original image according to an embodiment of the present application;
FIG. 3 is a flowchart showing specific steps for calculating camera intrinsic and distortion parameters for each segmented image according to an embodiment of the present application;
FIG. 4 is a flowchart illustrating specific steps for fusing depth maps of multiple segmented images according to an embodiment of the present application;
FIG. 5 is a flow chart of texture mapping for a three-dimensional grid according to an embodiment of the present application;
FIG. 6 is a schematic diagram of a three-dimensional modeling apparatus for large-resolution oblique photographic images according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a three-dimensional modeling server for large resolution oblique photographic images provided by an embodiment of the application;
fig. 8 is a schematic diagram of original image segmentation according to an embodiment of the present application;
fig. 9 is a schematic diagram of a pixel coordinate system for an original image and a segmented image according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. It will be apparent that the described embodiments are some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
First, a brief description will be given of related technologies or concepts involved in the embodiments of the present application.
In the aerial triangulation, control point encryption is performed indoors according to a small number of field control points, and the height and plane position of the encrypted points are obtained. The main purpose is to provide an absolute directional control point for regional mapping lacking field control points.
The embodiment of the application provides a three-dimensional modeling method of a large-resolution oblique photographic image, which comprises steps S101 to S105 as shown in FIG. 1.
S101: the original image is segmented, so that the video memory required for calculating the depth map of each segmented image is smaller than or equal to the video memory of a computer.
Specifically, the original image may be an aerial image obtained by oblique photography of an unmanned aerial vehicle, and in general, the video memory required for computing the feature extraction of the original image and the video memory required for computing the depth map are both computed by using a graphics processor (English: graphics processing unit, abbreviated as GPU), and the video memory is consumed by both, so when the video memory of the graphics processor is insufficient, the original image needs to be segmented, because the video memory required for computing the feature extraction of the original image is generally greater than the video memory required for computing the feature extraction of the original image, and the memory in a computer used for computing the feature extraction of the original image and the video memory required for computing the depth map is less, the requirements can generally be met, and therefore, only the video memory required for computing the depth map of the original image and the video memory of the computer need to be compared, and the original image needs to be segmented reasonably.
Further, the required video memory of the depth map of each segmented image is determined by the following steps: calculating the video memory required by the depth map of each segmented image according to the memory space occupied by one pixel of the formula bytes=w×h; where w is the width of each segmented image and h is the height of each segmented image.
The method described above will be described in detail with reference to one embodiment, but of course, other embodiments are possible, and the application is not limited to this embodiment.
Assuming that the pixels of the original image are 15000×10000, that is, the width of the original image is 15000 and the height is 10000, firstly, the width and the height of the original image are brought into a memory space occupied by one pixel according to a formula of bytes=w×h, wherein the memory space occupied by one pixel is generally 640 bytes, and 640 is brought into the formula of bytes=15000×10000×640=96,000,000,000×bytes=96 GB, then the display memory required for obtaining the depth map of the original image is 96GB, assuming that the display memory of a computer is 24G, then the original image is segmented into 4 pieces, at this time, the pixels of each segmented image are 7500×5000, that is, the width and the height of each segmented image are 7500×5000, at this time, the width and the height of each segmented image are brought into a formula of bytes=w×h×640, that is, bytes=7500×5000×640=24,000,000×000×000×24GB, and the required for obtaining the depth map of the original image is just satisfied in the same condition as that the computer or the display memory map is small.
When the original image is segmented, the size of the original image is larger as much as possible under the condition that the video memory required by the depth map of each segmented image is smaller than or equal to the video memory of a computer, if the size of the segmentation is too small, the obtained segmented image is more, more time is consumed when the depth map of each segmented image is extracted, and the storage is inconvenient.
S102: and extracting the characteristics of each segmented image to obtain the characteristic points of each segmented image.
Specifically, obtaining the feature point of each segmented image can obtain the descriptor of the feature point and the pixel coordinates of the feature point, wherein the descriptor of the feature point is used for matching whether two feature points are the same feature point, and the feature point refers to the position of a two-dimensional image point which can be stably and repeatedly detected under different illumination and different visual angles.
S103: and converting the pixel coordinates of each characteristic point in the corresponding segmented image into pixel coordinates on the original image, and determining camera internal parameters and distortion parameters of the original image.
The method for converting the pixel coordinate of each feature point in the corresponding segmented image into the pixel coordinate on the original image comprises the following steps: and respectively calculating pixel coordinates of each characteristic point on the original image according to the position of the segmented image on the original image and the width and the height of the original image, and merging the characteristic points of each segmented image to obtain all the characteristic points of the original image.
Fig. 2 is a flowchart showing specific steps for determining camera intrinsic parameters and distortion parameters of an original image in step S103 according to an embodiment of the present application, including steps S201 to S202.
S201: and carrying out feature matching on the original image according to the pixel coordinates of each feature point on the original image.
S202: and carrying out aerial triangulation on the original image to obtain a measurement result. Wherein the measurement results include camera intrinsic to the original image and the connection point between the original image and each segmented image.
Specifically, the aerial triangulation can iteratively calculate the position pose, the connection point, the camera parameters and other information of the original image by using the beam method adjustment, and simultaneously optimize the position pose, the connection point and the camera parameters to minimize the re-projection error of the connection point, wherein the adjustment model is as follows:
wherein e i Representing the i-th camera external parameter,first representing the ith camera i Internal reference, p j Represents the coordinates of the connection point, q ij Representing the projection coordinates of the jth connection point in the ith camera, ε representing camera parameters, +.>Representing camera parameters->Representing the connection point, pi represents the reprojection function.
Further, the basic idea of performing aerial triangulation using beam method adjustment iteration is: by rotating and translating the individual bundles of rays in space, the rays at a common point between the models are optimally intersected, and the entire area is optimally incorporated into a known control point coordinate system. The rotation here corresponds to the external azimuthal element of the ray bundle, while the translation corresponds to the spatial coordinates of the site of the camera. In the case of redundant observations, the so-called coordinates of the neighboring image common intersection should be equal, and the encrypted coordinates of the control point should be identical to the ground measurement coordinates, both in the sense of ensuring [ pvv ] minimum, where [ pvv ] refers to the sum of the weights corresponding to the n observations times the square of their corrections, due to the presence of the measurement errors of the image point coordinates.
The method described above will be described in detail with reference to one embodiment, but of course, other embodiments are possible, and the application is not limited to this embodiment.
Fig. 8 is a schematic view of original image segmentation provided in the embodiment of the present application, as shown in fig. 8, taking segmentation of an original image into 4 sheets as an example, segmenting the original image from half height and width, and numbering each segmented image from left to right in sequence.
Fig. 9 is a schematic diagram of a pixel coordinate system of an original image and a segmented image according to an embodiment of the present application, as shown in fig. 9, taking the 4 th segmented image as an example, and recording the pixel coordinates of the 4 th segmented image in the pixel coordinate system as (x) 1 ,y 1 ) The width of the original image is w, and the height is h, and the pixel coordinates (x, y) of the feature points on the converted original image are:
sheet 1:
and (2):
3 rd:
4:
it should be noted that, if the original image is segmented into more segmented images, the pixel coordinates of the feature points on the original image after feature combination can be obtained in the same way, so as to obtain the positions of the feature points of the original image.
S104: and calculating camera internal parameters and distortion parameters of each segmented image, and correcting each segmented image.
Fig. 3 is a flowchart showing specific steps for calculating camera intrinsic parameters and distortion parameters of each segmented image in step S104 according to an embodiment of the present application, including steps S301 to S302.
S301: and taking the distortion parameter of the original image as the distortion parameter of each segmented image.
S302: and converting the camera intrinsic parameters of the original image into the camera intrinsic parameters of the segmented image according to the position of the segmented image on the original image and the width and the height of the original image.
Since the pixel coordinates of the feature points of the original image are obtained in step S103, after the original image is segmented, the obtained connection point between each segmented image and the original image and the camera parameters will change, so that the camera internal parameters in the measurement result of the aerial triangulation of the original image, the connection point between the original image and each segmented image, and each segmented image need to be reorganized before step S104, so as to obtain a new segmented image.
The method described above will be described in detail with reference to one embodiment, but of course, other embodiments are possible, and the application is not limited to this embodiment.
Assuming that the projection coordinates of the point P (x, y, z) of connection between the original image and each segmented image on the original image are (x, y), the coordinates on each segmented image (x 1 ,y 1 ) The method comprises the following steps of:
sheet 1:
and (2):
3 rd:
4:
assume that the camera intrinsic and distortion parameters of the original image are (x 0 ,y 0 F, k1, k2, k3, p1, p 2), wherein x 0 ,y 0 Representing principal point coordinates relative to an imaging plane, k1, k2, k3 being radial distortion parameters, p1, p2 being tangential distortion coefficients, radial distortion occurring during transformation of the camera coordinate system into a physical coordinate systemThe distortion occurs because the lens is not perfectly parallel to the image. The internal parameters and distortion parameters of each segmented image are respectively:
sheet 1: (x) 0 ,y 0 ,f,k1,k2,k3,p1,p2)
And (2): (x) 0 -w/2,y 0 ,f,k1,k2,k3,p1,p2)
3 rd: (x) 0 ,y 0 -h/2,f,k1,k2,k3,p1,p2)
4: (x) 0 -w/2,y 0 -h/2,f,k1,k2,k3,p1,p2)
S105: and performing feature matching on each segmented image subjected to distortion correction, and acquiring and fusing depth maps of a plurality of segmented images to obtain a three-dimensional point cloud.
Fig. 4 is a flowchart of specific steps for implementing the fusion of depth maps of multiple segmented images in step S105 according to an embodiment of the present application, including steps S401 to S403.
S401: reconstructing three-dimensional coordinates of the plurality of segmented images to obtain depth information of the depth map.
S402: and sequentially taking each segmented image as a reference image, and projecting the depth estimation value of each segmented image to world coordinates.
S403: and extracting a three-dimensional point cloud from the three-dimensional point clouds of the adjacent segmented images, and reserving the three-dimensional point clouds with two identical adjacent estimates according to the estimated depth, normal line and reprojection error constraint of each segmented image in the world coordinates.
Fig. 5 is a flowchart of texture mapping on a three-dimensional grid according to an embodiment of the present application, including steps S501 to S502.
S501: and constructing a three-dimensional grid according to the three-dimensional point cloud, so that any point in the three-dimensional point cloud falls on the vertex, the side or the inside of the corresponding triangle.
S502: texture mapping is performed on the three-dimensional grid.
The three-dimensional modeling method of the large-resolution oblique photographic image provided by the application is simple and effective, the segmentation technology is applied to oblique photographing, the larger-size and larger-resolution image can be processed, the problem that the three-dimensional modeling cannot be performed due to overflow of a video memory in the process of feature extraction, depth map calculation and fusion can be solved, in addition, compared with the method of extracting feature points by downsampling of an original image, the positioning precision of the feature points is improved, the spatial distribution of the feature points is improved, and the speed is improved compared with that before segmentation during depth map calculation and fusion.
The embodiment of the application also provides a three-dimensional modeling apparatus 600 for large-resolution oblique photography image, as shown in fig. 6, the apparatus comprises: the device comprises a segmentation module 601, a feature extraction module 602, a coordinate conversion module 603, a correction module 604 and a fusion module 605.
The segmentation module 601 is configured to segment the original image, so that the video memory required for computing the depth map of each segmented image is smaller than or equal to the video memory of the computer.
The feature extraction module 602 is configured to perform feature extraction on each segmented image, and obtain feature points of each segmented image.
The coordinate conversion module 603 is configured to convert the pixel coordinates of each feature point in the corresponding segmented image into pixel coordinates on the original image, and determine the camera intrinsic parameters and distortion parameters of the original image.
The correction module 604 is configured to calculate camera parameters and distortion parameters for each segmented image and correct each segmented image.
The fusion module 605 is configured to perform feature matching on each segmented image after distortion correction, obtain and fuse depth maps of multiple segmented images, and obtain a three-dimensional point cloud.
The coordinate conversion module 603 is specifically configured to: and respectively calculating pixel coordinates of each characteristic point on the original image according to the position of the segmented image on the original image and the width and the height of the original image.
The coordinate conversion module 603 is specifically configured to: performing feature matching on the original image according to the pixel coordinates of each feature point on the original image; performing aerial triangulation on the original image to obtain a measurement result; wherein the measurement results include camera intrinsic parameters of the original image and connection points between the original image and each segmented image.
The correction module 604 is specifically configured to: taking the distortion parameters of the original image as the distortion parameters of each segmented image; and converting the camera internal parameters of the original image into the camera internal parameters of the segmented image according to the position of the segmented image on the original image and the width and the height of the original image.
The fusion module 605 is specifically configured to: reconstructing three-dimensional coordinates of the plurality of segmented images to obtain depth information of a depth map; sequentially taking each segmented image as a reference image, and projecting the depth estimation value of each segmented image to world coordinates; and extracting a three-dimensional point cloud from the three-dimensional point clouds of the adjacent segmented images, and reserving the three-dimensional point clouds with two identical adjacent estimates according to the estimated depth, normal line and reprojection error constraint of each segmented image in the world coordinates.
Further, the three-dimensional modeling apparatus 600 for a large-resolution oblique photographic image provided by the embodiment of the application further includes: the texture mapping module is used for constructing a three-dimensional grid according to the three-dimensional point cloud, so that any point in the three-dimensional point cloud falls on the vertex, the side or the inside of the corresponding triangle; texture mapping is performed on the three-dimensional grid.
Some of the modules of the apparatus of the embodiments of the present application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, classes, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
As shown in fig. 7, an embodiment of the present application further provides a three-dimensional modeling server for a large-resolution oblique photographic image, including a memory 701 and a processor 702; memory 701 is used to store computer-executable instructions; the processor 702 is configured to execute computer-executable instructions to implement the method for three-dimensional modeling of a large-resolution oblique photographic image described above in accordance with an embodiment of the present application.
The embodiment of the application also provides a computer readable storage medium which stores executable instructions, and the computer can realize the three-dimensional modeling method of the large-resolution oblique photographic image.
From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus necessary hardware. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product or may be embodied in the implementation of data migration. The computer software product may be stored on a storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and include instructions for causing a computer device (which may be a personal computer, mobile terminal, server, or network device, etc.) to perform the methods described in the embodiments of the present application.
In this specification, each embodiment is described in a progressive manner, and the same or similar parts of each embodiment are referred to each other, and each embodiment is mainly described as a difference from other embodiments. All or portions of the present application are operational with numerous general purpose or special purpose computer system environments or configurations.
The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the present application; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced with equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims (10)

1. A three-dimensional modeling method of a large-resolution oblique photographic image, comprising:
splitting the original image to enable the video memory required for calculating the depth map of each split image to be smaller than or equal to the video memory of a computer;
extracting the characteristics of each segmented image to obtain characteristic points of each segmented image;
converting pixel coordinates of each characteristic point in the corresponding segmented image into pixel coordinates on the original image, and determining camera internal parameters and distortion parameters of the original image;
calculating camera internal parameters and distortion parameters of each segmented image, and correcting each segmented image;
and performing feature matching on each segmented image subjected to distortion correction, and acquiring and fusing depth maps of a plurality of segmented images to obtain a three-dimensional point cloud.
2. The method according to claim 1, wherein the required memory for the depth map of each of the segmented images is determined by:
calculating the video memory required by the depth map of each segmented image according to the memory space occupied by one pixel of the formula bytes=w×h; wherein w is the width of each segmented image, and h is the height of each segmented image.
3. The method according to claim 1, wherein said converting pixel coordinates of each of said feature points at a location corresponding to said segmented image to pixel coordinates on said original image comprises:
and respectively calculating pixel coordinates of each characteristic point on the original image according to the position of the segmented image on the original image and the width and the height of the original image.
4. The method of claim 1, wherein the determining camera intrinsic and distortion parameters of the original image comprises:
performing feature matching on the original image according to the pixel coordinates of each feature point on the original image;
performing aerial triangulation on the original image to obtain a measurement result; wherein the measurement results include camera intrinsic parameters of the original image and connection points between the original image and each of the segmented images.
5. The method of claim 1, wherein said calculating camera intrinsic and distortion parameters for each of said segmented images comprises:
taking the distortion parameters of the original image as the distortion parameters of each segmented image;
and converting the camera internal parameters of the original image into the camera internal parameters of the segmented image according to the position of the segmented image on the original image and the width and the height of the original image.
6. The method of claim 1, wherein said fusing depth maps of a plurality of said segmented images comprises:
reconstructing three-dimensional coordinates of a plurality of segmented images to obtain depth information of the depth map;
sequentially taking each segmented image as a reference image, and projecting the depth estimated value of each segmented image to world coordinates;
and extracting a three-dimensional point cloud from the three-dimensional point clouds of the adjacent segmented images, and reserving the three-dimensional point clouds with two identical adjacent estimates according to the estimated depth, the normal line and the reprojection error constraint of each segmented image in the world coordinates.
7. The method as recited in claim 1, further comprising:
constructing a three-dimensional grid according to the three-dimensional point cloud, so that any point in the three-dimensional point cloud falls on the vertex, side or inside the triangle of the corresponding triangle;
and carrying out texture mapping on the three-dimensional grid.
8. A three-dimensional modeling apparatus for a large-resolution oblique photographic image, comprising:
the segmentation module is used for segmenting the original image, so that the video memory required by calculating the depth map of each segmented image is smaller than or equal to the video memory of the computer;
the feature extraction module is used for extracting features of each segmented image to obtain feature points of each segmented image;
the coordinate conversion module is used for converting the pixel coordinate of each characteristic point corresponding to the segmentation image into the pixel coordinate on the original image and determining the camera internal parameters and distortion parameters of the original image;
the correction module is used for calculating camera internal parameters and distortion parameters of each segmented image and correcting each segmented image;
and the fusion module is used for carrying out feature matching on each segmented image subjected to distortion correction, obtaining and fusing depth maps of a plurality of segmented images, and obtaining a three-dimensional point cloud.
9. A three-dimensional modeling server for a high resolution oblique photographic image, comprising a memory and a processor;
the memory is used for storing computer executable instructions;
the processor is configured to execute the computer-executable instructions to implement the method of any of claims 1-6.
10. A computer readable storage medium storing executable instructions which when executed by a computer enable the method of any one of claims 1 to 6.
CN202310501817.6A 2023-05-06 2023-05-06 Three-dimensional modeling method and device for large-resolution oblique photographic image Pending CN116580152A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310501817.6A CN116580152A (en) 2023-05-06 2023-05-06 Three-dimensional modeling method and device for large-resolution oblique photographic image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310501817.6A CN116580152A (en) 2023-05-06 2023-05-06 Three-dimensional modeling method and device for large-resolution oblique photographic image

Publications (1)

Publication Number Publication Date
CN116580152A true CN116580152A (en) 2023-08-11

Family

ID=87542516

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310501817.6A Pending CN116580152A (en) 2023-05-06 2023-05-06 Three-dimensional modeling method and device for large-resolution oblique photographic image

Country Status (1)

Country Link
CN (1) CN116580152A (en)

Similar Documents

Publication Publication Date Title
CN108764048B (en) Face key point detection method and device
Rupnik et al. MicMac–a free, open-source solution for photogrammetry
JP6057298B2 (en) Rapid 3D modeling
CN112444242B (en) Pose optimization method and device
CN104330074B (en) Intelligent surveying and mapping platform and realizing method thereof
CN108168521A (en) One kind realizes landscape three-dimensional visualization method based on unmanned plane
CN112686877B (en) Binocular camera-based three-dimensional house damage model construction and measurement method and system
CN109102563A (en) A kind of outdoor scene three-dimensional modeling method
CN113393577B (en) Oblique photography terrain reconstruction method
CN113298947A (en) Multi-source data fusion-based three-dimensional modeling method medium and system for transformer substation
CN115631317B (en) Tunnel lining ortho-image generation method and device, storage medium and terminal
CN113566793A (en) True orthoimage generation method and device based on unmanned aerial vehicle oblique image
CN114202632A (en) Grid linear structure recovery method and device, electronic equipment and storage medium
Kang et al. An automatic mosaicking method for building facade texture mapping using a monocular close-range image sequence
CN115222884A (en) Space object analysis and modeling optimization method based on artificial intelligence
CN109785421B (en) Texture mapping method and system based on air-ground image combination
CN117115243B (en) Building group outer facade window positioning method and device based on street view picture
CN113034347A (en) Oblique photographic image processing method, device, processing equipment and storage medium
Deng et al. Automatic true orthophoto generation based on three-dimensional building model using multiview urban aerial images
CN113850293B (en) Positioning method based on multisource data and direction prior combined optimization
CN116580152A (en) Three-dimensional modeling method and device for large-resolution oblique photographic image
CN113596432B (en) Visual angle variable 3D video production method, visual angle variable 3D video production device, visual angle variable 3D video production equipment and storage medium
Wu et al. Building Facade Reconstruction Using Crowd-Sourced Photos and Two-Dimensional Maps
Wei et al. Indoor and outdoor multi-source 3D data fusion method for ancient buildings
CN116805355B (en) Multi-view three-dimensional reconstruction method for resisting scene shielding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination