CN110135474A - A kind of oblique aerial image matching method and system based on deep learning - Google Patents

A kind of oblique aerial image matching method and system based on deep learning Download PDF

Info

Publication number
CN110135474A
CN110135474A CN201910344297.6A CN201910344297A CN110135474A CN 110135474 A CN110135474 A CN 110135474A CN 201910344297 A CN201910344297 A CN 201910344297A CN 110135474 A CN110135474 A CN 110135474A
Authority
CN
China
Prior art keywords
data set
deep learning
point
network
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910344297.6A
Other languages
Chinese (zh)
Inventor
吴春
郑振华
宋洁
张翼峰
孙佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Land Use And Urban Spatial Planning Research Center
Original Assignee
Wuhan Land Use And Urban Spatial Planning Research Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Land Use And Urban Spatial Planning Research Center filed Critical Wuhan Land Use And Urban Spatial Planning Research Center
Priority to CN201910344297.6A priority Critical patent/CN110135474A/en
Publication of CN110135474A publication Critical patent/CN110135474A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01CMEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
    • G01C11/00Photogrammetry or videogrammetry, e.g. stereogrammetry; Photographic surveying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/02Affine transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/75Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
    • G06V10/751Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The present invention provides a kind of oblique aerial image matching method and system based on deep learning, includes the following steps: step 1, and training sample data set, performance test data collection obtain;Step 2 is improved Triplet network by the loss function for selecting different affine Transform Model parameters, construction different, and is trained using training sample data set to improved Triplet network;Performance test data collection is input in trained Triplet network by step 3, exports matching result, and reject mispairing point pair using basis matrix F, i.e. corresponding dot pair need to meet condition: x'Fx≤threshold value T.By the results showed that compared with the conventional method, for aviation image, the method for the present invention can to avoid a large amount of space characteristics points to mispairing point deletion is mistakened as the problem of.

Description

A kind of oblique aerial image matching method and system based on deep learning
Technical field
The invention belongs to the photogrammetric technology fields in Surveying Science and Technology subject, navigate more particularly to a kind of inclination The matching process and system of empty image.
Background technique
With the rapid development of earth observation technology, contemporary Geographical Information Sciences are just from two-dimentional papery plane to three-dimensional space Direction develop.Obtain in recent years fast-developing oblique photograph measuring system can effectively solve the problem that urban skyscraper block, The problems such as side grain acquisition of information, is increasingly becoming the main data source of 3 d modeling of building.Feature Matching is Restore the key technology means of three-dimensional information, including two core procedures: feature detection, feature in the system based on bidimensional image Description.The features such as DOG, Harris, Harris-Affine, Hessian, Hessian-Affine, MSERs detection symbol, SIFT, The feature descriptors such as SURF, DAISY, belong to the shallow-layer learning model of engineer, and Image Matching process is as shown in Figure 1.With Data volume increased dramatically the raising with computing capability, the limitation of shallow-layer study mainly based on finite sample and is calculating single In the case where member, the applicability in complex three-dimensional scene (such as urban area) is limited.Basic reason is that urban area covers Inclination image in range usually can not be approximately two-dimensional surface, different different as estimating affine transformation parameter, pass through The robustness for the provincial characteristics that conventional method obtains is not strong.
Summary of the invention
Present invention seek to address that " ground object target change in depth acutely causes picture can not between in oblique aerial Image Matching With single affine Transform Model approximate expression " problem.
The technical scheme is that a kind of oblique aerial image matching method based on deep learning, including walk as follows It is rapid:
Step 1, training sample data set, performance test data collection obtain;
Wherein training sample data set in HPatches data set by being sampled to obtain, and implementation is such as Under,
If χ=(Ai,Pi)I=1,2 ... nFor the sample matches point pair generated from data set, n is the logarithm of matching double points, right Answering network output valve is (ai,pi)I=1,2 ... n, Distance matrix D=[dij]n×n,
Define distance aiNearest non-matching pointjmin=argmind (ai,pj), j=1,2 ... n, j ≠ i;
Define distance piNearest non-matching pointkmin=argmind (ak,pi), k=1,2 ... n, k ≠ i;
Construct the input value of Triplet network structure: ifIt takes IfIt takesWork as setting(or) be isolated point when, it is corresponding(or) rejected from input sample centering;
Step 2, by selecting different affine Transform Model parameters, constructing different loss functions to Triplet network It improves, and improved Triplet network is trained using training sample data set;
Performance test data collection is input in trained Triplet network by step 3, exports matching result, and adopt Mispairing point pair is rejected with basis matrix F, i.e. corresponding dot pair need to meet condition: x'Fx≤threshold value T;Wherein, x, x' indicate two shadows The corresponding dot pair expressed as in homogeneous coordinates, F is 3 × 3 matrix.
Further, affine Transform Model parameter selected in step 2 isWherein, λ is scale;φ is longitude;θ is latitude Degree;ψ is swing angle.
Further, the loss function constructed in step 2 is as follows,
Wherein, aiOn the basis of, piFor positive sample, qiFor negative sample, n is the logarithm of corresponding dot pair.
Further, the performance test data collection include the general two-dimensional surface scene of 4 groups of computer vision fields from Right image data collection and 2 groups of cities provided by International Society for Photogrammetry and Remote (ISPRS) Third Committee's experimental project Central city three-dimensional scenic multi-angle of view image.
Further, the value of threshold value T is 3 or 5.
The present invention also provides a kind of oblique aerial image matching system based on deep learning, including following module:
Data acquisition module, the acquisition for training sample data set, performance test data collection;
Wherein training sample data set in HPatches data set by being sampled to obtain, and implementation is such as Under,
If χ=(Ai,Pi)I=1,2 ...N is the sample matches point pair generated from data set, and n is the logarithm of matching double points, Corresponding network output valve is (ai,pi)I=1,2 ... n, Distance matrix D=[dij]n×n,
Define distance aiNearest non-matching pointjmin=argmind (ai,pj), j=1,2 ... n, j ≠ i;
Define distance piNearest non-matching pointkmin=argmind (ak,pi), k=1,2 ... n, k ≠ i;
Construct the input value of Triplet network structure: ifIt takes IfIt takesWork as setting(or) be isolated point when, it is corresponding(or) rejected from input sample centering;
Network improvement and training module, for the loss by selecting different affine Transform Model parameters, construction different Function improves Triplet network, and is instructed using training sample data set to improved Triplet network Practice;
Mispairing point, for performance test data collection to be input in trained Triplet network, is exported to module is rejected Matching result, and mispairing point pair is rejected using basis matrix F, i.e. corresponding dot pair need to meet condition: x'Fx≤threshold value T;Wherein, x, X' indicates the corresponding dot pair expressed in two images with homogeneous coordinates, and F is 3 × 3 matrix.
Further, affine Transform Model parameter selected in network improvement and training module isWherein, λ is scale;φ is longitude;θ is latitude Degree;ψ is swing angle.
Further, the loss function constructed in network improvement and training module is as follows,
Wherein, aiOn the basis of, piFor positive sample, qiFor negative sample, n is the logarithm of corresponding dot pair.
Further, the performance test data collection include the general two-dimensional surface scene of 4 groups of computer vision fields from Right image data collection and 2 groups of cities provided by International Society for Photogrammetry and Remote (ISPRS) Third Committee's experimental project Central city three-dimensional scenic multi-angle of view image.
Further, the value of threshold value T is 3 or 5.
Compared with prior art, the advantages of the present invention: what is be primarily present between inclination image is that visual angle becomes Change, translation common in previous deep learning document, rotation parameter estimation are extended to the affine transformation suitable for visual angle change Parameter realizes the dynamic estimation of affine transformation parameter;By constructing specific loss function to obtain matched sample Euclidean distance Relatively minimal, 128 low dimensional feature vectors of the degree of correlation between different dimensions.
Detailed description of the invention
Fig. 1 is the Feature Matching process based on shallow-layer learning model (SIFT).
Fig. 2 is Siamese Network in the embodiment of the present invention.
Fig. 3 is Triplet Network in the embodiment of the present invention.
Fig. 4 is that training dataset samples process in the embodiment of the present invention.
Fig. 5 is algorithm performance test data set in the embodiment of the present invention: (a)-(d) is that 4 groups of computer vision fields are general Two-dimensional surface scene.
Fig. 6 is algorithm performance test data set in the embodiment of the present invention: simple three-dimensional scene, wherein (a) forward sight, (b) Backsight, (c) left view, (d) right view, (e) under regard.
Fig. 7 is algorithm performance test data set in the embodiment of the present invention: complex three-dimensional scene figure, wherein (a) forward sight, (b) Backsight, (c) left view, (d) right view, (e) under regard.
Fig. 8 is affine camera model in the embodiment of the present invention.
Fig. 9 is that ASIFT affine space parameter space samples schematic diagram in the embodiment of the present invention.
Figure 10 is median-plane field of embodiment of the present invention scape homography matrix H.
Figure 11 is basis matrix F in the embodiment of the present invention.
Figure 12 is homography matrix H, basis matrix F rejecting mispairing point comparative experiments figure in the embodiment of the present invention, wherein (a), (b) is lower view, right view aviation image to SIFT algorithmic match as a result, using basis matrix rejecting mispairing point;(c), (d) is Same image is to RANSAC algorithm mispairing point deletion result.
Specific embodiment
Technical solution of the present invention is described further with reference to the accompanying drawings and examples.
In general classification problem, classification sum is known and fixed, each input value has its corresponding ownership class Not.In Feature Matching, image is matched to the substantial amounts of respective extracted characteristic point, but it is partially of the same name for wherein only having Characteristic point pair, remaining is " isolated point ".To improve matched accuracy, in addition to conventional mispairing point rejects strategy, usually also Compare other two point relevant to point to be matched: the point close apart from nearest point, distance time.Therefore it is suitable for image feature Matched neural network model is based on " parallel " structure of multi input (out) value, and there are mainly two types of forms:
I, the Siamese Network (also known as " twins' network ", such as Fig. 2) of 2 input values;
The Triplet Network (such as Fig. 3) of П, 3 input values;
For there are the inclination Image Matching that visual angle (viewpoint) changes, newest contrast and experiment shows: " tradition Matching strategy ratio " local use depth learning technology " (learning characteristic detection+craft of the detection of hand-designed feature, descriptor " Feature description, manual feature detection+learning characteristic description) matching strategy advantageously (Fan et al., 2017; Schonberger et al.,2017;Lenc et al.,2018).The present invention is using " learning characteristic detection+learning characteristic is retouched State " matching strategy, technology path include three big steps:
Step 1, training sample data set, performance test data collection obtain.
I, training sample data set samples
Using newest at present and most popular HPatches data set (Balntas et al., 2017), the data Collection has the advantage that real scene is abundant, data volume is big, is suitable for multiple-task.Training data concentrates the number of non-matching point It is unnecessary to traverse all possible combination far more than match point for amount, therefore effectively very crucial (the Tian et of sampling policy al.,2017)。
If χ=(Ai,Pi)I=1,2 ... nFor the sample matches point pair generated from data set, n is the logarithm of matching double points, right Answering network output valve is (ai,pi)I=1,2 ... n, Distance matrix D=[dij]n×n,
Define distance aiNearest non-matching pointjmin=argmind (ai,pj), j=1,2 ... n, j ≠ i;
Define the nearest non-matching point of distance pikmin=argmind (ak,pi), k=1,2 ... n, k ≠ i;
Construct the input value of Triplet network structure: ifIt takes IfIt takesAs shown in Figure 4.
Unlike document Mishkin et al. (2018), we add additional judgement(or) whether For the random selection mechanism of " isolated point ", work as setting(or) be isolated point when, it is corresponding(or) from input Sample centering is rejected, and reason is in characteristic point to be matched that only some is there are Corresponding matching point, is existed largely without corresponding " isolated point " with point, this problem is more prominent in aviation image matching, and reason is aviation image mostly with urban architecture Based on object (complex three-dimensional scene), the terrestrial object information that different perspectives image is included is not fully identical.
П, performance test data collection
The general two-dimensional surface scene nature image data collection (as shown in Figure 5) of experiment 4 groups of computer vision fields of selection, 2 groups of urban central zone three dimensional fields provided by International Society for Photogrammetry and Remote (ISPRS) Third Committee's experimental project Scape multi-angle of view image (portion intercepts region is as shown in Figure 6, Figure 7).Fig. 6 image overlay area: building structure is simple, site coverage It is small, highly low, simple three-dimensional scenic can be considered as, as experimental data set I.Fig. 7 image overlay area: building structure complexity, Site coverage is big, can be considered as complex three-dimensional scene, as experimental data set II.Two groups of scene data collection all include image The information such as elements of interior orientation, elements of exterior orientation, Pixel size, radiometric resolution, and have been subjected to stringent optic aberrance revising.By In algorithm main output the result is that the characteristic point pair after matching pixel coordinate value, therefore pass through correct matching double points quantity (quantitative assessment) carrys out the performance of evaluation algorithms.
Experimental situation is as follows: Ubuntu 18.04.1,10 professional of Microsoft Windows 64 behaviour Make system, development platform is Visual Studio 2015, Matlab2017a;Deep learning tool is Pyorch, SIFT algorithm It is adopted using open source software VLFeat, the ASIFT algorithm that Oxford University Andrew Zisserman professor computation vision group provides The source program provided with algorithm original author.
The correct matching characteristic point logarithm that distinct methods obtain in 1 two-dimensional surface scene of table
The statistical result of table 1 shows: in two-dimensional surface scene, affine space continuous parameters, and ASIFT algorithm affine space The matching strategy of sparse sampling can be good at covering this affine space, therefore the matching knot of ASIFT algorithm parameter at equal intervals Fruit will be substantially better than deep learning method.Deep learning method is commonly available to " can not be by physics theorem or mathematics etc. Formula establishes model " occasion;Conversely, when model " can be established by physics theorem or mathematical equation ", deep learning Method can not often play its advantage.
The correct match point logarithm that distinct methods obtain in the simple three-dimensional scenic interception area of table 2
As shown in table 2, in the simple three-dimensional scenic there are visual angle change, SIFT algorithm fails substantially, and ASIFT algorithm is still Have certain adaptability, but in forward sight-backsight, left view-is right sharply declines depending on the true matching double points quantity of two image alignments, depth Learning method also has similar situation, because the two opposite camera observed directions are completely on the contrary, parallax comparison is most obvious.With two Dimensional plane scene is different, and the advantage of deep learning method has started to show, deep except lower view-forward sight, lower view-right seeing image picture are external The correct matching double points that degree learning method obtains are more than ASIFT algorithm.
The correct match point logarithm that distinct methods obtain in 3 complex three-dimensional scene interception area of table
As shown in table 3, in simple three-dimensional scenic, even there are also certain adaptability for ASIFT algorithm, in complex three-dimensional field It almost fails in scape, having three pairs of images to (lower view-forward sight, forward sight-backsight, the right view of left view -), it fails to match.Deep learning method Correct matching double points in varying numbers are all obtained, but quantity decline is obvious compared with simple three-dimensional scenic.
Step 2, depth net structure, training
Depth net structure includes two parts: affine Transform Model parameter selection, feature describe network losses function structure It makes, is trained using network of the training sample after sampling to construction.
I, affine Transform Model parameter selection
Direction estimation realized by Spatial Transformer Newworks (Jaderberg et al., 2016), It needs covariant constraint (Covariant Constraint) being extended to adaptation affine transformation in the direction estimation of visual angle change, Select affine Transform Model parameter.Affine transformation matrixA decomposed form multiplicity, based on convolution mind In geometry estimation through network model, the selection of resolution parameter can have a significant impact (Mishkin et al., 2018).The present invention The affine parameter of use is as follows,
Above-mentioned affine parameter is selected to be based primarily upon following two points consideration:
1) parameter explicit physical meaning (Fig. 8, λ: scale;φ: longitude;θ: latitude;ψ: swing angle), and training can be passed through Depth network implementations dynamic estimation afterwards, close to the true imaging process of aviation image.
2) it is convenient for comparing with algorithm ASIFT;In plane scene, ASIFT algorithmic match effect is very good, and in three-dimensional It is undesirable in scene.Reason is: in three-dimensional scenic, affine space parameter is not continuous, the affine parameter of ASIFT algorithm Equal interval sampling strategy (Fig. 9) can not cover all affine spaces (Chen Min, 2014).In previous comparative experiments document, Rare to compare deep learning method and ASIFT method, the present invention will exactly be solved discontinuous by the method for deep learning Affine space Parameter Estimation Problem.
П, loss function construction
The basic Component units of neural network that feature description uses quoted from L2NET (Tian et al., 2017), but we Using the Triplet network structure (as shown in Figure 3) different from master mould, loss function construction is also different, if network is defeated Feature description vectors out are as follows: (ai,pi,qi), ai: benchmark, pi: positive sample is (with aiThe point to match), qi: negative sample is (with aiIt is non- Matched point);
Loss function:
In the ideal case, d (ai,pi) < d (ai,qi), it may be assumed that the distance between matching double points are less than non-matching point pair The distance between, it will appear negative value in network training real process.
Step 3, mispairing point is to rejecting
There are a large amount of mispairing points pair in network output result, it is therefore necessary to take appropriate measures and be rejected.It is counting Calculation machine visual field, widely used natural image is mostly based on plane scene or almost plane scene in view transformation research. In plane scene, all corresponding dot pairs meet the same homography matrix (Homography Matrix) transformation model: x'= Hx, as shown in Figure 10.π is a space plane, xπFor the point in space plane, c, c' are projection ray, and x, x' are planar point xπ? As the subpoint expressed in plane with homogeneous coordinates, i.e., (as) of the same name point pair, H is 3 × 3 matrix.In Image Matching, plane Scene homography matrix H can be used for affine space parameter sparse sampling early period and later period and be deleted based on the mispairing point of RANSAC algorithm It removes.
Oblique aerial image is based on complicated city three-dimensional scenic, and ground object target change in depth is discontinuous, in image Corresponding dot pair does not meet the same homography conversion model, this is also that the famous operators of many computer vision fields can not obtain Ideal effect reason for it.Due to image elements of exterior orientation it is known that using basis matrix (Fundamental matrix, figure 11) F rejecting mispairing point more meets reality, i.e., when x, x' indicate the corresponding dot pair expressed in left and right image with homogeneous coordinates, has Following equation is set up: x'Fx=0, F are 3 × 3 matrixes.Inevitably there is error in imaging, matching process in image, because This is normally set up the threshold value of 3 to 5 pixels, and corresponding dot pair need to meet condition: x'Fx≤3.
As shown in figure 12, (a), (b) be it is lower view, right view aviation image to SIFT algorithmic match as a result, using basis matrix Reject mispairing point;(c), (d) is same image to RANSAC algorithm mispairing point deletion result.The results showed that for aviation Image, according to the relationship between homography matrix approximate expression corresponding dot pair, a large amount of space characteristics points are to (such as building object angle Point) mispairing point deletion can be mistakened as.
The embodiment of the present invention also provides a kind of oblique aerial image matching system based on deep learning, including such as lower die Block:
Data acquisition module, the acquisition for training sample data set, performance test data collection;
Wherein training sample data set in HPatches data set by being sampled to obtain, and implementation is such as Under,
If χ=(Ai,Pi)I=1,2 ... nFor the sample matches point pair generated from data set, n is the logarithm of matching double points, right Answering network output valve is (ai,pi)I=1,2 ... n, Distance matrix D=[dij]n×n,
Define distance aiNearest non-matching pointjmin=argmind (ai,pj), j=1,2 ... n, j ≠ i;
Define distance piNearest non-matching pointkmin=argmind (ak,pi), k=1,2 ... n, k ≠ i;
Construct the input value of Triplet network structure: ifIt takes IfIt takesWork as setting(or) be isolated point when, it is corresponding(or) rejected from input sample centering;
Network improvement and training module, for the loss by selecting different affine Transform Model parameters, construction different Function improves Triplet network, and is instructed using training sample data set to improved Triplet network Practice;
Mispairing point, for performance test data collection to be input in trained Triplet network, is exported to module is rejected Matching result, and mispairing point pair is rejected using basis matrix F, i.e. corresponding dot pair need to meet condition: x'Fx≤threshold value T;Wherein, x, X' indicates the corresponding dot pair expressed in two images with homogeneous coordinates, and F is 3 × 3 matrix.
Wherein, affine Transform Model parameter selected in network improvement and training module isWherein, λ is scale;φ is longitude;θ is latitude Degree;ψ is swing angle.
Wherein, the loss function constructed in network improvement and training module is as follows,
Wherein, aiOn the basis of, piFor positive sample, qiFor negative sample, n is the logarithm of corresponding dot pair.
The specific implementation of each module and each step are corresponding, and the present invention not writes.
Relevant references are as follows:
[1] Chen Min .2014. multi-source Remote Sensing Images characteristic matching technical research [D]: the Wuhan [doctor]: Wuhan University
[2]Balntas V,Lenc K,Vedaldi A,et al.2017.HPatch:A benchmark and evaluation of handcrafted and learned local descriptors[C].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.
[3]Fan B,Kong Q,Wang X,et al.2017.A Performance Evaluation of Local Features for Image Based 3D Reconstruction[C].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.
[4]Jaderberg M,Simonyan K,Zisserman A.et al.2016.Spatial Transformer Networks[C].Proceedings of Advances in Neural Information Processing Systems.
[5]Lenc K,Vedaldi A.2017.Large scale evaluation of local image feature detectors on homography datasets[J].
[6]Mishkin D,Radenovic F,Matas J.2018.Repeatability Is Not Enough: Learning discriminative affine regions via discriminability[C].Proceedings of Computing Research Repository.
[7]Sch¨onberger J L,Hardmeier H,Sattler T,et al.2017.Comparative evaluation of hand-crafted and learned local features[C].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.
[8]Tian Y,Fan B,Wu F.2017.L2Net:Deep learning of discriminative patch descriptor in euclidean space[C].Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.
[9]Yu G,Morel J M.2009.A fully affine invariant image comparison method[C].Proceedings of IEEE International Conference on Acoustics,Speech, and Signal Processing.
Specific embodiment described herein is only an example for the spirit of the invention.The neck of technology belonging to the present invention The technical staff in domain can make various modifications or additions to the described embodiments or replace by a similar method In generation, however, it does not deviate from the spirit of the invention or beyond the scope of the appended claims.

Claims (10)

1. a kind of oblique aerial image matching method based on deep learning, which comprises the steps of:
Step 1, training sample data set, performance test data collection obtain;
Wherein for training sample data set by being sampled to obtain in HPatches data set, implementation is as follows,
If χ=(Ai,Pi)I=1,2 ... nFor the sample matches point pair generated from data set, n is the logarithm of matching double points, corresponding net Network output valve is (ai,pi)I=1,2 ... n, Distance matrix D=[dij]n×n,
Define distance aiNearest non-matching pointjmin=argmind (ai,pj), j=1,2 ... n, j ≠ i;
Define distance piNearest non-matching pointkmin=argmind (ak,pi), k=1,2 ... n, k ≠ i;
Construct the input value of Triplet network structure: ifIt takesIfIt takesWork as setting(or) be isolated point when, it is corresponding (or) rejected from input sample centering;
Step 2 carries out Triplet network by the loss function for selecting different affine Transform Model parameters, construction different It improves, and improved Triplet network is trained using training sample data set;
Performance test data collection is input in trained Triplet network by step 3, exports matching result, and use base Plinth matrix F rejects mispairing point pair, i.e. corresponding dot pair need to meet condition: x'Fx≤threshold value T;Wherein, x, x' are indicated in two images The corresponding dot pair expressed with homogeneous coordinates, F is 3 × 3 matrix.
2. a kind of oblique aerial image matching method based on deep learning as described in claim 1, it is characterised in that: step Affine Transform Model parameter selected in 2 is Wherein, λ is scale;φ is longitude;θ is latitude;ψ is swing angle.
3. a kind of oblique aerial image matching method based on deep learning as described in claim 1, it is characterised in that: step The loss function constructed in 2 is as follows,
Wherein, aiOn the basis of, piFor positive sample, qiFor negative sample, n is the logarithm of corresponding dot pair.
4. a kind of oblique aerial image matching method based on deep learning as described in claim 1, it is characterised in that: described Performance test data collection include the general two-dimensional surface scene nature image data collection of 4 groups of computer vision fields and 2 groups by The urban central zone three-dimensional scenic that International Society for Photogrammetry and Remote (ISPRS) Third Committee's experimental project provides regards more Angle image.
5. a kind of oblique aerial image matching method based on deep learning as described in claim 1, it is characterised in that: threshold value The value of T is 3 or 5.
6. a kind of oblique aerial image matching system based on deep learning, which is characterized in that including following module:
Data acquisition module, the acquisition for training sample data set, performance test data collection;
Wherein for training sample data set by being sampled to obtain in HPatches data set, implementation is as follows,
If χ=(Ai,Pi)I=1,2 ... nFor the sample matches point pair generated from data set, n is the logarithm of matching double points, corresponding net Network output valve is (ai,pi)I=1,2 ... n, Distance matrix D=[dij]n×n,
Define distance aiNearest non-matching pointjmin=argmind (ai,pj), j=1,2 ... n, j ≠ i;
Define distance piNearest non-matching pointkmin=argmind (ak,pi), k=1,2 ... n, k ≠ i;
Construct the input value of Triplet network structure: ifIt takesIfIt takesWork as setting(or) be isolated point when, it is corresponding (or) rejected from input sample centering;
Network improvement and training module, for the loss function by selecting different affine Transform Model parameters, construction different Triplet network is improved, and improved Triplet network is trained using training sample data set;
Mispairing point is to module is rejected, for performance test data collection to be input in trained Triplet network, output matching As a result, and mispairing point pair is rejected using basis matrix F, i.e. corresponding dot pair need to meet condition: x'Fx≤threshold value T;Wherein, x, x' table Show the corresponding dot pair expressed in two images with homogeneous coordinates, F is 3 × 3 matrix.
7. a kind of oblique aerial image matching system based on deep learning as claimed in claim 6, it is characterised in that: network Improve and training module selected in affine Transform Model parameter beWherein, λ is scale;φ is longitude;θ is latitude Degree;ψ is swing angle.
8. a kind of oblique aerial image matching system based on deep learning as claimed in claim 6, it is characterised in that: network The loss function constructed in improvement and training module is as follows,
Wherein, aiOn the basis of, piFor positive sample, qiFor negative sample, n is the logarithm of corresponding dot pair.
9. a kind of oblique aerial image matching system based on deep learning as described in claim 1, it is characterised in that: described Performance test data collection include the general two-dimensional surface scene nature image data collection of 4 groups of computer vision fields and 2 groups by The urban central zone three-dimensional scenic that International Society for Photogrammetry and Remote (ISPRS) Third Committee's experimental project provides regards more Angle image.
10. a kind of oblique aerial image matching system based on deep learning as claimed in claim 6, it is characterised in that: threshold The value of value T is 3 or 5.
CN201910344297.6A 2019-04-26 2019-04-26 A kind of oblique aerial image matching method and system based on deep learning Pending CN110135474A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910344297.6A CN110135474A (en) 2019-04-26 2019-04-26 A kind of oblique aerial image matching method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910344297.6A CN110135474A (en) 2019-04-26 2019-04-26 A kind of oblique aerial image matching method and system based on deep learning

Publications (1)

Publication Number Publication Date
CN110135474A true CN110135474A (en) 2019-08-16

Family

ID=67575342

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910344297.6A Pending CN110135474A (en) 2019-04-26 2019-04-26 A kind of oblique aerial image matching method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN110135474A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113112529A (en) * 2021-03-08 2021-07-13 武汉市土地利用和城市空间规划研究中心 Dense matching mismatching point processing method based on region adjacent point search
CN114937393A (en) * 2022-03-30 2022-08-23 中国石油化工股份有限公司 Petrochemical enterprise high-altitude operation simulation training system based on augmented reality

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794490A (en) * 2015-04-28 2015-07-22 中测新图(北京)遥感技术有限责任公司 Slanted image homonymy point acquisition method and slanted image homonymy point acquisition device for aerial multi-view images
CN108446627A (en) * 2018-03-19 2018-08-24 南京信息工程大学 A kind of Aerial Images matching process based on partial-depth Hash
US20190005670A1 (en) * 2017-06-28 2019-01-03 Magic Leap, Inc. Method and system for performing simultaneous localization and mapping using convolutional image transformation
CN109344845A (en) * 2018-09-21 2019-02-15 哈尔滨工业大学 A kind of feature matching method based on Triplet deep neural network structure

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104794490A (en) * 2015-04-28 2015-07-22 中测新图(北京)遥感技术有限责任公司 Slanted image homonymy point acquisition method and slanted image homonymy point acquisition device for aerial multi-view images
US20190005670A1 (en) * 2017-06-28 2019-01-03 Magic Leap, Inc. Method and system for performing simultaneous localization and mapping using convolutional image transformation
CN108446627A (en) * 2018-03-19 2018-08-24 南京信息工程大学 A kind of Aerial Images matching process based on partial-depth Hash
CN109344845A (en) * 2018-09-21 2019-02-15 哈尔滨工业大学 A kind of feature matching method based on Triplet deep neural network structure

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
ANASTASIYA MISHCHUK ET.AL: "Working hard to know your neighbor’s margins: Local descriptor learning loss", 《ARXIV:1705.10872V4 [CS.CV]》 *
DMYTRO MISHKIN ET.AL: "Learning Discriminative Affine Regions via Discriminability", 《ARXIV:1711.06704V2 [CS.CV]》 *
HANI ALTWAIJRY ET.AL: "Learning to Detect and Match Keypoints with Deep Architectures", 《BRITISH MACHINE VISION CONFERENCE》 *
李竹林 等: "《图像立体匹配技术及其发展和应用》", 31 July 2007, 西安:陕西科学技术出版社 *
每天都要深度学习: "深度学习干货学习(2)-triplet loss", 《HTTPS://BLOG.CSDN.NET/LUCIFER_ZZQ/ARTICLE/DETAILS/81271260》 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113112529A (en) * 2021-03-08 2021-07-13 武汉市土地利用和城市空间规划研究中心 Dense matching mismatching point processing method based on region adjacent point search
CN114937393A (en) * 2022-03-30 2022-08-23 中国石油化工股份有限公司 Petrochemical enterprise high-altitude operation simulation training system based on augmented reality
CN114937393B (en) * 2022-03-30 2023-10-13 中国石油化工股份有限公司 Petrochemical enterprise high-altitude operation simulation training system based on augmented reality

Similar Documents

Publication Publication Date Title
CN110135455B (en) Image matching method, device and computer readable storage medium
Pham et al. Lcd: Learned cross-domain descriptors for 2d-3d matching
JP7453470B2 (en) 3D reconstruction and related interactions, measurement methods and related devices and equipment
US10043097B2 (en) Image abstraction system
CN112270249A (en) Target pose estimation method fusing RGB-D visual features
CN108369741A (en) Method and system for registration data
CN103700099A (en) Rotation and dimension unchanged wide baseline stereo matching method
JP2013186902A (en) Vehicle detection method and apparatus
CN110222572A (en) Tracking, device, electronic equipment and storage medium
Zhang et al. Vehicle global 6-DoF pose estimation under traffic surveillance camera
CN113674400A (en) Spectrum three-dimensional reconstruction method and system based on repositioning technology and storage medium
CN104463962B (en) Three-dimensional scene reconstruction method based on GPS information video
WO2022247126A1 (en) Visual localization method and apparatus, and device, medium and program
CN110135474A (en) A kind of oblique aerial image matching method and system based on deep learning
Sun et al. A fast underwater calibration method based on vanishing point optimization of two orthogonal parallel lines
CN112329662B (en) Multi-view saliency estimation method based on unsupervised learning
CN113902802A (en) Visual positioning method and related device, electronic equipment and storage medium
Jiang et al. Contrastive learning of features between images and lidar
Lee et al. Robust uncertainty-aware multiview triangulation
CN117197333A (en) Space target reconstruction and pose estimation method and system based on multi-view vision
CN114998630B (en) Ground-to-air image registration method from coarse to fine
Budianti et al. Background blurring and removal for 3d modelling of cultural heritage objects
CN114723973A (en) Image feature matching method and device for large-scale change robustness
CN113570535A (en) Visual positioning method and related device and equipment
CN111414802B (en) Protein data characteristic extraction method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190816

RJ01 Rejection of invention patent application after publication