CN110135474A - A kind of oblique aerial image matching method and system based on deep learning - Google Patents
A kind of oblique aerial image matching method and system based on deep learning Download PDFInfo
- Publication number
- CN110135474A CN110135474A CN201910344297.6A CN201910344297A CN110135474A CN 110135474 A CN110135474 A CN 110135474A CN 201910344297 A CN201910344297 A CN 201910344297A CN 110135474 A CN110135474 A CN 110135474A
- Authority
- CN
- China
- Prior art keywords
- data set
- deep learning
- point
- network
- matching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000013135 deep learning Methods 0.000 title claims abstract description 26
- 238000012549 training Methods 0.000 claims abstract description 34
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims abstract description 32
- 239000011159 matrix material Substances 0.000 claims abstract description 31
- 238000013480 data collection Methods 0.000 claims abstract description 25
- 238000011056 performance test Methods 0.000 claims abstract description 21
- 230000006870 function Effects 0.000 claims abstract description 16
- 238000010276 construction Methods 0.000 claims abstract description 8
- 230000006872 improvement Effects 0.000 claims description 8
- 238000012217 deletion Methods 0.000 abstract description 4
- 230000037430 deletion Effects 0.000 abstract description 4
- 238000007796 conventional method Methods 0.000 abstract description 2
- 238000004422 calculation algorithm Methods 0.000 description 19
- 238000005516 engineering process Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 230000009466 transformation Effects 0.000 description 7
- 230000000007 visual effect Effects 0.000 description 7
- 230000008859 change Effects 0.000 description 6
- 238000001514 detection method Methods 0.000 description 6
- 238000005070 sampling Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 238000003909 pattern recognition Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 3
- 230000000052 comparative effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000013598 vector Substances 0.000 description 2
- 101100126955 Arabidopsis thaliana KCS2 gene Proteins 0.000 description 1
- 244000260524 Chrysanthemum balsamita Species 0.000 description 1
- 235000005633 Chrysanthemum balsamita Nutrition 0.000 description 1
- 240000001439 Opuntia Species 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 238000007792 addition Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000010835 comparative analysis Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01C—MEASURING DISTANCES, LEVELS OR BEARINGS; SURVEYING; NAVIGATION; GYROSCOPIC INSTRUMENTS; PHOTOGRAMMETRY OR VIDEOGRAMMETRY
- G01C11/00—Photogrammetry or videogrammetry, e.g. stereogrammetry; Photographic surveying
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/02—Affine transformations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/751—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Multimedia (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The present invention provides a kind of oblique aerial image matching method and system based on deep learning, includes the following steps: step 1, and training sample data set, performance test data collection obtain;Step 2 is improved Triplet network by the loss function for selecting different affine Transform Model parameters, construction different, and is trained using training sample data set to improved Triplet network;Performance test data collection is input in trained Triplet network by step 3, exports matching result, and reject mispairing point pair using basis matrix F, i.e. corresponding dot pair need to meet condition: x'Fx≤threshold value T.By the results showed that compared with the conventional method, for aviation image, the method for the present invention can to avoid a large amount of space characteristics points to mispairing point deletion is mistakened as the problem of.
Description
Technical field
The invention belongs to the photogrammetric technology fields in Surveying Science and Technology subject, navigate more particularly to a kind of inclination
The matching process and system of empty image.
Background technique
With the rapid development of earth observation technology, contemporary Geographical Information Sciences are just from two-dimentional papery plane to three-dimensional space
Direction develop.Obtain in recent years fast-developing oblique photograph measuring system can effectively solve the problem that urban skyscraper block,
The problems such as side grain acquisition of information, is increasingly becoming the main data source of 3 d modeling of building.Feature Matching is
Restore the key technology means of three-dimensional information, including two core procedures: feature detection, feature in the system based on bidimensional image
Description.The features such as DOG, Harris, Harris-Affine, Hessian, Hessian-Affine, MSERs detection symbol, SIFT,
The feature descriptors such as SURF, DAISY, belong to the shallow-layer learning model of engineer, and Image Matching process is as shown in Figure 1.With
Data volume increased dramatically the raising with computing capability, the limitation of shallow-layer study mainly based on finite sample and is calculating single
In the case where member, the applicability in complex three-dimensional scene (such as urban area) is limited.Basic reason is that urban area covers
Inclination image in range usually can not be approximately two-dimensional surface, different different as estimating affine transformation parameter, pass through
The robustness for the provincial characteristics that conventional method obtains is not strong.
Summary of the invention
Present invention seek to address that " ground object target change in depth acutely causes picture can not between in oblique aerial Image Matching
With single affine Transform Model approximate expression " problem.
The technical scheme is that a kind of oblique aerial image matching method based on deep learning, including walk as follows
It is rapid:
Step 1, training sample data set, performance test data collection obtain;
Wherein training sample data set in HPatches data set by being sampled to obtain, and implementation is such as
Under,
If χ=(Ai,Pi)I=1,2 ... nFor the sample matches point pair generated from data set, n is the logarithm of matching double points, right
Answering network output valve is (ai,pi)I=1,2 ... n, Distance matrix D=[dij]n×n,
Define distance aiNearest non-matching pointjmin=argmind (ai,pj), j=1,2 ... n, j ≠ i;
Define distance piNearest non-matching pointkmin=argmind (ak,pi), k=1,2 ... n, k ≠ i;
Construct the input value of Triplet network structure: ifIt takes
IfIt takesWork as setting(or) be isolated point when, it is corresponding(or) rejected from input sample centering;
Step 2, by selecting different affine Transform Model parameters, constructing different loss functions to Triplet network
It improves, and improved Triplet network is trained using training sample data set;
Performance test data collection is input in trained Triplet network by step 3, exports matching result, and adopt
Mispairing point pair is rejected with basis matrix F, i.e. corresponding dot pair need to meet condition: x'Fx≤threshold value T;Wherein, x, x' indicate two shadows
The corresponding dot pair expressed as in homogeneous coordinates, F is 3 × 3 matrix.
Further, affine Transform Model parameter selected in step 2 isWherein, λ is scale;φ is longitude;θ is latitude
Degree;ψ is swing angle.
Further, the loss function constructed in step 2 is as follows,
Wherein, aiOn the basis of, piFor positive sample, qiFor negative sample, n is the logarithm of corresponding dot pair.
Further, the performance test data collection include the general two-dimensional surface scene of 4 groups of computer vision fields from
Right image data collection and 2 groups of cities provided by International Society for Photogrammetry and Remote (ISPRS) Third Committee's experimental project
Central city three-dimensional scenic multi-angle of view image.
Further, the value of threshold value T is 3 or 5.
The present invention also provides a kind of oblique aerial image matching system based on deep learning, including following module:
Data acquisition module, the acquisition for training sample data set, performance test data collection;
Wherein training sample data set in HPatches data set by being sampled to obtain, and implementation is such as
Under,
If χ=(Ai,Pi)I=1,2 ...N is the sample matches point pair generated from data set, and n is the logarithm of matching double points,
Corresponding network output valve is (ai,pi)I=1,2 ... n, Distance matrix D=[dij]n×n,
Define distance aiNearest non-matching pointjmin=argmind (ai,pj), j=1,2 ... n, j ≠ i;
Define distance piNearest non-matching pointkmin=argmind (ak,pi), k=1,2 ... n, k ≠ i;
Construct the input value of Triplet network structure: ifIt takes
IfIt takesWork as setting(or) be isolated point when, it is corresponding(or) rejected from input sample centering;
Network improvement and training module, for the loss by selecting different affine Transform Model parameters, construction different
Function improves Triplet network, and is instructed using training sample data set to improved Triplet network
Practice;
Mispairing point, for performance test data collection to be input in trained Triplet network, is exported to module is rejected
Matching result, and mispairing point pair is rejected using basis matrix F, i.e. corresponding dot pair need to meet condition: x'Fx≤threshold value T;Wherein, x,
X' indicates the corresponding dot pair expressed in two images with homogeneous coordinates, and F is 3 × 3 matrix.
Further, affine Transform Model parameter selected in network improvement and training module isWherein, λ is scale;φ is longitude;θ is latitude
Degree;ψ is swing angle.
Further, the loss function constructed in network improvement and training module is as follows,
Wherein, aiOn the basis of, piFor positive sample, qiFor negative sample, n is the logarithm of corresponding dot pair.
Further, the performance test data collection include the general two-dimensional surface scene of 4 groups of computer vision fields from
Right image data collection and 2 groups of cities provided by International Society for Photogrammetry and Remote (ISPRS) Third Committee's experimental project
Central city three-dimensional scenic multi-angle of view image.
Further, the value of threshold value T is 3 or 5.
Compared with prior art, the advantages of the present invention: what is be primarily present between inclination image is that visual angle becomes
Change, translation common in previous deep learning document, rotation parameter estimation are extended to the affine transformation suitable for visual angle change
Parameter realizes the dynamic estimation of affine transformation parameter;By constructing specific loss function to obtain matched sample Euclidean distance
Relatively minimal, 128 low dimensional feature vectors of the degree of correlation between different dimensions.
Detailed description of the invention
Fig. 1 is the Feature Matching process based on shallow-layer learning model (SIFT).
Fig. 2 is Siamese Network in the embodiment of the present invention.
Fig. 3 is Triplet Network in the embodiment of the present invention.
Fig. 4 is that training dataset samples process in the embodiment of the present invention.
Fig. 5 is algorithm performance test data set in the embodiment of the present invention: (a)-(d) is that 4 groups of computer vision fields are general
Two-dimensional surface scene.
Fig. 6 is algorithm performance test data set in the embodiment of the present invention: simple three-dimensional scene, wherein (a) forward sight, (b)
Backsight, (c) left view, (d) right view, (e) under regard.
Fig. 7 is algorithm performance test data set in the embodiment of the present invention: complex three-dimensional scene figure, wherein (a) forward sight, (b)
Backsight, (c) left view, (d) right view, (e) under regard.
Fig. 8 is affine camera model in the embodiment of the present invention.
Fig. 9 is that ASIFT affine space parameter space samples schematic diagram in the embodiment of the present invention.
Figure 10 is median-plane field of embodiment of the present invention scape homography matrix H.
Figure 11 is basis matrix F in the embodiment of the present invention.
Figure 12 is homography matrix H, basis matrix F rejecting mispairing point comparative experiments figure in the embodiment of the present invention, wherein
(a), (b) is lower view, right view aviation image to SIFT algorithmic match as a result, using basis matrix rejecting mispairing point;(c), (d) is
Same image is to RANSAC algorithm mispairing point deletion result.
Specific embodiment
Technical solution of the present invention is described further with reference to the accompanying drawings and examples.
In general classification problem, classification sum is known and fixed, each input value has its corresponding ownership class
Not.In Feature Matching, image is matched to the substantial amounts of respective extracted characteristic point, but it is partially of the same name for wherein only having
Characteristic point pair, remaining is " isolated point ".To improve matched accuracy, in addition to conventional mispairing point rejects strategy, usually also
Compare other two point relevant to point to be matched: the point close apart from nearest point, distance time.Therefore it is suitable for image feature
Matched neural network model is based on " parallel " structure of multi input (out) value, and there are mainly two types of forms:
I, the Siamese Network (also known as " twins' network ", such as Fig. 2) of 2 input values;
The Triplet Network (such as Fig. 3) of П, 3 input values;
For there are the inclination Image Matching that visual angle (viewpoint) changes, newest contrast and experiment shows: " tradition
Matching strategy ratio " local use depth learning technology " (learning characteristic detection+craft of the detection of hand-designed feature, descriptor "
Feature description, manual feature detection+learning characteristic description) matching strategy advantageously (Fan et al., 2017;
Schonberger et al.,2017;Lenc et al.,2018).The present invention is using " learning characteristic detection+learning characteristic is retouched
State " matching strategy, technology path include three big steps:
Step 1, training sample data set, performance test data collection obtain.
I, training sample data set samples
Using newest at present and most popular HPatches data set (Balntas et al., 2017), the data
Collection has the advantage that real scene is abundant, data volume is big, is suitable for multiple-task.Training data concentrates the number of non-matching point
It is unnecessary to traverse all possible combination far more than match point for amount, therefore effectively very crucial (the Tian et of sampling policy
al.,2017)。
If χ=(Ai,Pi)I=1,2 ... nFor the sample matches point pair generated from data set, n is the logarithm of matching double points, right
Answering network output valve is (ai,pi)I=1,2 ... n, Distance matrix D=[dij]n×n,
Define distance aiNearest non-matching pointjmin=argmind (ai,pj), j=1,2 ... n, j ≠ i;
Define the nearest non-matching point of distance pikmin=argmind (ak,pi), k=1,2 ... n, k ≠ i;
Construct the input value of Triplet network structure: ifIt takes
IfIt takesAs shown in Figure 4.
Unlike document Mishkin et al. (2018), we add additional judgement(or) whether
For the random selection mechanism of " isolated point ", work as setting(or) be isolated point when, it is corresponding(or) from input
Sample centering is rejected, and reason is in characteristic point to be matched that only some is there are Corresponding matching point, is existed largely without corresponding
" isolated point " with point, this problem is more prominent in aviation image matching, and reason is aviation image mostly with urban architecture
Based on object (complex three-dimensional scene), the terrestrial object information that different perspectives image is included is not fully identical.
П, performance test data collection
The general two-dimensional surface scene nature image data collection (as shown in Figure 5) of experiment 4 groups of computer vision fields of selection,
2 groups of urban central zone three dimensional fields provided by International Society for Photogrammetry and Remote (ISPRS) Third Committee's experimental project
Scape multi-angle of view image (portion intercepts region is as shown in Figure 6, Figure 7).Fig. 6 image overlay area: building structure is simple, site coverage
It is small, highly low, simple three-dimensional scenic can be considered as, as experimental data set I.Fig. 7 image overlay area: building structure complexity,
Site coverage is big, can be considered as complex three-dimensional scene, as experimental data set II.Two groups of scene data collection all include image
The information such as elements of interior orientation, elements of exterior orientation, Pixel size, radiometric resolution, and have been subjected to stringent optic aberrance revising.By
In algorithm main output the result is that the characteristic point pair after matching pixel coordinate value, therefore pass through correct matching double points quantity
(quantitative assessment) carrys out the performance of evaluation algorithms.
Experimental situation is as follows: Ubuntu 18.04.1,10 professional of Microsoft Windows 64 behaviour
Make system, development platform is Visual Studio 2015, Matlab2017a;Deep learning tool is Pyorch, SIFT algorithm
It is adopted using open source software VLFeat, the ASIFT algorithm that Oxford University Andrew Zisserman professor computation vision group provides
The source program provided with algorithm original author.
The correct matching characteristic point logarithm that distinct methods obtain in 1 two-dimensional surface scene of table
The statistical result of table 1 shows: in two-dimensional surface scene, affine space continuous parameters, and ASIFT algorithm affine space
The matching strategy of sparse sampling can be good at covering this affine space, therefore the matching knot of ASIFT algorithm parameter at equal intervals
Fruit will be substantially better than deep learning method.Deep learning method is commonly available to " can not be by physics theorem or mathematics etc.
Formula establishes model " occasion;Conversely, when model " can be established by physics theorem or mathematical equation ", deep learning
Method can not often play its advantage.
The correct match point logarithm that distinct methods obtain in the simple three-dimensional scenic interception area of table 2
As shown in table 2, in the simple three-dimensional scenic there are visual angle change, SIFT algorithm fails substantially, and ASIFT algorithm is still
Have certain adaptability, but in forward sight-backsight, left view-is right sharply declines depending on the true matching double points quantity of two image alignments, depth
Learning method also has similar situation, because the two opposite camera observed directions are completely on the contrary, parallax comparison is most obvious.With two
Dimensional plane scene is different, and the advantage of deep learning method has started to show, deep except lower view-forward sight, lower view-right seeing image picture are external
The correct matching double points that degree learning method obtains are more than ASIFT algorithm.
The correct match point logarithm that distinct methods obtain in 3 complex three-dimensional scene interception area of table
As shown in table 3, in simple three-dimensional scenic, even there are also certain adaptability for ASIFT algorithm, in complex three-dimensional field
It almost fails in scape, having three pairs of images to (lower view-forward sight, forward sight-backsight, the right view of left view -), it fails to match.Deep learning method
Correct matching double points in varying numbers are all obtained, but quantity decline is obvious compared with simple three-dimensional scenic.
Step 2, depth net structure, training
Depth net structure includes two parts: affine Transform Model parameter selection, feature describe network losses function structure
It makes, is trained using network of the training sample after sampling to construction.
I, affine Transform Model parameter selection
Direction estimation realized by Spatial Transformer Newworks (Jaderberg et al., 2016),
It needs covariant constraint (Covariant Constraint) being extended to adaptation affine transformation in the direction estimation of visual angle change,
Select affine Transform Model parameter.Affine transformation matrixA decomposed form multiplicity, based on convolution mind
In geometry estimation through network model, the selection of resolution parameter can have a significant impact (Mishkin et al., 2018).The present invention
The affine parameter of use is as follows,
Above-mentioned affine parameter is selected to be based primarily upon following two points consideration:
1) parameter explicit physical meaning (Fig. 8, λ: scale;φ: longitude;θ: latitude;ψ: swing angle), and training can be passed through
Depth network implementations dynamic estimation afterwards, close to the true imaging process of aviation image.
2) it is convenient for comparing with algorithm ASIFT;In plane scene, ASIFT algorithmic match effect is very good, and in three-dimensional
It is undesirable in scene.Reason is: in three-dimensional scenic, affine space parameter is not continuous, the affine parameter of ASIFT algorithm
Equal interval sampling strategy (Fig. 9) can not cover all affine spaces (Chen Min, 2014).In previous comparative experiments document,
Rare to compare deep learning method and ASIFT method, the present invention will exactly be solved discontinuous by the method for deep learning
Affine space Parameter Estimation Problem.
П, loss function construction
The basic Component units of neural network that feature description uses quoted from L2NET (Tian et al., 2017), but we
Using the Triplet network structure (as shown in Figure 3) different from master mould, loss function construction is also different, if network is defeated
Feature description vectors out are as follows: (ai,pi,qi), ai: benchmark, pi: positive sample is (with aiThe point to match), qi: negative sample is (with aiIt is non-
Matched point);
Loss function:
In the ideal case, d (ai,pi) < d (ai,qi), it may be assumed that the distance between matching double points are less than non-matching point pair
The distance between, it will appear negative value in network training real process.
Step 3, mispairing point is to rejecting
There are a large amount of mispairing points pair in network output result, it is therefore necessary to take appropriate measures and be rejected.It is counting
Calculation machine visual field, widely used natural image is mostly based on plane scene or almost plane scene in view transformation research.
In plane scene, all corresponding dot pairs meet the same homography matrix (Homography Matrix) transformation model: x'=
Hx, as shown in Figure 10.π is a space plane, xπFor the point in space plane, c, c' are projection ray, and x, x' are planar point xπ?
As the subpoint expressed in plane with homogeneous coordinates, i.e., (as) of the same name point pair, H is 3 × 3 matrix.In Image Matching, plane
Scene homography matrix H can be used for affine space parameter sparse sampling early period and later period and be deleted based on the mispairing point of RANSAC algorithm
It removes.
Oblique aerial image is based on complicated city three-dimensional scenic, and ground object target change in depth is discontinuous, in image
Corresponding dot pair does not meet the same homography conversion model, this is also that the famous operators of many computer vision fields can not obtain
Ideal effect reason for it.Due to image elements of exterior orientation it is known that using basis matrix (Fundamental matrix, figure
11) F rejecting mispairing point more meets reality, i.e., when x, x' indicate the corresponding dot pair expressed in left and right image with homogeneous coordinates, has
Following equation is set up: x'Fx=0, F are 3 × 3 matrixes.Inevitably there is error in imaging, matching process in image, because
This is normally set up the threshold value of 3 to 5 pixels, and corresponding dot pair need to meet condition: x'Fx≤3.
As shown in figure 12, (a), (b) be it is lower view, right view aviation image to SIFT algorithmic match as a result, using basis matrix
Reject mispairing point;(c), (d) is same image to RANSAC algorithm mispairing point deletion result.The results showed that for aviation
Image, according to the relationship between homography matrix approximate expression corresponding dot pair, a large amount of space characteristics points are to (such as building object angle
Point) mispairing point deletion can be mistakened as.
The embodiment of the present invention also provides a kind of oblique aerial image matching system based on deep learning, including such as lower die
Block:
Data acquisition module, the acquisition for training sample data set, performance test data collection;
Wherein training sample data set in HPatches data set by being sampled to obtain, and implementation is such as
Under,
If χ=(Ai,Pi)I=1,2 ... nFor the sample matches point pair generated from data set, n is the logarithm of matching double points, right
Answering network output valve is (ai,pi)I=1,2 ... n, Distance matrix D=[dij]n×n,
Define distance aiNearest non-matching pointjmin=argmind (ai,pj), j=1,2 ... n, j ≠ i;
Define distance piNearest non-matching pointkmin=argmind (ak,pi), k=1,2 ... n, k ≠ i;
Construct the input value of Triplet network structure: ifIt takes
IfIt takesWork as setting(or) be isolated point when, it is corresponding(or) rejected from input sample centering;
Network improvement and training module, for the loss by selecting different affine Transform Model parameters, construction different
Function improves Triplet network, and is instructed using training sample data set to improved Triplet network
Practice;
Mispairing point, for performance test data collection to be input in trained Triplet network, is exported to module is rejected
Matching result, and mispairing point pair is rejected using basis matrix F, i.e. corresponding dot pair need to meet condition: x'Fx≤threshold value T;Wherein, x,
X' indicates the corresponding dot pair expressed in two images with homogeneous coordinates, and F is 3 × 3 matrix.
Wherein, affine Transform Model parameter selected in network improvement and training module isWherein, λ is scale;φ is longitude;θ is latitude
Degree;ψ is swing angle.
Wherein, the loss function constructed in network improvement and training module is as follows,
Wherein, aiOn the basis of, piFor positive sample, qiFor negative sample, n is the logarithm of corresponding dot pair.
The specific implementation of each module and each step are corresponding, and the present invention not writes.
Relevant references are as follows:
[1] Chen Min .2014. multi-source Remote Sensing Images characteristic matching technical research [D]: the Wuhan [doctor]: Wuhan University
[2]Balntas V,Lenc K,Vedaldi A,et al.2017.HPatch:A benchmark and
evaluation of handcrafted and learned local descriptors[C].Proceedings of
IEEE Conference on Computer Vision and Pattern Recognition.
[3]Fan B,Kong Q,Wang X,et al.2017.A Performance Evaluation of Local
Features for Image Based 3D Reconstruction[C].Proceedings of IEEE Conference
on Computer Vision and Pattern Recognition.
[4]Jaderberg M,Simonyan K,Zisserman A.et al.2016.Spatial Transformer
Networks[C].Proceedings of Advances in Neural Information Processing Systems.
[5]Lenc K,Vedaldi A.2017.Large scale evaluation of local image
feature detectors on homography datasets[J].
[6]Mishkin D,Radenovic F,Matas J.2018.Repeatability Is Not Enough:
Learning discriminative affine regions via discriminability[C].Proceedings of
Computing Research Repository.
[7]Sch¨onberger J L,Hardmeier H,Sattler T,et al.2017.Comparative
evaluation of hand-crafted and learned local features[C].Proceedings of IEEE
Conference on Computer Vision and Pattern Recognition.
[8]Tian Y,Fan B,Wu F.2017.L2Net:Deep learning of discriminative patch
descriptor in euclidean space[C].Proceedings of IEEE Conference on Computer
Vision and Pattern Recognition.
[9]Yu G,Morel J M.2009.A fully affine invariant image comparison
method[C].Proceedings of IEEE International Conference on Acoustics,Speech,
and Signal Processing.
Specific embodiment described herein is only an example for the spirit of the invention.The neck of technology belonging to the present invention
The technical staff in domain can make various modifications or additions to the described embodiments or replace by a similar method
In generation, however, it does not deviate from the spirit of the invention or beyond the scope of the appended claims.
Claims (10)
1. a kind of oblique aerial image matching method based on deep learning, which comprises the steps of:
Step 1, training sample data set, performance test data collection obtain;
Wherein for training sample data set by being sampled to obtain in HPatches data set, implementation is as follows,
If χ=(Ai,Pi)I=1,2 ... nFor the sample matches point pair generated from data set, n is the logarithm of matching double points, corresponding net
Network output valve is (ai,pi)I=1,2 ... n, Distance matrix D=[dij]n×n,
Define distance aiNearest non-matching pointjmin=argmind (ai,pj), j=1,2 ... n, j ≠ i;
Define distance piNearest non-matching pointkmin=argmind (ak,pi), k=1,2 ... n, k ≠ i;
Construct the input value of Triplet network structure: ifIt takesIfIt takesWork as setting(or) be isolated point when, it is corresponding
(or) rejected from input sample centering;
Step 2 carries out Triplet network by the loss function for selecting different affine Transform Model parameters, construction different
It improves, and improved Triplet network is trained using training sample data set;
Performance test data collection is input in trained Triplet network by step 3, exports matching result, and use base
Plinth matrix F rejects mispairing point pair, i.e. corresponding dot pair need to meet condition: x'Fx≤threshold value T;Wherein, x, x' are indicated in two images
The corresponding dot pair expressed with homogeneous coordinates, F is 3 × 3 matrix.
2. a kind of oblique aerial image matching method based on deep learning as described in claim 1, it is characterised in that: step
Affine Transform Model parameter selected in 2 is
Wherein, λ is scale;φ is longitude;θ is latitude;ψ is swing angle.
3. a kind of oblique aerial image matching method based on deep learning as described in claim 1, it is characterised in that: step
The loss function constructed in 2 is as follows,
Wherein, aiOn the basis of, piFor positive sample, qiFor negative sample, n is the logarithm of corresponding dot pair.
4. a kind of oblique aerial image matching method based on deep learning as described in claim 1, it is characterised in that: described
Performance test data collection include the general two-dimensional surface scene nature image data collection of 4 groups of computer vision fields and 2 groups by
The urban central zone three-dimensional scenic that International Society for Photogrammetry and Remote (ISPRS) Third Committee's experimental project provides regards more
Angle image.
5. a kind of oblique aerial image matching method based on deep learning as described in claim 1, it is characterised in that: threshold value
The value of T is 3 or 5.
6. a kind of oblique aerial image matching system based on deep learning, which is characterized in that including following module:
Data acquisition module, the acquisition for training sample data set, performance test data collection;
Wherein for training sample data set by being sampled to obtain in HPatches data set, implementation is as follows,
If χ=(Ai,Pi)I=1,2 ... nFor the sample matches point pair generated from data set, n is the logarithm of matching double points, corresponding net
Network output valve is (ai,pi)I=1,2 ... n, Distance matrix D=[dij]n×n,
Define distance aiNearest non-matching pointjmin=argmind (ai,pj), j=1,2 ... n, j ≠ i;
Define distance piNearest non-matching pointkmin=argmind (ak,pi), k=1,2 ... n, k ≠ i;
Construct the input value of Triplet network structure: ifIt takesIfIt takesWork as setting(or) be isolated point when, it is corresponding
(or) rejected from input sample centering;
Network improvement and training module, for the loss function by selecting different affine Transform Model parameters, construction different
Triplet network is improved, and improved Triplet network is trained using training sample data set;
Mispairing point is to module is rejected, for performance test data collection to be input in trained Triplet network, output matching
As a result, and mispairing point pair is rejected using basis matrix F, i.e. corresponding dot pair need to meet condition: x'Fx≤threshold value T;Wherein, x, x' table
Show the corresponding dot pair expressed in two images with homogeneous coordinates, F is 3 × 3 matrix.
7. a kind of oblique aerial image matching system based on deep learning as claimed in claim 6, it is characterised in that: network
Improve and training module selected in affine Transform Model parameter beWherein, λ is scale;φ is longitude;θ is latitude
Degree;ψ is swing angle.
8. a kind of oblique aerial image matching system based on deep learning as claimed in claim 6, it is characterised in that: network
The loss function constructed in improvement and training module is as follows,
Wherein, aiOn the basis of, piFor positive sample, qiFor negative sample, n is the logarithm of corresponding dot pair.
9. a kind of oblique aerial image matching system based on deep learning as described in claim 1, it is characterised in that: described
Performance test data collection include the general two-dimensional surface scene nature image data collection of 4 groups of computer vision fields and 2 groups by
The urban central zone three-dimensional scenic that International Society for Photogrammetry and Remote (ISPRS) Third Committee's experimental project provides regards more
Angle image.
10. a kind of oblique aerial image matching system based on deep learning as claimed in claim 6, it is characterised in that: threshold
The value of value T is 3 or 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910344297.6A CN110135474A (en) | 2019-04-26 | 2019-04-26 | A kind of oblique aerial image matching method and system based on deep learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910344297.6A CN110135474A (en) | 2019-04-26 | 2019-04-26 | A kind of oblique aerial image matching method and system based on deep learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110135474A true CN110135474A (en) | 2019-08-16 |
Family
ID=67575342
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910344297.6A Pending CN110135474A (en) | 2019-04-26 | 2019-04-26 | A kind of oblique aerial image matching method and system based on deep learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110135474A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113112529A (en) * | 2021-03-08 | 2021-07-13 | 武汉市土地利用和城市空间规划研究中心 | Dense matching mismatching point processing method based on region adjacent point search |
CN114937393A (en) * | 2022-03-30 | 2022-08-23 | 中国石油化工股份有限公司 | Petrochemical enterprise high-altitude operation simulation training system based on augmented reality |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104794490A (en) * | 2015-04-28 | 2015-07-22 | 中测新图(北京)遥感技术有限责任公司 | Slanted image homonymy point acquisition method and slanted image homonymy point acquisition device for aerial multi-view images |
CN108446627A (en) * | 2018-03-19 | 2018-08-24 | 南京信息工程大学 | A kind of Aerial Images matching process based on partial-depth Hash |
US20190005670A1 (en) * | 2017-06-28 | 2019-01-03 | Magic Leap, Inc. | Method and system for performing simultaneous localization and mapping using convolutional image transformation |
CN109344845A (en) * | 2018-09-21 | 2019-02-15 | 哈尔滨工业大学 | A kind of feature matching method based on Triplet deep neural network structure |
-
2019
- 2019-04-26 CN CN201910344297.6A patent/CN110135474A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104794490A (en) * | 2015-04-28 | 2015-07-22 | 中测新图(北京)遥感技术有限责任公司 | Slanted image homonymy point acquisition method and slanted image homonymy point acquisition device for aerial multi-view images |
US20190005670A1 (en) * | 2017-06-28 | 2019-01-03 | Magic Leap, Inc. | Method and system for performing simultaneous localization and mapping using convolutional image transformation |
CN108446627A (en) * | 2018-03-19 | 2018-08-24 | 南京信息工程大学 | A kind of Aerial Images matching process based on partial-depth Hash |
CN109344845A (en) * | 2018-09-21 | 2019-02-15 | 哈尔滨工业大学 | A kind of feature matching method based on Triplet deep neural network structure |
Non-Patent Citations (5)
Title |
---|
ANASTASIYA MISHCHUK ET.AL: "Working hard to know your neighbor’s margins: Local descriptor learning loss", 《ARXIV:1705.10872V4 [CS.CV]》 * |
DMYTRO MISHKIN ET.AL: "Learning Discriminative Affine Regions via Discriminability", 《ARXIV:1711.06704V2 [CS.CV]》 * |
HANI ALTWAIJRY ET.AL: "Learning to Detect and Match Keypoints with Deep Architectures", 《BRITISH MACHINE VISION CONFERENCE》 * |
李竹林 等: "《图像立体匹配技术及其发展和应用》", 31 July 2007, 西安:陕西科学技术出版社 * |
每天都要深度学习: "深度学习干货学习(2)-triplet loss", 《HTTPS://BLOG.CSDN.NET/LUCIFER_ZZQ/ARTICLE/DETAILS/81271260》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113112529A (en) * | 2021-03-08 | 2021-07-13 | 武汉市土地利用和城市空间规划研究中心 | Dense matching mismatching point processing method based on region adjacent point search |
CN114937393A (en) * | 2022-03-30 | 2022-08-23 | 中国石油化工股份有限公司 | Petrochemical enterprise high-altitude operation simulation training system based on augmented reality |
CN114937393B (en) * | 2022-03-30 | 2023-10-13 | 中国石油化工股份有限公司 | Petrochemical enterprise high-altitude operation simulation training system based on augmented reality |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110135455B (en) | Image matching method, device and computer readable storage medium | |
Pham et al. | Lcd: Learned cross-domain descriptors for 2d-3d matching | |
JP7453470B2 (en) | 3D reconstruction and related interactions, measurement methods and related devices and equipment | |
US10043097B2 (en) | Image abstraction system | |
CN112270249A (en) | Target pose estimation method fusing RGB-D visual features | |
CN108369741A (en) | Method and system for registration data | |
CN103700099A (en) | Rotation and dimension unchanged wide baseline stereo matching method | |
JP2013186902A (en) | Vehicle detection method and apparatus | |
CN110222572A (en) | Tracking, device, electronic equipment and storage medium | |
Zhang et al. | Vehicle global 6-DoF pose estimation under traffic surveillance camera | |
CN113674400A (en) | Spectrum three-dimensional reconstruction method and system based on repositioning technology and storage medium | |
CN104463962B (en) | Three-dimensional scene reconstruction method based on GPS information video | |
WO2022247126A1 (en) | Visual localization method and apparatus, and device, medium and program | |
CN110135474A (en) | A kind of oblique aerial image matching method and system based on deep learning | |
Sun et al. | A fast underwater calibration method based on vanishing point optimization of two orthogonal parallel lines | |
CN112329662B (en) | Multi-view saliency estimation method based on unsupervised learning | |
CN113902802A (en) | Visual positioning method and related device, electronic equipment and storage medium | |
Jiang et al. | Contrastive learning of features between images and lidar | |
Lee et al. | Robust uncertainty-aware multiview triangulation | |
CN117197333A (en) | Space target reconstruction and pose estimation method and system based on multi-view vision | |
CN114998630B (en) | Ground-to-air image registration method from coarse to fine | |
Budianti et al. | Background blurring and removal for 3d modelling of cultural heritage objects | |
CN114723973A (en) | Image feature matching method and device for large-scale change robustness | |
CN113570535A (en) | Visual positioning method and related device and equipment | |
CN111414802B (en) | Protein data characteristic extraction method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190816 |
|
RJ01 | Rejection of invention patent application after publication |