CN115861792A

CN115861792A - Multi-mode remote sensing image matching method for weighted phase orientation description

Info

Publication number: CN115861792A
Application number: CN202211397518.4A
Authority: CN
Inventors: 张永军; 姚永祥; 刘伟玉; 熊铭涛; 万一
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2022-11-09
Filing date: 2022-11-09
Publication date: 2023-03-28

Abstract

The invention discloses a multi-mode remote sensing image matching method based on weighted phase orientation description, which comprises three parts of key point extraction, descriptor construction, matching, gross error elimination and the like of aggregation characteristics. Firstly, realizing the extraction of aggregation feature points, namely establishing the features of the maximum moment and the minimum moment of the image by adopting a phase consistency model, respectively extracting the features of spots and angular points, and then optimizing the feature points by the designed aggregation feature model to complete the extraction of key points. Second, a weighted phase-oriented feature descriptor is constructed. Firstly, a weighting bandwidth function is constructed to generate weight coefficients of phase orientation features, the weight coefficients are adopted to construct weighting phase orientation features, and the features are used for generating regularized polar coordinate descriptor vectors. And finally, performing nearest neighbor feature point matching by adopting the Euclidean distance, and eliminating mismatching by utilizing a rapid sample consensus algorithm to complete multi-mode remote sensing image matching.

Description

Multi-mode remote sensing image matching method for weighted phase orientation description

Technical Field

The invention belongs to a remote sensing image processing method, and particularly relates to a multi-mode remote sensing image matching method based on aggregation feature extraction and weighted phase orientation description.

Background

The multimode remote sensing image generally refers to two or more images (such as a point cloud depth map, an infrared image, a navigation map, an SAR image, a noctilucent image and the like) which are shot at different times under different sensors or different imaging conditions, and due to the fact that the images have obvious nonlinear distortion and geometric difference due to different imaging mechanisms, extraction and stable description of feature points cannot be achieved frequently, and the multimode remote sensing image is difficult to match and poor in effect. However, the multi-mode remote sensing image plays an important role in the fields of natural disaster assessment, disaster relief search, change detection, image splicing, aerial triangulation, three-dimensional reconstruction and the like. Therefore, it is necessary to study this.

With the continuous development of computer vision and image processing technology, remote sensing image matching methods can be roughly classified into three types, namely, a region-based method, a feature-based method and a deep learning-based method. The region-based method mainly calculates the similarity of images by means of the characteristics of image intensity, mutual information and the like, has the advantage of high matching precision, but has the problems of large calculation amount, poor invariance to the scale and rotation of the images and the like, so that the application range of the method is limited. The characteristic-based method mainly develops research around image gradient characteristics in the early stage, such as a plurality of methods of SIFT, SIFT-like and the like, the method can achieve satisfactory effects under the conditions of translation, scale and rotation differences, but in the multi-mode image, the image gradient characteristics are sensitive to nonlinear distortion and contrast difference of the multi-mode remote sensing image, so that the method is not suitable for multi-mode matching. Later scholars obtain breakthrough in applying phase characteristics of images to multi-mode remote sensing image matching, algorithms such as a phase consistency orientation histogram, radiation invariant characteristic transformation and the like, but the algorithms have certain restriction on scale and rotation invariance of the algorithms, and although some people try to improve, the problems of phase direction reversal, phase extreme value mutation and the like of the multi-mode remote sensing images still exist. The method based on deep learning, although showing great development potential in image matching, has remarkable effect. However, in view of the difficulty of multi-modal remote sensing image sample data acquisition and the complexity of remote sensing image scene application, the deep learning matching method still needs a certain development time.

In summary, whether effective feature points and robust descriptors can be extracted becomes a key factor affecting success of multimodal image matching. In feature extraction, most of the study of scholars is focused on finding the corner or spot features between images, and meanwhile, the study considering the comprehensive extraction of the corner and spot of the images is less. In the feature description, the phase features extracted by directly adopting a phase consistency model have the problems of obvious direction reversal, sudden change of phase extreme values and the like. The phase essentially represents the shape component of the image, and the above problem changes the shape component of the image to some extent, thereby destroying the structural integrity of the image, causing the phase orientation feature to be unable to correctly characterize the direction change between the images, and increasing the description difficulty of the feature descriptor. Based on the method, the invention provides a multi-mode remote sensing image matching method for aggregation feature extraction and weighted phase orientation description to realize the stable matching of the images.

Disclosure of Invention

The invention provides a multi-mode remote sensing image matching method based on weighted phase orientation description, which is used for solving the problem of matching multi-mode remote sensing images.

The technical scheme adopted by the invention is as follows: the multi-mode remote sensing image matching method based on aggregation feature extraction and weighted phase orientation description comprises the following steps:

step 1, initializing calculation parameters matched with the multi-mode remote sensing image, and performing nonlinear diffusion on the multi-mode remote sensing image;

step 2, respectively resolving the maximum moment characteristic and the minimum moment characteristic of the image scale space through a phase consistency model, and outputting resolving results;

step 3, performing speckle feature extraction on the maximum moment by using a Hessian matrix, performing corner feature extraction on the minimum moment by using a Fast detector, sequentially completing extraction of corresponding image layers, summarizing to a feature point set, and outputting the feature point set;

step 4, filtering the feature points extracted in the step 3 by using a polymerization feature optimization model, and determining a final feature point set by setting a feature point significance detection threshold;

step 5, calculating the multi-mode remote sensing image phase shape component by using a weighted bandwidth function, and acquiring a weight coefficient of the image in the phase characteristics;

step 6, constructing a weighted phase orientation characteristic model by using the weight coefficient, and calculating a weighted phase orientation characteristic diagram of the multi-mode remote sensing image;

step 7, calculating a regularized log-polar descriptor of each feature point according to the obtained weighted phase orientation feature map, and outputting a descriptor vector set of the feature points;

and 8, matching according to the descriptor vector set of the feature points, and eliminating gross errors to complete the matching of the multi-mode remote sensing images.

Further, in step 1, the number of layers in the nonlinear image scale space, the size of the non-maximum threshold window range of the feature points, the filtering threshold of the feature points, and the size parameter of the initial neighborhood range of the descriptor are initialized.

Further, the formulas of the maximum moment and the minimum moment characteristics in step 2 are shown as (1) and (2):

in the formula (1), PC (x, y) represents the result of measuring the phase consistency of the image; w is a _O (x, y) represents a weighting function; a. The _SO (x, y) represents an amplitude component; s represents a scale; o represents the convolution direction; ξ represents a minimum value;

zero when the number of seals is negative; delta phi _SO (x, y) representsA phase deviation function; t represents a noise compensation term; in the formula (2), M _max Representing the maximum moment of phase consistency of the image; m _min Representing the minimum moment of phase consistency of the image; PC (θ o) represents the mapping of the image in the o direction; A. b and C are intermediate quantities of the phase moment calculation; θ is the angle of the direction o.

Further, the specific implementation manner in step 4 is as follows;

/>

in the formula (3), S _points Representing the set of blobs and corners after mask optimization, f _Blob Representing a blob extract function; f. of _Corner Representing a corner extraction function, M _max Representing the maximum moment of phase consistency of the image; m _min Representing the minimum moment of phase consistency of the image;

representing a mask function, R representing a mask radius; n is a radical of _w A neighborhood window is represented, where the mask function is defined as,

in the formula (4), mask (x, y) represents a calculation result of the Mask function; i is _x Representing the pixel value of the original image in the x direction; i is _y Representing the pixel value of the original image in the y direction; n is a radical of _w Representing a neighborhood window;

constructing a significance fraction extraction equation, wherein the mathematical expression of the significance fraction extraction equation is shown as a formula (5):

in formula (5), f _nms (. -) represents the characteristic point resulting in a non-maximum suppression function; AF denotes the last keypoint; s _score Representation Filtering by saliency scoreA set of points of (a); k represents a filtering threshold value of the characteristic point, the value is a non-fixed value, and the filtering threshold value is set and adjusted according to the strength of different remote sensing image textures;

representing its phase intensity value, f _nms (. Cndot.) represents a characteristic point retained after non-maximum inhibition, and S _points Representing points removed from the edge.

Further, the specific implementation of step 5 includes,

firstly, performing Log-Gabor convolution on the multimodal remote sensing image to extract phase information of the image, and decomposing the image into two parts, wherein the formula is shown as (6):

in the formula (6), s and o represent the scale and direction of the Log-Gabor filter;

an even symmetric filter representing Log-Gabor filtering; />

Represents an odd symmetric filter; symbol i represents an imaginary unit of a complex number; i (x, y) represents an image; EO (x, y) represents the result of the image's response on the real filter; OO (x, y) represents the response of the image on the imaginary filter; />

Represents a convolution operation;

secondly, a weighted bandwidth function is designed according to the response result of the real part and the imaginary part of the Log-Gabor filtering, and the mathematical expression of the weighted bandwidth function is shown as (7):

in the formula (7), wc' represents the maximum weighting coefficient of the image; wc "represents the minimum weighting factor for the image; exp represents the exponential calculation; cutoff represents the fractional scale of the frequency distribution; width (width) _i Representing a frequency score in the current image; g represents the sharpness of the transition in the control phase consistency model; ξ represents a minimum value to prevent the value from being zero.

Further, in step 6, the weighted phase orientation characteristic is solved, and the mathematical expression thereof is shown in equation (8):

in the formula (8), W represents a weighted phase orientation characteristic; OO _i Representing the odd symmetric convolution result on the ith direction layer; EO (ethylene oxide) _i Representing the even symmetric convolution result on the ith direction layer; θ represents the rotation angle.

Further, the specific implementation manner of step 7 is as follows;

dividing the description sub-neighborhood range into 4 layers and performing 12 equal divisions on the weighted phase orientation characteristic diagram, namely dividing into 4 concentric circles, performing regularized equal divisions on the four concentric circles, and finally dividing the whole characteristic neighborhood range into antipodal coordinate grids of 48 sub-area grids; the horizontal direction in each grid represents the polar angle of the position of the circular neighborhood pixel point, after the direction histogram of each feature point is calculated, one dimension is divided at intervals of 45 degrees, and the directions of 0-360 degrees are divided into 8 dimensions; therefore, the adjacent points of each sub-region grid have 8-dimensional gradient position orientation histograms, and finally a 384-dimensional regularization log-polar descriptor is generated;

regularizing the mathematical expression of the log-polar descriptor, as shown in (9):

RGLOH＝[D ₁ ,D ₂ ,…,D _N ] ^T (9)

in formula (9), RGLOH represents a descriptor set of all feature points; d _i A 384-dimensional characteristic vector value representing the ith feature point, i.e. a regularized log-polar descriptor in 384 dimensions(ii) a T represents a matrix transposition character; n represents the number of feature points.

Further, in step 8, the euclidean distance is used to measure the descriptors of the feature points, so that nearest neighbor matching is completed, and gross errors are removed through a fast sample consensus algorithm.

Further, the method comprises a step 9 of evaluating the matching effect of the multi-mode remote sensing image by using the correct homonymy point, and the step 9 of quantitatively checking the matching precision by using the calculated root mean square error of the homonymy point and the number of homonymy point pairs.

Compared with the prior art, the invention has the following advantages and beneficial effects:

the multi-mode remote sensing image matching method provided by the invention is divided into three parts of feature point extraction, descriptor construction, matching, gross error elimination and the like of feature aggregation. Firstly, establishing image maximum moment and minimum moment characteristics by adopting a phase consistency model, respectively extracting spot and corner characteristics, and optimizing characteristic points by a designed aggregation characteristic equation to complete key point extraction. Secondly, constructing weighted phase orientation features through a designed weighted bandwidth function, and generating a regularized polar coordinate descriptor through the features. And finally, performing nearest neighbor image matching by adopting the Euclidean distance, and removing error points by means of a rapid sample consensus algorithm to complete image matching. The weighted phase orientation characteristic constructed by the weighted bandwidth function can better overcome the problems of sudden change of phase extreme values, direction reversal and the like, and the expression capability of the descriptor is increased.

Drawings

FIG. 1: a method flow diagram of the invention;

FIG. 2: a schematic diagram of optimization of the aggregation characteristic points;

FIG. 3: a logarithm polar coordinate descriptor schematic diagram of the weighted phase orientation feature;

FIG. 4: the multimode remote sensing image data set comprises a point cloud depth map and an optical image, an infrared image and an optical image, a navigation map and an optical image, an SAR image and an optical image, and a luminous image and an optical image, wherein the point cloud depth map and the optical image are used as (a) images, (b) images of infrared and optical images are used as (c) images of navigation and optical images, the SAR image and the optical images are used as (d) images of SAR and optical images, and the luminous image and the optical image are used as (e) images of night light;

FIG. 5 is a schematic view of: and matching results of the multi-mode remote sensing images.

Detailed Description

In order to facilitate the understanding and implementation of the present invention for those of ordinary skill in the art, the present invention is further described in detail with reference to the accompanying drawings and examples, it is to be understood that the embodiments described herein are merely illustrative and explanatory of the present invention and are not restrictive thereof.

Referring to the flowchart in fig. 1, the multi-modal remote sensing image matching method for weighted phase orientation description provided by the invention includes the following steps:

step 1: and initializing calculation parameters matched with the multi-mode remote sensing image, and performing nonlinear diffusion on the multi-mode remote sensing image.

Preferably, in the step 1, for the nonlinear image scale space, in the process of constructing the scale space, the initialization is mainly completed by the number of layers of the nonlinear image scale space, the size of a non-maximum threshold window range of a feature point, a filtering threshold of the feature point, and an initial neighborhood range size parameter of a descriptor. According to a large amount of experimental experience, the number of layers in a nonlinear image scale space, the size of a non-maximum threshold window range of a feature point, the size of a filtering threshold of the feature point and the size of an initial neighborhood range of a descriptor are respectively set to be 3, 0.85 and 38.

Step 2: and respectively resolving the maximum moment characteristic and the minimum moment characteristic of the image scale space through the phase consistency model, and outputting a resolving result. The phase consistency model is a model capable of effectively checking local energy characteristics of an image, and is convenient for extracting edge characteristics and corner characteristics of the image. The maximum moment and the minimum moment are used to better describe the salient features of the image. From the moment analysis algorithm, the axis corresponding to the smallest moment is called the principal axis, which usually represents the directional information of the feature. The axis of maximum moment is perpendicular to the principal axis, reflecting the significance of the feature. The feature point extraction of the multi-mode remote sensing image is facilitated through the features of the maximum moment and the minimum moment.

Therefore, the maximum moment and the minimum moment of the image are calculated through the phase consistency model. The phase consistency measurement is mainly obtained by local fourier transform calculation. The method can well extract the frequency domain characteristic information of the image, and extracts the maximum moment and the minimum moment characteristics of the multi-mode remote sensing image through a phase consistency model, wherein the formulas are shown as (1) and (2):

zero when the number of seals is negative; delta phi _SO (x, y) represents a phase deviation function; t denotes a noise compensation term. In the formula (2), M _max Representing the maximum moment of phase consistency of the image; m _min Representing the minimum moment of phase consistency of the image; PC (θ o) represents the mapping of the image in the o direction; A. b and C are intermediate quantities of the phase moment calculation; θ is the angle of the direction o. />

And 3, step 3: and performing speckle feature extraction on the maximum moment by using a Hessian matrix, performing corner feature extraction on the minimum moment by using a Fast detector, sequentially completing extraction of corresponding image layers, summarizing the image layers to a feature point set, and outputting the feature point set. Feature point extraction is an important step of multi-modal image matching, and common feature points can be mainly divided into two types, namely spots and angular points. The speckle is usually an area with color and gray difference from its periphery, and has the advantages of strong noise resistance and good stability. The corner points generally refer to corner regions of the ground features in the image or intersection parts between lines, and the significance of the corner points is high.

And 4, step 4: and (4) filtering the feature points extracted in the step (3) by using a designed aggregation feature optimization model, and determining a final feature point set by setting a feature point significance detection threshold value.

Conventional detectors extract only one type of keypoint, which is not beneficial for image matching. Inspired by the method, the invention designs an aggregation characteristic strategy to optimize the characteristic points, thereby improving the richness of the characteristic points. The aggregation feature optimization strategy has three steps: removing boundary region points, suppressing non-maximum values and filtering significance scores. The mathematical expression for the removal of the boundary region points is shown in (3):

in formula (3), S _points Representing the set of blobs and corners after mask optimization. f. of _Blob Representing a blob extract function; f. of _Corner Representing a corner extraction function;

representing a mask function, R representing a mask radius; n is a radical of _w Representing a neighborhood window. Wherein, the mask function is defined as,

in the formula (4), mask (x, y) represents a calculation result of a Mask function; i is _x Representing the pixel value of the original image in the x direction; i is _y Representing the pixel value of the original image in the y direction; n is a radical of _w Representing a neighborhood window.

Non-maxima suppression is a common approach and the present invention will not be discussed further.

And finally, calculating the significance score of the characteristic points after the first two steps of operation, wherein the mathematical expression of the significance score is shown as the formula (5):

in the formula (5), f _nms () represents the characteristic point-derived non-maxima suppression function; AF denotes the last keypoint; s _score Representing a set of points filtered by a saliency score; k represents a filtering threshold value of the characteristic point, the value is a non-fixed value, and the filtering threshold value is automatically set and adjusted according to the strength degree of different remote sensing image textures;

representing its phase intensity value, f _nms (. Represents the characteristic points retained after non-maximum suppression, and S _points Points removed from the edge are indicated and the results are shown in figure 2.

And 5: and calculating the change of the multi-mode remote sensing image phase shape component by using the designed weighted bandwidth function to obtain the noise weight coefficient of the image in the phase characteristics.

First, convolution processing is performed on an input image by using a Log-Gabor filter, and local phase information is extracted. The Log-Gabor filter is decomposed into two parts in the spatial domain, and the two parts can better resist the noise of the image and the gray difference of the image. The formula is shown as (6):

an even symmetric filter representing Log-Gabor filtering; />

Represents a convolution operation;

in the formula (7), wc' represents the maximum weighting coefficient of the image; wc "represents the maximum weighting factor of the image; exp represents the exponential calculation; cutoff represents the fractional scale of the frequency distribution; width (width) _i Representing a frequency score in the current image; g represents the sharpness of the transitions in the control phase consistency model; ξ represents a minimum value to prevent the value from going to zero.

Step 6: and constructing a weighted phase orientation characteristic model, and calculating a weighted phase orientation characteristic diagram of the multi-mode remote sensing image.

Due to the influence of abrupt phase energy extreme value change and direction reversal in the phase orientation characteristics of the multi-modal image, descriptors of characteristic points have obvious instability. To better address such issues, the present invention introduces the results of weighting the bandwidth functions into the Log-Gabor odd and even symmetric functions. The operation can effectively overcome the negative influence caused by sudden change of the phase extreme value and reversal of the direction, thereby enhancing the robustness of the characteristic point descriptor.

Substituting the weighted bandwidth function into the phase orientation feature calculation, wherein the mathematical expression of the weighted bandwidth function is shown as (8):

in the formula (8), W represents a weighted phase orientation characteristic; OO _i Representing the odd symmetric convolution result on the ith direction layer; EO (ethylene oxide) _i Representing the even symmetric convolution result on the ith direction layer; θ represents the rotation angle. Second, the orientation feature is migrated to [0, 360 ]]And generating final weighted phase orientation features.

And 7: and calculating a regularized log-polar descriptor of each feature point according to the obtained weighted phase orientation feature map, and outputting a descriptor vector set of the feature points, wherein the result is shown in fig. 3.

According to the method, the descriptor neighborhood range is divided into 4 layers and is divided into 12 equal parts on the weighted phase orientation characteristic diagram in consideration of the stability and robustness of the descriptor. Namely, the four concentric circles are divided into 4 concentric circles, and the four concentric circles are divided into four equal circles in a regularized mode, and finally the whole characteristic neighborhood range is divided into a antipodal coordinate grid of (12 × 4) =48 sub-area grids. The grid dividing mode makes up for instability of descriptors caused by small areas, and enables description in the neighborhood range of the feature points to be more detailed and accurate.

The areas of the polar coordinate subregions are approximately consistent. And the horizontal direction in each grid represents the polar angle of the position of the circular neighborhood pixel point. After the direction histogram of each feature point is calculated, one dimension is divided at intervals of 45 degrees, and the directions of 0-360 degrees are divided into 8 dimensions. Therefore, the adjacent points of each sub-region grid have 8-dimensional gradient position orientation histograms, and finally, 384-dimensional regularized log-polar descriptors are generated.

The mathematical expression of the regularized log-polar descriptor is shown as (9) below:

RGLOH＝[D ₁ ,D ₂ ,…,D _N ] ^T ， (9)

in formula (9), RGLOH represents a descriptor set of all feature points; d _i A 384-dimensional characteristic vector value representing the ith feature point, namely a regularized logarithmic polar descriptor of 384 dimensions; t represents a matrix transposition character; n represents the number of feature points.

And 8: and after the descriptor calculation is completed, measuring the descriptor of the feature point by adopting the Euclidean distance, thereby completing the nearest neighbor matching. Gross errors are removed through a fast sample consensus algorithm, matching of the multi-mode remote sensing images is completed, and the result is shown in fig. 5.

And step 9: and evaluating the matching effect of the multi-mode remote sensing image by using the correct same-name points. The performance of 5 groups of multi-mode remote sensing image testing algorithms is utilized, and a data set is shown in figure 4. For each image pair, the matching number is quantitatively checked using Root-Mean-Square Error (RMSE) of the same-name point and the same-name point, where RMSE is in pixels. The multi-mode remote sensing image matching method is named as a HORP algorithm, and compared with the optimal image matching methods (LGHD, PSO-SIFT and RIFT), and the comparison result is shown in a table 1.

TABLE 1 several image matching method comparisons

As can be seen from Table 1, in the multi-modal remote sensing image data, compared with LGHD, PSO-SIFT and RIFT algorithms, the HORP algorithm can acquire more homonymous point pairs. The HORP algorithm provided by the invention can achieve a relatively optimal result. Among them, RMSE of HORP algorithm is superior to LGHD, PSO-SIFT and RIFT methods. The values of the RMSE of the HORP algorithm provided by the invention are less than 2 pixels, and the HORP algorithm is further proved to greatly increase the matching number of the homonymous points and keep better matching precision. Meanwhile, a large number of experiments show that when the extraction difficulty of the multi-mode sleeve remote sensing image is high, the size of a window of the descriptor field can be increased; on the contrary, when the multi-mode remote sensing image has rich textures, the values of the parameters can be properly reduced. Wherein, the number parameter of the image layers in the image scale space is set between 2 and 6.

It should be understood that parts of the specification not set forth in detail are well within the prior art.

It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A multi-mode remote sensing image matching method of weighted phase orientation description is characterized by comprising the following steps:

step 3, performing speckle feature extraction on the maximum moment by using a Hessian matrix, performing corner feature extraction on the minimum moment by using a Fast detector, sequentially completing extraction of corresponding image layers, summarizing the image layers to a feature point set, and outputting the feature point set;

step 5, calculating the multi-mode remote sensing image phase shape component by using a weighted bandwidth function, and acquiring a weight coefficient of the image in phase characteristics;

step 7, calculating a regularized log-polar coordinate descriptor of each feature point according to the obtained weighted phase orientation feature map, and outputting a descriptor vector set of the feature points;

2. The method for matching the multi-modal remote sensing image based on the weighted phase orientation description as claimed in claim 1, wherein: in the step 1, the number of layers of a nonlinear image scale space, the size of a non-maximum threshold window range of a characteristic point, a filtering threshold of the characteristic point and the size parameter of an initial neighborhood range of a descriptor are initialized.

3. The method for matching the multi-modal remote sensing image based on the weighted phase orientation description as claimed in claim 1, wherein: the formulas of the maximum moment and the minimum moment in the step 2 are shown as (1) and (2):

zero when the number of seals is negative; delta phi _SO (x, y) represents a phase deviation function; t represents a noise compensation term; in the formula (2), M _max Representing the maximum moment of phase consistency of the image; m _min Representing the minimum moment of phase consistency of the image; PC (θ o) represents the mapping of the image in the o direction; A. b and C are intermediate quantities of the phase moment calculation; θ is the angle of the direction o.

4. The method for matching the multi-modal remote sensing image based on the weighted phase orientation description as claimed in claim 1, wherein: the specific implementation manner in the step 4 is as follows;

in the formula (3), S _points Representing a set of spots and corners, f, optimized by a mask _Blob Representing a blob extract function; f. of _Corner Representing a corner extraction function, M _max Representing the maximum moment of phase consistency of the image; m _min Representing the minimum moment of phase consistency of the image;

in the formula (4), mask (x, y) represents a calculation result of the Mask function; i is _x Representing the pixel value of the original image in the x direction; I.C. A _y Representing the pixel value of the original image in the y direction; n is a radical of _w Representing a neighborhood window;

in the formula (5), f _nms (. -) represents the characteristic point resulting in a non-maximum suppression function; AF denotes the last keypoint; s _score Representing a set of points filtered by a saliency score; k represents a filtering threshold value of the characteristic point, the value is a non-fixed value, and the filtering threshold value is set and adjusted according to the strength of different remote sensing image textures;

5. The method for matching the multi-modal remote sensing image based on the weighted phase orientation description as claimed in claim 1, wherein: the specific implementation of the step 5 includes that,

an even symmetric filter representing Log-Gabor filtering; />

Represents a convolution operation;

in the formula (7), wc' represents the maximum weighting coefficient of the image; wc "represents the minimum weighting factor for the image; exp represents the exponential calculation; cutoff represents the fractional scale of the frequency distribution; width (width) _i Representing a frequency score in the current image; g represents the sharpness of the transitions in the control phase consistency model; ξ represents a minimum value to prevent the value from being zero.

6. The method for matching the multi-modal remote sensing image of the weighted phase orientation description according to claim 5, wherein: in step 6, solving the weighted phase orientation characteristic, wherein the mathematical expression of the weighted phase orientation characteristic is shown as formula (8):

7. The method for matching the multi-modal remote sensing image based on the weighted phase orientation description as claimed in claim 1, wherein: the specific implementation of step 7 is as follows;

dividing the description sub-neighborhood range into 4 layers and performing 12 equal divisions on the weighted phase orientation characteristic diagram, namely dividing the description sub-neighborhood range into 4 concentric circles, performing regularized equal divisions on the four concentric circles, and finally dividing the whole characteristic neighborhood range into antipodal coordinate grids of 48 sub-area grids; the horizontal direction in each grid represents the polar angle of the position of the circular neighborhood pixel point, after the direction histogram of each feature point is calculated, one dimension is divided at intervals of 45 degrees, and the directions of 0-360 degrees are divided into 8 dimensions; therefore, the adjacent points of each sub-region grid have 8-dimensional gradient position orientation histograms, and finally a 384-dimensional regularization log-polar descriptor is generated;

RGLOH＝[D ₁ ,D ₂ ,…,D _N ] ^T (9)

8. The method for matching the multi-modal remote sensing images of the weighted phase orientation description according to claim 1, wherein the method comprises the following steps: in step 8, euclidean distance is adopted to measure descriptors of the feature points, so that nearest neighbor matching is completed, and gross errors are eliminated through a fast sample consensus algorithm.

9. The method for matching the multi-modal remote sensing image based on the weighted phase orientation description as claimed in claim 1, wherein: the method further comprises a step 9 of evaluating the matching effect of the multi-mode remote sensing image by using the correct homonymy point, and quantitatively checking the matching precision by using the solved root mean square error of the homonymy point and the number of homonymy point pairs in the step 9.