CN107392963B

CN107392963B - Eagle eye-imitated moving target positioning method for soft autonomous aerial refueling

Info

Publication number: CN107392963B
Application number: CN201710506141.4A
Authority: CN
Inventors: 段海滨; 王晓华; 邓亦敏
Original assignee: Beijing University of Aeronautics and Astronautics
Current assignee: Beijing University of Aeronautics and Astronautics
Priority date: 2017-06-28
Filing date: 2017-06-28
Publication date: 2019-12-06
Anticipated expiration: 2037-06-28
Also published as: CN107392963A

Abstract

The invention relates to an eagle eye movement imitating target positioning method for soft autonomous aerial refueling, which comprises the following implementation steps: the method comprises the following steps: calculating the response of the olecranon-imitating cells; step two: inhibiting the texture of the eagle nucleus and extracting a significant map; step three: color threshold segmentation; step four: extracting an interested region; step five: acquiring coordinates of taper sleeve mark points; step six: matching the identification points; step seven: calibrating parameters of a camera; step eight: measuring the pose of the refueling taper sleeve; the eagle eye-imitating moving target positioning method for soft autonomous aerial refueling, provided by the invention, can accurately extract the refueling taper sleeve in the soft refueling process of the unmanned aerial vehicle and accurately determine the position of the refueling taper sleeve, and the method has higher accuracy and robustness.

Description

Eagle eye-imitated moving target positioning method for soft autonomous aerial refueling

One, the technical field

The invention discloses an eagle eye-imitated moving target positioning method for soft autonomous aerial refueling, and belongs to the technical field of computer vision.

Second, background Art

the autonomous aerial refueling technology is a hotspot in the field of autonomous and intelligent research of airplanes, can be used for improving the combat radius and the combat efficiency of the unmanned aerial vehicle, and is favorable for improving the safety and the operability of manned airplane aerial refueling. Particularly, under severe weather conditions, the technical difficulty and the workload of the pilot for air refueling can be greatly reduced by utilizing the autonomous air refueling technology. In 2011, in 4 months, a modified unmanned verification machine and a global eagle complete partner type air refueling at 13716m altitude by the U.S. Nuo Spropuglman company, the U.S. national defense advanced research program office and the U.S. space launch Deleton flight center, and the pioneer of automatic air refueling of the unmanned aerial vehicle is created. In 2015 for 4 months, the U.S. military X-47B realizes the first unmanned aerial vehicle autonomous air refueling docking test historically, and successfully inserts the oil receiving probe into a hose-taper sleeve discharged by a refueling machine.

the precise navigation of the oiling machine/oil receiving machine is a key technology and research focus of the autonomous air refueling technology. Currently, the air refueling navigation technologies studied at home and abroad mainly include an inertial navigation System, a Global Positioning System (GPS), a visual navigation System, and the like. The positioning error of the inertial navigation system is accumulated with time, and needs to be corrected by using other navigation systems. The GPS navigation system has the advantages of wide application, relatively mature technology, simple use and the like, but completely depends on the reception of satellite signals and has larger dependence on external signals. In addition, for soft refueling, the hose and the refueling cone sleeve carried by the hose swing under the influence of air flow, and the positions of the refueling cone and the oil receiving port cannot be accurately given by the positions between the refueling machine and the oil receiving machine obtained by adopting differential GPS and inertial navigation. The relative navigation method based on the bionic vision can directly measure the relative position between the refueling taper sleeve and the refueling machine, and provides accurate navigation information for the aerial refueling of the unmanned aerial vehicle.

Many biological systems in nature have super strong environment perception capability, and if the biological systems are applied to the relative navigation process of autonomous aerial refueling, the high precision, the real-time performance, the robustness and the like of the biological systems can be simultaneously ensured. The eagle eye is a good biological visual information processing system, and a visual processing mechanism of the eagle eye is used in bionic visual positioning to improve the accuracy of target detection and further improve the precision of the visual positioning, so that accurate relative navigation is provided for soft autonomous aerial refueling. Hawk-like vision systems have a pop-out mechanism that can lock the visual attention to a more valuable target area, thereby greatly improving the image analysis speed and target capture accuracy of the vision system. Starting from the eagle eye retina, there is a mutual inhibition in the ganglion cells of the eagle eye retina, and the dynamic response range of the cells can be limited by the widespread inhibition, and meanwhile, the information is effectively encoded and integrated, and the code is transmitted to the next layer. An off-roof channel exists between the retina and the brain, and the isthmus nucleus cells and the roof cells in the channel can receive stimulation from ganglion cells and other bottom layer cells, carry out further integrated processing on the stimulation, filter out invalid information and noise information, and extract target information for next target detection and positioning.

The method starts from a visual system processing mechanism of the hawk, simulates a visual attention mechanism of the hawk by combining the interaction between the hawk brain nuclei, extracts a rough area where the refueling taper sleeve is located, then effectively extracts features of the taper sleeve area by using color segmentation, and further measures the relative position information between the refueling machine taper sleeve and the fuel receiving machine visual system by using a pose estimation algorithm. In addition, the invention builds an aerial verification platform and verifies the soft autonomous aerial refueling bionic visual positioning method provided by the invention.

third, the invention

1. The purpose of the invention is as follows:

the invention provides an eagle eye-imitated moving target positioning method for soft autonomous aerial refueling, and aims to provide an accurate soft autonomous aerial refueling relative navigation scheme, provide reliable relative position measurement information for a soft autonomous aerial refueling system, improve the autonomy of aerial refueling relative navigation, reduce the dependence on external signals such as satellites and the like, reduce the accident rate in the butt joint process and reform the current aerial refueling technology.

2. the technical scheme is as follows:

The invention provides an eagle eye-imitated moving target positioning method with strong robustness and high accuracy aiming at the relative navigation task requirement of a soft air refueling short-distance docking stage, and designs an air verification platform system, wherein the structure of the system is shown in an attached figure 1. An eagle eye movement-imitating target positioning method for soft autonomous aerial refueling is characterized in that a visual system processing mechanism of an eagle is used, a visual attention mechanism of the eagle is simulated by combining interaction between eagle brain nuclei, a rough area where a refueling taper sleeve is located is extracted, effective feature extraction is carried out on the taper sleeve area by using color segmentation, and then relative position information between the refueling machine taper sleeve and a refueling machine receiving visual system is measured by using a pose estimation algorithm; the method comprises the following specific steps:

the method comprises the following steps: olecranon-roof-mimicking cell response calculation

An effective olecranon-roof-cell-like coding mechanism is established, the effectiveness and sparseness of olecranon-roof cell coding are simulated, and effective invariance information is obtained from an image. It is assumed that for any image I it can be represented by a linear combination of a series of image bases Bk:

The image base Bk needs to be trained from a large number of natural images, which is common information existing between the natural images, ak is a coefficient corresponding to the image base Bk, and the coefficient has certain sparsity and can be calculated by the following formula:

Where Ck is called a coding filter and is the inverse or pseudo-inverse of the image base Bk. When the filter is used for filtering any one image, the response of the cells of the visual cap can be obtained, and the response can present certain sparsity, namely most of the response values are 0, and only a small part of the response values are larger, and the response is consistent with the physiological research result of the cells of the visual cap.

The cellular response of the eagle eye visual cap corresponding to the background area in the soft air refueling scene is similar, and the cellular response of the refueling drogue area is greatly different from that of the background area. The cellular response is determined by both the input image and the receptive field, whereas the receptive field of the roof cells is strongly selective for both directional and edge information. The larger the cell response corresponding to a certain receptive field is, the more the direction and the edge information corresponding to the image block are matched with the selectivity of the receptive field. The main information of the image block can be described by using the maximum response and the corresponding receptive field. Similarly, the maximum response of the background region is very similar and the cellular response of the refueling drogue region is much different from it. Meanwhile, the background area is generally flat, and the edge information is not obvious, so that the corresponding cell response is not much larger than that of other receptive fields in a certain receptive field. On the contrary, abundant marginal information exists in the refueling drogue area, and the target area has stronger directionality, and its corresponding cellular response often can appear in certain receptive field far above other receptive fields. Thus, the present invention uses the maximum response of each image patch to describe the probability that the image patch is a refueling drogue region.

Step two: eagle nucleus-like texture inhibition and significant map extraction

A large amount of side inhibition exists among cell receptive fields in the eagle visual system, when stimulation is input, the central cell can be inhibited by surrounding cells, the inhibition is expressed as anti-interference capability of an enhanced algorithm, and the texture consistency of a background area is also considered. When the texture consistency between an image block and its surrounding image blocks is strong, the probability that the image block is a background area is considered to be high, and therefore, the image block is suppressed by being given a large texture suppression coefficient. On the contrary, when the texture consistency of the image block and the surrounding image block is weak, the probability that the image block is the refueling drogue area is high, and a small texture suppression coefficient is given to the image block.

The texture consistency calculation method is based on a gray level co-occurrence matrix, and the gray level co-occurrence matrix is defined as follows: assuming that an image to be analyzed has Nx pixels in the horizontal direction and Ny pixels in the vertical direction Y, the gray level of the image is G, X ═ {1, 2.. times, Nx } represents the pixel coordinates in the horizontal direction of the image, Y ═ {1, 2.. times, Ny } represents the pixel coordinates in the vertical direction of the image, and N ═ 0, 1.. times, G } represents the quantized gray level, the original image can be represented as a mapping function f: X × Y → N from the horizontal and vertical coordinates to the gray level. The statistical law of the gray levels of a pair of pixels separated by a certain distance in a certain direction in an image can reflect the texture characteristics of the image, and a gray level co-occurrence matrix can be obtained by describing the gray level statistical law of each pixel pair by using a matrix, and is represented as W.

Any point (x, y) in the image and a pixel point (x + a, y + b) which is at a certain distance from the point to form a pixel point pair, and the gray value of the pixel point pair is set as (i, j). That is, the gray value of the pixel (x, y) is i, and the gray value of the pixel (x + a, y + b) is j. By fixing a and b and moving the point (x, y) over the entire image, various values of (i, j) are obtained. If the number of gray levels of an image is G, G2 combinations of i and j are total. In the whole image, the square matrix [ P (i, j, d, theta) ] GXG is called as a gray level co-occurrence matrix [ P (i, j, d, theta) ] G XG when the frequency of each occurrence is counted to be P (i, j, d, theta). The gray level co-occurrence matrix is essentially a combined histogram of two pixel points, and the distance difference values (a, b) are combined by different numerical values to obtain the gray level co-occurrence matrix of which the image is separated by a certain distance along a certain direction theta.

In the present invention, a is 2, θ is [0 °,45 °, 90 °, 135 °, and the quantized gray scale is 8, so the gray co-occurrence matrix is a matrix of 8 × 4. The gray level co-occurrence matrix can reflect information of image gray levels about direction, adjacent interval, variation amplitude and the like. In order to analyze the local mode, the arrangement rule and the like of the image, the obtained gray level co-occurrence matrix is not directly used generally, and secondary statistics are obtained on the basis of the gray level co-occurrence matrix. Before acquiring the characteristic parameters of the gray level co-occurrence matrix, normalization is performed according to the following formula:

P(i,j,d,θ)＝P(i,j,d,θ)/R (3)

Wherein R is a normalization constant, which is the sum of all elements in the gray level co-occurrence matrix.

the secondary statistics used by the present invention are contrast and entropy, where contrast is defined as follows:

W＝∑∑[(i-j)×P(i,j,d,θ)] (4)

Contrast is the moment of inertia about the principal diagonal in the W-matrix, which measures the distribution of matrix values and local changes in the image. The larger the value of W1, the stronger the texture contrast, the clearer the image, and the more obvious the texture effect.

entropy is defined as follows:

W＝-∑∑P(i,j,d,θ)×logP(i,j,d,θ) (5)

Entropy represents the amount of information of an image, is a measure of the randomness of the content of the image, and can represent the complexity of texture. The entropy is 0 when the image is texture-free and is maximum when the image is texture-full.

In order to calculate the texture suppression coefficient of the image, the invention samples two different window sizes around each pixel, and respectively calculates the gray level co-occurrence matrix of the two image blocks obtained by sampling. And then obtaining a normalized gray level co-occurrence matrix by using a calculation formula (3). And respectively calculating secondary statistics of the image blocks by utilizing the gray level co-occurrence matrix, and then calculating the distance between the two secondary statistics, wherein the distance describes the texture consistency of a certain image block and the image blocks around the certain image block. If the distance between the two secondary statistics is larger, the image block and the texture of the surrounding area have larger difference, so the texture suppression coefficient of the image block is endowed with a smaller value; otherwise, if the distance between the two secondary statistics is smaller, it indicates that the texture of the image block is more similar to that of the surrounding area, and the texture suppression coefficient corresponding to the image block should be larger. The eagle nucleus-imitated texture suppression and saliency map extraction method disclosed by the invention uses the distance between the secondary statistics of two gray level co-occurrence matrixes corresponding to each image block as the texture suppression coefficient. And then multiplying the reciprocal of the texture suppression coefficient serving as a texture enhancement coefficient by the maximum response corresponding to each image block of the image to obtain a saliency map of the image, wherein the region corresponding to the maximum value of the saliency is the taper sleeve region. The significant map extracted by the eagle eye-imitating significant map can filter out partial background information, and the calculation amount is reduced for subsequent processing.

Step three: color thresholding

In the verification platform designed by the invention, the oiling taper sleeve area is a red circular ring. As shown in fig. 2, the refueling drogue area is indicated using a ring, and green and blue marks are affixed to the ring for visual positioning. The topmost point is blue and serves as a first identification point, the points are numbered in sequence, and the rest points are green. The method comprises the steps of obtaining a rough area containing the taper sleeve by extracting the eagle eye-imitated saliency map, and carrying out color threshold segmentation on the saliency area on the basis of the saliency map extraction to obtain a more accurate taper sleeve area. Compared with the RGB, i.e., red-green-blue color space, HSV, i.e., hue-saturation-brightness color space, is more consistent with the human color perception approach in color expression, decomposing color into three asymmetric components in the human eye: hue, saturation and brightness. The three components of the HSV color space can express the brightness, tone and vividness of colors more intuitively. On the basis of the saliency map extraction, the image is subjected to threshold segmentation by using two channels, namely H, hue and S, namely saturation, to obtain an area containing a red object such as a taper sleeve, and the segmented image is subjected to binarization to obtain a segmented binary map.

Step four: region of interest extraction

since the regions extracted by the threshold segmentation may be incomplete and contain noise, in order to obtain the region of interest of the original image, it is first necessary to perform morphological operations on the binary image obtained by the first HSV threshold segmentation, and extract the outer contour of each red region. Assuming that a contour point set of an ith Region is qi, wherein image coordinates of an mth contour point of the ith Region are subjected to sorting calculation for two dimensions of the image coordinates of the contour points of each Region, and a maximum value and a minimum value of the contour point coordinates of each Region are obtained, so that a circumscribed rectangle of each Region is obtained and is used as a Region of Interest (ROI), represented as ROIi ═ i, (ui, vi, wi, hi), ui and vi respectively represent image coordinates of a vertex at the upper left corner of the ROI rectangular Region, and wi and hi respectively represent the width and height of the rectangular Region, so that the circumscribed rectangle of each Region is uniquely determined.

The proportion of the red area in the image in the pixels is small, and when the extracted ROI is further processed, the occupied computing resource is far smaller than that of the original image, so that the purpose of improving the computing speed is achieved. Because the red area in the designed refueling drogue contains green and blue mark points, holes appear in the drogue area in the binary image obtained by red threshold segmentation. Therefore, the holes of the contour region need to be filled while the ROI is extracted. As shown in fig. 2, the refueling drogue is a circular ring, and in order to prevent the inner ring region from being filled, the area of each cavity in each red region obtained by segmentation is judged, and the contour exceeding the area threshold value is not filled, so that the correct red ROI is obtained.

Step five: taper sleeve mark point coordinate acquisition

According to the invention, the blue and green circular mark points are adhered to the taper sleeve area, and when the distance between the taper sleeve and the oil receiver is relatively short, whether the area is the taper sleeve area can be judged according to whether each ROI contains blue and green wafers. The method comprises the steps of performing HSV threshold segmentation on a blue channel and a green channel in each ROI area respectively, and judging whether a blue or green wafer is contained in a red area or not, so that non-target red interference objects are removed, and a taper sleeve interesting area is found.

After the taper sleeve area is detected, the central point of the circular mark point needs to be extracted. Firstly, dividing a gray level image into a binary image set by using continuous equal-step threshold values, wherein if the range of the divided threshold values is [ T1, T2] and the step length is T, all the threshold values are as follows: t1, T1+ T, T1+2T, …, T2. Secondly, extracting the boundary of each binary image, detecting a connected region of the binary image, and extracting the central image coordinates of the connected region of the binary image. Thirdly, counting the center coordinates of the connected domains of all the binary images, and if the distance between the centers of the connected domains of different binary images is smaller than a threshold value, enabling the binary image spots to belong to a gray image spot. Finally, the image coordinates and size of the gray scale image blob are determined. The coordinate position of the spot in the gray-scale image is obtained by weighting all the corresponding binary image spot center coordinates, and qi is the inertia rate of the ith binary image spot as shown in the calculation formula (6), so that the closer the binary image spot shape is to a circle, the greater its contribution to the gray-scale image spot position. The size of the gray image blob is then the median length of the radii of all the binary image blobs.

in the process of extracting the spots of the binary image, the miscellaneous points can be filtered out by defining the shape, the area and the color of the spots. Because the mark point on the taper sleeve is circular, the invention filters the impurity point by setting the area threshold and the roundness threshold of the spot. Each spot corresponds to a green or blue marker point within a red circle on the refueling drogue. In the identification point extraction process, the input image is an image obtained after HSV threshold segmentation, connected domain detection is directly carried out on the input image without carrying out binarization on the image, non-circular miscellaneous points are filtered according to the roundness and area thresholds, and the central image coordinate of each identification point is output.

Step six: identification point matching

Before the pose measurement is carried out, the one-to-one corresponding relation between the extracted image coordinate points and the actual circular identification points is determined, the image coordinates of the blue identification points and the image coordinates of the green identification points can be obtained through the extraction method in the fifth step, but the identification points with different numbers cannot be distinguished. Therefore, a feature point matching algorithm is required to solve the correspondence problem of the feature points.

The invention adopts a simple method to mark the characteristic points, and sets the green round mark point as a first point and the blue point which is closest to the Euclidean distance of the first mark point on the imaging plane as a second mark point. Except for the first identification point, the point closest to the second identification point is the third identification point, and so on, all the identification points can be numbered.

Step seven: camera parameter calibration

a black and white checkerboard calibration plate is manufactured, and the side length of each check is a known value. The visual sensor is used for shooting the checkerboard at different angles and different depths, so that calibration errors are reduced, and more accurate camera internal parameters are obtained. In a calibration experiment, the camera respectively collects images for checkerboards with different angles, and after the angular point of each calibration image checkerboard is extracted, a camera model can be calculated.

step eight: pose measurement of refueling drogue

For the problem of soft air refueling, the present invention assumes that the camera is mounted at a particular location on the fuel receiver. In order to obtain the relative position of the oil filling taper sleeve relative to the oil receiving port, the position information of the mark point and the imaging model of the camera are utilized to calculate the relative position. The method is used for measuring the pose of the refueling cone based on a Robust pose measuring algorithm Robust _ Perspective-n-point (RPnP), and a solution of the RPnP problem is obtained by establishing a seven-order polynomial as a cost function. The overall process of the invention is shown in figure 3.

3. The advantages and effects are as follows:

the invention provides an eagle eye-imitated moving target positioning method for soft autonomous aerial refueling, and provides a practical solution for refueling cone detection and pose measurement in soft autonomous aerial refueling visual navigation. The method simulates a processing mechanism of an eagle vision system, performs significance calculation on an image obtained by a vision sensor, extracts an interested region by adopting color segmentation, further extracts and matches characteristic points, and finally realizes accurate positioning of the refueling cone through pose estimation. The method has strong autonomy, less dependence on external signals, stronger robustness and higher accuracy, and can greatly improve the safety and reliability of the soft autonomous air refueling.

Description of the drawings

FIG. 1 is a soft autonomous airborne fueling verification platform architecture.

FIG. 2 shows the relationship between the oil filling cone identification point and the world coordinate.

FIG. 3 is a flow of an eagle eye movement-imitating target positioning method for soft autonomous aerial refueling.

fig. 4X-axis position measurements.

FIG. 5Y-axis position measurements.

FIG. 6Z-axis position measurements.

Fig. 7 reprojects an error curve.

fifth, detailed description of the invention

the effectiveness of the method designed by the invention is verified by a specific air verification platform refueling cone positioning example. In this example, two unmanned aerial vehicles are used for a test, one unmanned aerial vehicle serves as an oiling machine, and the other unmanned aerial vehicle serves as an oil receiving machine, and the main components are as shown in fig. 1. The oiling machine is an S900 six-rotor aircraft in the great Xinjiang province, an APM2.5 open source flight control system of a 3DR company is loaded, the wireless data transmission module is an APM data transmission system of the 3DR company, and the onboard processor is a raspberry processor. The oil receiving machine also selects an S900 six-rotor aircraft in Xinjiang, and a binocular vision sensor, an airborne vision processor and digital wireless image transmission equipment are installed on the oil receiving machine. The flight controller selects Pixhawk open-source flight control produced by 3DR company, the vision processor is an industry processor for moxa, the camera is an Aca series industry camera produced by Germany Basler company, and the wireless image transmission equipment adopts a DJI Lightbridge high-definition digital wireless image transmission module produced by Dajiang innovative technology Limited. The main configuration of the vision measurement navigation system is as follows:

(1) The airborne vision processor: PICO 880; 1.7GHz main frequency of an i 74650U processor; 8GB memory; 120G solid state disk; the size is 100 multiplied by 72 multiplied by 40.3 mm; a total weight of about 450 g; 4 USB3.0 interfaces.

(2) The airborne color vision sensor comprises: acA1920-155uc color cameras from Basler; a USB3.0 interface; resolution 1920 pixels by 1200 pixels; the maximum frame rate is 164 fps; the CCD physical size is 2/3 inch; the pixel size is 3.75um x 3.75 um.

The method comprises the following steps: calculation of Cap cell response calculation

when the visual scene is input to the cells of the visual cap, only a small part of the cells can respond, namely, most of the cells are not activated, so that an effective coding mechanism can be established, and effective invariance information can be obtained from the image. It is assumed that for any image I it can be represented by a linear combination of a series of image bases Bk:

where Ck is called a coding filter and is the inverse or pseudo-inverse of the image base Bk. The invention takes 128 filtering kernels with obvious direction selectivity as the receptor fields of the olecranon cells, and the size of each filtering kernel of the receptor fields is 14 × 14 ═ 196. When the filter is used for filtering any one image, the response of the cells of the visual cap can be obtained, and the response can present certain sparsity, namely most of the response values are 0, and only a small part of the response values are larger, and the response is consistent with the physiological research result of the cells of the visual cap.

Step two: eagle nucleus-like texture inhibition and significant map extraction

P(i,j,d,θ)＝P(i,j,d,θ)/R (3)

W＝∑∑[(i-j)×P(i,j,d,θ)] (4)

entropy is defined as follows:

W＝-∑∑P(i,j,d,θ)×logP(i,j,d,θ) (5)

step three: color thresholding

Step four: region of interest extraction

Step five: taper sleeve mark point coordinate acquisition

Step six: identification point matching

Step seven: camera parameter calibration

Firstly, a black and white checkerboard calibration plate is manufactured, and the side length of each checkerboard is 5.8 mm. And shooting the checkerboard at different angles and different depths by using a Basler visual sensor, so that calibration errors are reduced, and more accurate camera internal parameters are obtained. In a calibration experiment, the camera respectively collects images for checkerboards with different angles, and after the angular point of each checkerboard of the calibration image is extracted, a camera model can be calculated.

Camera parameters:

step eight: pose measurement of refueling drogue

The taper sleeve designed by the invention has the outer diameter of 35cm and the inner diameter of 20cm, six blue solid circles and one green solid circle are arranged on the circular ring of the taper sleeve, and the radius of the circles is about 2 cm. The centers of the seven circular mark points on the taper sleeve are coplanar, and according to the position relationship, the plane is taken as an X-Y plane, the center of the circle of the taper sleeve is taken as an origin, and the direction vertical to the X-Y plane is the positive direction of a Z axis, so that a world coordinate system is established. After the world coordinate system is established, the world coordinates of the centers of the 7 circular mark points can be obtained, as shown in fig. 2. In order to obtain the relative position of the oil filling taper sleeve relative to the oil receiving port, the relative position is calculated by utilizing the relative position information of the mark points and the camera imaging model. The invention carries out the position and pose measurement of the refueling cone based on the advanced algorithm Robust experience-n-Point (RPnP) of the existing position and pose measurement.

Pose solution is performed on a plurality of images in a continuous image sequence, and the solved displacements in three directions are shown in the attached figures 4-6. And (4) according to the calculation result, performing back solution on the mark point through reprojection to obtain a center point pixel, and subtracting the mark point position coordinate obtained in the step five to obtain a reprojection error through calculation, wherein an error curve is shown in an attached figure 7. The test result shows that the invention can accurately measure the pose of the soft air refueling simulation refueling taper sleeve.

Claims

1. An eagle eye-imitated moving target positioning method for soft autonomous aerial refueling is characterized by comprising the following steps of: starting from a visual system processing mechanism of the hawk, simulating a visual attention mechanism of the hawk by combining the interaction between the hawk brain nuclei, extracting a rough area where the refueling taper sleeve is located, then performing feature extraction on the taper sleeve area by using color segmentation, and further measuring relative position information between the refueling taper sleeve and the visual system of the oil receiving machine by using a pose estimation algorithm; the method comprises the following specific steps:

Establishing an effective olecranon-roof-imitating cell coding mechanism, simulating the effectiveness and sparsity of olecranon-roof cell coding, acquiring direction and edge information from an image, and expressing any one image I by linear combination of a series of image bases Bk:

The image base Bk is obtained by training from natural images, which is common information existing between the natural images, and ak is a coefficient corresponding to the image base Bk, the coefficient has sparsity and is obtained by the following formula:

Wherein, Ck is called a coding filter and is the inverse or pseudo-inverse of the image base Bk; when the filter is used for carrying out filtering operation on any one image, the response of the cells of the visual cap is obtained, and the response can present sparsity, namely most of the response values are 0, and only a small part of the response values are not 0, and the response is consistent with the physiological research result of the cells of the visual cap;

The cellular response of the eagle eye visual top cover corresponding to the background area in the soft air refueling scene is similar, and the cellular response of the refueling drogue area is different from that of the background area; the cellular response is determined by both the input image and the receptive field, while the receptive field of the roof cells is selective for direction and edge information; the larger the cell response corresponding to a certain receptive field is, the larger the cell response corresponding to the certain receptive field is, the coincidence of the corresponding direction and edge information of the image block and the selectivity of the receptive field is; describing the information of the image block by using the maximum response and the corresponding receptive field; similarly, the maximum response of the background region is very similar and the cellular response of the refueling drogue region differs therefrom; meanwhile, the background area is flat, and the edge information is not obvious, so that the corresponding cell response is not larger than other receptive fields in a certain receptive field; on the contrary, abundant marginal information exists in the refueling drogue area, the target area has directionality, and the corresponding cellular response of the target area is often higher in one receptive field than other receptive fields; thus, the maximum response of each image patch is used to describe the probability that the image patch is a refueling drogue region;

Step two: eagle nucleus-like texture inhibition and significant map extraction

lateral inhibition exists among cell receptive fields in the eagle visual system, when stimulation is input, central cells can be inhibited by surrounding cells, the inhibition is expressed as anti-interference capability of an enhanced algorithm, and texture consistency of a background area is also considered; when the texture consistency of a certain image block and the surrounding image blocks is strong, the probability that the image block is a background area is considered to be large, and therefore a large texture suppression coefficient is given to the image block for suppression; on the contrary, when the texture consistency of the image block and the surrounding image block is weaker, the probability that the image block is the refueling drogue area is higher, and a smaller texture suppression coefficient is given to the image block; the texture consistency calculation method is based on a gray level co-occurrence matrix, and the gray level co-occurrence matrix is defined as follows: the image to be analyzed having Nx pixels in the horizontal direction and Ny pixels in the vertical direction y, the gray of the image

degree is G, and X ═ {1, 2., Nx } represents the pixel coordinates in the horizontal direction of the image,

Y ═ 1, 2., Ny } represents the pixel coordinates in the vertical direction of the image, N ═ 0, 1., G } is the quantized gray level, then the original image is represented as a mapping function f from horizontal and vertical coordinates to gray level X × Y → N; the statistical law of a pair of pixel gray scales separated by a certain distance in a certain direction in an image reflects the texture characteristics of the image, and a matrix is used for describing the gray scale statistical law of each pixel pair to obtain a gray scale co-occurrence matrix which is expressed as W;

A pixel point pair is formed by any point (x, y) in the image and a certain pixel point (x + a, y + b) which is away from the point by a certain length, the gray value of the pixel point pair is set as (i, j), namely the gray value of the pixel point (x, y) is set as i, and the gray value of the pixel point (x + a, y + b) is set as j; fixing a and b, and moving the point (x, y) on the whole image to obtain various values of (i, j); if the gray scale number of the image is G, the combination of i and j has G2; counting the frequency of each occurrence in the whole image as P (i, j, d, theta), and then calling a square matrix [ P (i, j, d, theta) ] GXG as a gray level co-occurrence matrix, namely [ P (i, j, d, theta) ] GXG; the gray level co-occurrence matrix is a combined histogram of two pixel points, and the distance difference values (a and b) are combined by different numerical values to obtain the gray level co-occurrence matrix of the image at a certain distance along a certain direction theta;

Setting a to b to 2, theta to [0 °,45 °, 90 °, 135 ° ], and the quantized gray scale is 8, so that the gray co-occurrence matrix is a matrix of 8 × 4; the gray level co-occurrence matrix can reflect information of image gray levels about direction, adjacent interval and change amplitude; in order to analyze the local mode and the arrangement rule of the image, the obtained gray level co-occurrence matrix is not directly used, but secondary statistics is obtained on the basis of the gray level co-occurrence matrix; before acquiring the characteristic parameters of the gray level co-occurrence matrix, normalization is performed according to the following formula:

P(i,j,d,θ)＝P(i,j,d,θ)/R (3)

wherein R is a normalization constant and is the sum of all elements in the gray level co-occurrence matrix;

the secondary statistics are contrast and entropy, where contrast is defined as follows:

W＝∑∑[(i-j)×P(i,j,d,θ)] (4)

The contrast is the moment of inertia about the main diagonal in the W1 matrix, which measures the distribution of matrix values and local changes in the image; the larger the value of W1, the stronger the texture contrast is, the clearer the image is, and the more obvious the texture effect is;

Entropy is defined as follows:

W＝-∑∑P(i,j,d,θ)×logP(i,j,d,θ) (5)

the entropy represents the information quantity of the image, is the measurement of the randomness of the image content, and can represent the complexity of the texture; when the image has no texture, the entropy is 0, and when the image has the texture, the entropy is maximum; in order to calculate the texture suppression coefficient of the image, sampling with two different window sizes is carried out around each pixel, and the gray level co-occurrence matrix of the two image blocks obtained by sampling is respectively calculated; then obtaining a normalized gray level co-occurrence matrix by utilizing a calculation formula (3); respectively calculating secondary statistics of the image blocks by utilizing the gray level co-occurrence matrix, and then calculating the distance between the two secondary statistics, wherein the distance describes the texture consistency of a certain image block and the image blocks around the certain image block; if the distance between the two secondary statistics is larger, the image block and the texture of the surrounding area have larger difference, so the texture suppression coefficient of the image block is endowed with a smaller value; otherwise, if the distance between the two secondary statistics is smaller, the image block is more similar to the texture of the surrounding area, and the texture suppression coefficient corresponding to the image block is larger at the moment; using the distance between the secondary statistics of the two gray level co-occurrence matrixes corresponding to each image block as a texture suppression coefficient, and multiplying the reciprocal of the texture suppression coefficient as a texture enhancement coefficient by the maximum response corresponding to each image block of the image to obtain a saliency map of the image, wherein the region corresponding to the maximum value of the saliency is a taper sleeve region; filtering partial background information by using a saliency map extracted from the eagle eye-imitated saliency map, and reducing the calculation amount for subsequent processing;

Step three: color thresholding

In the designed verification platform, the region of the refueling taper sleeve is a red ring; a circular ring is used for representing the oiling taper sleeve area, and green and blue marks are pasted on the circular ring for visual positioning; the topmost point is blue and serves as a first identification point, the points are numbered in sequence, and the rest points are green; extracting an approximate region containing a taper sleeve from the eagle eye-imitated saliency map, and performing color threshold segmentation on the saliency region on the basis of extraction of the saliency map to obtain a more accurate taper sleeve region; compared with the RGB, i.e., red-green-blue color space, HSV, i.e., hue-saturation-brightness color space, is more consistent with the human color perception approach in color expression, decomposing color into three asymmetric components in the human eye: hue, saturation, and brightness; three components of the HSV color space express the brightness, tone and vividness of colors more intuitively; on the basis of the extraction of the saliency map, performing threshold segmentation on the image by utilizing two channels, namely H, hue and S, namely saturation to obtain a region of a red object containing a taper sleeve, and binarizing the segmented image to obtain a segmented binary map;

step four: region of interest extraction

because the region extracted by threshold segmentation is incomplete and contains noise, in order to obtain the region of interest of the original image, firstly, morphological operation is carried out on a binary image obtained by HSV threshold segmentation for the first time, and the external contour of each red region is extracted; the contour point set of the r-th area is qr, wherein the image coordinate of the m-th contour point of the r-th area is obtained by performing sequencing calculation on two dimensions of the image coordinate of each area contour point, so that the maximum value and the minimum value of each area contour point coordinate are obtained, and the circumscribed rectangle of each area is obtained and is used as a Region of Interest (ROI), namely, represented as a Region of Interest

ROIr ═ r (ur, vr, wr, hr), ur and vr respectively represent the image coordinates of the vertex at the top left corner of the ROI rectangular region, wr and hr respectively represent the width and height of the rectangular region, thereby uniquely determining the circumscribed rectangle of each region;

the proportion of the red area in the image in the pixels is small, and when the extracted ROI is further processed, the occupied computing resource is far smaller than that of the original image, so that the purpose of improving the computing speed is achieved; because the red area in the designed refueling taper sleeve contains green and blue identification points, holes appear in the taper sleeve area in a binary image obtained by red threshold segmentation; therefore, the holes of the outline region are filled while the ROI is extracted; the oiling taper sleeve is a circular ring, in order to prevent the inner ring area from being filled, the area of each cavity in each red area obtained by segmentation is judged, and the contour exceeding the area threshold value is not filled, so that a correct red ROI is obtained;

step five: taper sleeve identification point coordinate acquisition

Blue and green circular identification points are pasted in the taper sleeve area, and when the distance between the taper sleeve and the oil receiver is close, whether the area is the taper sleeve area is judged according to whether each ROI contains blue and green wafers; performing HSV threshold segmentation of a blue channel and a green channel on each ROI area respectively, and judging whether a blue or green wafer is contained in a red area or not, so that a non-target red interference object is removed, and a taper sleeve interested area is found;

After detecting the taper sleeve area, extracting the central point of the circular identification point; firstly, dividing a gray level image into a binary image set by using continuous equal-step threshold values, wherein if the divided threshold value range is [ T1, T2] and the step size is T, all the threshold values are as follows: t1, T1+ T, T1+2T, …, T2; secondly, extracting the boundary of each binary image, detecting a connected region of the boundary, and extracting the central image coordinate of the connected region of the binary images; thirdly, counting the center coordinates of the connected regions of all the binary images, and if the distance between the centers of the connected regions of different binary images is less than a threshold value, enabling the binary image spots to belong to a gray image spot; finally, determining the image coordinates and the size of the gray image spots; the coordinate position of the spot in the gray-scale image is obtained by weighting the center coordinates of all corresponding binary image spots, as shown in the calculation formula (6), qr is the inertia rate of the r-th binary image spot, so that the closer the shape of the binary image spot is to a circle, the greater its contribution to the gray-scale image spot position is; the size of the gray image spot is the median of the radius lengths of all the binary image spots;

In the process of extracting the spots of the binary image, filtering out the miscellaneous points by limiting the shape, the area and the color of the spots; because the identification point on the taper sleeve is circular, the impurity point is filtered by setting the area threshold and the roundness threshold of the spot; each spot corresponds to a green or blue identification point in a red circular ring on the refueling taper sleeve; in the identification point extraction process, an input image is an image obtained after HSV threshold segmentation, the image does not need to be binarized, the input image is directly subjected to connected region detection, non-circular miscellaneous points are filtered according to roundness and area thresholds, and the central image coordinate of each identification point is output;

Step six: identification point matching

before pose measurement is carried out, the one-to-one corresponding relation between the extracted image coordinate points and actual circular identification points is required to be determined, the image coordinates of the blue identification points and the image coordinates of the green identification points are obtained through the five-step extraction method, but the identification points with different numbers cannot be distinguished, so that the corresponding problem of the feature points is solved by adopting a feature point matching algorithm;

The following method is adopted to identify the characteristic points: setting a green circular identification point as a first point, and setting a blue point which is closest to the first identification point in Euclidean distance on an imaging plane as a second identification point; except the first identification point, the point closest to the second identification point is the third identification point, and by analogy, all the identification points are numbered;

step seven: camera parameter calibration

Manufacturing a black and white chessboard grid calibration board, wherein the side length of each grid is a known value; shooting the checkerboard at different angles and different depths by using a visual sensor, so that calibration errors are reduced, and more accurate camera internal parameters are obtained; in a calibration experiment, a camera is used for respectively acquiring images of checkerboards at different angles, after the corner point of each calibrated image checkerboard is extracted, a camera model is calculated, as the lens distortion is small, only the mirror image distortion of the camera is considered, and the camera is calibrated by utilizing a MATLAB 2015a tool box to obtain the internal parameters and distortion coefficients of the camera;

Step eight: pose measurement of refueling drogue

Aiming at the problem of soft air refueling, a camera is arranged at a certain position of a fuel receiver; in order to obtain the relative position of the oil filling taper sleeve relative to the oil receiving port, the relative position is calculated by utilizing the position information of the identification point and a camera imaging model; the robust pose measurement algorithm is adopted to measure the pose of the refueling taper sleeve, and a solution of the robust pose measurement algorithm problem is obtained by establishing a seven-order polynomial as a cost function.