CN117132597B

CN117132597B - Image recognition target positioning method and device and electronic equipment

Info

Publication number: CN117132597B
Application number: CN202311394887.2A
Authority: CN
Inventors: 陈方平; 崔强强; 陆煜衡
Original assignee: Tianjin Yunsheng Intelligent Technology Co ltd
Current assignee: Tianjin Yunsheng Intelligent Technology Co ltd
Priority date: 2023-10-26
Filing date: 2023-10-26
Publication date: 2024-02-09
Anticipated expiration: 2043-10-26
Also published as: CN117132597A

Abstract

Some embodiments of the present application provide a method, an apparatus, and an electronic device for positioning a target based on image recognition, where the method includes: determining target object position information, attitude information and airport positions through visual positioning; adjusting the attitude information based on the target object position information and the airport position so that the airport position is always at the center position of an airport image; and identifying auxiliary positioning identifiers in the airport images, and generating landing guide information, wherein the landing guide information is used for guiding the target object to position and land. Some embodiments of the present application may enable accurate positioning of a target object.

Description

Image recognition target positioning method and device and electronic equipment

Technical Field

The application relates to the technical field of image recognition and positioning, in particular to a method and a device for positioning a target based on image recognition and electronic equipment.

Background

In the process of automatically moving the target object, a GPS positioning signal is generally used for positioning, so as to realize the control of the target object. However, in practical application, the GPS positioning signal is easily influenced by the shielding or unstable receiving of the object, so that the target object cannot acquire the positioning information of the target object, and therefore, the target object cannot automatically land to the automatic airport.

Therefore, how to provide a precise method for positioning a target based on image recognition becomes a technical problem to be solved.

Disclosure of Invention

The invention provides a method, a device and electronic equipment for positioning a target based on image recognition, and the technical scheme of the embodiment of the application can realize accurate positioning and landing of the target object under the condition of no GPS positioning signal.

In a first aspect, some embodiments of the present application provide a method for image-based identification of target positioning, including: determining target object position information, attitude information and airport positions through visual positioning; adjusting the attitude information based on the target object position information and the airport position so that the airport position of the target object is always at the center position of an airport image in the descending process; and identifying an auxiliary positioning identifier in the airport image, and generating landing guide information, wherein the landing guide information is used for guiding the target object to land.

According to the method and the device, the position information, the gesture information and the airport position of the target object can be determined through the visual positioning method, the gesture information of the target object can be adjusted through the position information, landing guide information is generated through the auxiliary positioning mark, and the technical effect of accurately landing the target object under the condition that GPS positioning signals are not available is achieved.

In some embodiments, the determining the target object position information and the gesture information through visual positioning includes: collecting multi-frame images of surrounding environment; determining a plurality of key frame images by tracking and matching the extracted visual feature point information of each frame of image in the multi-frame images; integrating and optimizing flight control data between adjacent key frame images in the plurality of key frame images to obtain at least one visual estimation value, wherein the visual estimation value comprises: initial value of target object position and initial information of gesture; and accumulating the at least one visual estimation value to obtain the target object position information and the gesture information.

According to the method and the device, after the collected multi-frame images of the surrounding environment are extracted, tracked and matched, the vision estimated value is obtained by combining flight control data, and then the position information and the posture information of the target object are obtained through accumulation, so that the accurate acquisition of the related position information of the target object can be realized.

In some embodiments, the determining a plurality of key frame images by tracking and matching the extracted visual feature point information of each frame of the multi-frame images includes: calculating the visual feature point information of each frame of image by using a feature point extraction algorithm; tracking and matching the visual feature point information to obtain a matching frame; removing the abnormal frames in the matched frames to obtain an image queue; and confirming that visual feature point information of an ith frame image in the image queue is larger than a first preset threshold value, and that feature average parallax of the ith frame image and an (i+1) th frame image is larger than a second preset threshold value, taking the ith frame image as a key frame image, wherein the ith frame image is any frame in the multi-frame images.

According to the method and the device, the visual characteristic point information is obtained by calculating each frame of image through the characteristic point extraction algorithm, the key frame image can be obtained based on the visual characteristic point information, and effective data support is provided for subsequent target object positioning.

In some embodiments, integrating and optimizing flight control data between adjacent key frame images in the plurality of key frame images to obtain at least one visual estimation value includes: integrating the flight control data between the adjacent key frame images to obtain a visual estimation initial value; and optimizing the initial value of the visual estimation by using an objective function to obtain the visual estimation value, wherein the objective function is related to the pre-integral increment of the adjacent key frame image, the jacobian matrix of the pre-integral error and the covariance term.

According to the method and the device, the vision estimation initial value is obtained through integrating the flight control data, then the vision estimation value is obtained through optimizing the target function, and effective data support is provided for subsequent target object positioning.

In some embodiments, the determining the airport location includes: collecting airport images of overlooking view angles of target objects; inputting the airport image into a target detection model to obtain the airport position.

Some embodiments of the application obtain the airport location by inputting the airport image into the object detection model, which is efficient.

In some embodiments, the adjusting the pose information based on the target object location information and the airport location includes: calculating the central point position deviation of the target object position information and the airport position; generating a control signal through the center point position deviation, and adjusting the gesture information by using the control signal, wherein the gesture information comprises: the flight speed and the flight angle of the target object.

According to the method and the device, the control signal for adjusting the posture of the target object can be generated through the deviation of the position information of the target object and the central point of the airport position, so that the target object can be effectively adjusted, and accurate landing is facilitated.

In a second aspect, some embodiments of the present application provide an apparatus for image-based identification of target locations, comprising: the position determining module is used for determining the position information, the gesture information and the airport position of the target object through visual positioning; the gesture adjusting module is used for adjusting the gesture information based on the target object position information and the airport position so that the airport position is always at the center position of an airport image; and the landing module is used for identifying the auxiliary positioning identifier in the airport image and generating landing guide information, wherein the landing guide information is used for guiding the target object to land.

In some embodiments, the location determination module is configured to: collecting multi-frame images of surrounding environment; determining a plurality of key frame images by tracking and matching the extracted visual feature point information of each frame of image in the multi-frame images; integrating and optimizing flight control data between adjacent key frame images in the plurality of key frame images to obtain at least one visual estimation value, wherein the visual estimation value comprises: initial value of target object position and initial information of gesture; and accumulating the at least one visual estimation value to obtain the target object position information and the gesture information.

In a third aspect, some embodiments of the present application provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs a method according to any of the embodiments of the first aspect.

In a fourth aspect, some embodiments of the present application provide an electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the program, can implement a method according to any of the embodiments of the first aspect.

Drawings

In order to more clearly illustrate the technical solutions of some embodiments of the present application, the drawings that are required to be used in some embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort to a person having ordinary skill in the art.

FIG. 1 is a system diagram of image recognition-based targeting provided in some embodiments of the present application;

FIG. 2 is one of the flow charts of the method for image-based identification object localization provided in some embodiments of the present application;

FIG. 3 is a second flowchart of a method for image-based identification object localization according to some embodiments of the present application;

FIG. 4 is a block diagram of an apparatus for image recognition-based targeting provided in some embodiments of the present application;

fig. 5 is a schematic diagram of an electronic device according to some embodiments of the present application.

Detailed Description

The technical solutions in some embodiments of the present application will be described below with reference to the drawings in some embodiments of the present application.

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more clear, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it should be understood that the accompanying drawings in the present application are only for the purpose of illustration and description, and are not intended to limit the protection scope of the present application. In addition, it should be understood that the schematic drawings are not drawn to scale. A flowchart, as used in this application, illustrates operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be implemented out of order and that steps without logical context may be performed in reverse order or concurrently. Moreover, one or more other operations may be added to the flow diagrams and one or more operations may be removed from the flow diagrams as directed by those skilled in the art.

In addition, the described embodiments are only some, but not all, of the embodiments of the present application. The components of the embodiments of the present application, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, as provided in the accompanying drawings, is not intended to limit the scope of the application, as claimed, but is merely representative of selected embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, are intended to be within the scope of the present application.

It should be noted that the term "comprising" will be used in the embodiments of the present application to indicate the presence of the features stated hereinafter, but not to exclude the addition of other features.

In the related art, in the process of landing an automatic inspection target object (for example, an inspection unmanned aerial vehicle or other type unmanned aerial vehicle) to an automatic airport, a GPS positioning signal is generally used for positioning to realize landing of the unmanned aerial vehicle. If the GPS positioning signal is disturbed, landing failure is caused. That is, since the unmanned aerial vehicle is in the process of automatic flight, the GPS positioning signal is easily blocked by an object or is affected by unstable reception, the unmanned aerial vehicle cannot acquire positioning information of the unmanned aerial vehicle, and thus the unmanned aerial vehicle cannot automatically fall to an automatic airport.

According to the related art, in the prior art, under the condition that a GPS positioning signal is not available, the unmanned aerial vehicle is difficult to acquire the position information of the unmanned aerial vehicle, so that the unmanned aerial vehicle cannot automatically and accurately land in an airport.

In view of this, some embodiments of the present application provide a method for image-based recognition of target positioning, which can determine target object position information, pose information by a visual positioning method, and then can adjust the pose information of the target object by determining airport positions. And finally, generating landing guide information for guiding the target object to descend by identifying the auxiliary positioning identification in the automatic airport. Some implementations of the application assist the target object to obtain stable self-positioning information in a visual positioning manner, and simultaneously, assist the target object to accurately position and drop to the automatic airport by detecting and identifying the position of the automatic airport in an airport image picture of the airborne cradle head in real time.

In order to facilitate understanding of some embodiments provided in the present application, a process of implementing accurate landing by positioning an unmanned aerial vehicle based on an image recognition technology is exemplarily described below by taking a target object as an unmanned aerial vehicle as an example.

The overall composition of the image recognition target localization based system provided in some embodiments of the present application is described below by way of example with reference to fig. 1.

As shown in fig. 1, some embodiments of the present application provide a system for image-based identification of target locations, the system comprising: the unmanned aerial vehicle 100, a four-eye fish-eye camera 110 and a pan-tilt camera 120 disposed on the unmanned aerial vehicle 100. The four-eye fisheye camera 110 may acquire multi-frame panoramic image information of a 360 ° surrounding environment. The processor of the unmanned aerial vehicle 100 can then determine the unmanned aerial vehicle position information and attitude information by using a visual positioning algorithm through the multi-frame panoramic image information and the flight control data. The pan-tilt camera 120 may enable the unmanned aerial vehicle 100 to acquire the airport location and the auxiliary positioning identifier by acquiring airport images from a top view. Finally, the unmanned aerial vehicle 100 guides the landing of the unmanned aerial vehicle through the unmanned aerial vehicle position information, the gesture information, the airport position and the auxiliary positioning mark, so as to realize the accurate landing of the unmanned aerial vehicle.

In some embodiments of the present application, the type of camera deployed on the drone 100 may be set according to the actual situation, and embodiments of the present application are not limited thereto.

The implementation of drone landing performed by drone 100 provided in some embodiments of the present application is described below by way of example in connection with fig. 2.

Referring to fig. 2, fig. 2 is a flowchart of a method for positioning an object based on image recognition according to some embodiments of the present application, where the method for positioning an object based on image recognition includes:

s210, determining target object position information, attitude information and airport positions through visual positioning.

For example, in some embodiments of the present application, the unmanned aerial vehicle is assisted by visual positioning to obtain stable self-positioning information (as a specific example of target object position information, which may be abbreviated as position) and gesture information (or abbreviated as gesture). The automatic airport location is obtained by means of image detection.

In some embodiments of the present application, S210 may include: collecting multi-frame images of surrounding environment; determining a plurality of key frame images by tracking and matching the extracted visual feature point information of each frame of image in the multi-frame images; integrating and optimizing flight control data between adjacent key frame images in the plurality of key frame images to obtain at least one visual estimation value, wherein the visual estimation value comprises: initial value of target object position and initial information of gesture; and accumulating the at least one visual estimation value to obtain the target object position information and the gesture information.

For example, in some embodiments of the present application, the four-eye fisheye camera 110 carried by the unmanned aerial vehicle 100 is adopted to obtain panoramic image information of a surrounding environment of 360 ° (as a specific example of multi-frame images), through an image feature point extraction algorithm, visual feature point information in each frame of image is calculated, and the corresponding matching relationship of the visual feature points in the front frame image and the rear frame image of each frame of image in the acquired multi-frame images is combined with the data of the front frame image, the rear frame image and the flight control IMU, so as to finally solve the position and the gesture of the unmanned aerial vehicle 100.

In some embodiments of the present application, S210 may include: calculating the visual feature point information of each frame of image by using a feature point extraction algorithm; tracking and matching the visual feature point information to obtain a matching frame; removing the abnormal frames in the matched frames to obtain an image queue; and confirming that visual feature point information of an ith frame image in the image queue is larger than a first preset threshold value, and that feature average parallax of the ith frame image and an (i+1) th frame image is larger than a second preset threshold value, taking the ith frame image as a key frame image, wherein the ith frame image is any frame in the multi-frame images.

For example, in some embodiments of the present application, harris corner points of each frame of image are taken by an image feature point extraction algorithm; and then tracking visual feature points of adjacent frames of each frame of image by utilizing pyramid optical flow to obtain a matched frame. Abnormal points (e.g., feature points that match errors) in the matching frame are removed by RANSAC, and then the images corresponding to the tracked visual feature points are input into an image queue. And finally, judging that the number of the visual feature points of each frame of image in the image queue from which the abnormal points are removed is larger than a first preset threshold value, and judging that the feature average parallax of the image of each frame (as a specific example of an ith frame) which is relatively nearest to the image of each frame (as a specific example of an (i+1) th frame) is larger than a second preset threshold value as a key frame. The first preset threshold and the second preset threshold may be flexibly set according to actual situations, which is not specifically limited herein.

In some embodiments of the present application, S210 may include: integrating the flight control data between the adjacent key frame images to obtain a visual estimation initial value; and optimizing the initial value of the visual estimation by using an objective function to obtain the visual estimation value, wherein the objective function is related to the pre-integral increment of the adjacent key frame image, the jacobian matrix of the pre-integral error and the covariance term.

For example, in some embodiments of the present application, the PVQ (initial position, initial velocity, initial rotation angle) of the k+1th frame may be obtained as a visual estimation initial value by integrating all IMUs (i.e., flight control data) between the k-th frame and the k+1th frame (as a specific example of neighboring key frames). And simultaneously calculating the pre-integral increment of adjacent frames of the key frame to be used in optimization, and the jacobian matrix and covariance term of the pre-integral error, and then constructing by different optimization factors to obtain an objective function. And optimizing the initial value of visual estimation through an objective function to obtain the visual estimation value.

Specifically, the formula of the objective function is as follows:

wherein,w ₁ 、w ₂ andw ₃ respectively, are different optimization factors,is a state vector, ||r _p -H _p />|| ² For marginalized a priori information, +.>Measuring the residual error for IMU,>is a visual cost error;Cis a group of visual characteristic points,jis the firstjThe group of the components is arranged in a group,lis the firstlGroup characteristics.

By optimizing the error items, more accurate attitude information and position information of the unmanned aerial vehicle can be solved.

In some embodiments of the present application, S210 may include: collecting airport images of overlooking view angles of target objects; inputting the airport image into a target detection model to obtain the airport position.

For example, in some embodiments of the present application, in the early preparation, some automatic airport photographs from a top view are collected by the cradle head camera 120 carried by the unmanned aerial vehicle itself, and the positions of the automatic airports are marked in these photographs by manual labeling; a lightweight target detection neural network is trained using the labeled data sets. The trained neural network (as a specific example of the object detection model) is capable of automatically identifying the location of the automated airport (i.e., airport location) in the photograph (as a specific example of the airport image) captured by the pan-tilt camera 120.

S220, adjusting the attitude information based on the target object position information and the airport position, so that the airport position of the target object is always at the center position of an airport image in the descending process.

For example, in some embodiments of the present application, during the process of automatic landing of the unmanned aerial vehicle 100, the speed of the unmanned aerial vehicle 100 in the horizontal direction (as a specific example of gesture information) is controlled such that the automatic airport in the pan-tilt camera screen (as a specific example of airport image) is always located at the center of the screen, so that it is ensured that the unmanned aerial vehicle 100 can land in full alignment with the automatic airport.

In some embodiments of the present application, S220 may include: calculating the central point position deviation of the target object position information and the airport position; generating a control signal through the center point position deviation, and adjusting the gesture information by using the control signal, wherein the gesture information comprises: the flight speed and the flight angle of the target object.

For example, in some embodiments of the present application, movement of the drone in the horizontal plane is controlled based on the obtained position information and attitude information of the current drone 100. By continuously recognizing the position of the airport in the airport image, a positional deviation of the position of the unmanned aerial vehicle 100 from a predetermined center point of the airport in the airport image can be recognized. The position deviation is used as a control signal to be output to an electric control, the electric control controls the unmanned aerial vehicle 100 to fly according to the position of the corresponding center point, the flying speed and the flying angle are continuously identified and adjusted in the process, the unmanned aerial vehicle 100 is controlled to move in position according to the position deviation, and when the position of an airport in an airport image is judged to be the center position, the unmanned aerial vehicle 100 is completely aligned to the automatic airport to land and is positioned right above the airport. For example, the identification frequency can be set according to the position deviation, the larger the deviation is, the lower the identification frequency is, and the smaller the deviation is, the higher the identification frequency is, so that the unmanned aerial vehicle can be quickly adjusted to the position right above the preset airport according to the identification result.

S230, identifying auxiliary positioning identifiers in the airport images, and generating landing guide information, wherein the landing guide information is used for guiding the target objects to land.

For example, in some embodiments of the present application, the drone 100 may slowly land from a high altitude to an airport with the above-described constant adjustments. As the height of the pan-tilt camera 120 gradually decreases during the landing, the airport in the field of view of the pan-tilt camera 120 is larger and larger until it disappears in the picture. At this time, the unmanned aerial vehicle 100 is further located at a certain height from the full landing into the airport, and thus a small two-dimensional code (as a specific example of the auxiliary positioning mark) is stuck inside the airport, which assists the unmanned aerial vehicle to land in alignment in the final stage. That is, the unmanned aerial vehicle 100 can be divided into two stages in the automatic landing process, the first stage realizes that the unmanned aerial vehicle is always located at the center of the automatic airport for landing by identifying the position of the automatic airport, and when the second stage can not identify the automatic airport, the landing guiding information can be obtained by automatically identifying the labels (such as two-dimension codes) in the airport, so that the unmanned aerial vehicle 100 can land more accurately. It should be noted that, the auxiliary positioning identifier may also be other forms (such as a barcode, a custom image, etc.) besides the two-dimensional code, and the embodiment of the application is not limited thereto.

The specific process of image-based recognition object localization provided in some embodiments of the present application is described below by way of example in conjunction with fig. 3.

Referring to fig. 3, fig. 3 is a flowchart of a method for positioning an object based on image recognition according to some embodiments of the present application.

The above-described process is exemplarily set forth below.

S310, acquiring multi-frame images of surrounding environments by a four-eye fish-eye camera of the unmanned aerial vehicle;

s320, determining the position information and the attitude information of the unmanned aerial vehicle by using an image feature point extraction algorithm;

s330, inputting airport images acquired by a cradle head camera of the unmanned aerial vehicle into a target detection model to obtain an airport position;

s340, adjusting the attitude information of the unmanned aerial vehicle through the position information of the unmanned aerial vehicle and the airport position, so that the airport position of the unmanned aerial vehicle is always at the center position of an airport image in the descending process;

and S350, when the unmanned aerial vehicle reaches the second stage, identifying auxiliary positioning identification in the airport image, and generating landing guide information.

It will be appreciated that the manner in which the drone reaches the second stage may be at a time when the location of the airport cannot be detected in the airport image, or at a time when the distance of the drone from the airport is not greater than a threshold value. The confirmation may be specifically performed according to the actual situation, and the embodiment of the present application is not limited thereto.

It should be noted that, the specific implementation process of S310 to S350 may refer to the method embodiments provided above, and detailed descriptions are omitted here appropriately to avoid repetition.

According to the embodiments of the application, the automatic accurate falling into an automatic airport can be realized under the condition of no GPS; moreover, the method and the device can automatically identify the switch and the closing state of the airport, prevent forced landing when the airport is in the closing state, send out alarm information and greatly reduce crash accidents in the landing process of the unmanned aerial vehicle.

Referring to fig. 4, fig. 4 illustrates a block diagram of an apparatus for image recognition-based object localization according to some embodiments of the present application. It should be understood that the image recognition target positioning apparatus corresponds to the above method embodiments, and is capable of performing the steps involved in the above method embodiments, and specific functions of the image recognition target positioning apparatus may be referred to the above description, and detailed descriptions thereof are omitted herein as appropriate to avoid redundancy.

The image recognition object localization apparatus of fig. 4 includes at least one software functional module which can be stored in a memory in the form of software or firmware or cured in the image recognition object localization apparatus, the image recognition object localization apparatus including: a location determination module 410 for determining target object location information, pose information, and airport location by visual localization; a gesture adjustment module 420, configured to adjust the gesture information based on the target object position information and the airport position, so that the airport position is always at a center position of an airport image; and a landing module 430, configured to identify an auxiliary positioning identifier in the airport image, and generate landing guide information, where the landing guide information is used to guide the target object to land.

In some embodiments of the present application, the location determining module 410 is configured to acquire multiple frames of images of the surrounding environment; determining a plurality of key frame images by tracking and matching the extracted visual feature point information of each frame of image in the multi-frame images; integrating and optimizing flight control data between adjacent key frame images in the plurality of key frame images to obtain at least one visual estimation value, wherein the visual estimation value comprises: initial value of target object position and initial information of gesture; and accumulating the at least one visual estimation value to obtain the target object position information and the gesture information.

In some embodiments of the present application, the location determining module 410 is configured to calculate the visual feature point information of each frame of image using a feature point extraction algorithm; tracking and matching the visual feature point information to obtain a matching frame; removing the abnormal frames in the matched frames to obtain an image queue; and confirming that visual feature point information of an ith frame image in the image queue is larger than a first preset threshold value, and that feature average parallax of the ith frame image and an (i+1) th frame image is larger than a second preset threshold value, taking the ith frame image as a key frame image, wherein the ith frame image is any frame in the multi-frame images.

In some embodiments of the present application, the location determining module 410 is configured to integrate the flight control data between the neighboring key frame images to obtain a visual estimation initial value; and optimizing the initial value of the visual estimation by using an objective function to obtain the visual estimation value, wherein the objective function is related to the pre-integral increment of the adjacent key frame image, the jacobian matrix of the pre-integral error and the covariance term.

In some embodiments of the present application, the location determination module 410 is configured to acquire airport images from a top view of the target object; inputting the airport image into a target detection model to obtain the airport position.

In some embodiments of the present application, gesture adjustment module 420 is configured to calculate a center point position deviation of the target object location information and the airport location; generating a control signal through the center point position deviation, and adjusting the gesture information by using the control signal, wherein the gesture information comprises: the flight speed and the flight angle of the target object.

It will be clear to those skilled in the art that, for convenience and brevity of description, reference may be made to the corresponding procedure in the foregoing method for the specific working procedure of the apparatus described above, and this will not be repeated here.

Some embodiments of the present application also provide a computer readable storage medium having stored thereon a computer program, which when executed by a processor, may implement operations of the method corresponding to any of the above-described methods provided by the above-described embodiments.

Some embodiments of the present application further provide a computer program product, where the computer program product includes a computer program, where the computer program when executed by a processor may implement operations of a method corresponding to any of the foregoing methods provided by the foregoing embodiments.

For ease of understanding, fig. 5 shows a schematic diagram of exemplary hardware and software components of an electronic device 500 that may implement the concepts of the present application, according to some embodiments of the present application. For example, the processor 520 may be used on the electronic device 500 and to perform the functions herein.

The electronic device 500 may be a general-purpose computer or a special-purpose computer, an intelligent device such as a car-mounted computer, a robot, etc., which may be used to implement the methods shown in the above-described embodiments of the present application. Although only one computer is shown, the functionality described herein may be implemented in a distributed fashion across multiple similar platforms for convenience to balance processing loads.

For example, electronic device 500 may include network port 510 connected to a network, one or more processors 520 for executing program instructions, communication bus 530, and storage media 540 in a different form, such as magnetic disk, ROM, or RAM, or any combination thereof. By way of example, the computer platform may also include program instructions stored in ROM, RAM, or other types of non-transitory storage media, or any combination thereof. The methods of the present application may be implemented in accordance with these program instructions. The electronic device 500 also includes an Input/Output (I/O) interface 550 between a computer and other Input/Output devices (e.g., keyboard, display screen).

For ease of illustration, only one processor is depicted in electronic device 500. It should be noted, however, that the electronic device 500 in the present application may also include multiple processors, and thus steps performed by one processor described in the present application may also be performed jointly by multiple processors or separately. For example, if the processor of the electronic device 500 performs steps a and B, it should be understood that steps a and B may also be performed by two different processors together or performed separately in one processor. For example, the first processor performs step a, the second processor performs step B, or the first processor and the second processor collectively perform steps a and B, etc.

The foregoing is merely exemplary embodiments of the present application and is not intended to limit the scope of the present application, and various modifications and variations may be suggested to one skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present application should be included in the protection scope of the present application. It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.

The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Claims

1. A method for identifying object location based on an image, comprising:

determining target object position information, attitude information and airport positions through visual positioning;

adjusting the attitude information based on the target object position information and the airport position so that the airport position of the target object is always at the center position of an airport image in the descending process;

identifying auxiliary positioning identifiers in the airport images, and generating landing guide information, wherein the landing guide information is used for guiding the target objects to land;

the determining the position information and the gesture information of the target object through visual positioning comprises the following steps: determining a plurality of key frame images by tracking and matching visual feature point information of each frame of image in the extracted multi-frame images; integrating and optimizing flight control data between adjacent key frame images in the plurality of key frame images to obtain at least one visual estimation value, wherein the visual estimation value comprises: initial value of target object position and initial information of gesture; accumulating the at least one visual estimation value to obtain the target object position information and the gesture information;

the determining a plurality of key frame images by tracking and matching the extracted visual feature point information of each frame of image in the multi-frame images comprises the following steps: tracking and matching the visual feature point information of each frame of image calculated by utilizing a feature point extraction algorithm to obtain a matched frame; removing the abnormal frames in the matched frames to obtain an image queue; confirming that visual feature point information of an ith frame image in the image queue is larger than a first preset threshold value, and that feature average parallax of the ith frame image and an (i+1) th frame image is larger than a second preset threshold value, wherein the ith frame image is taken as a key frame image, and the ith frame image is any frame in the multi-frame images;

the adjusting the gesture information based on the target object location information and the airport location includes: calculating the central point position deviation of the target object position information and the airport position; generating a control signal through the center point position deviation, and adjusting the gesture information by using the control signal, wherein the gesture information comprises: the flight speed and the flight angle of the target object;

the center point position deviation is characterized by setting identification frequency, the identification frequency is lower when the center point position deviation is larger, and the identification frequency is higher when the center point position deviation is smaller.

2. The method for image recognition-based targeting of claim 1, wherein integrating and optimizing the flight control data between adjacent ones of the plurality of key frame images to obtain at least one visual estimate comprises:

integrating the flight control data between the adjacent key frame images to obtain a visual estimation initial value;

and optimizing the initial value of the visual estimation by using an objective function to obtain the visual estimation value, wherein the objective function is related to the pre-integral increment of the adjacent key frame image, the jacobian matrix of the pre-integral error and the covariance term.

3. The method of image-based identification of object location of claim 1, wherein the determining an airport location comprises:

collecting airport images of overlooking view angles of target objects;

inputting the airport image into a target detection model to obtain the airport position.

4. An apparatus for identifying object location based on an image, the apparatus for performing the method of claim 1, comprising:

the position determining module is used for determining the position information, the gesture information and the airport position of the target object through visual positioning;

the gesture adjusting module is used for adjusting the gesture information based on the target object position information and the airport position so that the airport position is always at the center position of an airport image;

and the landing module is used for identifying the auxiliary positioning identifier in the airport image and generating landing guide information, wherein the landing guide information is used for guiding the target object to land.

5. The image recognition target positioning based apparatus of claim 4, wherein the position determination module is configured to:

collecting multi-frame images of surrounding environment;

determining a plurality of key frame images by tracking and matching the extracted visual feature point information of each frame of image in the multi-frame images;

integrating and optimizing flight control data between adjacent key frame images in the plurality of key frame images to obtain at least one visual estimation value, wherein the visual estimation value comprises: initial value of target object position and initial information of gesture;

and accumulating the at least one visual estimation value to obtain the target object position information and the gesture information.

6. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program, wherein the computer program when run by a processor performs the method according to any of claims 1-3.

7. An electronic device comprising a memory, a processor, and a computer program stored on the memory and running on the processor, wherein the computer program when run by the processor performs the method of any one of claims 1-3.