CN114828146A

CN114828146A - Routing method for geographical position of unmanned cluster based on neural network and iterative learning

Info

Publication number: CN114828146A
Application number: CN202210412303.9A
Authority: CN
Inventors: 郑墨泓; 李勇; 李新宇; 姜虎
Original assignee: CETC 7 Research Institute
Current assignee: CETC 7 Research Institute
Priority date: 2022-04-19
Filing date: 2022-04-19
Publication date: 2022-07-29

Abstract

The invention discloses an unmanned cluster geographical position routing method based on a neural network and iterative learning, which comprises the following steps: before networking, off-line training is carried out on historical data based on a neural network and iterative learning, a prediction rule is obtained through training, and a plurality of value function sub-gradients are designed during training to prevent local convergence; when networking, generating a neighbor parameter prediction table according to a prediction rule; and selecting the next hop for forwarding according to the neighbor parameter prediction table and the requirement. According to the invention, offline training is adopted, the calculated amount and the power consumption of the unmanned aerial vehicle are reduced, learning and training of historical data are completed on line, no unmanned cluster is needed for online training during networking, the calculated amount and the power consumption of the unmanned aerial vehicle are greatly reduced, and the burden of an onboard computer and a power supply is reduced; a large amount of iterations and operations are effectively avoided, and a large amount of power is consumed, so that the method is suitable for the environment with scarce computing resources and power on the computer.

Description

Routing method for geographical position of unmanned cluster based on neural network and iterative learning

Technical Field

The invention relates to the technical field of network switching, in particular to an unmanned cluster geographical position routing method based on a neural network and iterative learning.

Background

Unmanned aerial vehicle has the flexibility height, can substitute the people and carry out the task, reduces the loss of lives and property to and advantage with low costs, along with the maturity of unmanned aerial vehicle technique, unmanned aerial vehicle is by the wide application in each field. But since a single drone cannot perform a large-scale task, the concept of unmanned clustering is proposed. The unmanned aerial vehicles in the unmanned cluster work cooperatively to require that the unmanned aerial vehicles can communicate with each other, so that the unmanned cluster does not need to complete networking in the air, and a routing method suitable for the unmanned cluster becomes one of important research points.

The high speed movement of the drone causes the topology to change dramatically and the nodes are sparse. In addition, the real-time performance of part of the transmission tasks is high, and the characteristics and requirements bring great challenges to the traditional routing protocol. Therefore, when the unmanned aerial vehicle is networked in the air, the design and improvement of the routing method mainly consider the kinematics dynamics characteristics of the unmanned aerial vehicle and the unmanned cluster and the network characteristics of the unmanned cluster, so that the stability and the reliability of the network are ensured.

Ad hoc routing protocols include static routing protocols, proactive routing protocols, reactive routing protocols, hybrid routing protocols, and location-based routing protocols. Although the static routing protocol is widely used in the unmanned cluster network, the reliability is poor; the dynamic routing protocol can use a network with frequent topology change, but the control overhead is high, so that the dynamic routing protocol is difficult to adapt to a narrow-band network of an unmanned cluster; the overhead of the reaction routing protocol is small, but the delay of the first packet is higher; hybrid routing protocols, while addressing some of the shortcomings of both a priori and reactive routing protocols, have difficulty defining the scope for using a priori routing protocols; the routing protocol based on the position fully considers and uses the position information of the unmanned cluster, so that the control overhead can be reduced, the time delay can be reduced, and the routing protocol is more suitable for the unmanned cluster networking compared with the other four routing protocols.

Machine learning can learn and train historical data, summarize the laws of things and predict the future. When the method is applied to the unmanned cluster routing method, the rules of networking are obtained by learning and training according to the data of historical tasks. Thus, a machine-learned based routing protocol or method enables prediction and optimization of routes as compared to traditional routing protocols learned by inorganic machines.

As machine learning algorithms have matured, many scholars have begun to design and improve routing methods using machine learning algorithms. Machine learning algorithms widely used in the routing method include reinforcement learning, neural networks, decision trees and the like, and requirements such as low time delay and load balancing can be considered. But some researches are more suitable for the special scenes such as the Internet of vehicles with small node height difference, the seriously delayed underwater network environment, intertidal zones and the like, such as an urban scene Internet of vehicles multicast routing method based on reinforcement learning provided by the university of electronic technology of western's Security, a low-delay routing method based on a machine learning intertidal zone sensor network provided by the university of Zhejiang, an Internet of vehicles reinforcement learning routing method based on position information provided by the university of mail and telecommunications of Nanjing, and a Q learning ant colony routing method facing to an underwater multi-agent provided by the university of Qinghua; or the method focuses on multicast, multiple data centers, software defined networks, delay tolerant networks and the like, such as a reinforcement learning-based multiple data center energy-saving routing method and system provided by Shandong university, an optimal path selection algorithm based on machine learning under an SDN provided by the Sian traffic university, an intelligent routing decision method based on a DDPG reinforcement learning algorithm provided by the Sian electronic technology university, a multi-agent reinforcement learning-based software defined network routing method provided by Nanjing university of science and technology, and the like; in addition, partial research does not consider the situation that various resources of an airborne system are insufficient, including power supply and computing resources, a routing protocol for online learning is difficult to operate on an unmanned aerial vehicle, such as an intelligent routing method based on a deep reinforcement learning technology under a wireless network environment provided by Nanjing university of industry, a routing optimization method and system based on a graph neural network and deep reinforcement learning provided by Huazhong university of science and technology, a network routing planning method and system based on a BP neural network ant colony algorithm provided by Zhejiang university of industry, a flying ad hoc network QoS routing method based on Q-learning provided by Shanghai microsystem of China academy of sciences and information technology research, and the like; in addition, part of routing protocols for unmanned systems based on machine learning do not consider the situations of frequent node movement, large moving range and even node damage, and routing protocols are not designed in combination with the geographic positions of unmanned clusters, so that the efficiency is low, for example, an unmanned system network adaptive routing method and system based on deep reinforcement learning proposed by the Chinese scientific computing technology research, a mobile self-organizing network routing method based on reinforcement learning proposed by Shanghai Dynasty institute of technology, and a distributed intelligent routing method for unmanned aerial vehicle network slices proposed by the university of electronic science and technology.

Disclosure of Invention

The invention provides an unmanned cluster geographical position routing method based on a neural network and iterative learning, aiming at solving the problems of the defects and shortcomings of the prior art.

In order to realize the purpose of the invention, the technical scheme is as follows:

an unmanned cluster geographical position routing method based on neural network and iterative learning, the method comprises the following steps:

before networking, off-line training is carried out on historical data based on a neural network and iterative learning, a prediction rule is obtained through training, and a plurality of value function sub-gradients are designed during training, wherein the value function sub-gradients comprise a steepest descent sub-gradient, a specific example gradient and a differential sub-gradient so as to prevent local convergence;

when networking, generating a neighbor parameter prediction table according to a prediction rule;

and selecting the next hop for forwarding according to the neighbor parameter prediction table and the requirement.

Preferably, the historical data includes m sets of air mobile nodes, neighbor sets of each node, and transmission characteristic parameters.

Further, the transmission characteristic parameters include a distance between the current node and a neighboring node, a distance between the destination node and the neighboring node, an included angle between a relative position vector between the neighboring node and the current node and a relative position vector between the destination node and the current node, an included angle between a velocity vector of the neighboring node relative to the current node and a relative position vector between the neighboring node and the current node, a transmission bandwidth between the current node and the neighboring node, a time delay, a hop count, and a transmission time between the current node and the neighboring node; wherein the current node and the destination node are not coincident.

Further, taking the distance between the current node and the neighbor node, the distance between the target node and the neighbor node, the relative position vector between the neighbor node and the current node, the included angle between the relative position vector between the target node and the current node, the velocity vector between the neighbor node and the current node, the included angle between the relative position vector between the neighbor node and the current node, the transmission bandwidth between the current node and the neighbor node, and the time delay between the current node and the neighbor node as training to perform off-line training on the input neural network;

and taking the hop count and the transmission time as an output pair of the input neural network.

Still further, the off-line training of the historical data based on the neural network and the iterative learning comprises the following specific steps:

s101: selecting the value of each constant parameter, including the number of neurons in the first hidden layer and the second hidden layer of the neural network

And

coefficient k _hh And kappa _hy Diagonal matrix lambda _y ，λ _h And λ _u Magnitude of (2), value function standard value V _req Maximum number of iterations k _max Training step length eta;

s102: initializing a first weighting matrix from an input layer node to a hidden layer node, a second weighting matrix from the first hidden layer node to a second hidden layer node, and a third weighting matrix from the second hidden layer node to an output layer node; wherein k represents an iteration number;

s103: respectively calculating the output of a first hidden layer node, the output of a second hidden layer node and an output pair matrix of the kth iteration for n groups of training pairs, wherein the output pair matrix comprises a hop count predicted value and a transmission time predicted value;

s104: respectively calculating first cost functions V corresponding to the first weighting matrix, the second weighting matrix and the third weighting matrix according to the output of the first hidden layer node, the output of the second hidden layer node and the output pair matrix obtained by calculation _uh (k) Second value function V _hh (k) A third valence function V _hy (k)；

S105: if | | V (k) calucing _∞ ≤V _req Wherein V (k) ═ V _uh (k) V _hh (k) V _hy (k)]Is a value function matrix, | · | | non-conducting phosphor _∞ Is an infinite norm; or the iteration number reaches k to the set maximum number k _max If yes, ending the training and entering step S111; otherwise, go to step S106;

s106: respectively calculating the corresponding value function sub-gradients of the first weighting matrix, the second weighting matrix and the third weighting matrix, wherein the value function sub-gradients comprise a steepest descent sub-gradient, a ratio example gradient and a differential sub-gradient;

s107: respectively calculating the value function gradients corresponding to the first weighting matrix, the second weighting matrix and the third weighting matrix according to the steepest descent sub-gradient, the proportional sub-gradient and the differential sub-gradient obtained by calculation;

s108: entering next iteration, wherein k is k + 1;

s109: updating the corresponding first weighting matrix, second weighting matrix and third weighting matrix according to the value function gradient corresponding to the first weighting matrix, second weighting matrix and third weighting matrix obtained by the last iterative computation;

s110: repeating the steps S103-S105 until the iteration ending condition is met, and entering the step S111;

s111: and taking the updated first weighting matrix, second weighting matrix and third weighting matrix as the optimal weighting matrix and as the prediction rule.

Further, a neighbor parameter prediction table is generated according to the prediction rule, which specifically includes:

s201: when networking, obtaining the current node X _c And destination node X _d Information;

s202: if the destination node is X _d ∈χ _c,B Then directly select X _d Is the next hop; otherwise, entering the next step;

s203: obtaining next hop feasible solution set x from neighbor information _c,B Selecting neighbor node X _j ∈χ _B Wherein j ∈ [1, m ]]，

Predicting and selecting the next hop according to a prediction rule, wherein the next hop is predicted and selected to be X _j Then, the hop count and the transmission time of the message are sent to a destination node;

s204: and traversing all the neighbor nodes, repeating the step S203 to predict the hop count and the transmission time of the neighbor nodes, and obtaining a neighbor parameter prediction table.

And further, predicting and selecting the next hop according to the transmission characteristic parameters, wherein the next hop is predicted and selected to be X _j The hop count of the message is sent to the destination node

And transmission time

The formula is expressed as follows:

in the formula (I), the compound is shown in the specification,

indicating that depending on the distance of the current node from the neighbor nodes,

indicating the distance of the destination node from the neighbor nodes,

representing the included angle between the relative position vector of the neighbor node and the current node and the relative position vector of the destination node and the current node,

representing the included angle between the velocity vector of the current node and the relative position vector of the neighbor node and the current node,

representing the current node and the neighbor node X _j The transmission bandwidth of (a) is,

representing the current node and the neighbor node X _j Time delay of (2).

Preferably, the next hop is selected for forwarding according to the neighbor parameter prediction table and according to requirements, and the current routing information is stored in a database; prior to networking, historical data is obtained from a database.

An aerial mobile device comprises an offline training module, a prediction module and a forwarding selection module;

before networking, an offline training module trains based on historical data to obtain a prediction rule, and a plurality of value function sub-gradients are designed during training, wherein the value function sub-gradients comprise a steepest descent sub-gradient, a ratio example gradient and a differential sub-gradient to prevent local convergence;

when networking, the prediction module generates a neighbor parameter prediction table according to a prediction rule;

and the forwarding selection module selects the next hop for forwarding according to the neighbor parameter prediction table and according to requirements, and stores the current routing information into a database.

A computer system comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the neural network and iterative learning based unmanned constellation geolocation routing method when executing the computer program.

The invention has the following beneficial effects:

1. the invention is oriented to unmanned cluster air networking, combines a neural network and iterative learning to design a routing method based on the geographic position, fully utilizes the navigation function and the motion characteristic of the unmanned cluster, and adapts to the confrontation environment with frequent network topology change.

2. In addition, the invention adopts off-line training, reduces the calculated amount and power consumption of the unmanned aerial vehicle, completes the learning and training of historical data on line, does not need to carry out on-line training when no people cluster is used for networking, greatly reduces the calculated amount and power consumption of the unmanned aerial vehicle, and reduces the burden of an onboard computer and a power supply; a large amount of iterations and operations are effectively avoided, and a large amount of power is consumed, so that the method is suitable for the environment with scarce computing resources and power on the computer.

Drawings

FIG. 1 is a flow chart of the routing method of the unmanned cluster geographic position based on neural network and iterative learning.

Fig. 2 is a flow chart showing the detailed steps of the method of the present invention.

FIG. 3 is a diagram illustrating the physical significance of data among a current node, a neighbor node, and a destination node according to the present invention.

FIG. 4 is a flow chart of the operation of the off-line training of the present invention.

FIG. 5 is a diagram of the structure and training process of the neural network of the present invention.

FIG. 6 is a graphical illustration of historical data for the present invention.

Figure 7 is a flowchart of the operation of the present invention to generate a neighbor parameter prediction table and forwarding selection.

Fig. 8 is an exemplary diagram of neighbor node information of the present invention.

Fig. 9 is transmission parameter characteristic information of the node in fig. 7.

Fig. 10 is a representation diagram of neighbor parameter prediction in accordance with the present invention.

Fig. 11 is a system schematic diagram of an over-the-air mobile device according to the present invention.

Detailed Description

The invention is described in detail below with reference to the drawings and the detailed description.

Example 1

Considering that the positions of all unmanned aerial vehicle nodes in the unmanned aerial vehicle cluster change frequently and regularly, and the rule of the positions is related to tasks executed by the unmanned aerial vehicle cluster and the kinematics and dynamics of the unmanned aerial vehicle cluster, the invention combines a neural network and iterative learning to design a position-based routing method, can adapt to a network environment with frequent topology change, and adopts an off-line training mode to avoid carrying out a large amount of calculation and storage on an onboard computer and generate larger power consumption, thereby adapting to an onboard environment. The airborne mobile nodes involved in the present embodiment include, but are not limited to, drone nodes, balloon nodes, glider nodes, airship nodes, airplane nodes, helicopter nodes. The present embodiment takes the unmanned aerial vehicle node as an example to describe in detail as follows:

as shown in fig. 1 and fig. 2, an unmanned cluster geographical location routing method based on neural network and iterative learning includes the following steps:

before networking, offline training is carried out on historical data based on a neural network and iterative learning, a prediction rule is obtained through training, and a plurality of value function sub-gradients including a steepest descent sub-gradient, a specific example gradient and a differential sub-gradient are designed during training to prevent local convergence;

when networking, generating a neighbor parameter prediction table according to a prediction rule without knowing the topology of the whole network;

In a specific embodiment, the historical data includes

Unmanned aerial vehicle node set χ ═ X ₁ ,X ₂ ,…,X _m }, each node X _i E χ neighbor set

And a transmission characteristic parameter. As shown in fig. 3, wherein the transmission characteristic parameters includeFrom the current node X _c Belongs to X and passes through next jump X _b ∈χ _i,B To destination node X _d When the information is sent by the left-handed over mobile terminal, the distance between the current node and the neighbor node

Distance between destination node and neighbor node

Relative position vector of neighbor node and current node

And the relative position vector of the destination node and the current node

Angle of (2)

Velocity vector of neighbor node relative to current node

And the included angle of the relative position vector of the neighbor node and the current node

Transmission bandwidth of current node and neighbor node

Time delay of current node and neighbor node

Hop count

Time of flight

Wherein c ≠ d represents that the current node and the destination node are not coincident; i.e. a set of data:

wherein n is a data sequence number.

Wherein velocity vectors of neighboring nodes relative to the current node

Can be calculated according to the following formula:

wherein the content of the first and second substances,

is a position vector of the destination node relative to the neighbor nodes, and

and

all are obtained by navigation.

In a specific embodiment, the distance between the current node and the neighbor node is determined

Distance between destination node and neighbor node

Relative position vector of neighbor node and current node

And relative position vector of the destination node and the current nodeMeasurement of

Angle (d) of

Velocity vector of neighbor node relative to current node

Transmission bandwidth of current node and neighbor node

Time delay of current node and neighbor node

Inputting the neural network as a training pair to perform off-line training;

number of hops

Time of flight

As an output pair of the input neural network.

That is to say, the

Performing off-line training on the input neural network as training;

will be provided with

As an output pair of the input neural network.

In this embodiment, the input neural network is trained offline, and the training is performed to obtain the output pair.

The embodiment adopts an off-line training mode, can avoid carrying out a large amount of operations and storage on the airborne computer, and generates larger power consumption, thereby being suitable for the airborne environment.

In a specific embodiment, the neural network described in this embodiment sequentially includes an input layer, a first hidden layer, a second hidden layer, and an output layer.

As shown in fig. 4, 5, and 6, the offline training of the historical data based on the neural network and the iterative learning specifically includes the following steps:

And

coefficient k _hh And kappa _hy Diagonal matrix lambda _y ，λ _h And λ _u Magnitude of (2), value function standard value V _req Maximum number of iterations k _max Training step length eta; wherein k represents an iteration number;

s102: initializing off-line training parameters, and setting a first weighting matrix from an input layer node to a hidden layer node as

A second weighting matrix from the first hidden layer node to the second hidden layer node is

A third weighting matrix from the second hidden layer node to the output layer node is

Wherein, k represents the iteration number, then the first weighting matrix, the second weighting matrix and the third weighting matrix are initialized to W _uh (0)，W _hh (0)，W _hy (0) And k is 0.

S103: for n training pairs, respectively calculating the first iteration of the kOutput of hidden layer nodes

Output of the second hidden layer node

And output pair matrix

Wherein the output pair matrix includes hop count prediction values

And transmission time prediction value

A specific calculation method provided in this embodiment is as follows:

wherein the content of the first and second substances,

to train the matrix pair;

s104: according to the calculated output of the first hidden layer node

Output of the second hidden layer node

And output pair matrix

Respectively calculating first weighting matrix

Second weighting matrix

Third weighting matrix

Corresponding first value function V _uh (k) Second value function V _hh (k) Third value function V _hy (k)；

A specific calculation formula provided in this embodiment is as follows:

wherein e is _y (N, k) is the error of the N (N is less than or equal to N) th training set to the output node, e ₁ (n, k) is the n-th set of virtual errors trained on the first hidden layer nodes, e ₂ (n, k) is the n-th set of virtual errors, k, for training the second hidden layer nodes _hh Not less than 0 and κ _hy Not less than 0 is a constant parameter, | ·| non-woven phosphor ₂ Is a two-norm.

Wherein, the first and the second end of the pipe are connected with each other,

training theoretical values for output nodes, z, for the nth set _2,T (n, k) is the virtual theoretical value of the second hidden layer node, z _1,T (n, k) is the virtual theoretical value of the first hidden layer node, λ _y ，λ _h And λ _u Is a positive diagonal matrix.

S105: if the cost function V _uh (k)，V _hh (k) And V _hy (k) Satisfy the requirement, that is if | | | V (k) | non-woven phosphor _∞ ≤V _req Wherein V (k) ═ V _uh (k) V _hh (k) V _hy (k)]Is a value function matrix, | · | | non-conducting phosphor _∞ Is infinite norm, V _req Is a value function standard value; or the iteration number reaches k to the set maximum number k _max Then become knotBundle training and entering step S111; otherwise, go to step S106;

s106: respectively calculating first weighting matrix W _hy (k) A second weighting matrix W _hh (k) A third weighting matrix W _uh (k) Corresponding cost function sub-gradients including steepest descent sub-gradients, ratio example gradients and differential sub-gradients; in particular, the first weighting matrix W _hy (k) The corresponding merit function sub-gradients are respectively expressed as steepest descent sub-gradients

Gradient of specific example

And micro molecular gradient

Applying the second weighting matrix W _hh (k) The corresponding merit function sub-gradients are respectively expressed as steepest descent sub-gradients

Gradient of specific example

And micro molecular gradient

The third weighting matrix W _uh (k) The corresponding merit function sub-gradients are respectively expressed as steepest descent sub-gradients

Gradient of specific example

And micro molecular gradient

Wherein the content of the first and second substances,

and

calculated according to the following formula:

wherein the content of the first and second substances,

is a matrix

Row i and column j, (. cndot.)' are derivatives,

represents W _hy (k) Virtual theoretical value of

And W _hy (k) E (-) is desired.

And

calculated according to the following formula:

wherein the content of the first and second substances,

is a matrix

The elements of row i and column j,

represents W _hh (k) Virtual theoretical value of

And W _hh (k) The difference of (a).

And

calculated according to the following formula:

wherein the content of the first and second substances,

is a matrix

The elements of row i and column j,

represents W _uh (k) Virtual theoretical value of

And W _hh (k) The difference of (a).

In this embodiment, the signs of the differences between the virtual theoretical values and the iteration values of each weighting matrix are calculated according to the following formula:

s107: respectively calculating the value function gradients corresponding to the first weighting matrix, the second weighting matrix and the third weighting matrix according to the steepest descent sub-gradient, the proportional sub-gradient and the differential sub-gradient obtained by calculation; each value function sub-gradient is used for preventing each weighting matrix from locally converging, and a specific calculation formula is as follows:

wherein G is _hy (k) Represents the gradient of the cost function G corresponding to the first weighting matrix _hh (k) Representing the gradient of the cost function, G, corresponding to the second weighting matrix _uh (k) And representing the gradient of the cost function corresponding to the third weighting matrix.

S108: entering next iteration, wherein k is k + 1;

s109: updating the corresponding first weighting matrix and second weighting moment according to the value function gradient corresponding to the first weighting matrix, the second weighting matrix and the third weighting matrix obtained by the last iterative computationAn array and a third weighting matrix; in particular according to the gradient G of the merit function _uh (k-1)，G _hh (k-1) and G _hy (k-1) updating the weighting matrix to W _uh (k)，W _hh (k) And W _hy (k)：

Wherein, the training step length eta > 0 is a constant parameter.

s111: an updated first weighting matrix is obtained

Second weighting matrix

Third weighting matrix

As an optimal weighting matrix and as a prediction rule.

In a specific embodiment, as shown in fig. 7, the neighbor parameter prediction table is generated according to the prediction rule, which is as follows:

s203: obtaining next hop feasible solution set x from neighbor information _c,B Selecting neighbor node X _j ∈χ _B Where j is ∈ [1, m ]]，

Predicting and selecting the next hop according to the transmission characteristic parameters, and predicting and selecting the next hop as X _j Then, the hop count and the transmission time of the message are sent to a destination node;

In this embodiment, in step S202, if the destination node is determined to be the neighbor node X _d ∈χ _c,B If yes, storing the routing information into a database;

in a specific embodiment, a next hop is selected for forwarding according to a neighbor parameter prediction table and according to requirements, and current routing information is stored in a database; prior to networking, historical data is obtained from a database.

The neighbor parameter prediction table obtained in this embodiment is as follows:

in a specific embodiment, the next hop is selected according to the transmission characteristic parameter prediction, and the next hop is selected as X in the prediction _j The hop count of the message is sent to the destination node

And transmission time

The formula is expressed as follows:

in the formula (I), the compound is shown in the specification,

indicating that according to the distance of the current node from the neighbor nodes,

indicating the distance of the destination node from the neighbor nodes,

representing the current node and the neighbor node X _j Time delay of (2).

As shown in fig. 8 and 9, it is also assumed that the unmanned cluster includes 6 air mobile nodes, and when the off-line training is completed and the air networking is performed, it is assumed that the current node is node 2, that is, X _c ＝X ₂ The destination node being node 6, i.e. X _d ＝X ₆ The neighbors of node 2 are node 3, node 4 and node 5, i.e.' χ _B ＝{X ₃ ,X ₄ ,X ₅ }. When the prediction module works, the relevant information of the nodes is obtained, and the method comprises the following steps: distance of node 3 relative to node 2

Distance of node 6 relative to node 3

The angle between the position vector of node 3 relative to node 2 and the position vector of node 6 relative to node 2

The angle between the velocity vector of node 3 relative to node 2 and the position vector of node 6 relative to node 2

Bandwidth of nodes 2 and 3

Time delay of nodes 2 and 3

Distance of node 4 relative to node 2

Distance of node 6 relative to node 4

The angle between the position vector of node 4 relative to node 2 and the position vector of node 6 relative to node 2

The angle between the velocity vector of node 4 relative to node 2 and the position vector of node 6 relative to node 2

Bandwidth of nodes 2 and 4

Time delay of nodes 2 and 4

Distance of node 5 relative to node 2

Distance of node 5 relative to node 6

The angle between the position vector of node 5 relative to node 2 and the position vector of node 6 relative to node 2

The angle between the velocity vector of node 5 relative to node 2 and the position vector of node 6 relative to node 2

Bandwidth of nodes 2 and 5

Time delay of nodes 2 and 5

The optimal weighting matrix obtained by off-line training is used as a prediction rule for prediction to obtain a neighbor parameter prediction table, as shown in fig. 10, which includes the hop number prediction value from the node 2 to the node 6 when the node 3 is selected as the next hop

And transmission time prediction value

Hop count prediction from node 2 to node 6 when node 4 is selected as the next hop

And transmission time prediction value

Hop count prediction from node 2 to node 6 when node 5 is selected as the next hop

And transmission time prediction value

The forwarding selection module may select the next hop according to the criterion of low hop count or low transmission time based on the neighbor parameter prediction table, for example, when the requirement is low hop count, the selected next hop is

When the demand is that the transmission time is short, the next hop is selected to be

Example 2

As shown in fig. 11, an over-the-air mobile device includes an offline training module, a prediction module, and a forwarding selection module;

the forwarding selection module selects the next hop for forwarding according to the neighbor parameter prediction table and the requirement, and stores the current routing information into a database; prior to networking, historical data is obtained from a database.

The offline training module performs offline training on historical data based on a neural network and iterative learning, and the method of steps S101 to S111 in embodiment 1 is implemented.

The forwarding selection module generates a neighbor parameter prediction table according to the prediction rule, and implements the method as steps S201 to S204 in embodiment 1.

The airborne mobile device described in this embodiment refers to an apparatus flying object that is manufactured by human, can fly off the ground, flies in space, and is controlled by human to fly in the atmosphere or in the space outside the atmosphere (space), including but not limited to an unmanned aerial vehicle, a balloon, a glider, an airship, an airplane, and a helicopter.

Example 3

A computer system comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the method steps when executing the computer program as follows:

Where the memory and processor are connected by a bus, the bus may comprise any number of interconnected buses and bridges, the buses connecting together one or more of the various circuits of the processor and the memory. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, etc., which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor is transmitted over a wireless medium via an antenna, which further receives the data and transmits the data to the processor.

It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims

1. An unmanned cluster geographical position routing method based on neural network and iterative learning is characterized in that: the method comprises the following steps:

2. The neural network and iterative learning based unmanned constellation geolocation routing method of claim 1, wherein: the historical data comprises m air mobile node sets, neighbor sets of all nodes and transmission characteristic parameters.

3. The neural network and iterative learning based unmanned constellation geolocation routing method of claim 2, wherein: the transmission characteristic parameters comprise the distance between the current node and the neighbor node, the distance between the target node and the neighbor node, the included angle between the relative position vector between the neighbor node and the current node and the relative position vector between the target node and the current node, the included angle between the velocity vector between the neighbor node and the current node and the relative position vector between the neighbor node and the current node, the transmission bandwidth between the current node and the neighbor node, the time delay, the hop count and the transmission time between the current node and the neighbor node; wherein the current node and the destination node are not coincident.

4. The neural network and iterative learning based unmanned constellation geolocation routing method of claim 3, wherein: taking the distance between the current node and a neighbor node, the distance between the target node and the neighbor node, the relative position vector between the neighbor node and the current node, the included angle between the relative position vector between the target node and the current node, the speed vector between the neighbor node and the current node, the included angle between the relative position vector between the neighbor node and the current node, the transmission bandwidth between the current node and the neighbor node, and the time delay between the current node and the neighbor node as training to perform offline training on the input neural network;

5. The neural network and iterative learning based unmanned constellation geolocation routing method of claim 4, wherein: the off-line training of the historical data based on the neural network and the iterative learning comprises the following specific steps:

And

s102: initializing a first weighting matrix from an input layer node to a hidden layer node, a second weighting matrix from the first hidden layer node to a second hidden layer node, and a third weighting matrix from the second hidden layer node to an output layer node;

S105: if | | V (k) | non-woven phosphor _∞ ≤V _req Wherein V (k) ═ V _uh (k) V _hh (k) V _hy (k)]Is a value function matrix, | · | | non-conducting phosphor _∞ Is an infinite norm; or the iteration number reaches k to the set maximum number k _max If yes, ending the training and entering step S111; otherwise, go to step S106;

s108: entering next iteration, wherein k is k + 1;

6. The neural network and iterative learning based unmanned constellation geolocation routing method of claim 5, wherein: generating a neighbor parameter prediction table according to the prediction rule, specifically as follows:

s202: if the destination node is

Then directly select X _d Is the next hop; otherwise, entering the next step;

s203: obtaining next hop feasible solution set from neighbor information

Selecting neighbor nodes

Wherein j is equal to [1, m ]]，

7. The neural network and iterative learning based unmanned constellation geolocation routing method of claim 6, wherein: predicting and selecting the next hop according to the transmission characteristic parameters, and predicting and selecting the next hop as X _j The hop count of the message is sent to the destination node

And transmission time

The formula is expressed as follows:

in the formula (I), the compound is shown in the specification,

represents the distance of the destination node from the neighbor nodes,

indicating the speed of the current nodeThe angle between the vector and the relative position vector of the neighbor node and the current node,

representing the current node and the neighbor node X _j Time delay of (2).

8. The neural network and iterative learning based unmanned constellation geolocation routing method of claim 1, wherein: and selecting a next hop for forwarding according to the neighbor parameter prediction table and according to requirements, storing the current routing information into a database, and acquiring historical data from the database before networking.

9. An airborne mobile device, characterized in that: the system comprises an offline training module, a prediction module and a forwarding selection module;

before networking, the off-line training module performs training based on historical data to obtain a prediction rule, and a plurality of value function sub-gradients are designed during training, wherein the value function sub-gradients comprise a steepest descent sub-gradient, a ratio example gradient and a differential sub-gradient to prevent local convergence;

10. A computer system comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein: the processor, when executing the computer program, performs the steps of the method according to any of claims 1 to 8.