CN114828146A - Routing method for geographical position of unmanned cluster based on neural network and iterative learning - Google Patents

Routing method for geographical position of unmanned cluster based on neural network and iterative learning Download PDF

Info

Publication number
CN114828146A
CN114828146A CN202210412303.9A CN202210412303A CN114828146A CN 114828146 A CN114828146 A CN 114828146A CN 202210412303 A CN202210412303 A CN 202210412303A CN 114828146 A CN114828146 A CN 114828146A
Authority
CN
China
Prior art keywords
node
neighbor
weighting matrix
training
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210412303.9A
Other languages
Chinese (zh)
Inventor
郑墨泓
李勇
李新宇
姜虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 7 Research Institute
Original Assignee
CETC 7 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 7 Research Institute filed Critical CETC 7 Research Institute
Priority to CN202210412303.9A priority Critical patent/CN114828146A/en
Publication of CN114828146A publication Critical patent/CN114828146A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W40/00Communication routing or communication path finding
    • H04W40/02Communication route or path selection, e.g. power-based or shortest path routing
    • H04W40/20Communication route or path selection, e.g. power-based or shortest path routing based on geographic position or location
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B7/00Radio transmission systems, i.e. using radiation field
    • H04B7/14Relay systems
    • H04B7/15Active relay systems
    • H04B7/185Space-based or airborne stations; Stations for satellite systems
    • H04B7/18502Airborne stations
    • H04B7/18506Communications with or from aircraft, i.e. aeronautical mobile service
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W40/00Communication routing or communication path finding
    • H04W40/02Communication route or path selection, e.g. power-based or shortest path routing
    • H04W40/12Communication route or path selection, e.g. power-based or shortest path routing based on transmission quality or channel quality

Abstract

The invention discloses an unmanned cluster geographical position routing method based on a neural network and iterative learning, which comprises the following steps: before networking, off-line training is carried out on historical data based on a neural network and iterative learning, a prediction rule is obtained through training, and a plurality of value function sub-gradients are designed during training to prevent local convergence; when networking, generating a neighbor parameter prediction table according to a prediction rule; and selecting the next hop for forwarding according to the neighbor parameter prediction table and the requirement. According to the invention, offline training is adopted, the calculated amount and the power consumption of the unmanned aerial vehicle are reduced, learning and training of historical data are completed on line, no unmanned cluster is needed for online training during networking, the calculated amount and the power consumption of the unmanned aerial vehicle are greatly reduced, and the burden of an onboard computer and a power supply is reduced; a large amount of iterations and operations are effectively avoided, and a large amount of power is consumed, so that the method is suitable for the environment with scarce computing resources and power on the computer.

Description

Routing method for geographical position of unmanned cluster based on neural network and iterative learning
Technical Field
The invention relates to the technical field of network switching, in particular to an unmanned cluster geographical position routing method based on a neural network and iterative learning.
Background
Unmanned aerial vehicle has the flexibility height, can substitute the people and carry out the task, reduces the loss of lives and property to and advantage with low costs, along with the maturity of unmanned aerial vehicle technique, unmanned aerial vehicle is by the wide application in each field. But since a single drone cannot perform a large-scale task, the concept of unmanned clustering is proposed. The unmanned aerial vehicles in the unmanned cluster work cooperatively to require that the unmanned aerial vehicles can communicate with each other, so that the unmanned cluster does not need to complete networking in the air, and a routing method suitable for the unmanned cluster becomes one of important research points.
The high speed movement of the drone causes the topology to change dramatically and the nodes are sparse. In addition, the real-time performance of part of the transmission tasks is high, and the characteristics and requirements bring great challenges to the traditional routing protocol. Therefore, when the unmanned aerial vehicle is networked in the air, the design and improvement of the routing method mainly consider the kinematics dynamics characteristics of the unmanned aerial vehicle and the unmanned cluster and the network characteristics of the unmanned cluster, so that the stability and the reliability of the network are ensured.
Ad hoc routing protocols include static routing protocols, proactive routing protocols, reactive routing protocols, hybrid routing protocols, and location-based routing protocols. Although the static routing protocol is widely used in the unmanned cluster network, the reliability is poor; the dynamic routing protocol can use a network with frequent topology change, but the control overhead is high, so that the dynamic routing protocol is difficult to adapt to a narrow-band network of an unmanned cluster; the overhead of the reaction routing protocol is small, but the delay of the first packet is higher; hybrid routing protocols, while addressing some of the shortcomings of both a priori and reactive routing protocols, have difficulty defining the scope for using a priori routing protocols; the routing protocol based on the position fully considers and uses the position information of the unmanned cluster, so that the control overhead can be reduced, the time delay can be reduced, and the routing protocol is more suitable for the unmanned cluster networking compared with the other four routing protocols.
Machine learning can learn and train historical data, summarize the laws of things and predict the future. When the method is applied to the unmanned cluster routing method, the rules of networking are obtained by learning and training according to the data of historical tasks. Thus, a machine-learned based routing protocol or method enables prediction and optimization of routes as compared to traditional routing protocols learned by inorganic machines.
As machine learning algorithms have matured, many scholars have begun to design and improve routing methods using machine learning algorithms. Machine learning algorithms widely used in the routing method include reinforcement learning, neural networks, decision trees and the like, and requirements such as low time delay and load balancing can be considered. But some researches are more suitable for the special scenes such as the Internet of vehicles with small node height difference, the seriously delayed underwater network environment, intertidal zones and the like, such as an urban scene Internet of vehicles multicast routing method based on reinforcement learning provided by the university of electronic technology of western's Security, a low-delay routing method based on a machine learning intertidal zone sensor network provided by the university of Zhejiang, an Internet of vehicles reinforcement learning routing method based on position information provided by the university of mail and telecommunications of Nanjing, and a Q learning ant colony routing method facing to an underwater multi-agent provided by the university of Qinghua; or the method focuses on multicast, multiple data centers, software defined networks, delay tolerant networks and the like, such as a reinforcement learning-based multiple data center energy-saving routing method and system provided by Shandong university, an optimal path selection algorithm based on machine learning under an SDN provided by the Sian traffic university, an intelligent routing decision method based on a DDPG reinforcement learning algorithm provided by the Sian electronic technology university, a multi-agent reinforcement learning-based software defined network routing method provided by Nanjing university of science and technology, and the like; in addition, partial research does not consider the situation that various resources of an airborne system are insufficient, including power supply and computing resources, a routing protocol for online learning is difficult to operate on an unmanned aerial vehicle, such as an intelligent routing method based on a deep reinforcement learning technology under a wireless network environment provided by Nanjing university of industry, a routing optimization method and system based on a graph neural network and deep reinforcement learning provided by Huazhong university of science and technology, a network routing planning method and system based on a BP neural network ant colony algorithm provided by Zhejiang university of industry, a flying ad hoc network QoS routing method based on Q-learning provided by Shanghai microsystem of China academy of sciences and information technology research, and the like; in addition, part of routing protocols for unmanned systems based on machine learning do not consider the situations of frequent node movement, large moving range and even node damage, and routing protocols are not designed in combination with the geographic positions of unmanned clusters, so that the efficiency is low, for example, an unmanned system network adaptive routing method and system based on deep reinforcement learning proposed by the Chinese scientific computing technology research, a mobile self-organizing network routing method based on reinforcement learning proposed by Shanghai Dynasty institute of technology, and a distributed intelligent routing method for unmanned aerial vehicle network slices proposed by the university of electronic science and technology.
Disclosure of Invention
The invention provides an unmanned cluster geographical position routing method based on a neural network and iterative learning, aiming at solving the problems of the defects and shortcomings of the prior art.
In order to realize the purpose of the invention, the technical scheme is as follows:
an unmanned cluster geographical position routing method based on neural network and iterative learning, the method comprises the following steps:
before networking, off-line training is carried out on historical data based on a neural network and iterative learning, a prediction rule is obtained through training, and a plurality of value function sub-gradients are designed during training, wherein the value function sub-gradients comprise a steepest descent sub-gradient, a specific example gradient and a differential sub-gradient so as to prevent local convergence;
when networking, generating a neighbor parameter prediction table according to a prediction rule;
and selecting the next hop for forwarding according to the neighbor parameter prediction table and the requirement.
Preferably, the historical data includes m sets of air mobile nodes, neighbor sets of each node, and transmission characteristic parameters.
Further, the transmission characteristic parameters include a distance between the current node and a neighboring node, a distance between the destination node and the neighboring node, an included angle between a relative position vector between the neighboring node and the current node and a relative position vector between the destination node and the current node, an included angle between a velocity vector of the neighboring node relative to the current node and a relative position vector between the neighboring node and the current node, a transmission bandwidth between the current node and the neighboring node, a time delay, a hop count, and a transmission time between the current node and the neighboring node; wherein the current node and the destination node are not coincident.
Further, taking the distance between the current node and the neighbor node, the distance between the target node and the neighbor node, the relative position vector between the neighbor node and the current node, the included angle between the relative position vector between the target node and the current node, the velocity vector between the neighbor node and the current node, the included angle between the relative position vector between the neighbor node and the current node, the transmission bandwidth between the current node and the neighbor node, and the time delay between the current node and the neighbor node as training to perform off-line training on the input neural network;
and taking the hop count and the transmission time as an output pair of the input neural network.
Still further, the off-line training of the historical data based on the neural network and the iterative learning comprises the following specific steps:
s101: selecting the value of each constant parameter, including the number of neurons in the first hidden layer and the second hidden layer of the neural network
Figure BDA0003604456630000031
And
Figure BDA0003604456630000032
coefficient k hh And kappa hy Diagonal matrix lambda y ,λ h And λ u Magnitude of (2), value function standard value V req Maximum number of iterations k max Training step length eta;
s102: initializing a first weighting matrix from an input layer node to a hidden layer node, a second weighting matrix from the first hidden layer node to a second hidden layer node, and a third weighting matrix from the second hidden layer node to an output layer node; wherein k represents an iteration number;
s103: respectively calculating the output of a first hidden layer node, the output of a second hidden layer node and an output pair matrix of the kth iteration for n groups of training pairs, wherein the output pair matrix comprises a hop count predicted value and a transmission time predicted value;
s104: respectively calculating first cost functions V corresponding to the first weighting matrix, the second weighting matrix and the third weighting matrix according to the output of the first hidden layer node, the output of the second hidden layer node and the output pair matrix obtained by calculation uh (k) Second value function V hh (k) A third valence function V hy (k);
S105: if | | V (k) calucing ≤V req Wherein V (k) ═ V uh (k) V hh (k) V hy (k)]Is a value function matrix, | · | | non-conducting phosphor Is an infinite norm; or the iteration number reaches k to the set maximum number k max If yes, ending the training and entering step S111; otherwise, go to step S106;
s106: respectively calculating the corresponding value function sub-gradients of the first weighting matrix, the second weighting matrix and the third weighting matrix, wherein the value function sub-gradients comprise a steepest descent sub-gradient, a ratio example gradient and a differential sub-gradient;
s107: respectively calculating the value function gradients corresponding to the first weighting matrix, the second weighting matrix and the third weighting matrix according to the steepest descent sub-gradient, the proportional sub-gradient and the differential sub-gradient obtained by calculation;
s108: entering next iteration, wherein k is k + 1;
s109: updating the corresponding first weighting matrix, second weighting matrix and third weighting matrix according to the value function gradient corresponding to the first weighting matrix, second weighting matrix and third weighting matrix obtained by the last iterative computation;
s110: repeating the steps S103-S105 until the iteration ending condition is met, and entering the step S111;
s111: and taking the updated first weighting matrix, second weighting matrix and third weighting matrix as the optimal weighting matrix and as the prediction rule.
Further, a neighbor parameter prediction table is generated according to the prediction rule, which specifically includes:
s201: when networking, obtaining the current node X c And destination node X d Information;
s202: if the destination node is X d ∈χ c,B Then directly select X d Is the next hop; otherwise, entering the next step;
s203: obtaining next hop feasible solution set x from neighbor information c,B Selecting neighbor node X j ∈χ B Wherein j ∈ [1, m ]],
Figure BDA0003604456630000041
Predicting and selecting the next hop according to a prediction rule, wherein the next hop is predicted and selected to be X j Then, the hop count and the transmission time of the message are sent to a destination node;
s204: and traversing all the neighbor nodes, repeating the step S203 to predict the hop count and the transmission time of the neighbor nodes, and obtaining a neighbor parameter prediction table.
And further, predicting and selecting the next hop according to the transmission characteristic parameters, wherein the next hop is predicted and selected to be X j The hop count of the message is sent to the destination node
Figure BDA0003604456630000042
And transmission time
Figure BDA0003604456630000043
The formula is expressed as follows:
Figure BDA0003604456630000044
in the formula (I), the compound is shown in the specification,
Figure BDA0003604456630000045
indicating that depending on the distance of the current node from the neighbor nodes,
Figure BDA0003604456630000046
indicating the distance of the destination node from the neighbor nodes,
Figure BDA0003604456630000051
representing the included angle between the relative position vector of the neighbor node and the current node and the relative position vector of the destination node and the current node,
Figure BDA0003604456630000052
representing the included angle between the velocity vector of the current node and the relative position vector of the neighbor node and the current node,
Figure BDA0003604456630000053
representing the current node and the neighbor node X j The transmission bandwidth of (a) is,
Figure BDA0003604456630000054
representing the current node and the neighbor node X j Time delay of (2).
Preferably, the next hop is selected for forwarding according to the neighbor parameter prediction table and according to requirements, and the current routing information is stored in a database; prior to networking, historical data is obtained from a database.
An aerial mobile device comprises an offline training module, a prediction module and a forwarding selection module;
before networking, an offline training module trains based on historical data to obtain a prediction rule, and a plurality of value function sub-gradients are designed during training, wherein the value function sub-gradients comprise a steepest descent sub-gradient, a ratio example gradient and a differential sub-gradient to prevent local convergence;
when networking, the prediction module generates a neighbor parameter prediction table according to a prediction rule;
and the forwarding selection module selects the next hop for forwarding according to the neighbor parameter prediction table and according to requirements, and stores the current routing information into a database.
A computer system comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the neural network and iterative learning based unmanned constellation geolocation routing method when executing the computer program.
The invention has the following beneficial effects:
1. the invention is oriented to unmanned cluster air networking, combines a neural network and iterative learning to design a routing method based on the geographic position, fully utilizes the navigation function and the motion characteristic of the unmanned cluster, and adapts to the confrontation environment with frequent network topology change.
2. In addition, the invention adopts off-line training, reduces the calculated amount and power consumption of the unmanned aerial vehicle, completes the learning and training of historical data on line, does not need to carry out on-line training when no people cluster is used for networking, greatly reduces the calculated amount and power consumption of the unmanned aerial vehicle, and reduces the burden of an onboard computer and a power supply; a large amount of iterations and operations are effectively avoided, and a large amount of power is consumed, so that the method is suitable for the environment with scarce computing resources and power on the computer.
Drawings
FIG. 1 is a flow chart of the routing method of the unmanned cluster geographic position based on neural network and iterative learning.
Fig. 2 is a flow chart showing the detailed steps of the method of the present invention.
FIG. 3 is a diagram illustrating the physical significance of data among a current node, a neighbor node, and a destination node according to the present invention.
FIG. 4 is a flow chart of the operation of the off-line training of the present invention.
FIG. 5 is a diagram of the structure and training process of the neural network of the present invention.
FIG. 6 is a graphical illustration of historical data for the present invention.
Figure 7 is a flowchart of the operation of the present invention to generate a neighbor parameter prediction table and forwarding selection.
Fig. 8 is an exemplary diagram of neighbor node information of the present invention.
Fig. 9 is transmission parameter characteristic information of the node in fig. 7.
Fig. 10 is a representation diagram of neighbor parameter prediction in accordance with the present invention.
Fig. 11 is a system schematic diagram of an over-the-air mobile device according to the present invention.
Detailed Description
The invention is described in detail below with reference to the drawings and the detailed description.
Example 1
Considering that the positions of all unmanned aerial vehicle nodes in the unmanned aerial vehicle cluster change frequently and regularly, and the rule of the positions is related to tasks executed by the unmanned aerial vehicle cluster and the kinematics and dynamics of the unmanned aerial vehicle cluster, the invention combines a neural network and iterative learning to design a position-based routing method, can adapt to a network environment with frequent topology change, and adopts an off-line training mode to avoid carrying out a large amount of calculation and storage on an onboard computer and generate larger power consumption, thereby adapting to an onboard environment. The airborne mobile nodes involved in the present embodiment include, but are not limited to, drone nodes, balloon nodes, glider nodes, airship nodes, airplane nodes, helicopter nodes. The present embodiment takes the unmanned aerial vehicle node as an example to describe in detail as follows:
as shown in fig. 1 and fig. 2, an unmanned cluster geographical location routing method based on neural network and iterative learning includes the following steps:
before networking, offline training is carried out on historical data based on a neural network and iterative learning, a prediction rule is obtained through training, and a plurality of value function sub-gradients including a steepest descent sub-gradient, a specific example gradient and a differential sub-gradient are designed during training to prevent local convergence;
when networking, generating a neighbor parameter prediction table according to a prediction rule without knowing the topology of the whole network;
and selecting the next hop for forwarding according to the neighbor parameter prediction table and the requirement.
In a specific embodiment, the historical data includes
Figure BDA0003604456630000061
Unmanned aerial vehicle node set χ ═ X 1 ,X 2 ,…,X m }, each node X i E χ neighbor set
Figure BDA0003604456630000062
And a transmission characteristic parameter. As shown in fig. 3, wherein the transmission characteristic parameters includeFrom the current node X c Belongs to X and passes through next jump X b ∈χ i,B To destination node X d When the information is sent by the left-handed over mobile terminal, the distance between the current node and the neighbor node
Figure BDA0003604456630000063
Distance between destination node and neighbor node
Figure BDA0003604456630000064
Relative position vector of neighbor node and current node
Figure BDA0003604456630000071
And the relative position vector of the destination node and the current node
Figure BDA0003604456630000072
Angle of (2)
Figure BDA0003604456630000073
Velocity vector of neighbor node relative to current node
Figure BDA0003604456630000074
And the included angle of the relative position vector of the neighbor node and the current node
Figure BDA0003604456630000075
Transmission bandwidth of current node and neighbor node
Figure BDA0003604456630000076
Time delay of current node and neighbor node
Figure BDA0003604456630000077
Hop count
Figure BDA0003604456630000078
Time of flight
Figure BDA0003604456630000079
Wherein c ≠ d represents that the current node and the destination node are not coincident; i.e. a set of data:
Figure BDA00036044566300000710
wherein n is a data sequence number.
Wherein velocity vectors of neighboring nodes relative to the current node
Figure BDA00036044566300000711
And the included angle of the relative position vector of the neighbor node and the current node
Figure BDA00036044566300000712
Can be calculated according to the following formula:
Figure BDA00036044566300000713
wherein the content of the first and second substances,
Figure BDA00036044566300000714
is a position vector of the destination node relative to the neighbor nodes, and
Figure BDA00036044566300000715
and
Figure BDA00036044566300000716
all are obtained by navigation.
In a specific embodiment, the distance between the current node and the neighbor node is determined
Figure BDA00036044566300000717
Distance between destination node and neighbor node
Figure BDA00036044566300000718
Relative position vector of neighbor node and current node
Figure BDA00036044566300000719
And relative position vector of the destination node and the current nodeMeasurement of
Figure BDA00036044566300000720
Angle (d) of
Figure BDA00036044566300000721
Velocity vector of neighbor node relative to current node
Figure BDA00036044566300000722
And the included angle of the relative position vector of the neighbor node and the current node
Figure BDA00036044566300000723
Transmission bandwidth of current node and neighbor node
Figure BDA00036044566300000724
Time delay of current node and neighbor node
Figure BDA00036044566300000725
Inputting the neural network as a training pair to perform off-line training;
number of hops
Figure BDA00036044566300000726
Time of flight
Figure BDA00036044566300000727
As an output pair of the input neural network.
That is to say, the
Figure BDA00036044566300000728
Performing off-line training on the input neural network as training;
will be provided with
Figure BDA00036044566300000729
As an output pair of the input neural network.
In this embodiment, the input neural network is trained offline, and the training is performed to obtain the output pair.
The embodiment adopts an off-line training mode, can avoid carrying out a large amount of operations and storage on the airborne computer, and generates larger power consumption, thereby being suitable for the airborne environment.
In a specific embodiment, the neural network described in this embodiment sequentially includes an input layer, a first hidden layer, a second hidden layer, and an output layer.
As shown in fig. 4, 5, and 6, the offline training of the historical data based on the neural network and the iterative learning specifically includes the following steps:
s101: selecting the value of each constant parameter, including the number of neurons in the first hidden layer and the second hidden layer of the neural network
Figure BDA0003604456630000081
And
Figure BDA0003604456630000082
coefficient k hh And kappa hy Diagonal matrix lambda y ,λ h And λ u Magnitude of (2), value function standard value V req Maximum number of iterations k max Training step length eta; wherein k represents an iteration number;
s102: initializing off-line training parameters, and setting a first weighting matrix from an input layer node to a hidden layer node as
Figure BDA0003604456630000083
A second weighting matrix from the first hidden layer node to the second hidden layer node is
Figure BDA0003604456630000084
A third weighting matrix from the second hidden layer node to the output layer node is
Figure BDA0003604456630000085
Wherein, k represents the iteration number, then the first weighting matrix, the second weighting matrix and the third weighting matrix are initialized to W uh (0),W hh (0),W hy (0) And k is 0.
S103: for n training pairs, respectively calculating the first iteration of the kOutput of hidden layer nodes
Figure BDA0003604456630000086
Output of the second hidden layer node
Figure BDA0003604456630000087
And output pair matrix
Figure BDA0003604456630000088
Wherein the output pair matrix includes hop count prediction values
Figure BDA0003604456630000089
And transmission time prediction value
Figure BDA00036044566300000810
A specific calculation method provided in this embodiment is as follows:
Figure BDA00036044566300000811
wherein the content of the first and second substances,
Figure BDA00036044566300000812
to train the matrix pair;
s104: according to the calculated output of the first hidden layer node
Figure BDA00036044566300000813
Output of the second hidden layer node
Figure BDA00036044566300000814
And output pair matrix
Figure BDA00036044566300000815
Respectively calculating first weighting matrix
Figure BDA00036044566300000816
Second weighting matrix
Figure BDA00036044566300000817
Third weighting matrix
Figure BDA0003604456630000091
Corresponding first value function V uh (k) Second value function V hh (k) Third value function V hy (k);
A specific calculation formula provided in this embodiment is as follows:
Figure BDA0003604456630000092
wherein e is y (N, k) is the error of the N (N is less than or equal to N) th training set to the output node, e 1 (n, k) is the n-th set of virtual errors trained on the first hidden layer nodes, e 2 (n, k) is the n-th set of virtual errors, k, for training the second hidden layer nodes hh Not less than 0 and κ hy Not less than 0 is a constant parameter, | ·| non-woven phosphor 2 Is a two-norm.
Figure BDA0003604456630000093
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003604456630000094
training theoretical values for output nodes, z, for the nth set 2,T (n, k) is the virtual theoretical value of the second hidden layer node, z 1,T (n, k) is the virtual theoretical value of the first hidden layer node, λ y ,λ h And λ u Is a positive diagonal matrix.
S105: if the cost function V uh (k),V hh (k) And V hy (k) Satisfy the requirement, that is if | | | V (k) | non-woven phosphor ≤V req Wherein V (k) ═ V uh (k) V hh (k) V hy (k)]Is a value function matrix, | · | | non-conducting phosphor Is infinite norm, V req Is a value function standard value; or the iteration number reaches k to the set maximum number k max Then become knotBundle training and entering step S111; otherwise, go to step S106;
s106: respectively calculating first weighting matrix W hy (k) A second weighting matrix W hh (k) A third weighting matrix W uh (k) Corresponding cost function sub-gradients including steepest descent sub-gradients, ratio example gradients and differential sub-gradients; in particular, the first weighting matrix W hy (k) The corresponding merit function sub-gradients are respectively expressed as steepest descent sub-gradients
Figure BDA0003604456630000095
Gradient of specific example
Figure BDA0003604456630000096
And micro molecular gradient
Figure BDA0003604456630000097
Applying the second weighting matrix W hh (k) The corresponding merit function sub-gradients are respectively expressed as steepest descent sub-gradients
Figure BDA0003604456630000098
Gradient of specific example
Figure BDA0003604456630000099
And micro molecular gradient
Figure BDA00036044566300000910
The third weighting matrix W uh (k) The corresponding merit function sub-gradients are respectively expressed as steepest descent sub-gradients
Figure BDA0003604456630000101
Gradient of specific example
Figure BDA0003604456630000102
And micro molecular gradient
Figure BDA0003604456630000103
Wherein the content of the first and second substances,
Figure BDA0003604456630000104
and
Figure BDA0003604456630000105
calculated according to the following formula:
Figure BDA0003604456630000106
Figure BDA0003604456630000107
Figure BDA0003604456630000108
wherein the content of the first and second substances,
Figure BDA0003604456630000109
is a matrix
Figure BDA00036044566300001010
Row i and column j, (. cndot.)' are derivatives,
Figure BDA00036044566300001011
represents W hy (k) Virtual theoretical value of
Figure BDA00036044566300001012
And W hy (k) E (-) is desired.
Figure BDA00036044566300001013
And
Figure BDA00036044566300001014
calculated according to the following formula:
Figure BDA00036044566300001015
Figure BDA00036044566300001016
Figure BDA00036044566300001017
wherein the content of the first and second substances,
Figure BDA00036044566300001018
is a matrix
Figure BDA00036044566300001019
The elements of row i and column j,
Figure BDA00036044566300001020
represents W hh (k) Virtual theoretical value of
Figure BDA00036044566300001021
And W hh (k) The difference of (a).
Figure BDA00036044566300001022
And
Figure BDA00036044566300001023
calculated according to the following formula:
Figure BDA00036044566300001024
Figure BDA00036044566300001025
Figure BDA00036044566300001026
wherein the content of the first and second substances,
Figure BDA00036044566300001027
is a matrix
Figure BDA00036044566300001028
The elements of row i and column j,
Figure BDA00036044566300001029
represents W uh (k) Virtual theoretical value of
Figure BDA0003604456630000111
And W hh (k) The difference of (a).
In this embodiment, the signs of the differences between the virtual theoretical values and the iteration values of each weighting matrix are calculated according to the following formula:
Figure BDA0003604456630000112
s107: respectively calculating the value function gradients corresponding to the first weighting matrix, the second weighting matrix and the third weighting matrix according to the steepest descent sub-gradient, the proportional sub-gradient and the differential sub-gradient obtained by calculation; each value function sub-gradient is used for preventing each weighting matrix from locally converging, and a specific calculation formula is as follows:
Figure BDA0003604456630000113
wherein G is hy (k) Represents the gradient of the cost function G corresponding to the first weighting matrix hh (k) Representing the gradient of the cost function, G, corresponding to the second weighting matrix uh (k) And representing the gradient of the cost function corresponding to the third weighting matrix.
S108: entering next iteration, wherein k is k + 1;
s109: updating the corresponding first weighting matrix and second weighting moment according to the value function gradient corresponding to the first weighting matrix, the second weighting matrix and the third weighting matrix obtained by the last iterative computationAn array and a third weighting matrix; in particular according to the gradient G of the merit function uh (k-1),G hh (k-1) and G hy (k-1) updating the weighting matrix to W uh (k),W hh (k) And W hy (k):
Figure BDA0003604456630000114
Wherein, the training step length eta > 0 is a constant parameter.
S110: repeating the steps S103-S105 until the iteration ending condition is met, and entering the step S111;
s111: an updated first weighting matrix is obtained
Figure BDA0003604456630000115
Second weighting matrix
Figure BDA0003604456630000116
Third weighting matrix
Figure BDA0003604456630000117
As an optimal weighting matrix and as a prediction rule.
In a specific embodiment, as shown in fig. 7, the neighbor parameter prediction table is generated according to the prediction rule, which is as follows:
s201: when networking, obtaining the current node X c And destination node X d Information;
s202: if the destination node is X d ∈χ c,B Then directly select X d Is the next hop; otherwise, entering the next step;
s203: obtaining next hop feasible solution set x from neighbor information c,B Selecting neighbor node X j ∈χ B Where j is ∈ [1, m ]],
Figure BDA0003604456630000121
Predicting and selecting the next hop according to the transmission characteristic parameters, and predicting and selecting the next hop as X j Then, the hop count and the transmission time of the message are sent to a destination node;
s204: and traversing all the neighbor nodes, repeating the step S203 to predict the hop count and the transmission time of the neighbor nodes, and obtaining a neighbor parameter prediction table.
In this embodiment, in step S202, if the destination node is determined to be the neighbor node X d ∈χ c,B If yes, storing the routing information into a database;
in a specific embodiment, a next hop is selected for forwarding according to a neighbor parameter prediction table and according to requirements, and current routing information is stored in a database; prior to networking, historical data is obtained from a database.
The neighbor parameter prediction table obtained in this embodiment is as follows:
Figure BDA0003604456630000122
in a specific embodiment, the next hop is selected according to the transmission characteristic parameter prediction, and the next hop is selected as X in the prediction j The hop count of the message is sent to the destination node
Figure BDA0003604456630000123
And transmission time
Figure BDA0003604456630000124
The formula is expressed as follows:
Figure BDA0003604456630000125
in the formula (I), the compound is shown in the specification,
Figure BDA0003604456630000126
indicating that according to the distance of the current node from the neighbor nodes,
Figure BDA0003604456630000127
indicating the distance of the destination node from the neighbor nodes,
Figure BDA0003604456630000128
representing the included angle between the relative position vector of the neighbor node and the current node and the relative position vector of the destination node and the current node,
Figure BDA0003604456630000131
representing the included angle between the velocity vector of the current node and the relative position vector of the neighbor node and the current node,
Figure BDA0003604456630000132
representing the current node and the neighbor node X j The transmission bandwidth of (a) is,
Figure BDA0003604456630000133
representing the current node and the neighbor node X j Time delay of (2).
As shown in fig. 8 and 9, it is also assumed that the unmanned cluster includes 6 air mobile nodes, and when the off-line training is completed and the air networking is performed, it is assumed that the current node is node 2, that is, X c =X 2 The destination node being node 6, i.e. X d =X 6 The neighbors of node 2 are node 3, node 4 and node 5, i.e.' χ B ={X 3 ,X 4 ,X 5 }. When the prediction module works, the relevant information of the nodes is obtained, and the method comprises the following steps: distance of node 3 relative to node 2
Figure BDA0003604456630000134
Distance of node 6 relative to node 3
Figure BDA0003604456630000135
The angle between the position vector of node 3 relative to node 2 and the position vector of node 6 relative to node 2
Figure BDA0003604456630000136
The angle between the velocity vector of node 3 relative to node 2 and the position vector of node 6 relative to node 2
Figure BDA0003604456630000137
Bandwidth of nodes 2 and 3
Figure BDA0003604456630000138
Time delay of nodes 2 and 3
Figure BDA0003604456630000139
Distance of node 4 relative to node 2
Figure BDA00036044566300001310
Distance of node 6 relative to node 4
Figure BDA00036044566300001311
The angle between the position vector of node 4 relative to node 2 and the position vector of node 6 relative to node 2
Figure BDA00036044566300001312
The angle between the velocity vector of node 4 relative to node 2 and the position vector of node 6 relative to node 2
Figure BDA00036044566300001313
Bandwidth of nodes 2 and 4
Figure BDA00036044566300001314
Time delay of nodes 2 and 4
Figure BDA00036044566300001315
Distance of node 5 relative to node 2
Figure BDA00036044566300001316
Distance of node 5 relative to node 6
Figure BDA00036044566300001317
The angle between the position vector of node 5 relative to node 2 and the position vector of node 6 relative to node 2
Figure BDA00036044566300001318
The angle between the velocity vector of node 5 relative to node 2 and the position vector of node 6 relative to node 2
Figure BDA00036044566300001319
Bandwidth of nodes 2 and 5
Figure BDA00036044566300001320
Time delay of nodes 2 and 5
Figure BDA00036044566300001321
The optimal weighting matrix obtained by off-line training is used as a prediction rule for prediction to obtain a neighbor parameter prediction table, as shown in fig. 10, which includes the hop number prediction value from the node 2 to the node 6 when the node 3 is selected as the next hop
Figure BDA00036044566300001322
And transmission time prediction value
Figure BDA00036044566300001323
Hop count prediction from node 2 to node 6 when node 4 is selected as the next hop
Figure BDA00036044566300001324
And transmission time prediction value
Figure BDA00036044566300001325
Hop count prediction from node 2 to node 6 when node 5 is selected as the next hop
Figure BDA00036044566300001326
And transmission time prediction value
Figure BDA00036044566300001327
The forwarding selection module may select the next hop according to the criterion of low hop count or low transmission time based on the neighbor parameter prediction table, for example, when the requirement is low hop count, the selected next hop is
Figure BDA00036044566300001328
When the demand is that the transmission time is short, the next hop is selected to be
Figure BDA00036044566300001329
Example 2
As shown in fig. 11, an over-the-air mobile device includes an offline training module, a prediction module, and a forwarding selection module;
before networking, an offline training module trains based on historical data to obtain a prediction rule, and a plurality of value function sub-gradients are designed during training, wherein the value function sub-gradients comprise a steepest descent sub-gradient, a ratio example gradient and a differential sub-gradient to prevent local convergence;
when networking, the prediction module generates a neighbor parameter prediction table according to a prediction rule;
the forwarding selection module selects the next hop for forwarding according to the neighbor parameter prediction table and the requirement, and stores the current routing information into a database; prior to networking, historical data is obtained from a database.
The offline training module performs offline training on historical data based on a neural network and iterative learning, and the method of steps S101 to S111 in embodiment 1 is implemented.
The forwarding selection module generates a neighbor parameter prediction table according to the prediction rule, and implements the method as steps S201 to S204 in embodiment 1.
The airborne mobile device described in this embodiment refers to an apparatus flying object that is manufactured by human, can fly off the ground, flies in space, and is controlled by human to fly in the atmosphere or in the space outside the atmosphere (space), including but not limited to an unmanned aerial vehicle, a balloon, a glider, an airship, an airplane, and a helicopter.
Example 3
A computer system comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the method steps when executing the computer program as follows:
before networking, offline training is carried out on historical data based on a neural network and iterative learning, a prediction rule is obtained through training, and a plurality of value function sub-gradients including a steepest descent sub-gradient, a specific example gradient and a differential sub-gradient are designed during training to prevent local convergence;
when networking, generating a neighbor parameter prediction table according to a prediction rule;
and selecting the next hop for forwarding according to the neighbor parameter prediction table and the requirement.
Where the memory and processor are connected by a bus, the bus may comprise any number of interconnected buses and bridges, the buses connecting together one or more of the various circuits of the processor and the memory. The bus may also connect various other circuits such as peripherals, voltage regulators, power management circuits, etc., which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the transceiver. The transceiver may be one element or a plurality of elements, such as a plurality of receivers and transmitters, providing a means for communicating with various other apparatus over a transmission medium. The data processed by the processor is transmitted over a wireless medium via an antenna, which further receives the data and transmits the data to the processor.
It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (10)

1. An unmanned cluster geographical position routing method based on neural network and iterative learning is characterized in that: the method comprises the following steps:
before networking, offline training is carried out on historical data based on a neural network and iterative learning, a prediction rule is obtained through training, and a plurality of value function sub-gradients including a steepest descent sub-gradient, a specific example gradient and a differential sub-gradient are designed during training to prevent local convergence;
when networking, generating a neighbor parameter prediction table according to a prediction rule;
and selecting the next hop for forwarding according to the neighbor parameter prediction table and the requirement.
2. The neural network and iterative learning based unmanned constellation geolocation routing method of claim 1, wherein: the historical data comprises m air mobile node sets, neighbor sets of all nodes and transmission characteristic parameters.
3. The neural network and iterative learning based unmanned constellation geolocation routing method of claim 2, wherein: the transmission characteristic parameters comprise the distance between the current node and the neighbor node, the distance between the target node and the neighbor node, the included angle between the relative position vector between the neighbor node and the current node and the relative position vector between the target node and the current node, the included angle between the velocity vector between the neighbor node and the current node and the relative position vector between the neighbor node and the current node, the transmission bandwidth between the current node and the neighbor node, the time delay, the hop count and the transmission time between the current node and the neighbor node; wherein the current node and the destination node are not coincident.
4. The neural network and iterative learning based unmanned constellation geolocation routing method of claim 3, wherein: taking the distance between the current node and a neighbor node, the distance between the target node and the neighbor node, the relative position vector between the neighbor node and the current node, the included angle between the relative position vector between the target node and the current node, the speed vector between the neighbor node and the current node, the included angle between the relative position vector between the neighbor node and the current node, the transmission bandwidth between the current node and the neighbor node, and the time delay between the current node and the neighbor node as training to perform offline training on the input neural network;
and taking the hop count and the transmission time as an output pair of the input neural network.
5. The neural network and iterative learning based unmanned constellation geolocation routing method of claim 4, wherein: the off-line training of the historical data based on the neural network and the iterative learning comprises the following specific steps:
s101: selecting the value of each constant parameter, including the number of neurons in the first hidden layer and the second hidden layer of the neural network
Figure FDA0003604456620000011
And
Figure FDA0003604456620000012
coefficient k hh And kappa hy Diagonal matrix lambda y ,λ h And λ u Magnitude of (2), value function standard value V req Maximum number of iterations k max Training step length eta; wherein k represents an iteration number;
s102: initializing a first weighting matrix from an input layer node to a hidden layer node, a second weighting matrix from the first hidden layer node to a second hidden layer node, and a third weighting matrix from the second hidden layer node to an output layer node;
s103: respectively calculating the output of a first hidden layer node, the output of a second hidden layer node and an output pair matrix of the kth iteration for n groups of training pairs, wherein the output pair matrix comprises a hop count predicted value and a transmission time predicted value;
s104: respectively calculating first cost functions V corresponding to the first weighting matrix, the second weighting matrix and the third weighting matrix according to the output of the first hidden layer node, the output of the second hidden layer node and the output pair matrix obtained by calculation uh (k) Second value function V hh (k) A third valence function V hy (k);
S105: if | | V (k) | non-woven phosphor ≤V req Wherein V (k) ═ V uh (k) V hh (k) V hy (k)]Is a value function matrix, | · | | non-conducting phosphor Is an infinite norm; or the iteration number reaches k to the set maximum number k max If yes, ending the training and entering step S111; otherwise, go to step S106;
s106: respectively calculating the corresponding value function sub-gradients of the first weighting matrix, the second weighting matrix and the third weighting matrix, wherein the value function sub-gradients comprise a steepest descent sub-gradient, a ratio example gradient and a differential sub-gradient;
s107: respectively calculating the value function gradients corresponding to the first weighting matrix, the second weighting matrix and the third weighting matrix according to the steepest descent sub-gradient, the proportional sub-gradient and the differential sub-gradient obtained by calculation;
s108: entering next iteration, wherein k is k + 1;
s109: updating the corresponding first weighting matrix, second weighting matrix and third weighting matrix according to the value function gradient corresponding to the first weighting matrix, second weighting matrix and third weighting matrix obtained by the last iterative computation;
s110: repeating the steps S103-S105 until the iteration ending condition is met, and entering the step S111;
s111: and taking the updated first weighting matrix, second weighting matrix and third weighting matrix as the optimal weighting matrix and as the prediction rule.
6. The neural network and iterative learning based unmanned constellation geolocation routing method of claim 5, wherein: generating a neighbor parameter prediction table according to the prediction rule, specifically as follows:
s201: when networking, obtaining the current node X c And destination node X d Information;
s202: if the destination node is
Figure FDA0003604456620000021
Then directly select X d Is the next hop; otherwise, entering the next step;
s203: obtaining next hop feasible solution set from neighbor information
Figure FDA0003604456620000031
Selecting neighbor nodes
Figure FDA0003604456620000032
Wherein j is equal to [1, m ]],
Figure FDA0003604456620000033
Predicting and selecting the next hop according to a prediction rule, wherein the next hop is predicted and selected to be X j Then, the hop count and the transmission time of the message are sent to a destination node;
s204: and traversing all the neighbor nodes, repeating the step S203 to predict the hop count and the transmission time of the neighbor nodes, and obtaining a neighbor parameter prediction table.
7. The neural network and iterative learning based unmanned constellation geolocation routing method of claim 6, wherein: predicting and selecting the next hop according to the transmission characteristic parameters, and predicting and selecting the next hop as X j The hop count of the message is sent to the destination node
Figure FDA0003604456620000034
And transmission time
Figure FDA0003604456620000035
The formula is expressed as follows:
Figure FDA0003604456620000036
in the formula (I), the compound is shown in the specification,
Figure FDA0003604456620000037
indicating that according to the distance of the current node from the neighbor nodes,
Figure FDA0003604456620000038
represents the distance of the destination node from the neighbor nodes,
Figure FDA0003604456620000039
representing the included angle between the relative position vector of the neighbor node and the current node and the relative position vector of the destination node and the current node,
Figure FDA00036044566200000310
indicating the speed of the current nodeThe angle between the vector and the relative position vector of the neighbor node and the current node,
Figure FDA00036044566200000311
representing the current node and the neighbor node X j The transmission bandwidth of (a) is,
Figure FDA00036044566200000312
representing the current node and the neighbor node X j Time delay of (2).
8. The neural network and iterative learning based unmanned constellation geolocation routing method of claim 1, wherein: and selecting a next hop for forwarding according to the neighbor parameter prediction table and according to requirements, storing the current routing information into a database, and acquiring historical data from the database before networking.
9. An airborne mobile device, characterized in that: the system comprises an offline training module, a prediction module and a forwarding selection module;
before networking, the off-line training module performs training based on historical data to obtain a prediction rule, and a plurality of value function sub-gradients are designed during training, wherein the value function sub-gradients comprise a steepest descent sub-gradient, a ratio example gradient and a differential sub-gradient to prevent local convergence;
when networking, the prediction module generates a neighbor parameter prediction table according to a prediction rule;
and the forwarding selection module selects the next hop for forwarding according to the neighbor parameter prediction table and according to requirements, and stores the current routing information into a database.
10. A computer system comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein: the processor, when executing the computer program, performs the steps of the method according to any of claims 1 to 8.
CN202210412303.9A 2022-04-19 2022-04-19 Routing method for geographical position of unmanned cluster based on neural network and iterative learning Pending CN114828146A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210412303.9A CN114828146A (en) 2022-04-19 2022-04-19 Routing method for geographical position of unmanned cluster based on neural network and iterative learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210412303.9A CN114828146A (en) 2022-04-19 2022-04-19 Routing method for geographical position of unmanned cluster based on neural network and iterative learning

Publications (1)

Publication Number Publication Date
CN114828146A true CN114828146A (en) 2022-07-29

Family

ID=82504843

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210412303.9A Pending CN114828146A (en) 2022-04-19 2022-04-19 Routing method for geographical position of unmanned cluster based on neural network and iterative learning

Country Status (1)

Country Link
CN (1) CN114828146A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116489738A (en) * 2023-06-25 2023-07-25 深圳市华曦达科技股份有限公司 QoS route model processing method and device based on wireless Mesh network
CN117376934A (en) * 2023-12-08 2024-01-09 山东科技大学 Deep reinforcement learning-based multi-unmanned aerial vehicle offshore mobile base station deployment method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116489738A (en) * 2023-06-25 2023-07-25 深圳市华曦达科技股份有限公司 QoS route model processing method and device based on wireless Mesh network
CN116489738B (en) * 2023-06-25 2023-09-19 深圳市华曦达科技股份有限公司 QoS route model processing method and device based on wireless Mesh network
CN117376934A (en) * 2023-12-08 2024-01-09 山东科技大学 Deep reinforcement learning-based multi-unmanned aerial vehicle offshore mobile base station deployment method
CN117376934B (en) * 2023-12-08 2024-02-27 山东科技大学 Deep reinforcement learning-based multi-unmanned aerial vehicle offshore mobile base station deployment method

Similar Documents

Publication Publication Date Title
Mardani et al. Communication-aware UAV path planning
Guo et al. ICRA: An intelligent clustering routing approach for UAV ad hoc networks
CN114828146A (en) Routing method for geographical position of unmanned cluster based on neural network and iterative learning
CN110825116B (en) Unmanned aerial vehicle formation method based on time-varying network topology
CN114025330B (en) Air-ground cooperative self-organizing network data transmission method
WO2023010712A1 (en) Optimization method and device for communication network of aerial swarm
Nguyen et al. DRL-based intelligent resource allocation for diverse QoS in 5G and toward 6G vehicular networks: a comprehensive survey
CN114499648A (en) Unmanned aerial vehicle cluster network intelligent multi-hop routing method based on multi-agent cooperation
CN113115399B (en) Route optimization method for self-organizing network of heterogeneous unmanned aerial vehicle
Toorchi et al. Skeleton-based swarm routing (SSR): Intelligent smooth routing for dynamic UAV networks
Khodaparast et al. Deep reinforcement learning based energy efficient multi-UAV data collection for IoT networks
Peng et al. FNTAR: A future network topology-aware routing protocol in UAV networks
Wang et al. An intelligent UAV based data aggregation algorithm for 5G-enabled internet of things
CN112822745A (en) Self-adaptive routing method for unmanned aerial vehicle ad hoc network
Bilen et al. Digital twin evolution for hard-to-follow aeronautical ad-hoc networks in beyond 5g
Chen et al. Cooperative networking strategy of UAV cluster for large-scale WSNs
Jiang et al. Research on OLSR adaptive routing strategy based on dynamic topology of UANET
CN116939761A (en) Air-ground cooperative routing method based on reinforcement learning
Hussain et al. Taking FANET to next level: The contrast evaluation of moth-and-ant with Bee Ad-hoc routing protocols for flying Ad-hoc networks
Singh et al. Energy-efficient uav trajectory planning in rechargeable iot networks
Zixuan et al. UAV flight strategy algorithm based on dynamic programming
CN114879726A (en) Path planning method based on multi-unmanned-aerial-vehicle auxiliary data collection
CN113495574A (en) Control method and device for unmanned aerial vehicle group flight
Hao et al. Mobility-aware trajectory design for aerial base station using deep reinforcement learning
Lin et al. Deep Reinforcement Learning-Based Computation Offloading for Servicing Dynamic Demand in Multi-UAV-Assisted IoT Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination