CN113093124A - DQN algorithm-based real-time allocation method for radar interference resources - Google Patents
DQN algorithm-based real-time allocation method for radar interference resources Download PDFInfo
- Publication number
- CN113093124A CN113093124A CN202110370353.0A CN202110370353A CN113093124A CN 113093124 A CN113093124 A CN 113093124A CN 202110370353 A CN202110370353 A CN 202110370353A CN 113093124 A CN113093124 A CN 113093124A
- Authority
- CN
- China
- Prior art keywords
- unmanned aerial
- aerial vehicle
- radar
- jam
- interference
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 238000004422 calculation algorithm Methods 0.000 title claims abstract description 25
- 230000009471 action Effects 0.000 claims description 24
- 238000012549 training Methods 0.000 claims description 22
- 238000013528 artificial neural network Methods 0.000 claims description 18
- 230000008859 change Effects 0.000 claims description 9
- 238000005070 sampling Methods 0.000 claims description 3
- 238000013468 resource allocation Methods 0.000 abstract description 14
- 238000006243 chemical reaction Methods 0.000 abstract description 4
- 238000012545 processing Methods 0.000 abstract description 4
- 230000007547 defect Effects 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 8
- 230000008569 process Effects 0.000 description 7
- 230000006870 function Effects 0.000 description 5
- 230000007704 transition Effects 0.000 description 4
- 229920001339 phlorotannin Polymers 0.000 description 3
- 238000004088 simulation Methods 0.000 description 3
- 238000009825 accumulation Methods 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000012800 visualization Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000036039 immunity Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S7/00—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
- G01S7/02—Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00
- G01S7/38—Jamming means, e.g. producing false echoes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Remote Sensing (AREA)
- Radar, Positioning & Navigation (AREA)
- Computer Networks & Wireless Communication (AREA)
- Algebra (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Databases & Information Systems (AREA)
- Control Of Position, Course, Altitude, Or Attitude Of Moving Bodies (AREA)
Abstract
The invention belongs to the technical field of radar interference, and particularly relates to a method for real-time allocation of radar interference resources based on a DQN algorithm. The invention introduces the DQN algorithm into the interference pattern resource allocation of the unmanned aerial vehicle, overcomes the defects of the prior art on dynamic and real-time allocation, realizes the real-time allocation of the interference pattern resource of the unmanned aerial vehicle from the start of a task to the completion of the task, and can be used for processing the condition that the radar has a plurality of working mode conversions.
Description
Technical Field
The invention belongs to the technical field of radar interference, and particularly relates to a method for real-time allocation of radar interference resources based on a DQN algorithm.
Background
At present, more and more radars are automatically changed according to the surrounding environment, so that the requirement on an interference resource allocation strategy is higher and higher, the unmanned aerial vehicle is required to be capable of adaptively changing the strategy of the unmanned aerial vehicle in real time according to the obtained parameters of the radars, and the current threatened radars can be effectively interfered in real time and quickly in the whole flight process. Therefore, the method has important significance for researching interference pattern resources of the unmanned aerial vehicle distributed in real time along with the flight range in the flight process.
The allocation of interference pattern resources results in a large amount of data accumulation and calculation, which puts higher demands on the ability of the drones to quickly allocate the interference pattern resources they carry. The existing algorithms which are applicable to the problem comprise a traditional dynamic planning algorithm and an intelligent population search algorithm, the two algorithms are not dynamic but static for the distribution of interference pattern resources carried by the unmanned aerial vehicle, the distribution mode of the interference pattern resources cannot be changed in real time along with the flying distance of the unmanned aerial vehicle, particularly the situation that the radar is in multiple working modes, and the multifunctional radar is classified into 3 working modes for searching, tracking and guiding. In order to make up for the deficiency of the allocation mode under the algorithm, the invention provides that the DQN algorithm is introduced into the research of the interference pattern resource allocation of the unmanned aerial vehicle, so that the defects of the two algorithms on dynamic and real-time allocation can be made up, and the situation that the radar has multiple working mode conversions can be processed.
Disclosure of Invention
The invention aims to provide a method for real-time allocation of radar interference resources based on a DQN algorithm.
The purpose of the invention is realized by the following technical scheme: the method comprises the following steps:
step 1: obtaining an interference resource pool J ═ J1,j2,......,jxThe radar resource pool P ═ P1,P2,......,PmThe unmanned aerial vehicle group jam to be distributed is { jam ═ jam1,jam2,...,jamm}; obtaining the required success rate SR of the task executed by the unmanned aerial vehicle groupmax;
Wherein x represents the number of interference patterns; the number of unmanned aerial vehicles in the unmanned aerial vehicle group is the same as the number of radars in the environment, and is m;
step 2: setting the distance L from the starting point to the task point, the number of iteration steps num and the maximum capacity D of an empirical playback pool of the unmanned aerial vehiclemax(ii) a Initializing state S of m-section radar with t equal to 11={s11,s21,...,sm1}; initializing an experience playback pool
Wherein,representing unmanned aerial vehicle jamuInterfering radar PiThe state at the t step;representing unmanned aerial vehicle jamuThe accumulated flight distance at the t-th step,fucli(t) denotes a radar PiThe state at the t step; u ═ 1,2, ·, m, i ═ 1,2, ·, m;
and step 3: interference action A executed by selecting unmanned aerial vehicle cluster by greedy strategyt={a1t,a2t,...,amt};
Wherein, aut={Pi,jkDenotes unmanned plane jamuFor radar P at t stepiPerforming a jamming action jk,jk∈J;
And 4, step 4: performing a disturbing action AtThen, according to the reward value RtObtaining the state S of m radarst+1;
Wherein if radar PiKeeping the working mode unchanged, then sitThe change is not changed; if radar PiFrom search mode to tracking mode or from tracking mode to guidance mode, sitIncreasing; if radar PiFrom the guidance mode to the tracking mode, or from the guidance mode to the search mode, or from the tracking mode to the search mode, sitDecrease;
and 5: will (S)t,At,Rt,St+1) Storing the experience playback pool D; if the experience playback pool D does not reach the maximum capacity DmaxIf so, changing t to t +1, and returning to the step 3; otherwise, executing step 6;
step 6: initialization G1=0,G 20; randomly sampling a batch of samples from an experience pool D, and converting the state sitAnd action aitPerforming combined input into the neural network for training, and utilizing DQN algorithm to correspond to the state s of the neural network at each stepitCorrecting the output action to make the output of the neural network to act aitApproaching;
and 7: predicting the action taken by the unmanned aerial vehicle cluster from 1-num step according to the trained neural network, and recording whether the unmanned aerial vehicle cluster successfully reaches a task point after num step;
and 8: repeatedly executing the step 7, and calculating the success rate sr of the task executed by the unmanned aerial vehicle group; if the success rate SR of the unmanned aerial vehicle group to execute the task is larger than the SRmaxEnding the training and executing the step 9; otherwise, returning to the step 2;
sr=(G2/G1)
wherein G is1The total times of executing the step 7, namely the total flying times of the unmanned aerial vehicle group; g2The number of times of completing the flight mission for the unmanned aerial vehicle group;
and step 9: the neural network meeting the success rate of the task requirement is used for executing the real-time allocation of radar interference resources of the unmanned aerial vehicle cluster, and the state S of m radars at a certain momenttInput to meet the required success rate of the taskObtaining the interference action A taken by the unmanned aerial vehicle group in the neural networktNamely, the real-time allocation result of the radar interference resources of the unmanned aerial vehicle cluster.
The invention has the beneficial effects that:
the invention introduces the DQN algorithm into the interference pattern resource allocation of the unmanned aerial vehicle, overcomes the defects of the prior art on dynamic and real-time allocation, realizes the real-time allocation of the interference pattern resource of the unmanned aerial vehicle from the start of a task to the completion of the task, and can be used for processing the condition that the radar has a plurality of working mode conversions.
Drawings
Fig. 1 is a DQN learning diagram.
Fig. 2 is a flow chart of DQN algorithm training in conjunction with radar interference strategy assignment.
Fig. 3 is a conversion diagram of the operation mode of the multifunctional radar.
Fig. 4 shows the relationship between the radar and the radial position of the drone.
Figure 5 is a tensorbard code visualization diagram.
Fig. 6 is an interference resource allocation diagram when t is 20 steps.
Fig. 7 is an interference resource allocation diagram when t is 40 steps.
Fig. 8 is an interference resource allocation diagram when t is 60 steps.
Fig. 9 is an interference resource allocation diagram when t is 80 steps.
FIG. 10 is a graph of error as a function of iteration number.
FIG. 11 is a graph of flight success rate as a function of iteration number.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
The invention aims to provide a DQN-based method suitable for dynamically allocating interference pattern resources carried by an unmanned aerial vehicle, and particularly relates to a method for realizing the real-time allocation of the interference pattern resources from the start to the completion of a task of the unmanned aerial vehicle by processing the condition that a radar has multiple working modes.
The invention uses DQN algorithm as a solving tool, a network structure is shown in figure 1, the network structure is introduced into the distribution of a plurality of multifunctional radar interference resources, a dynamic distribution strategy of the interference resources in the environment changing along with the unmanned aerial vehicle group range is researched by adopting a one-to-one interference mode on the basis of a complex electronic countermeasure environment, the flow of the whole scheme is shown in figure 2, and the method comprises the following steps:
step 1: bringing the electronic countermeasure information into an interference resource pool J and a radar resource pool P ═ P1,P2,......,PmThe unmanned aerial vehicle group jam to be distributed is { jam ═ jam1,jam2,...,jamm}; wherein J ═ { J ═ J1,j2,......,jxX represents the number of interference patterns.
Ground radar resource pool P ═ { P ═ P1,P2,......,PmAnd m represents the number of radars in the environment. Pi={fucl,sys,pp,gr,qs},PiThe method comprises the steps of (1) representing an ith multifunctional radar, and fucl representing different working mode parameter sets of the multifunctional radar; qs represents a measure of the radar against interference; sys represents radar constitution, representing different radar types; pp denotes the peak power (KW) of the radar and gr denotes the radar antenna gain (dB).
Relevant parameters in fucl: fcl ═ pwj,bwj,prfj,rfjJ is 0-2, which represents three different working modes of the first multifunctional radar, wherein pwj,bwj,prfj,rfjThe radar signal pulse width, the receiver bandwidth, the pulse repetition frequency and the carrier frequency of the radar under different modes are respectively.
Unmanned aerial vehicle group jam ═ jam of interference resource to be distributed1,jam2,...,jammAnd m represents the number of drones. Wherein the ith unmanned plane is jami={pjam,gj,bwjam,J},pjamFor unmanned aerial vehicle power (W), gj for unmanned aerial vehicle antenna gain (dB), bwjamIs the drone bandwidth (MHz).
Step 2: setting of relevant parameters in DQN networks, Dnum(empirical playback set size), γ (reward discount factor), r (learning rate), ε (ε -greedy), C (number of network weight reset steps).
And step 3: training of the DQN network is started.The distance from a starting point to a task point of the unmanned aerial vehicle is L, the unmanned aerial vehicle is divided into num steps, t represents that the unmanned aerial vehicle flies t steps, the initialized t is 1, and the state S of m radars is detected1={s11,s21,...,sm1}; initializing an experience playback poolHaving a capacity of Dmax(ii) a Initializing a randomly generated weight θ1;
Wherein,express the u unmanned plane jamuThe state of the interference ith radar in the step t; jamuRepresents the accumulated flight distance of the unmanned plane during the step t, and
and 4, step 4: interference action A executed by selecting unmanned aerial vehicle cluster by greedy strategyt={a1t,a2t,...,amt};
Wherein, ait={Pi,jkDenotes unmanned plane jamiFor radar P in t stepsiPerforming a jamming action jk,jk∈J;
And 5: performing a disturbing action AtThen, the state S of m radar parts is obtainedt+1To obtain a reward value RtAs shown in the following formula (2)
Wherein r ist(i) And (4) indicating that the unmanned aerial vehicle interferes with the reward value obtained by the ith radar when flying to the step t. RtRepresenting the total reward value obtained by the interfering m radars at step t.
When the ith radar goes from the search mode to the tracking mode and then to the guidance mode, sitSequentially increasing, otherwise, decreasing; if the radar keeps the working mode unchanged, sitThe radar operation transitions are shown in figure 3, without change. The transition of the drone to the radar mode of operation is obtained by a change in the parameters of fucol in step 1.
Step 6: will (S)t,At,Rt,St+1) And storing the samples into an experience pool D, and if the number of the samples stored in the experience pool D is not enough, entering a step 4, and making t equal to t +1 until D is full. And on the contrary, randomly sampling a batch of samples from the experience pool D every C step in the training process to adjust the internal parameters of the training network.
And 7: and if the experience pool is full, training is started from steps of 1-num in sequence. Training the network is to state sitAnd action aitCombining and carrying out learning in a neural network, utilizing the advantages of the neural network, and utilizing the characteristics in the DQN algorithm to carry out learning on the current state s at each stepitAction taken aitCorrecting the reward gradually to the optimal action aitTo get close. If the whole unmanned plane with the steps of 1-num flies successfully, G2Adding 1; whether failed or successful, G1And adding 1.
And 8: if the total flying times of the unmanned aerial vehicle is G at the moment1The number of times that the unmanned plane successfully completes the flight mission is G2The success rate of obtaining the unmanned aerial vehicle to execute the task is shown as the formula (4), and when SR is larger than the task requirement SRmaxAnd ending the training and entering the step 9, otherwise, continuing to execute the step 3.
sr=(G2/G1) (4)
And step 9: at this point, the training of the interference pattern resource allocation by using the DQN algorithm is finished, and the internal neural network parameters are trained. Now we input the corresponding state StCan pass through DQN networkAnd obtaining a corresponding optimal interference pattern resource allocation result according to the training result.
Example 1:
the invention provides a DQN-based method suitable for dynamically allocating interference pattern resources carried by an unmanned aerial vehicle, and particularly relates to a method for realizing real-time allocation of the interference pattern resources from the start to the completion of a task of the unmanned aerial vehicle by processing the condition that a radar has multiple working modes. In order to verify the effectiveness of the method, the method is used, and as shown in fig. 2, a DQN algorithm is performed to allocate interference resources of the unmanned aerial vehicle, which change along with the flight path, in real time.
The method comprises the following steps: obtaining an interference resource pool J and a radar resource pool P ═ { P ═ P1,P2,P3,P4Resource pool for confrontation environment E ═ E1,E2The unmanned aerial vehicle group jam to be distributed is { jam ═ jam1,jam2,jam3,jam4};
Wherein J ═ { J ═ J1,j2,j3,j4,j5,j6,j7},j1Representing noise frequency modulation suppressed interference, j2Representing noise frequency modulation suppressed interference, j3Representing smart noise convolution disturbances, j4Suppression of disturbances, j, representing dense decoys5Representing distance-trailing spoofing interference, j6Representing speed-pulling spoofing disturbances, j7Representing a combined range-velocity tow spoofing disturbance.
Ground radar resource pool P ═ { P ═ P1,P2,P3,P4And for the established radar, the radar has two basic anti-jamming capabilities of pulse compression and pulse accumulation. Now. We denote the range radar by 0, the pulse Doppler radar by 1 and the MTI moving object display radar by 2.
Wherein P is1{ fuco, 0,320,32, qs }, where qs increases the pulse front tracking immunity measure; when the radar is in the search state, fucl ═ {32,24,0.3,8.7}, and when the radar is in the tracking state, fucl ═ {15,40,1.2,8.7 }.
Wherein P is2{ fucol, 1,250,33, qs }, where qs adds clutter cancellation, pulse leading edge tracking anti-jamming measures; thunderWhen in the search state, fucl ═ 20,24,0.5,10.3, and when in the tracking state, fucl ═ 5,60,1.5, 11.1.
Wherein P is3{ fucl,2,180,34, qs }, wherein qs adds clutter cancellation, speed discrimination anti-jamming measures; when the radar is in the search state, fucl ═ {15,32,0.8,9.5}, and when the radar is in the tracking state, fucl ═ {8,50,1.8,9.5 }.
Wherein P is4The method comprises the following steps of (1, 220,33, qs), wherein qs adds a clutter cancellation and speed discrimination anti-interference measure; when the radar is in the search state, fucl ═ {15,32,0.8,11.8}, and when the radar is in the tracking state, fucl ═ {4,60,2.4,11.8 }.
Unmanned aerial vehicle group jam ═ jam of interference resource to be distributed1,jam2,jam3,jam4And m represents the number of drones. Wherein the ith unmanned plane is jami={pjam,gj,bwjam,J},pjamFor unmanned aerial vehicle power (W), gj for unmanned aerial vehicle antenna gain (dB), bwjamIs the drone bandwidth (MHz).
jam1={10,9,200,J}、jam2={10,9,200,J}、jam3={10,9,200,J}、jam410,9,200, J belonging to J1~j7;
rdmAnd jdmRespectively representing the position coordinates, in units (KM), of the mth radar and drone. The specific coordinate settings are as follows:
rd1=[-30,200],rd2=[30,120],rd3=[-20,40],rd4=[20,0];jd1=[0,10],jd2=[0,10],jd3=[0,10],jd4=[0,10]。
therefore, the position information of the unmanned aerial vehicle and the radar can be described in the two-dimensional coordinates, the distance between the unmanned aerial vehicle and the radar can be calculated through the coordinates, and the radial distance transformation relation between the unmanned aerial vehicle and each radar is shown in fig. 4.
Step 2: relevant parameters in the DQN network are set, D (empirical playback set size) is 2000, γ (reward discount factor) is 0.9, r (learning rate) is 0.001, e (e-greedy) is 0.9, and C (reset network weight step number) is 200.
And step 3: training of the DQN network is started. The distance from the starting point to the task point of the whole unmanned aerial vehicle is 300KM, the whole unmanned aerial vehicle is divided into 100 steps, t represents that the unmanned aerial vehicle flies t steps, the initialized t is 1, and the states S of m radars are obtained1={s11,s21,s31,s41}; initializing an experience playback poolIts capacity is 2000; initializing a randomly generated weight θ1;
Wherein,express the u unmanned plane jamuThe state of the interference ith radar in the step t; jamuRepresents the accumulated flight distance of the unmanned plane during the step t, and
and 4, step 4: interference action performed by a greedy strategy selection drone swarmt={a1t,a2t,a3t,a4t}; wherein, ait={Pi,jkDenotes unmanned plane jamiFor radar P in t stepsiPerforming a jamming action jk,jk∈J;
And 5: performing a disturbing action AtThen, the state S of m radar parts is obtainedt+1To obtain a reward value RtAs shown in the following formula (6)
Wherein r ist(i) And (4) indicating that the unmanned aerial vehicle interferes with the reward value obtained by the ith radar when flying to the step t. RtRepresenting the total reward value obtained by the interfering 4 radars at step t.
When the ith radar goes from the search mode to the tracking mode and then to the guidance mode, sitSequentially increasing, otherwise, decreasing; if the radar keeps the working mode unchanged, sitThe radar operation transitions are shown in figure 3, without change. The transition of the drone to the radar mode of operation is obtained by a change in the parameters of fucol in step 1.
Step 6: will (S)t,At,Rt,St+1) And storing the samples into an experience pool D, and if the number of the samples stored in the experience pool D is not enough, entering a step 4, and making t equal to t +1 until D is full. On the contrary, a batch of samples are randomly sampled from the experience pool D every 200 steps in the training process to adjust the internal parameters of the training network.
And 7: and (4) when the experience pool is full, starting training from 1-100 steps in sequence. Training the network is to state sitAnd action aitCombining and carrying out learning in a neural network, utilizing the advantages of the neural network, and utilizing the characteristics in the DQN algorithm to carry out learning on the current state s at each stepitAction taken aitCorrecting the reward gradually to the optimal action aitTo get close. If the whole unmanned aerial vehicle with 1-100 steps flies successfully, G2Adding 1; whether failed or successful, G1Plus 1, and G1、G2The initial value is zero.
And 8: if the total flying times of the unmanned aerial vehicle is G at the moment1The number of times that the unmanned plane successfully completes the flight mission is G2The success rate of obtaining the unmanned aerial vehicle to execute the task is shown as the formula (4), and when SR is larger than the task requirement SRmaxAnd ending the training and entering the step 9, otherwise, continuing to execute the step 3.
And step 9: at this point, the training of the interference pattern resource allocation by using the DQN algorithm is finished, and the internal neural network parameters are trained. Now we input the corresponding state StCan be trained by DQN networkAnd obtaining a corresponding optimal interference pattern resource allocation result.
The results of dynamic allocation of interference resources through the DQN network are shown in fig. 6-9, where t is the number of flight steps of the drone. After training and learning are carried out through the DQN algorithm, an optimal interference pattern resource distribution result under the current establishment environment and a graph 10 of changes of the DQN error function along with iteration times can be obtained. The Tenscript code frame visualization display is shown in FIG. 5.
2. Analysis of simulation results
The results of interference resource allocation in the simulation environment are shown in FIGS. 6 to 9. In the whole flying process of the cluster, the influence of interference resource distribution along with the dynamic change of the flight distance is considered, and the results show that in the change process of random cluster flying time, different interference patterns of different interference machines are adopted for interference at different moments aiming at different multifunctional radars, so that a flying task is completed. The experiment is carried out for 1600 times of simulation, although the error function after the DQN network training has fluctuation of about 0.2 in 1200-1600, the DQN error function is basically converged between 0.1-0.3, and the interference resource allocation can be basically converged, through the final flight success rate effect diagram, as shown in FIG. 11, we can see that the success rate of the interference effect through the DQN algorithm is finally stabilized at more than 70%, and the overall interference allocation result is good for the whole interference process, thereby realizing the requirement of dynamic allocation of the interference resources, and further verifying the feasibility and effectiveness of the establishment method.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (1)
1. A DQN algorithm-based real-time allocation method for radar interference resources is characterized by comprising the following steps:
step 1: obtaining an interference resource pool J ═ J1,j2,......,jxThe radar resource pool P ═ P1,P2,......,PmThe unmanned aerial vehicle group jam to be distributed is { jam ═ jam1,jam2,...,jamm}; obtaining the required success rate SR of the task executed by the unmanned aerial vehicle groupmax;
Wherein x represents the number of interference patterns; the number of unmanned aerial vehicles in the unmanned aerial vehicle group is the same as the number of radars in the environment, and is m;
step 2: setting the distance L from the starting point to the task point, the number of iteration steps num and the maximum capacity D of an empirical playback pool of the unmanned aerial vehiclemax(ii) a Initializing state S of m-section radar with t equal to 11={s11,s21,...,sm1}; initializing an experience playback pool
Wherein,representing unmanned aerial vehicle jamuInterfering radar PiThe state at the t step;representing unmanned aerial vehicle jamuThe accumulated flight distance at the t-th step,fucli(t) denotes a radar PiThe state at the t step; u ═ 1,2, ·, m, i ═ 1,2, ·, m;
and step 3: interference action A executed by selecting unmanned aerial vehicle cluster by greedy strategyt={a1t,a2t,...,amt};
Wherein, aut={Pi,jkDenotes unmanned plane jamuFor radar P at t stepiPerforming a jamming action jk,jk∈J;
And 4, step 4: performing a disturbing action AtThen, according to the reward value RtGet m partState S of radart+1;
Wherein if radar PiKeeping the working mode unchanged, then sitThe change is not changed; if radar PiFrom search mode to tracking mode or from tracking mode to guidance mode, sitIncreasing; if radar PiFrom the guidance mode to the tracking mode, or from the guidance mode to the search mode, or from the tracking mode to the search mode, sitDecrease;
and 5: will (S)t,At,Rt,St+1) Storing the experience playback pool D; if the experience playback pool D does not reach the maximum capacity DmaxIf so, changing t to t +1, and returning to the step 3; otherwise, executing step 6;
step 6: initialization G1=0,G20; randomly sampling a batch of samples from an experience pool D, and converting the state sitAnd action aitPerforming combined input into the neural network for training, and utilizing DQN algorithm to correspond to the state s of the neural network at each stepitCorrecting the output action to make the output of the neural network to act aitApproaching;
and 7: predicting the action taken by the unmanned aerial vehicle cluster from 1-num step according to the trained neural network, and recording whether the unmanned aerial vehicle cluster successfully reaches a task point after num step;
and 8: repeatedly executing the step 7, and calculating the success rate sr of the task executed by the unmanned aerial vehicle group; if the success rate SR of the unmanned aerial vehicle group to execute the task is larger than the SRmaxEnding the training and executing the step 9; otherwise, returning to the step 2;
sr=(G2/G1)
wherein G is1The total times of executing the step 7, namely the total flying times of the unmanned aerial vehicle group; g2The number of times of completing the flight mission for the unmanned aerial vehicle group;
and step 9: the neural network meeting the success rate of the task requirement is used for executing the real-time allocation of radar interference resources of the unmanned aerial vehicle cluster, and the state S of m radars at a certain momenttInputting the data into a neural network meeting the success rate of the task requirements to obtain an interference action A taken by the unmanned aerial vehicle grouptNamely, the real-time allocation result of the radar interference resources of the unmanned aerial vehicle cluster.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110370353.0A CN113093124B (en) | 2021-04-07 | 2021-04-07 | DQN algorithm-based real-time allocation method for radar interference resources |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110370353.0A CN113093124B (en) | 2021-04-07 | 2021-04-07 | DQN algorithm-based real-time allocation method for radar interference resources |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113093124A true CN113093124A (en) | 2021-07-09 |
CN113093124B CN113093124B (en) | 2022-09-02 |
Family
ID=76674257
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110370353.0A Active CN113093124B (en) | 2021-04-07 | 2021-04-07 | DQN algorithm-based real-time allocation method for radar interference resources |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113093124B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114444398A (en) * | 2022-02-08 | 2022-05-06 | 扬州宇安电子科技有限公司 | Grey wolf algorithm-based networking radar cooperative interference resource allocation method |
CN114509732A (en) * | 2022-02-21 | 2022-05-17 | 四川大学 | Deep reinforcement learning anti-interference method of frequency agile radar |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150260828A1 (en) * | 2012-10-27 | 2015-09-17 | Valeo Schalter Und Sensoren Gmbh | Method for suppressing interference in a received signal of a radar sensor of a motor vehicle and corresponding driver assistance device |
US9622133B1 (en) * | 2015-10-23 | 2017-04-11 | The Florida International University Board Of Trustees | Interference and mobility management in UAV-assisted wireless networks |
CN108710110A (en) * | 2018-04-11 | 2018-10-26 | 哈尔滨工程大学 | A kind of cognitive interference method based on Markov process decision |
CN108777872A (en) * | 2018-05-22 | 2018-11-09 | 中国人民解放军陆军工程大学 | Deep Q neural network anti-interference model and intelligent anti-interference algorithm |
CN109444832A (en) * | 2018-10-25 | 2019-03-08 | 哈尔滨工程大学 | Colony intelligence interfering well cluster method based on more jamming effectiveness values |
CN109862610A (en) * | 2019-01-08 | 2019-06-07 | 华中科技大学 | A kind of D2D subscriber resource distribution method based on deeply study DDPG algorithm |
CN109884599A (en) * | 2019-03-15 | 2019-06-14 | 西安电子科技大学 | A kind of radar chaff method, apparatus, computer equipment and storage medium |
CN110031807A (en) * | 2019-04-19 | 2019-07-19 | 电子科技大学 | A kind of multistage smart noise jamming realization method based on model-free intensified learning |
CN110515045A (en) * | 2019-08-30 | 2019-11-29 | 河海大学 | A kind of radar anti-interference method and system based on Q- study |
CN111199127A (en) * | 2020-01-13 | 2020-05-26 | 西安电子科技大学 | Radar interference decision method based on deep reinforcement learning |
CN111970072A (en) * | 2020-07-01 | 2020-11-20 | 中国人民解放军陆军工程大学 | Deep reinforcement learning-based broadband anti-interference system and anti-interference method |
CN112435275A (en) * | 2020-12-07 | 2021-03-02 | 中国电子科技集团公司第二十研究所 | Unmanned aerial vehicle maneuvering target tracking method integrating Kalman filtering and DDQN algorithm |
CN112543038A (en) * | 2020-11-02 | 2021-03-23 | 杭州电子科技大学 | Intelligent anti-interference decision method of frequency hopping system based on HAQL-PSO |
-
2021
- 2021-04-07 CN CN202110370353.0A patent/CN113093124B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150260828A1 (en) * | 2012-10-27 | 2015-09-17 | Valeo Schalter Und Sensoren Gmbh | Method for suppressing interference in a received signal of a radar sensor of a motor vehicle and corresponding driver assistance device |
US9622133B1 (en) * | 2015-10-23 | 2017-04-11 | The Florida International University Board Of Trustees | Interference and mobility management in UAV-assisted wireless networks |
CN108710110A (en) * | 2018-04-11 | 2018-10-26 | 哈尔滨工程大学 | A kind of cognitive interference method based on Markov process decision |
CN108777872A (en) * | 2018-05-22 | 2018-11-09 | 中国人民解放军陆军工程大学 | Deep Q neural network anti-interference model and intelligent anti-interference algorithm |
CN109444832A (en) * | 2018-10-25 | 2019-03-08 | 哈尔滨工程大学 | Colony intelligence interfering well cluster method based on more jamming effectiveness values |
CN109862610A (en) * | 2019-01-08 | 2019-06-07 | 华中科技大学 | A kind of D2D subscriber resource distribution method based on deeply study DDPG algorithm |
CN109884599A (en) * | 2019-03-15 | 2019-06-14 | 西安电子科技大学 | A kind of radar chaff method, apparatus, computer equipment and storage medium |
CN110031807A (en) * | 2019-04-19 | 2019-07-19 | 电子科技大学 | A kind of multistage smart noise jamming realization method based on model-free intensified learning |
CN110515045A (en) * | 2019-08-30 | 2019-11-29 | 河海大学 | A kind of radar anti-interference method and system based on Q- study |
CN111199127A (en) * | 2020-01-13 | 2020-05-26 | 西安电子科技大学 | Radar interference decision method based on deep reinforcement learning |
CN111970072A (en) * | 2020-07-01 | 2020-11-20 | 中国人民解放军陆军工程大学 | Deep reinforcement learning-based broadband anti-interference system and anti-interference method |
CN112543038A (en) * | 2020-11-02 | 2021-03-23 | 杭州电子科技大学 | Intelligent anti-interference decision method of frequency hopping system based on HAQL-PSO |
CN112435275A (en) * | 2020-12-07 | 2021-03-02 | 中国电子科技集团公司第二十研究所 | Unmanned aerial vehicle maneuvering target tracking method integrating Kalman filtering and DDQN algorithm |
Non-Patent Citations (6)
Title |
---|
KOZY, M等: "Applying Deep-Q Networks to Target Tracking to Improve Cognitive Radar", 《2019 IEEE RADAR CONFERENCE (RADARCONF)》 * |
VAN HASSELT H等: "Deep reinforcement learning with double Q-Learning", 《NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE》 * |
张柏开等: "基于Q-Learning的多功能雷达认知干扰决策方法", 《电讯技术》 * |
张柏开等: "对多功能雷达的DQN认知干扰决策方法", 《***工程与电子技术》 * |
杨鸿杰等: "基于强化学习的智能干扰算法研究", 《电子测量技术》 * |
王帅康: "基于深度强化学习的无人机自主降落方法研究", 《中国优秀博硕士学位论文全文数据库(硕士)工程科技Ⅱ辑》 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114444398A (en) * | 2022-02-08 | 2022-05-06 | 扬州宇安电子科技有限公司 | Grey wolf algorithm-based networking radar cooperative interference resource allocation method |
CN114444398B (en) * | 2022-02-08 | 2022-11-01 | 扬州宇安电子科技有限公司 | Grey wolf algorithm-based networking radar cooperative interference resource allocation method |
CN114509732A (en) * | 2022-02-21 | 2022-05-17 | 四川大学 | Deep reinforcement learning anti-interference method of frequency agile radar |
CN114509732B (en) * | 2022-02-21 | 2023-05-09 | 四川大学 | Deep reinforcement learning anti-interference method of frequency agile radar |
Also Published As
Publication number | Publication date |
---|---|
CN113093124B (en) | 2022-09-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wicks | Spectrum crowding and cognitive radar | |
CN113093124B (en) | DQN algorithm-based real-time allocation method for radar interference resources | |
CN111090078B (en) | Networking radar residence time optimal control method based on radio frequency stealth | |
CN111812599B (en) | Networking radar optimal waveform design method based on low interception performance under game condition | |
CN113341383B (en) | Anti-interference intelligent decision method for radar based on DQN algorithm | |
CN107329136B (en) | MIMO radar multi-target self-adaptive tracking method based on variable analysis time | |
CN104007419B (en) | About residence time and the radar time resource combined distributing method of heavily visiting interval | |
CN111190176B (en) | Self-adaptive resource management method of co-location MIMO radar networking system | |
Yi et al. | Reinforcement learning-based joint adaptive frequency hopping and pulse-width allocation for radar anti-jamming | |
CN113406579B (en) | Camouflage interference waveform generation method based on deep reinforcement learning | |
CN116299408B (en) | Multi-radar autonomous cooperative detection system and detection method | |
CN113376607B (en) | Airborne distributed radar small sample space-time self-adaptive processing method | |
CN115567353B (en) | Interference multi-beam scheduling and interference power combined optimization method for radar networking system | |
CN115343680A (en) | Radar anti-interference decision method based on deep reinforcement learning and combined frequency hopping and pulse width distribution | |
CN113311857A (en) | Environment sensing and obstacle avoidance system and method based on unmanned aerial vehicle | |
Zhang et al. | Research on decision-making system of cognitive jamming against multifunctional radar | |
CN115236607A (en) | Radar anti-interference strategy optimization method based on double-layer Q learning | |
CN109633587B (en) | Adaptive adjustment method for networking radar signal bandwidth | |
Zhang et al. | Joint jamming beam and power scheduling for suppressing netted radar system | |
Zhang et al. | Performance analysis of deep reinforcement learning-based intelligent cooperative jamming method confronting multi-functional networked radar | |
CN112051552A (en) | Multi-station-based main lobe anti-interference method and device | |
CN109212494A (en) | A kind of stealthy interference waveform design method of radio frequency for radar network system | |
CN113114399B (en) | Three-dimensional spectrum situation complementing method and device based on generation countermeasure network | |
CN117709678A (en) | Multi-machine collaborative radar search resource optimization method based on multi-agent reinforcement learning | |
Bi et al. | Optimization method of passive omnidirectional buoy array in on-call anti-submarine search based on improved NSGA-II |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |