CN114285645B - Man-in-the-middle attack coping method based on repeated game - Google Patents

Man-in-the-middle attack coping method based on repeated game Download PDF

Info

Publication number
CN114285645B
CN114285645B CN202111604797.2A CN202111604797A CN114285645B CN 114285645 B CN114285645 B CN 114285645B CN 202111604797 A CN202111604797 A CN 202111604797A CN 114285645 B CN114285645 B CN 114285645B
Authority
CN
China
Prior art keywords
strategy
port
ports
man
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111604797.2A
Other languages
Chinese (zh)
Other versions
CN114285645A (en
Inventor
朱进
张景龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202111604797.2A priority Critical patent/CN114285645B/en
Publication of CN114285645A publication Critical patent/CN114285645A/en
Application granted granted Critical
Publication of CN114285645B publication Critical patent/CN114285645B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a man-in-the-middle attack coping method based on repeated games in network security, which distributes information originally transmitted by one port to a plurality of ports, selects a fixed number of partial ports to transmit invalid information, transmits effective information to be transmitted actually by the other ports, and then determines which ports transmit invalid or effective information again at regular intervals according to a port distribution strategy provided by the invention. The invention can reduce the redistribution times of the ports while reducing the loss caused by man-in-the-middle attack as much as possible.

Description

Man-in-the-middle attack coping method based on repeated game
Technical Field
The invention relates to the field of network security, in particular to a man-in-the-middle attack coping method based on repeated games, which is a technology capable of effectively coping with man-in-the-middle attacks and reducing information leakage loss.
Background
In the current information era, a large amount of information is spread through the internet, and the security and confidentiality of the information are increasingly emphasized. In a network, computers communicate with each other through ports, and a network attack named man-in-the-middle attack causes information leakage by intercepting normal network communication data and performing data tampering and sniffing.
The existing coping method can distribute the service originally provided by a certain port to a plurality of ports, namely, the transmitted information is dispersed, and due to the limitation of practical factors, the man-in-the-middle attack can only attack part of the ports simultaneously, so the loss caused by the man-in-the-middle attack can be reduced.
However, this still does not achieve optimal results, and a man-in-the-middle attack can still cause loss by stealing information by attacking a portion of the ports.
Disclosure of Invention
The invention solves the problems: the method overcomes the defects of the prior art, provides a man-in-the-middle attack coping method based on repeated games, and can further reduce the information leakage loss caused by man-in-the-middle attack.
The invention discloses a technology capable of effectively coping with man-in-the-middle attack and reducing information leakage loss. The invention constructs a scene of attack and defense of man-in-the-middle attack into a repeated game model, enables a part of ports to transmit invalid information, enables the other ports to transmit valid information, and redistributes the ports for transmitting valid and invalid information at regular intervals, aiming at reducing the loss caused by man-in-the-middle attack as much as possible. In addition, the number of reallocations is also reduced as much as possible in consideration of the fact that reallocation of ports has a certain influence on information transmission.
The technical scheme of the invention is as follows: a method for man-in-the-middle attacking based on repeated game includes distributing information originally transmitted by one port to multiple ports, selecting a fixed number of partial ports to transmit invalid information and transmitting effective information to be transmitted by other ports, modeling problem of defending man-in-the-middle attacking into repeated game model, generating new port distribution strategy at regular intervals according to innovative port distribution strategy method provided by the invention, namely, repeated game to determine which ports transmit invalid or effective information, reducing number of times of redistribution of ports as much as possible while reducing loss caused by man-in-the-middle attacking as much as possible.
The innovative port allocation strategy of the present invention is specifically implemented as follows:
step 1: generating an exploration strategy set, and ensuring that at least one allocation strategy for transmitting invalid information by a port exists for any one port;
step 2: initializing an accumulated reward estimation value and a strategy disturbance quantity for each port, and executing subsequent steps to generate a port distribution strategy in the time period at fixed time intervals;
and step 3: at the current moment, independently sampling each port from Gaussian distribution to obtain random quantity, and accumulating the random quantity to the strategy disturbance quantity of the port;
and 4, step 4: searching with a certain probability, and randomly selecting one from the searching strategy set as a current round of distribution strategy; or adopting a strategy which enables the sum of the accumulated reward estimate and the strategy disturbance amount to obtain the maximum value as an allocation strategy;
and 5: determining that the port transmits invalid information or valid information according to the allocation strategy, and observing the benefit of the port which is simultaneously attacked by the man-in-the-middle and transmits the invalid information according to the action taken by the man-in-the-middle attack;
step 6: simulating by utilizing a resampling algorithm to estimate the reciprocal of the probability of each port transmitting invalid information at the moment;
and 7: updating the accumulated reward estimation value according to the actually adopted distribution strategy, the observed partial reward and the probability reciprocal obtained by simulation;
and 8: and returning to the step 2 to continue generating the allocation strategy of the next round until the next moment, wherein the generation of the allocation strategy of the next round is finished.
Compared with the prior art, the invention has the following advantages:
(1) when the method faces man-in-the-middle attack in network security, any information of the opposite side does not need to be known, namely the method has good robustness and can deal with various types of opponents;
(2) the port allocation strategy of the invention can reduce the redistribution times of the ports while reducing the loss caused by man-in-the-middle attack as much as possible, in other words, the information leakage loss caused by man-in-the-middle attack is reduced, and the adverse effect of switching the ports on effective information transmission is reduced.
Drawings
FIG. 1 is a flow chart of the implementation of the method of the present invention.
Detailed Description
The man-in-the-middle attack is a common attack mode in the network attack, and the man-in-the-middle attack steals the information transmitted in the man-in-the-middle attack through an attack port, thereby causing information leakage loss. The existing means can distribute the information originally transmitted by a certain port to a plurality of ports for transmission, i.e. the transmitted information is dispersed, so as to reduce the loss caused by man-in-the-middle attack. However, this alone is not sufficient, and a man-in-the-middle attack can still cause loss by stealing information by attacking a portion of the ports. Therefore, the invention constructs a repeated game model on the basis of the prior art, so that a part of ports transmit invalid information, the other ports transmit valid information, the ports for transmitting valid and invalid information are redistributed in each round, and the redistribution times of the ports are reduced as much as possible while the loss caused by man-in-the-middle attack is reduced as much as possible.
Attack the middle man from the perspective of repeated gameThe specific mathematical model established in this scenario is as follows: the total number of ports capable of transmitting information is n, a defender can select k (k < n) ports to transmit invalid information each time, an n-dimensional binary vector v can be used for representing the port allocation strategy of the defender, if the ith (i is 1, …, n) ports transmit invalid information, the ith element of the corresponding strategy v is 1, otherwise, the ith element is 0, and the ith element is | | | v | | 1 K, the set of all policies v is denoted by v at the same time. Correspondingly, an attacker can only attack m ports at the same time, so that | | a | | calculation result 1 While we use
Figure BDA0003433337150000031
Representing the set of all policies a. Total port revenue r per round t For an n-dimensional vector, set as follows: if the port i is attacked and the port transmits invalid information, the benefit r is obtained t R of the ith component t,i Is [0, 0.5 ]]A random value; the port transmits effective information, and the protection person suffers loss of [ -0.5, 0 [ -0 [ ]]A random value. For the un-attacked port, the defender's profit is 0 no matter whether the effective information is transmitted or not. Since the content and importance of information transmitted by each port are different, the protection value of each port is different, and therefore the profit value of each port set in the model is also different. To be closer to the actual situation, there are two more important features on the model setup: the defender has no prior knowledge and limited perception ability of the defender. The former feature is that the defender does not know the game income and the behavior model of the attacker in advance; the latter feature is that the defender can only observe the benefit on the port that is not transmitting valid information in each round of the game. Under this model setting, online learning methods can be utilized to generate policies for defenders. The strategy should pursue two objectives: on one hand, effective information of a plurality of ports is prevented from being stolen as much as possible, and more benefits are obtained, namely the regret degree is reduced as much as possible; on the other hand, the reallocation of ports has a certain effect on information transmission, so the number of reallocations should be reduced as much as possible.
In a general repeated secure game scene, in order to evaluate the quality of a defender strategy algorithm, an idea of "regrettability" is generally adopted, that is, a difference value between an optimal fixed strategy which is known later and the accumulated income obtained by actually adopting the strategy is provided, and the lower the regrettability is, the better the actual strategy is, the greater the obtained income is. Unfortunately, the definition is as follows:
Figure BDA0003433337150000032
wherein v is a theoretical optimal strategy; v. of t And the actual strategy is adopted by the defense party at the moment T, and T is the total time of the attack and defense scene.
In addition, in network defense, additional loss such as delay or loss of information transmission may be caused by re-allocating ports to transmit valid information, so the number of re-allocation should be reduced as much as possible. Therefore, the 'number of reallocations' can be used to evaluate the quality of the strategy, and the lower the value of the index, the better the strategy. The "number of reallocations" is defined as follows:
S T =|{1<t≤T:v t-1 ≠vt}
aiming at the scene of coping with man-in-the-middle attack in network defense, the method can generate an effective defense strategy, wherein the effective defense strategy comprises the following important hyper-parameters: sigma is the variance of Gaussian distribution, gamma is the exploration probability, and the method specifically comprises the following steps:
step 1: generating an exploration strategy set epsilon { epsilon ═ consisting of n-dimensional vectors 1 ,...,ε n Therein, the vector ε i The ith component of (a) is determined to be 1, which means that the port i must transmit invalid information, the rest components are 0 or 1, and the vector epsilon i If and only k components are 1, it means that k ports transmit invalid information;
step 2: initializing a cumulative prize estimate for each of n ports available for transmitting information
Figure BDA0003433337150000041
Figure BDA0003433337150000042
All initial estimation values are combined into an n-dimensional accumulated reward estimation vector
Figure BDA0003433337150000043
Similarly, a perturbation Z is initialized for each port 0,i 0, form an n-dimensional perturbation vector Z 0 =(Z 0,1 ,Z 0,2 ,…,Z 0,n ). Performing the subsequent steps at regular intervals, i.e. when T is 1, 2., T;
and step 3: at time t, the variance is σ from obedience expected to be 0 2 (preset) Gaussian distribution
Figure BDA00034333371500000412
And independently sampling to obtain n random quantities
Figure BDA0003433337150000045
Form an n-dimensional vector X t =(X t,1 ,X t,2 ,…,X t,n ) Random vector X t Accumulated to disturbance vector Z t-1 To obtain Z t I.e. Z t =Z t-1 +X t
And 4, step 4: uniformly and randomly sampling from 0 to 1 to obtain a value alpha, and randomly selecting a vector from a strategy set epsilon as a current round of distribution strategy v if the alpha is smaller than a search probability gamma set in advance t (ii) a Otherwise, take the accumulated reward estimate
Figure BDA0003433337150000046
And random walk disturbance Z t V taking the maximum sum of v as the allocation policy v t I.e. by
Figure BDA0003433337150000047
And 5: according to an allocation policy v t Determining port assignment where v t If the ith component of (a) is 1, the ith port transmits invalid information, and if the ith component is 0, the ith port transmits valid information. From the action taken by the man-in-the-middle attack, the actual reward vector can be observedr t Partial component r of t,i I.e. the benefit of the port which is simultaneously attacked by the man-in-the-middle and transmits invalid information;
step 6: executing step 7-9 (resampling algorithm) to estimate the reciprocal of the probability of transmitting invalid information at the port i at the moment, and recording the reciprocal as K (t, i);
and 7: initializing K (t, i) ═ 0 for all i ═ 1, 2.., n; repeating steps 8-9 for k 1, 2.. times, M, where M represents a maximum number of simulations set in advance;
and 8: executing the step 3-4 to generate an allocation strategy v t A simulation of
Figure BDA0003433337150000048
And step 9: for all i 1, 2, n, if k < M,
Figure BDA0003433337150000049
and K (t, i) ═ 0, then K (t, i) is set to K; otherwise if K is M and K (t, i) is 0, then K (t, i) is set to M;
step 10: according to the distribution strategy v actually adopted t Observed partial awards r t,i And simulating the derived K (t, i) to update the cumulative prize estimate
Figure BDA00034333371500000410
The specific update follows the following equation:
Figure BDA00034333371500000411
step 11: and returning to the step 2 to continue generating the allocation strategy of the next round until the next time T +1 until the time T is finished.
Under the above-mentioned "man-in-the-middle attack" scene without prior knowledge and with limited observability restriction, the use of the present invention to formulate the port allocation strategy limits the desired upper limit of "regret degree" and "reallocation times" at a lower level, as shown in the following two formulas:
Figure BDA0003433337150000051
and
Figure BDA0003433337150000052
(1) in particular, take
Figure BDA0003433337150000053
Unfortunately desirable upper bounds are:
Figure BDA0003433337150000054
i.e., the desired upper bound of regressions after T rounds does not exceed
Figure BDA0003433337150000055
This means that as T goes towards infinity, unfortunately approaching 0, the actual strategy converges to the optimal fixed strategy.
(2) By using
Figure BDA0003433337150000056
And
Figure BDA0003433337150000057
the desired upper bound for the number of reallocations may be approximated as:
Figure BDA0003433337150000058
generally, the search rate γ is set to be small (between 0 and 0.1), and when
Figure BDA0003433337150000059
When the order of k log n is not exceeded, the order of the first term on the right side of the above formula is not exceeded by the second term, and the redistribution times can be approximate to
Figure BDA00034333371500000510
I.e. the number of reallocations increases sub-linearly with round T.

Claims (1)

1. A man-in-the-middle attack coping method based on repeated game is characterized in that: distributing information originally transmitted by one port to a plurality of ports, selecting a fixed number of partial ports from the ports to transmit invalid information, transmitting effective information to be actually transmitted by the other ports, modeling a problem of defending man-in-the-middle attack into a repeated game model, generating a new port distribution strategy according to the port distribution strategy at intervals of set time, namely, a repeated game to determine which ports transmit invalid or effective information, and reducing the number of times of redistribution of the ports as much as possible while reducing the loss caused by man-in-the-middle attack as much as possible;
the port allocation policy is specifically implemented as follows:
step 1: generating an exploration strategy set, and ensuring that at least one allocation strategy for transmitting invalid information by a port exists for any one port;
and 2, step: initializing an accumulated reward estimation value and a strategy disturbance quantity for each port, and executing subsequent steps to generate a port distribution strategy in fixed time at fixed time intervals;
and step 3: at the current moment, independently sampling each port from Gaussian distribution to obtain random quantity, and accumulating the random quantity to the strategy disturbance quantity of the port;
and 4, step 4: randomly selecting one strategy from the exploration strategy set as a current round of distribution strategy when the exploration is carried out according to the set probability; or adopting a strategy which enables the sum of the accumulated reward estimation and the strategy disturbance amount to obtain the maximum value as an allocation strategy;
and 5: determining that the port transmits invalid information or valid information according to the allocation strategy, and observing the benefit of the port which is simultaneously attacked by the man-in-the-middle and transmits the invalid information according to the action taken by the man-in-the-middle attack;
step 6: carrying out analog estimation on the reciprocal of the probability of transmitting invalid information at each port at the moment by utilizing a resampling algorithm;
and 7: updating the accumulated reward estimation value according to the actually adopted distribution strategy, the observed partial reward and the probability reciprocal obtained by simulation;
and step 8: and returning to the step 2 to continue generating the allocation strategy of the next round until the next moment, wherein the generation of the allocation strategy of the next round is finished.
CN202111604797.2A 2021-12-24 2021-12-24 Man-in-the-middle attack coping method based on repeated game Active CN114285645B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111604797.2A CN114285645B (en) 2021-12-24 2021-12-24 Man-in-the-middle attack coping method based on repeated game

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111604797.2A CN114285645B (en) 2021-12-24 2021-12-24 Man-in-the-middle attack coping method based on repeated game

Publications (2)

Publication Number Publication Date
CN114285645A CN114285645A (en) 2022-04-05
CN114285645B true CN114285645B (en) 2022-09-30

Family

ID=80875523

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111604797.2A Active CN114285645B (en) 2021-12-24 2021-12-24 Man-in-the-middle attack coping method based on repeated game

Country Status (1)

Country Link
CN (1) CN114285645B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108833401A (en) * 2018-06-11 2018-11-16 中国人民解放军战略支援部队信息工程大学 Network active defensive strategy choosing method and device based on Bayes's evolutionary Game
CN109194685A (en) * 2018-10-12 2019-01-11 天津大学 Man-in-the-middle attack defence policies based on safe game theory
CN109639729A (en) * 2019-01-16 2019-04-16 北京科技大学 A kind of dynamic game method and device of internet of things oriented intimidation defense resource allocation

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12015596B2 (en) * 2015-10-28 2024-06-18 Qomplx Llc Risk analysis using port scanning for multi-factor authentication
US10556179B2 (en) * 2017-06-09 2020-02-11 Performance Designed Products Llc Video game audio controller

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108833401A (en) * 2018-06-11 2018-11-16 中国人民解放军战略支援部队信息工程大学 Network active defensive strategy choosing method and device based on Bayes's evolutionary Game
CN109194685A (en) * 2018-10-12 2019-01-11 天津大学 Man-in-the-middle attack defence policies based on safe game theory
CN109639729A (en) * 2019-01-16 2019-04-16 北京科技大学 A kind of dynamic game method and device of internet of things oriented intimidation defense resource allocation

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于博弈论的服务资源分配机制优化研究;接赢墨;《基础科学》;20200615(第06期);全文 *
基于安全博弈论的中间人攻击防御策略;李姝昕;《基础科学》;20190415(第04期);全文 *

Also Published As

Publication number Publication date
CN114285645A (en) 2022-04-05

Similar Documents

Publication Publication Date Title
CN113762530B (en) Precision feedback federal learning method for privacy protection
CN107566387A (en) Cyber-defence action decision method based on attacking and defending evolutionary Game Analysis
CN110191120B (en) Vulnerability risk assessment method and device for network system
CN109589607A (en) A kind of game anti-cheating method and game anti-cheating system based on block chain
Pal et al. Might I get pwned: A second generation compromised credential checking service
Foley et al. Autonomous network defence using reinforcement learning
You et al. A kind of network security behavior model based on game theory
CN114285645B (en) Man-in-the-middle attack coping method based on repeated game
CN115580423A (en) CPPS optimal resource allocation method based on game aiming at FDI attack
Aggarwal et al. An exploratory study of a masking strategy of cyberdeception using cybervan
CN108200098A (en) A kind of method for controlling multilevel access and system based on more secret visual passwords
CN113132398B (en) Array honeypot system defense strategy prediction method based on Q learning
KR102388387B1 (en) Electronic device for determinining cyber attack and operating method thereof
Zheng et al. WMDefense: Using watermark to defense Byzantine attacks in federated learning
Werner et al. Uncle traps: Harvesting rewards in a queue-based ethereum mining pool
CN116708042B (en) Strategy space exploration method for network defense game decision
CN114666107B (en) Advanced persistent threat defense method in mobile fog calculation
CN116861994A (en) Privacy protection federal learning method for resisting Bayesian attack
Arora et al. Adaptive selection of cryptographic protocols in wireless sensor networks using evolutionary game theory
Amadi et al. Anti-DDoS firewall; A zero-sum mitigation game model for distributed denial of service attack using Linear programming
Koutiva et al. An Agent-Based Modelling approach to assess risk in Cyber-Physical Systems (CPS)
Pal et al. Might I Get Pwned: A second generation password breach alerting service
Nguyen et al. Heuristic search exploiting non-additive and unit properties for RTS-game unit micromanagement
JP6632796B2 (en) Database evaluation device, method and program, and database division device, method and program
CN115208639B (en) Method for defending blockchain sponsored block interception attack based on workload certification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant