WO2020248306A1 - 采集代理部署方法及装置 - Google Patents

采集代理部署方法及装置 Download PDF

Info

Publication number
WO2020248306A1
WO2020248306A1 PCT/CN2019/092999 CN2019092999W WO2020248306A1 WO 2020248306 A1 WO2020248306 A1 WO 2020248306A1 CN 2019092999 W CN2019092999 W CN 2019092999W WO 2020248306 A1 WO2020248306 A1 WO 2020248306A1
Authority
WO
WIPO (PCT)
Prior art keywords
collection agent
collection
threat event
threat
potential threat
Prior art date
Application number
PCT/CN2019/092999
Other languages
English (en)
French (fr)
Inventor
李凤华
陈黎丽
郭云川
王震
张玲翠
Original Assignee
中国科学院信息工程研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院信息工程研究所 filed Critical 中国科学院信息工程研究所
Publication of WO2020248306A1 publication Critical patent/WO2020248306A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0281Proxies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/30Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information
    • H04L63/302Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information gathering intelligence information for situation awareness or reconnaissance

Definitions

  • This application belongs to the field of network security technology, and in particular relates to a collection agent deployment method and device.
  • Existing collection agent deployment schemes mainly deploy collection agents on nodes such as data generation and aggregation.
  • Existing deployment methods mainly consider factors such as network topology or deployment cost, and generally use methods such as mirroring to achieve data collection.
  • this method of deployment of collection agents is not suitable for large-scale and complex information networks, because the collection capabilities of different collection agents and the capabilities of attackers are different. For collection agents with different collection capabilities and attackers with different attack capabilities, if only factors such as network topology or deployment cost are considered during deployment, it is easy to cause excessive or under-collection of data.
  • over-collection refers to the deployment of a large number of collection agents in the network, resulting in too many collections and redundant collection content, which will consume a lot of deployment, collection and maintenance costs; under-collection refers to the constraints of collection costs.
  • Important risk points have not deployed collection agents or collection agents with corresponding collection capabilities, and cannot obtain data closely related to threats, and cannot provide support for subsequent analysis of potential threat events.
  • the existing collection agent deployment method only considers factors such as network topology or deployment cost. For collection agents with different collection capabilities and attackers with different attack capabilities, using this method to deploy collection agents can easily lead to excessive collection or Under-collection.
  • embodiments of the present application provide a collection agent deployment method and device.
  • a collection agent deployment method including:
  • the target network-data service database stores the target The corresponding relationship between the network topology and the data service provided by the target network
  • the data service-threat event library stores the correspondence between the data service and the potential threat events faced by the data service
  • the threat event-signature beacon library stores the potential threat events
  • the collection agent-threat detection atomic data item library stores the collection agent and the threat detection atomic data item that can be collected by the collection agent for detecting potential threat events The corresponding relationship;
  • For any of the potential threat events obtain the risk value of the potential threat event according to the confidence level of the potential threat event monitored by the collection agent and the impact of the potential threat event;
  • the device node determines whether the device node is a risk point
  • a deployment point is selected and the collection agent is deployed.
  • a collection agent deployment device including:
  • the building module is used to construct the threat-collection tree of the network according to the target network-data service database, data service-threat event database, threat event-characteristic beacon database and collection agent-threat detection atomic data item database; among them, the target network-data service Inventory the corresponding relationship between the target network topology and the data service provided by the target network, the data service-threat event database stores the corresponding relationship between the data service and the potential threat events faced by the data service, and the threat event-signature beacon database storage The corresponding relationship between the potential threat event and the threat event feature beacon that can find the potential threat event, the collection agent-threat detection atomic data item library stores the collection agent and the threat detection used to detect the potential threat event that can be collected by the collection agent Correspondence of atomic data items;
  • the obtaining module is configured to obtain the risk value of the potential threat event according to the confidence level of the potential threat event monitored by the collection agent and the influence of the potential threat event for any one of the potential threat events;
  • the determining module is configured to determine whether each device node is a risk point according to the risk value of each potential threat event and the threat-collection tree;
  • the deployment module is used to select a deployment point and deploy the collection agent according to the risk point in the network, the collection capability of the collection agent, and preset constraint conditions.
  • an electronic device including a memory, a processor, and a computer program stored on the memory and capable of running on the processor, the processor calling the program instructions to be able to execute
  • the collection agent deployment method provided by any one of the various possible implementations of the first aspect.
  • non-transitory computer-readable storage medium the non-transitory computer-readable storage medium storing computer instructions that cause the computer to execute the first aspect
  • the collection agent deployment method provided by any one of the various possible implementations.
  • the embodiment of the application provides a collection agent deployment method and device.
  • the method calculates the risk value of threat events based on network topology, data services, and potential threat events, constructs a threat-collection tree, determines risk points, and determines risk points based on the risk points, Threats-the collection tree, collection agent capabilities, and collection constraints determine the deployment location of collection agents, thereby improving data collection capabilities and reducing the resources consumed by data collection and analysis.
  • Figure 1 is a schematic diagram of the overall flow of a collection agent deployment method provided by an embodiment of the application
  • FIG. 2 is a schematic diagram of the threat-collection tree structure in the collection agent deployment method provided by an embodiment of the application;
  • FIG. 3 is a schematic diagram of the threat-collection tree structure in the collection agent deployment method provided by another embodiment of this application;
  • FIG. 5 is a schematic diagram of a deployment algorithm flow diagram in a collection agent deployment method provided by an embodiment of the application
  • Fig. 6 is a schematic diagram of a collection agent scheduling strategy flow in a collection agent deployment method provided by an embodiment of the application;
  • FIG. 7 is a schematic diagram of the overall structure of a collection agent deployment device provided by an embodiment of the application.
  • FIG. 1 is a schematic diagram of the overall flow of the collection agent deployment method provided in an embodiment of this application.
  • the method includes: S101, according to the target network-data service library, data service- Threat event database, threat event-signature beacon database and collection agent-threat detection atomic data item database construct the threat-collection tree of the network; among them, the target network-data service database stores the target network topology and the data service provided by the target network Correspondence between the data service-threat event database stores the corresponding relationship between the data service and the potential threat events faced by the data service, and the threat event-signature beacon database stores the potential threat events and the threat events that can find the potential threat events Correspondence between characteristic beacons, the collection agent-threat detection atomic data item database stores the correspondence between the collection agent and the threat detection atomic data items that can be collected by the collection agent for detecting potential threat events;
  • the data service refers to the business service running on the device node of the target network
  • the data service type includes but not limited to Web service, FTP service and database service.
  • a threat event is an attack event that may affect the target network and/or an attack event that has already affected the target network, and can be described by any combination of one or more threat event characteristics.
  • potential threat event attributes include, but are not limited to, threat event type, threat event level, threat event impact, and the confidence level of the potential threat event being monitored.
  • Threat event types include but are not limited to DDOS (Distributed Denial of Service) attacks, brute force cracking, XSS (Cross-Site Scripting, cross-site scripting) attacks, SQL (Structured Query Language, structured query language) injection, Worm attacks, Trojan horse attacks, traffic hijacking and spoofing attacks, etc.
  • the threat event level is used to indicate the severity of the threat.
  • the methods for determining the threat event level include but are not limited to empirical knowledge and fuzzy statistics. For example, a discrete value measurement can be used, using an integer from 1 to 5.
  • Threat event impact refers to the impact of the threat event on the target network, and the threat event impact can be described from the security attributes of the target network.
  • the security attributes of the target network include, but are not limited to, system integrity (Integrity), system availability (Availability), and system confidentiality (Confidentiality).
  • Methods to determine the impact of threat events include but are limited to expert knowledge, probability statistics, and fuzzy statistics.
  • the confidence that a potential threat event is monitored refers to the authenticity of the potential threat event detected by the smallest set of characteristic beacons. Methods to determine the confidence that a potential threat event is monitored include but are not limited to expert knowledge, fuzzy statistics, weighted summation, and probability analysis.
  • the minimum characteristic beacon set consists of one or more threat event characteristic beacons, and each minimum characteristic beacon set is sufficient to detect a potential threat event. It should be noted that the same potential threat event can be detected by one or more minimum characteristic beacon sets.
  • the threat event feature beacon is a threat detection atomic data item extracted from the collection item data of the collection agent that can be used to detect potential threat events, generates an atomic predicate for judging potential threat events, and uses logical connectives to connect the threat detection rule. For example, "SYN half-connections> ⁇ 1 and TCP flow> ⁇ 2 "is a threat event characteristic beacon used to detect DOS attacks, where ⁇ 1 and ⁇ 2 are thresholds.
  • Threat event signature beacons are shown in the fourth layer in Figure 2.
  • the threat-collection tree uses the form of a tree to describe the correspondence between data services, potential threat events, threat event feature beacons, and collection agents, as shown in Figure 2. Among them, the scope of the potential threat event is not limited to the potential threat event involved in the embodiments of this patent, and its scope can be broader.
  • S102 For any potential threat event, obtain the risk value of the potential threat event according to the confidence level of the potential threat event monitored by the collection agent and the influence of the potential threat event;
  • the threat event risk value is calculated according to the monitored confidence of the potential threat event and the impact of the potential threat event.
  • the calculation method includes but not limited to the multiplication method, matrix method, weighted sum method, etc.
  • Factors that determine the confidence that a potential threat event is monitored include, but are not limited to, the probability that the threat detection atomic data item is monitored by the collection agent and the probability that the equipment in the system is attacked.
  • the steps to determine the probability that the threat detection atomic data item is monitored by the collection agent include but are not limited to: according to the corresponding relationship between the threat detection atomic data item and the collection agent, through random assignment, fixed value selection, Monte Carlo simulation, probability analysis, etc.
  • the method determines the probability of the threat detection atomic data item being detected by the collection agent.
  • the steps to determine the possibility of a device node in the target network being attacked include but are not limited to: according to the location of the device in the target network system (for example, the number of hops from the external network), using random assignment, fixed value selection, and Monte Carlo Methods such as simulation and probabilistic analysis determine the possibility of the device node in the target network being attacked (for example, the fewer hops from the external network, the greater the possibility of being attacked by an attacker).
  • S103 Determine whether the device node is a risk point according to the risk value of the potential threat event and the threat-collection tree;
  • S104 Select a deployment point and deploy a collection agent according to the risk point in the target network, the collection ability of the collection agent, and preset constraint conditions.
  • the deployment point selection algorithm is called to determine the deployment location of the collection agent.
  • the elements describing the risk point include but are not limited to: location, quantity, type, etc.
  • the collection agent capability is the ability of the collection agent to obtain the collection content from the device and the network system.
  • Deployment constraints can be described in terms of cost constraints and QoS (Quality of Service, quality of service) constraints.
  • the cost includes but is not limited to: the acquisition cost, deployment cost, maintenance cost, and resource cost of the collection agent.
  • the resource cost includes, but is not limited to, power, bandwidth, and current operating status of the device.
  • QoS includes, but is not limited to: the availability of collected data, throughput, time delay of collected data, delay change, packet loss rate, etc.
  • This embodiment calculates the risk value of threat events based on the topology map, data service, and threat events of the target network, constructs a threat-collection tree, determines risk points, and determines according to risk points, threat-collection trees, collection agent capabilities, and collection constraints
  • the location where the collection agent is deployed can improve data collection capabilities and reduce the resources consumed for data collection and analysis.
  • the method further includes: obtaining collection item data of the network; the collection item data includes network traffic information and device status Information and log information; analyze the collected item data, extract key fields from the collected item data, and extract threat detection atomic data items for detecting the potential threat event from the key fields ;
  • the collected item data is historically collected data and/or currently collected data; the threat detection atomic data item is analyzed to generate an atomic predicate for judging the potential threat event; logical connectives The atomic predicate is connected to generate a threat event characteristic beacon that can detect the potential threat event.
  • the collected item data includes, but is not limited to, network traffic information (for example, the number of sent data packets, the number of received data packets, etc.), device status information (for example, CPU utilization, memory utilization, etc.) and Log information.
  • log information includes but is not limited to operating system log data (for example, Windows system, Linux system, etc.), routers, switches and other transmission equipment deployed in the target network log data (for example, bandwidth, traffic, etc.), and specific records recorded on the host Service operation log data (for example, SSH, MySQL, HTTP, Web, etc.) and security device log data (for example, firewall, IDS, etc.), etc.
  • Threat detection atomic data items are landmark data related to potential threat events that are directly collected or indirectly extracted from the collected item data.
  • the method of extracting the threat detection atomic data items can be divided into the feature data extraction of known threat events and the feature data extraction of unknown threat events.
  • the methods of extracting characteristic data of known threat events include but are not limited to expert knowledge base, probability statistics, attack sequence template comparison, causality and hierarchical correlation analysis, etc.; methods of extracting characteristic data of unknown threat events include but are not limited to Fuzzy statistics, Bayesian networks, machine learning, etc.
  • the collected item data is historically collected data or currently collected data.
  • a potential threat event refers to an attack event that may have an impact on the target network and/or an attack event that has an impact on the target network that is analyzed from the collected item data.
  • One or more threat event characteristics can also be used. Describe in any combination.
  • the steps of generating characteristic beacons of potential threats include but are not limited to: the first step is to analyze the collected item data to extract key fields (for example, convert unstructured information into structured, etc.), and extract from the key fields Generate threat detection atomic data items that can be used to detect potential threat events; the second step is to analyze the threat detection atomic data items through statistical methodology to generate atomic predicates for judging potential threat events; the third step is to judge potential threats based on The atomic predicates of the event are connected by logical connectives to generate characteristic beacons that can detect potential threats.
  • the corresponding relationship between the threat event feature beacon and the potential threat event can be described by a list or by constructing a threat tree, as shown in layer 3-4 in Figure 2.
  • the step of obtaining the risk value of the potential threat event further includes: The corresponding relationship between the collection agent and the threat detection atomic data items that can be collected by the collection agent for detecting threats, and determine the probability that the threat detection atomic data items in the potential threat event feature beacon are monitored by the collection agent; The probability that the threat detection atomic data item is monitored by the collection agent is calculated based on the probability transfer method, and the threat detection atomic data item set corresponding to the smallest characteristic beacon set of the potential threat event is monitored by the collection agent Probability; wherein, the minimum characteristic beacon set corresponding to the potential threat event is a set of threat event characteristic beacons that meet the following conditions and can detect the potential threat event: any proper subset of the set cannot detect the potential Threat events; determine the possibility of each device node being attacked according to the location information of each device node in the network system and/or device defense information; calculate the possibility of each
  • determining the confidence that a potential threat event is monitored mainly includes the following steps: First, determining the probability of the threat detection atomic data item being monitored by the collection agent and the probability of the equipment in the system being attacked. Secondly, according to the possibility of the device in the system being attacked, using methods such as triangular paradigm, the computing device corresponds to the collection agent to obtain the authenticity of the collected item data and threaten the authenticity of the atomic data item. Third, according to the authenticity of the threat detection atomic data item and the threat event characteristic beacon, the authenticity of the potential threat event corresponding to the monitored data is calculated and determined. Finally, based on the probability of the threat detection atomic data item being monitored by the collection agent and the authenticity of the potential threat event corresponding to the monitored data, the weighted sum method is used to calculate the confidence that the potential threat event is monitored.
  • the probability that the smallest characteristic beacon set corresponding to the potential threat event is monitored by the collection agent and the truth of the smallest characteristic beacon set corresponding to the potential threat event To determine the confidence that the potential threat event is monitored by the collection agent:
  • p ⁇ represents the confidence that any one of the potential threat events ⁇ is monitored by the collection agent
  • ⁇ i represents the i-th smallest feature beacon set corresponding to ⁇
  • ⁇ ( ⁇ ) represents all the smallest features corresponding to ⁇ Collection of beacon collections
  • the step of obtaining the risk value of the potential threat event also includes: The security attributes of the target network involved in the potential threat event are evaluated; the security attributes include integrity, availability, and confidentiality; and the impact of the potential threat event is determined according to the evaluation result.
  • the impact of a potential threat event refers to the impact of a potential threat event on the target network, and the impact of a potential threat event can be described from the security attributes of the target network.
  • the security attributes of the target network include, but are not limited to, system integrity (Integrity), system availability (Availability), and system confidentiality (Confidentiality). Evaluate the security attributes involved in potential threat events, and determine the impact of potential threat events based on the results of the assessment.
  • the steps of deploying the collection agent on the risk point are specific based on the risk point in the target network, the collection ability of the collection agent, and preset constraints. Including: 1) constructing a first objective function, determining the constraint conditions of the first objective function, solving the first objective function, and obtaining the number of collection agents that need to be deployed; the first objective function includes: maximizing Any one or more of the collection utility, minimize the deployment cost of the collection agent, and minimize the resource consumption of the collection agent; the constraints of the first objective function include: the cost of deploying the collection agent is less than the total deployment budget, and the collection utility is not less than the first 2.
  • the preset threshold and the resource consumption of the collection agent does not exceed any one or more of the third preset threshold; 2) Construct a second objective function, and determine the constraint conditions of the second objective function,
  • the objective function is solved to obtain the location of the collection agent that needs to be deployed;
  • the second objective function includes the attacker's first profit function and/or the monitor's first profit function;
  • the attacker's first profit function includes: maximizing the attacker Any one or more of the impact on the device node, maximizing the time the attacker is monitored by the collection agent, and maximizing the number of infections of the device node when the attacker is detected;
  • the monitor's first profit function includes any one or more of minimizing the cost of the collection agent, maximizing the effectiveness of the collection item data obtained by the collection agent, and minimizing the attacker's first profit function;
  • the constraints of the second objective function include: the number of collection agents is less than the fourth preset threshold, the risk value caused by each potential threat event is less than the fifth preset threshold, and the monitoring time of the collection agent is
  • the deployment of collection agents mainly includes three steps: determining the number of collection agents, determining the deployment point of collection agents, and implementing the deployment of collection agents.
  • the specific process is as follows:
  • the specific steps for determining the number of collection agents include but are not limited to: First, construct the first objective function.
  • the constructed first objective function includes but is not limited to: maximizing collection utility, minimizing collection agent deployment costs, and minimizing collection agent resources Any one or more of the consumption;
  • select constraints which include but are limited to: the cost of deploying the collection agent is less than the total deployment budget, the collection utility is not less than the second preset threshold, and the resource consumption of the collection agent does not exceed the first Any one or more of the three preset thresholds; finally, the above-mentioned optimized first objective function is solved, and methods for solving the first objective function include but are not limited to: knapsack algorithm, multi-objective programming equation, local search, etc.
  • the item selected as the optimization objective in the construction of the first objective function of optimization cannot appear in the constraint condition.
  • the first objective function is to maximize the collection utility
  • the collection utility is not lower than the minimum basic utility value and cannot be used as a constraint condition.
  • the specific steps to determine the location of the collection agent include but are not limited to: First, construct the second objective function: 1 Choose the attacker’s first profit function from the attacker’s point of view.
  • the attacker’s first profit function includes but is not limited to: maximizing the attacker’s device The impact of nodes or network systems maximizes the time the attacker is detected, and maximizes the number of device nodes or network systems infected when the attacker is detected.
  • the monitor's first revenue function includes but is not limited to: minimize the cost of collection, maximize the effectiveness of the collected information, and minimize the benefits of the attacker.
  • select constraints include but are not limited to:
  • Constraints include but are not limited to: the number of collection agents is less than the fourth preset threshold, the risk value of each of the potential threat events is less than the fifth preset threshold, and the monitoring time of the collection agent is less than the first Any one or more of the six preset thresholds; then, the second objective function is solved, and the methods for solving the second objective function include but are not limited to: greedy algorithm, local search method, simulated annealing algorithm, genetic algorithm, ant colony algorithm, Particle swarm algorithm, Lagrangian multiplier method, etc. Finally, output the ID of the location where the collection agent is deployed, which is the deployment location of the collection agent.
  • Collection agent deployment According to the requirements of (1) and (2), implement the deployment of collection agents.
  • this embodiment after the step of deploying the collection agent on the risk point according to the risk point in the network, the collection capacity of the collection agent, and preset constraints, It also includes: generating a scheduling strategy of the collection agent according to the deployment location of the collection agent, the capabilities of the collection agent, and the capabilities of the attacker.
  • the attacker's ability refers to the attacker's ability to attack a group of device nodes or data services of the target system.
  • the elements to evaluate the attacker's ability include but are not limited to: the attack relay points that can be selected, The attack range that can be selected, the attack path that can be selected, the attack method that can be selected, and the number of exploitable vulnerabilities.
  • the collection agent scheduling generation algorithm is invoked to generate the collection agent scheduling strategy according to the deployment location of the collection agent, the capacity of the collection agent, and the ability of the attacker.
  • the existing acquisition agent activation strategy mainly adopts passive activation mode and active activation mode.
  • the passive start mode means that the collection agent waits for the collection start command sent by the manager, and once the start command is received, the data collection is executed according to the collection command.
  • the active start mode means that the collection agent autonomously activates the collection agent according to a preset method and the current environmental state to collect data.
  • the typical active start mode is timing or periodic collection, that is, data collection is performed according to a preset collection period. For example, the host CPU load is collected every 5 minutes.
  • the attacker can detect the topology of the attacked target network, observe the deployment location of the collection agent and the activation rule of the collection agent (for example, the attacker obtains the deployment location and activation rule through scanning, infiltration, social engineering, etc.), and select undeployed Collecting agents or nodes that have not started collecting agents are used as attack targets, which maximizes the attack effect, destroys the effectiveness of data collected by the collection agents, and makes it impossible for monitors to accurately analyze the target network security status.
  • the attacker can detect the topology of the attacked target network, observe the deployment location of the collection agent and the activation rule of the collection agent (for example, the attacker obtains the deployment location and activation rule through scanning, infiltration, social engineering, etc.), and select undeployed Collecting agents or nodes that have not started collecting agents are used as attack targets, which maximizes the attack effect, destroys the effectiveness of data collected by the collection agents, and makes it impossible for monitors to accurately analyze the target network security status.
  • the monitor selects a combination of collection agents for activation with different probabilities to ensure that the attacker cannot observe the law of activation of the collection agent, thereby preventing the attacker from avoiding monitoring and improving the effectiveness of the collection agent for collecting data.
  • the step of generating the scheduling strategy of the collection agent specifically includes: 1) Constructing the first Three objective functions, and determine the constraints of the third objective function, and solve the third objective function to obtain the number of collection agents that need to be turned on; the third objective function includes: maximizing the opening utility of the collection agent, Minimize any one or more of the resources consumed to start the collection agent; the constraints of the third objective function include: the collection agent activation utility is not lower than the seventh preset threshold, and the resource consumption of the collection agent activation does not exceed the eighth Any one or more of the preset thresholds; 2) Construct the attacker’s second profit function and the monitor’s second profit function, and construct the first profit function according to the attacker’s second profit function and/or the monitor’s second profit function.
  • the constraint condition of the fourth objective function is constructed according to the attacker strategy set, the monitor strategy set and the number of scheduled collection agents; the attacker strategy set is the action set that the attacker can choose,
  • the attacker's action set includes: selecting the source of infection, selecting the attack path, and selecting any one or more of the attack targets.
  • the monitoring party strategy set is a set of actions that the monitoring party can select, and the action set of the monitoring party refers to which collection agents the monitoring party selects to enable for monitoring; according to the constraints of the fourth objective function and the fourth objective function Condition, calculate the mixed strategy of the monitor and the mixed strategy of the attacker; wherein the mixed strategy of the attacker includes the attack strategy selected by the attacker and the probability that the attack strategy is selected, and the monitoring The mixed strategy of the monitor is the monitoring strategy selected by the monitor and the probability that the monitoring strategy is selected; the scheduling strategy of the collection agent is generated according to the mixed strategy of the monitor; wherein, the attacker has a second benefit
  • the function depends on the time from when the attacker started the attack to the monitoring by the monitor, the total number of device nodes infected by the attacker from when the attacker started the attack to when the monitor was monitored, and/or, The impact of the attacker on the data service; the monitor's second revenue function depends on the time when the monitor detects the attacker, and the number of nodes infected when the monitor detects the attacker The impact
  • the main steps for determining the collection agent scheduling strategy in this embodiment include but are not limited to:
  • the revenue function of both parties is the revenue that both parties can obtain according to their type and selected actions.
  • the revenue function includes one or two of the attacker's revenue function and the monitor's revenue function.
  • the set of attack strategies is a set of actions that can be selected by an attacker.
  • the set of actions of the attacker includes but is not limited to: selecting an infection source, selecting an attack path, and selecting an attack target.
  • the monitoring party strategy set is a set of actions that the monitoring party can select, and the action set of the monitoring party refers to which collection agents the monitoring party selects to enable for monitoring.
  • the steps of constructing the fourth objective function include but are not limited to: First, determine the respective revenue functions of the two parties involved.
  • the second revenue function of the attacker includes but is not limited to: the time from when the attacker starts the attack to the detection of the monitored person; From the beginning of the attack to the detection by the monitor, the attacker has infected a total of nodes; the attacker has an impact on service data, etc.
  • the second revenue function of the monitor includes but is not limited to: the time the monitor detects the attack; the number of nodes infected when the monitor detects the attack; the monitor detects that the attack is the impact of the data service.
  • construct the fourth objective function According to the income function of both parties, the expected income is calculated by weighted sum and other methods. The expected income is the system objective function.
  • Constraints include but are not limited to: the number of collection agents activated is less than the preset threshold, the sum of the probabilities of each strategy in the attacker's mixed strategy is equal to 1, the sum of the probability of each strategy in the monitor's mixed strategy is equal to 1, and the resource consumption is less than
  • the seventh preset threshold for example, the remaining power of the five devices installed with the collection agent is 20%, 45%, 50%, 75%, 90%, respectively.
  • the operating cost is less than the eighth preset threshold (opening each collection agent will consume a certain cost, such as manpower, financial resources, time, etc.), and the maintenance cost is less than the ninth Threshold (maintaining the normal operation of the collection agent will also consume a certain cost, such as manpower, financial resources, time, etc.).
  • the hybrid strategy is a strategy selected by both parties with a certain probability value, that is, the monitor selects the monitoring strategy with a certain probability value, and the attacker selects the attack strategy with a certain probability value.
  • the steps to solve the objective function include but are not limited to: the first step is to initialize one or several strategies of the two parties involved. The initialization methods include: random selection, degree centrality, etc.
  • the second step is to solve the objective function of the current strategy set.
  • the methods to solve the objective function include but are not limited to: linear programming, gradient descent, greedy algorithm, local search method, simulated annealing algorithm, genetic algorithm, ant colony algorithm, particle swarm algorithm Wait.
  • Solving the objective function can be divided into three situations for discussion: 1 When the scale of the strategy set of the two parties is less than the preset threshold, the initial strategy of the two parties is all the strategies, which can be solved directly by linear programming to find the best objective function value Mixed strategies with both parties; 2When the size of the strategy set of both parties is greater than the preset threshold, the initial strategy of the participant is part of the overall strategy.
  • the probability of selecting the current strategy is directly calculated by the objective function solution method, and is taken as the next A one-step benchmark, on which the participating parties select a new strategy from their respective strategy set and add it to the original strategy set, and then re-call the above objective function solving method to solve the objective function value based on the new strategy, and loop to the participating parties
  • the set of alternative strategies is empty, and finally the best objective function value and the mixed strategy of both parties are obtained.
  • the initial strategy of the participant with the smaller strategy set is the entire strategy
  • the initial strategy of the participant with the larger strategy set is part of the entire strategy, which is solved by the objective function
  • the method directly calculates the probability of selecting the current strategy, and uses it as a benchmark for the next step.
  • a new strategy is selected from a large-scale set of alternative strategies combined with strategies and added to the original strategy set, and then recalled Solve the objective function value based on the new strategy through the above objective function solving method, loop until the set of alternative strategies of the participating parties is empty, and finally find the best objective function value and the mixed strategy of both parties.
  • s1 represents the firewall, and the data service running on it is the UFW service; s2 and s3 represent the management server, and the data service running on them is the SSH service; s4 represents the web server and the data service running on it is the Apache HTTP service; s5 On behalf of the database, the data service running on it is the MySQL service.
  • the top 4 categories are selected as potential network threat events in this embodiment, where 1 represents brute force cracking, 2 represents DDOS attack, 3 represents XSS attack, and 4 represents SQL injection.
  • Determination of risk points Calculate the risk value of threat events based on the target network topology, data services, and threat events, construct a threat-collection tree, and determine risk points.
  • Threat event characteristic beacon generation The data service type is based on the service running in the device in the target network topology as an example, including UFW service, SSH service, Apache HTTP service, MySQL service.
  • the available collection item data can be divided into three categories: network traffic information (for example, the number of sent data packets, the number of received data packets, etc.), device status information (for example, CPU utilization, memory utilization, etc.) and Log information.
  • the log information includes but is not limited to: SSH log information, MySQL log information, HTTP log information, Web log information, firewall, IDS, etc.
  • feature data is extracted from the collected item data to form a set of threat event feature beacons.
  • the first step is to analyze the collected item data, extract the key fields, and extract the threat detection atomic data item that can be used to detect threats: "failed password" from the key fields.
  • the second step is to extract the signature data of the "brute force cracking" event threat event feature from the SSH connection failure log data of multiple collected items, analyze it with statistical methods, and generate the atomic predicate "SSH attempt failed" to judge potential threat events Times> Threshold".
  • the third step is a threat detection rule connected by logical connective words: "Number of failed SSH attempts> Threshold” and "Number of SSH start attempts> Threshold”
  • the detailed extraction process of other threat event feature beacons in this embodiment is no longer In the exhaustive description, the threat event characteristic beacon of this embodiment is directly given as follows:
  • the corresponding relationship between the threat event feature beacon and the collection agent can be represented by a threat-collection tree, as shown in Figure 3.
  • Risk value calculation Calculate the risk value of potential threat events based on the monitored confidence of threat event characteristics and the impact of potential threat events.
  • the calculation methods include but are not limited to: multiplication, matrix, weighted sum, etc.
  • determine the probability that the smallest set of characteristic beacons are monitored by the collection agent According to the relationship between the threat detection atomic data item and the collection agent, through the random assignment method, determine the probability of the threat detection atomic data item being monitored by the collection agent, and use the probability Delivery and probability calculation methods are used to calculate the probability that the minimum characteristic beacon set is monitored, as shown in Table 2.
  • the relationship between the threat event characteristic beacon and the collection agent is as follows:
  • the number of hops is used as the standard to measure the physical location of the device from the edge of the network.
  • the physical location of the database is relatively far away from the edge of the network, and there are more restrictions on the logical access relationship, the database server is more likely to be attacked, and the firewall data is general If the firewall is at the edge of the internal network and the external network, and is vulnerable to illegal access and attacks, the firewall is less likely to be attacked.
  • the triangle paradigm is used to determine the authenticity of the collected item data obtained by the corresponding collection agent of the device and the authenticity of the threat detection atomic data item.
  • the authenticity value range is between 0 and 1, where ,
  • the authenticity of the data service is 0 by default.
  • the use of 0.1 to 0.3 indicates low authenticity, 0.4 to 0.6 indicates medium authenticity, and 0.7 to 0.9 indicates high authenticity. Therefore, the authenticity of the collection agent deployed on the database server to obtain the threat detection atomic data is 0.9, and the authenticity of the collection agent deployed on the firewall server to obtain the threat detection atomic data is 0.3.
  • the authenticity of each threat detection atomic data item is consistent with the authenticity of the collection agent that generated it, as shown in Table 3.
  • the authenticity of the threat detection atomic data item and the threat event characteristic beacon is determined. Because each threat detection atomic data item is Generated by the collection item data collected by different collection agents, then the authenticity of each threat detection atomic data item is consistent with the authenticity of the collection agent that generated it.
  • the smallest characteristic beacon set contains two or more characteristic beacons, the lowest authenticity is taken as the authenticity of the entire smallest characteristic beacon set, for example, the smallest characteristic beacon in The authenticity from s2, s2 is 0.3, From s3, the authenticity of s3 is 0.5, so the smallest feature beacon The authenticity is 0.3.
  • the authenticity of the minimum characteristic beacon set is shown in Table 4.
  • the weighted sum method is used to calculate the confidence that the potential threat event is monitored by the collection agent.
  • the formula is as follows :
  • p ⁇ represents the confidence that any one of the potential threat events ⁇ is monitored by the collection agent
  • ⁇ i represents the i-th smallest feature beacon set corresponding to ⁇
  • ⁇ ( ⁇ ) represents all the smallest features corresponding to ⁇ Collection of beacon collections
  • the confidence levels of potential threat events detected by the collection agent are:
  • the impact of potential threat events is mainly described from the perspective of security attributes, which can be evaluated in three aspects: system confidentiality (Confidentiality), system integrity (Integrity), and system availability (Availability) .
  • the values of the above three aspects are between 0 and 5, and the impact level is between I and V.
  • Level I represents extremely low impact
  • level II represents low impact
  • level III represents medium impact
  • level IV represents high Impact
  • grade V represents extremely high impact.
  • P ⁇ represents the confidence that the potential threat event ⁇ is monitored by the collection agent
  • I ⁇ represents the impact value of the potential threat event ⁇ .
  • the risk value of the potential threat event is calculated as follows:
  • the threat characteristic beacon corresponding to the potential threat event ⁇ 1 is with The threat characteristic beacon corresponding to ⁇ 2 is with The threat characteristic beacon corresponding to ⁇ 3 is with The threat characteristic beacon corresponding to ⁇ 4 is with Threat signature beacon
  • the corresponding network device node is v 1
  • threat characteristic beacon with The corresponding network device node is v 2
  • threat characteristic beacon with The corresponding network device node is v 3
  • threat characteristic beacon with The corresponding network device node is v 4
  • threat characteristic beacon The corresponding network device node is v 5.
  • the risk points are network device nodes v 1 , v 2 , v 3 , v 4 , and v 5 .
  • the first objective function is calculated according to the knapsack algorithm.
  • the type of collection agent is a homogeneous embedded collection agent with no difference in the ability of the collection agent. Due to the different types of equipment deployed and the different data services running on the equipment, only the existence of the collected item data is considered difference.
  • the setting of the hostile environment should be considered. Therefore, for the determination of the location of the collection agent in this embodiment, that is, the optimization of the second objective function, the monitor minimizes the attacker's maximum attack impact.
  • a greedy algorithm is used to determine the location of the collection agent. Choose a value z as small as possible. For each value of z, the lowest cost set S d can be found. For all potential threats i can satisfy R i (S d ) ⁇ z. For z>0, the following is definition:
  • R i is a function of the first truncated at position z, which is an average value:
  • the question can be calculated by taking the maximum value and the minimum value z max z min, wherein z max is the maximum value acquired when all agents are not deployed, the attacker maximum utility value, z min is the minimum value when all Collection agents are deployed on the device nodes, and the attacker has the least utility. Secondly, find the average z of the maximum value z max and the minimum value z min .
  • the corresponding income can be calculated Again, call the greedy algorithm, according to the mean z and Find the device node ID combination with the largest absolute value in each round in turn, and assign it to S dbest ; if the number of selected collection agents does not meet 3, then use the current value of z to assign to z max or z min . Finally, call the greedy algorithm again, and loop to find the deployment set that satisfies the objective function. It should be noted that every time the greedy algorithm is called, it starts from the empty set. The calculation result is that the equipment labels are 1, 3, and 4, and these three points are the deployment positions.
  • the greedy algorithm is shown in Figure 4, and the collection agent deployment algorithm flowchart is shown in Figure 5.
  • the following is an example of the collection agent scheduling method.
  • each node represents a deployed collection agent.
  • the ability of the collection agent is the ability of the collection agent itself to obtain the collected item data, and the attacker's ability is that the attacker can select any node in the target network as the source of infection for spreading the virus.
  • the strategy of the monitor is to select k collection agents from the 7 device nodes of the target network to start, and the monitor has a total of c(k,n) alternative strategies.
  • the attacker's strategy is to select a point from the 7 nodes in the target network as the source of infection.
  • the attacker has 7 alternative strategies.
  • the policy space collection threshold is set to 20.
  • the number k of acquisition agents to be opened must be less than the pre-threshold value, which can be determined by constructing a third objective function and constraint conditions.
  • the value of k is set to 3.
  • the monitor selects the value of each strategy from the alternative strategies. The sum of the probabilities is 1.
  • the monitor's second revenue function selection minimizes the time the attacker is monitored by the monitor, that is, the detector monitors the attacker as soon as possible, and the attacker's second revenue function selects the monitor's revenue function to maximize.
  • the weighted sum method is used to calculate the expected income, and the fourth objective function of the whole system is constructed.
  • A represents any attacker strategy
  • D represents any monitoring strategy
  • represents the time the attacker is monitored by the monitor when the monitor chooses D and the attacker chooses A.
  • the fourth objective function of the whole system is as follows:
  • the restrictive conditions of the two-party strategy are the equations and inequalities in the fourth objective function.
  • A is the attack strategy selected by the attacker;
  • D is the monitoring strategy selected by the monitor;
  • U is the system objective function;
  • U d is the monitor’s revenue function;
  • x is the monitor’s mixed strategy, which can be selected with the probability of x D Alternative strategy set One strategy in D.
  • the scale of the strategy space set is judged.
  • the attacker strategy set scale is 7, and the monitor strategy set scale is 35. Therefore, it is in line with the third case of solving the objective function: the strategy set scale of one of the two parties is greater than the preset threshold.
  • the steps to solve the fourth objective function are as follows:
  • the attacker’s strategy has 7 alternative strategies, and all strategies ⁇ v0 ⁇ , ⁇ v1 ⁇ , ⁇ v2 ⁇ , ⁇ v3 ⁇ , ⁇ v4 ⁇ , ⁇ v5 ⁇ , ⁇ v6 ⁇ as the attacker’s initial strategy; use a random selection method to initialize the monitor, and randomly select a strategy ⁇ v4,v5,v3 ⁇ from all the c(3,7) alternative strategies of the monitor as the monitor Initial strategy.
  • linear programming can be used to calculate the current objective function benefit, the monitor's current mixed strategy, and the attacker's current mixed strategy based on the initial strategy, and use the above three as benchmarks. On this benchmark, the greedy algorithm can be used to find that the objective function can be improved.
  • the new monitor strategy for revenue, the set of alternative strategies for both parties involved in the circular system is empty, and the final fourth objective function revenue is solved and the monitor schedules the collection agent's mixed strategy.
  • the collection agent scheduling strategy process is shown in Figure 6.
  • the monitor's mixed strategy is: the probability of selecting strategy ⁇ v2,v5,v6 ⁇ is 0.278624, the probability of selecting ⁇ v3,v5,v6 ⁇ is 0.0248471, the probability of selecting ⁇ v0,v3,v6 ⁇ is 0.246089, and the probability of selecting ⁇ v2
  • the probability of ,v3,v6 ⁇ is 0.029415
  • the probability of choosing ⁇ v2,v3,v5 ⁇ is 0.162656
  • the probability of choosing ⁇ v1,v3,v4 ⁇ is 0.230108
  • the probability of choosing ⁇ v3,v4,v6 ⁇ is 0.0282604.
  • a collection agent deployment device is provided, which is used to implement the methods in the foregoing embodiments. Therefore, the descriptions and definitions in each embodiment of the foregoing collection agent deployment method can be used for the understanding of each execution module in the embodiments of the present application.
  • Fig. 7 is a schematic diagram of the overall structure of a collection agent deployment device. The device includes a construction module 701, an acquisition module 702, a determination module 703, and a deployment module 704; among them:
  • the construction module 701 is used to construct the threat-collection tree of the network according to the target network-data service database, data service-threat event database, threat event-feature beacon database and collection agent-threat detection atomic data item database; among them, the target network-
  • the data service library stores the correspondence between the target network topology and the data services provided by the target network
  • the data service-threat event library stores the correspondence between the data service and the potential threat events faced by the data service
  • the threat event-signature beacon The corresponding relationship between the inventory of potential threat events and the threat event feature beacons that can find the potential threat event, the collection agent-threat detection atomic data item database storage collection agent and the collection agent can collect for detecting potential threat events
  • the obtaining module 702 is used to obtain the potential threat according to the confidence level of the potential threat event monitored by the collection agent and the impact of the potential threat event for any one of the potential threat events
  • the risk value of the event the determining module 703 is
  • This embodiment calculates the risk value of threat events based on the target network topology, data services, and threat events, constructs a threat-collection tree, determines risk points, and determines collection based on risk points, threat-collection trees, collection agent capabilities, and collection constraints Agent deployment location to improve data collection capabilities and reduce the resources consumed by data collection and analysis.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Evolutionary Computation (AREA)
  • Technology Law (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Computer And Data Communications (AREA)

Abstract

本申请提供一种采集代理部署方法及装置,方法包括:根据目标网络-数据服务库、数据服务-潜在威胁事件属性库、威胁事件-特征信标库和采集代理-威胁检测原子数据项库构建所述网络的威胁-采集树;对于任一所述潜在威胁事件,根据该潜在威胁事件被所述采集代理监测到的置信度和该潜在威胁事件的影响,获取潜在该潜在威胁事件的风险值;根据各所述潜在威胁事件的风险值和所述威胁-采集树,确定各所述设备节点是否为风险点;根据所述网络中的风险点、所述采集代理的采集能力和预设约束条件,将所述采集代理部署在所述风险点上。本申请提升数据采集能力,降低数据采集和分析所消耗的资源。

Description

采集代理部署方法及装置
相关申请的交叉引用
本申请要求于2019年6月13日提交的申请号为201910509683.6,发明名称为“采集代理部署方法及装置”的中国专利申请的优先权,其通过引用方式全部并入本申请。
技术领域
本申请属于网络安全技术领域,尤其涉及一种采集代理部署方法及装置。
背景技术
大规模复杂信息网络中存在大量重要设备和***,为了监测这些设备和***的运行状态,及时发现潜在威胁,需要部署采集代理来采集设备和***的运行状态及其产生的海量数据和日志。
现有的采集代理部署方案主要在数据产生与汇聚等节点上部署采集代理。现有部署方式主要考虑网络拓扑或部署成本等因素,一般利用镜像等方式实现数据采集。但这种采集代理部署方式不适用于大规模复杂信息网络,这是因为不同的采集代理的采集能力,以及攻击者的能力是不同的。对于不同采集能力的采集代理和不同攻击能力的攻击者,若在部署时仅考虑考虑网络拓扑或部署成本等因素,容易导致数据的过度采集或欠采集。其中,过度采集指的是在网络中部署大量的采集代理,造成采集数量过多,采集内容冗余,这将消耗大量的部署、采集和维护成本;欠采集指的是采集成本约束下,在重要风险点未部署采集代理或未部署具有相应采集能力的采集代理,而不能获取与威胁密切相关的数据,无法为后续分析潜在威胁事件提供支持。
综上所述,现有的采集代理部署方法仅考虑网络拓扑或部署成本等因素,对于不同采集能力的采集代理和不同攻击能力的攻击者,采用这种方法进行采集代理部署容易造成过度采集或欠采集。
发明内容
为克服上述现有的采集代理部署方法易造成过度采集或欠采集的问题或者至少部分地解决上述问题,本申请实施例提供一种采集代理部署方法及装置。
根据本申请实施例的第一方面,提供一种采集代理部署方法,包括:
根据目标网络-数据服务库、数据服务-威胁事件库、威胁事件-特征信标库和采集代理-威胁检测原子数据项库构建网络的威胁-采集树;其中,目标网络-数据服务库存储目标网络拓扑与目标网络所提供的数据服务之间的对应关系,数据服务-威胁事件库存储数据服务与数据服务面临的潜在威胁事件之间的对应关系,威胁事件-特征信标库存储潜在威胁事件与能发现所述潜在威胁事件的威胁事件特征信标间的对应关系,采集代理-威胁检测原子数据项库存储采集代理与采集代理所能采集的用于检测潜在威胁事件的威胁检测原子数据项的对应关系;
对于任一所述潜在威胁事件,根据该潜在威胁事件被所述采集代理监测到的置信度和该潜在威胁事件的影响,获取该潜在威胁事件的风险值;
根据所述潜在威胁事件的风险值和所述威胁-采集树,确定设备节点是否为风险点;
根据所述网络中的风险点、所述采集代理的采集能力和预设约束条件,选择部署点并部署采集代理。
根据本申请实施例第二方面提供一种采集代理部署装置,包括:
构建模块,根据目标网络-数据服务库、数据服务-威胁事件库、威胁事件-特征信标库和采集代理-威胁检测原子数据项库构建网络的威胁-采集树;其中,目标网络-数据服务库存储目标网络拓扑与目标网络所提供的数据服务之间的对应关系,数据服务-威胁事件库存储数据服务与数据服务面临的潜在威胁事件之间的对应关系,威胁事件-特征信标库存储潜在威胁事件与能发现所述潜在威胁事件的威胁事件特征信标间的对应关系,采集代理-威胁检测原子数据项库存储采集代理与采集代理所能采集的用于检测潜在威胁事件的威胁检测原子数据项的对应关系;
获取模块,用于对于任一所述潜在威胁事件,根据该潜在威胁事件被所述采集代理监测到的置信度和该潜在威胁事件的影响,获取该潜在威胁事件的风险值;
确定模块,用于根据各所述潜在威胁事件的风险值和所述威胁-采集树,确定各所述设备节点是否为风险点;
部署模块,用于根据所述网络中的风险点、所述采集代理的采集能力和预设约束条件,选择部署点并部署采集代理。
根据本申请实施例的第三个方面,还提供一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器调用所述程序指令能够执行第一方面的各种可能的实现方式中任一种可能的实现方式所提供的采集代理部署方法。
根据本申请实施例的第四个方面,还提供一种非暂态计算机可读存储介质,所述非暂态计算机可读存储介质存储计算机指令,所述计算机指令使所述计算机执行第一方面的各种可能的实现方式中任一种可能的实现方式所提供的采集代理部署方法。
本申请实施例提供一种采集代理部署方法及装置,该方法通过依据网络拓扑图、数据服务、潜在威胁事件,计算威胁事件风险值,构建威胁-采集树,确定风险点,并依据风险点、威胁-采集树、采集代理能力、采集约束确定采集代理部署位置,从而提升数据采集能力,降低数据采集和分析所消耗的资源。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作以简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的采集代理部署方法整体流程示意图;
图2为本申请实施例提供的采集代理部署方法中威胁-采集树结构示意图;
图3为本申请又一实施例提供的采集代理部署方法中威胁-采集树结构示意图;
图4为本申请实施例提供的采集代理部署方法中贪心算法流程示意图;
图5为本申请实施例提供的采集代理部署方法中部署算法流程示意图;
图6为本申请实施例提供的采集代理部署方法中采集代理调度策略流 程示意图;
图7为本申请实施例提供的采集代理部署装置整体结构示意图。
具体实施方式
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
在本申请的一个实施例中提供一种采集代理部署方法,图1为本申请实施例提供的采集代理部署方法整体流程示意图,该方法包括:S101,根据目标网络-数据服务库、数据服务-威胁事件库、威胁事件-特征信标库和采集代理-威胁检测原子数据项库构建网络的威胁-采集树;其中,目标网络-数据服务库存储目标网络拓扑与目标网络所提供的数据服务之间的对应关系,数据服务-威胁事件库存储数据服务与数据服务面临的潜在威胁事件之间的对应关系,威胁事件-特征信标库存储潜在威胁事件与能发现所述潜在威胁事件的威胁事件特征信标间的对应关系,采集代理-威胁检测原子数据项库存储采集代理与采集代理所能采集的用于检测潜在威胁事件的威胁检测原子数据项的对应关系;
其中,数据服务是指目标网络的设备节点上运行的业务服务,数据服务类型包括但不限于Web服务、FTP服务和数据库服务等。威胁事件是可能会对目标网络造成影响的攻击事件和/或对目标网络已经造成影响的攻击事件,可用一个或多个威胁事件特征属性的任意组合进行描述。
其中,潜在威胁事件属性包括但不限于威胁事件类型、威胁事件等级、威胁事件影响和潜在威胁事件被监测到的置信度。威胁事件类型包括但不限于DDOS(Distributed Denial of Service,分布式拒绝服务)攻击、暴力破解、XSS(Cross-Site Scripting,跨站脚本)攻击、SQL(Structured Query Language,结构化查询语言)注入、蠕虫攻击、木马攻击和流量劫持和欺骗攻击等。威胁事件等级用于表示威胁的严重程度,确定威胁事件等级的方法包括但不限于经验知识和模糊统计。例如,可用离散值度量,用从1到5的整数,数字越大,表示威胁越严重。威胁事件影响指的是威胁事件对目标网络的影响,威胁事件影响可以从目标网络的安全属性进行描述。 目标网络的安全属性包括但不限于***的完整性(Integrity)、***的可用性(Availability)和***的机密性(Confidentiality)等。确定威胁事件影响的方法包括但限于专家知识、概率统计和模糊统计。潜在威胁事件被监测到的置信度指的是潜在威胁事件被最小特征信标集合检测到的真实性。确定潜在威胁事件被监测到的置信度的方法包括但不限于专家知识、模糊统计、加权求和和概率分析等。
最小特征信标集合由一个或多个威胁事件特征信标组成,每个最小特征信标集合足以检测到一个潜在威胁事件。需要说明的是,同一个潜在威胁事件可以被一个或多个最小特征信标集合检测出来。其中,威胁事件特征信标是从采集代理的采集项数据中提取的可用于检测潜在威胁事件的威胁检测原子数据项,生成判断潜在威胁事件的原子谓词,利用逻辑连接词连接而成的威胁检测规则。例如,“SYN半连接数>Φ 1 and TCP流量>Φ 2”是用于检测DOS攻击的威胁事件特征信标,其中,Φ 1和Φ 2是阈值。“SYN半连接数”和“TCP流量”是从采集项数据中可获取的威胁检测原子数据项,“and”是逻辑连接词。威胁事件特征信标如图2中第4层所示。威胁-采集树是利用树的形式将数据服务、潜在威胁事件、威胁事件特征信标和采集代理之间对应关系进行描述,如图2所示。其中,潜在威胁事件的范畴不限于本专利实施例中所涉及的潜在威胁事件,其范围可以更广泛。
S102,对于任一潜在威胁事件,根据该潜在威胁事件被所述采集代理监测到的置信度和该潜在威胁事件的影响,获取该潜在威胁事件的风险值;
在计算风险值时,依据潜在威胁事件被监测到的置信度和潜在威胁事件的影响计算威胁事件风险值,计算方法包括但不限于相乘法、矩阵法、加权和法等。确定潜在威胁事件被监测到的置信度的因素包括但不限于威胁检测原子数据项被采集代理监测到的概率和***中设备被攻击的可能性。确定威胁检测原子数据项被采集代理监测到的概率步骤包含但不限于:根据威胁检测原子数据项与采集代理的对应关系,通过随机赋值、固定值选取法、蒙特卡罗模拟法、概率分析等方法确定威胁检测原子数据项被采集代理监测到的概率。确定目标网络中设备节点被攻击的可能性的步骤包含但不限于:根据设备在目标网络***中的位置(例如,距离外网的跳数),利用随机赋值、固定值选取法、蒙特卡罗模拟法、概率分析等方法确定目 标网络中的设备节点被攻击的可能性(例如,距离外网的跳数越少,被攻击者攻击的可能性就越大)。
S103,根据所述潜在威胁事件的风险值和所述威胁-采集树,确定设备节点是否为风险点;
根据潜在威胁事件的风险值、威胁-采集树中威胁事件特征信标与潜在威胁事件的对应关系和威胁检测原子数据项与采集代理的关系确定风险点。首先,对所有潜在威胁事件的风险值进行排序,选取风险值大于第一预设阈值的潜在威胁事件;其次,利用威胁-采集树中的潜在威胁事件与威胁特征信标的对应关系、威胁检测原子数据项与目标网络设备节点的关系,如图2中第3-5层所示,确定能够采集到威胁检测原子数据项的设备节点,这些设备节点即为风险点的位置。
S104,根据所述目标网络中的风险点、所述采集代理的采集能力和预设约束条件,选择部署点并部署采集代理。
依据风险点、威胁-采集树、采集代理能力、采集约束(包括成本约束,QoS约束等),调用部署点选取算法确定采集代理部署位置。其中,描述风险点的要素包括但不限于:位置、数量、类型等。其中,根据数据服务风险点类型可以分为:Web服务、FTP服务、数据库服和应用程序服务等。采集代理能力是采集代理能够从设备和网络***上获取采集内容的能力。部署约束可以从成本约束和QoS(Quality of Service,服务质量)约束两个方面描述。其中,成本包括但不限于:采集代理的购买成本、部署成本、维护成本和资源成本,其中资源成本包括但不限于电量、带宽、设备当前运行状态等。QoS包括但不限于:采集数据的可用性、吞吐量、采集数据的时延、时延变化、丢包率等。
本实施例通过依据目标网络的拓扑图、数据服务、威胁事件,计算威胁事件风险值,构建威胁-采集树,确定风险点,并依据风险点、威胁-采集树、采集代理能力、采集约束确定采集代理部署位置,从而提升数据采集能力,降低数据采集和分析所消耗的资源。
在上述实施例的基础上,本实施例中在构建所述目标网络的威胁-采集树的步骤之前还包括:获取所述网络的采集项数据;所述采集项数据包括网络流量信息、设备状态信息和日志信息;对所述采集项数据进行分析, 从所述采集项数据中提取出关键字段,从所述关键字段中提取出用于检测所述潜在威胁事件的威胁检测原子数据项;其中,所述采集项数据为历史所采集的数据和/或当前采集的数据;对所述威胁检测原子数据项进行分析,生成判断所述潜在威胁事件的原子谓词;使用逻辑连接词将所述原子谓词进行连接,生成能检测所述潜在威胁事件的威胁事件特征信标。
其中,所述的采集项数据包括但不限于网络流量信息(例如,发送数据包的个数、接收数据包的个数等)、设备状态信息(例如,CPU利用率、内存利用率等)和日志信息。其中,日志信息包括但不限于操作***日志数据(例如,Windows***、Linux***等)、目标网络中部署的路由器、交换机等传输设备日志数据(例如,带宽、流量等)、主机上记录的具体服务运行日志数据(例如,SSH、MySQL、HTTP、Web等)和安全设备日志数据(例如,防火墙、IDS等)等。
威胁检测原子数据项是从采集项数据直接采集或间接提取的与潜在威胁事件相关的标志性数据。提取威胁检测原子数据项的方式可以分为对已知威胁事件特征数据提取和对未知威胁事件特征数据提取。其中,对已知威胁事件特征数据的提取方式包括但不限于专家知识库、概率统计、攻击序列模板对比、因果关系和层次式关联分析等;对未知威胁事件特征数据的提取方式包括但不限于模糊统计、贝叶斯网络和机器学习等。其中,所述采集项数据为历史所采集的数据或当前采集的数据。
潜在威胁事件是指从采集项数据中分析出来的对目标网络可能会对目标网络造成影响的攻击事件和/或对目标网络已经造成影响的攻击事件,也可用一个或多个威胁事件特征属性的任意组合进行描述。生成潜在威胁事件特征信标的步骤包含但不限于:第一步,对采集项数据进行分析,提取出关键字段(例如,将非结构化信息转化为结构化等),从关键字段中提取出可用于检测潜在威胁事件的威胁检测原子数据项;第二步,根据威胁检测原子数据项,通过统计方法学等进行分析,生成判断潜在威胁事件的原子谓词;第三步,根据判断潜在威胁事件的原子谓词,利用逻辑连接词连接,从而生成能检测潜在威胁事件特征信标。威胁事件特征信标与潜在威胁事件的对应关系可以用列表或者构建威胁树的方式进行描述,如图2中第3-4层所示。
在上述实施例的基础上,本实施例中根据该潜在威胁事件被所述采集代理监测到的置信度和该潜在威胁事件的影响,获取该潜在威胁事件的风险值的步骤之前还包括:根据采集代理与采集代理所能采集的用于检测威胁的威胁检测原子数据项的对应关系,确定所述潜在威胁事件特征信标中的威胁检测原子数据项被所述采集代理监测到的概率;根据所述威胁检测原子数据项被所述采集代理监测到的概率,基于概率传递方法,计算该潜在威胁事件的最小特征信标集合所对应的威胁检测原子数据项集合被所述采集代理监测到的概率;其中,该潜在威胁事件对应的最小特征信标集合是由满足如下条件且能检测该潜在威胁事件的威胁事件特征信标所组成的集合:该集合的任意真子集均不能检测所述潜在威胁事件;根据各所述设备节点在网络***中的位置信息和/或设备防御度信息,确定各所述设备节点被攻击的可能性;根据所述设备节点被攻击的可能性,计算所述设备节点上的采集代理获取的威胁检测原子数据项的真实性;根据所述威胁检测原子数据项的真实性,计算所述威胁检测原子数据项对应的最小特征信标集合的真实性;根据所述最小特征信标集合被监测到的概率和所述最小特征信标集合的真实性,确定被所述采集代理监测到的最小威胁特征信标集合对应的潜在威胁事件的置信度。
具体地,确定潜在威胁事件被监测到的置信度主要包括以下步骤:首先,确定威胁检测原子数据项被采集代理监测到的概率和***中设备被攻击的可能性。其次,根据***中设备被攻击的可能性,利用三角范式等方法,计算设备对应采集代理获取采集项数据的真实性与威胁检测原子数据项的真实性。再次,依据威胁检测原子数据项的真实性、威胁事件特征信标,计算确定已监测到数据对应的潜在威胁事件的真实性。最后,根据威胁检测原子数据项被采集代理监测到的概率和已监测到数据对应的潜在威胁事件的真实性,利用加权求和方法,计算出潜在威胁事件被监测到的置信度。
在上述实施例的基础上,本实施例中通过以下公式根据该潜在威胁事件对应的最小特征信标集合被所述采集代理监测到的概率和该潜在威胁事件对应的最小特征信标集合的真实性,确定该潜在威胁事件被所述采集代理监测到的置信度:
Figure PCTCN2019092999-appb-000001
其中,p ψ表示任一所述潜在威胁事件ψ被所述采集代理监测到的置信度,τ i表示ψ对应的第i个最小特征信标集合,γ(ψ)表示ψ对应的所有最小特征信标集合的集合,
Figure PCTCN2019092999-appb-000002
表示τ i被所述采集代理监测到的概率,
Figure PCTCN2019092999-appb-000003
表示τ i的真实性。
在上述实施例的基础上,本实施例中根据该潜在威胁事件被所述采集代理监测到的置信度和该潜在威胁事件的影响,获取该潜在威胁事件的风险值的步骤之前还包括:对该潜在威胁事件所涉及的所述目标网络的安全属性进行评估;所述安全属性包括完整性、可用性和保密性;根据评估结果确定该潜在威胁事件的影响。
其中,潜在威胁事件影响指的是潜在威胁事件对目标网络的影响,潜在威胁事件影响可以从目标网络的安全属性进行描述。目标网络的安全属性包括但不限于***的完整性(Integrity)、***的可用性(Availability)和***的机密性(Confidentiality)等。对潜在威胁事件所涉及的安全属性进行评估,根据评估结果确定潜在威胁事件的影响。
在上述实施例的基础上,本实施例中根据所述目标网络中的风险点、所述采集代理的采集能力和预设约束条件,将所述采集代理部署在所述风险点上的步骤具体包括:1)构建第一目标函数,并确定所述第一目标函数的约束条件,对所第一目标函数进行求解,获得需要部署的采集代理个数;所述第一目标函数包括:最大化采集效用、最小化采集代理部署成本、最小化采集代理的资源消耗中的任意一个或多个;第一目标函数的约束条件包括:部署采集代理的成本小于部署总预算、采集效用不低于第二预设阈值、采集代理的资源消耗不超过第三预设阈值中的任意一个或多个;2)构建第二目标函数,并确定所述第二目标函数的约束条件,对所述第二目标函数进行求解,获得需要部署的采集代理位置;所述第二目标函数包括攻击者第一收益函数和/或者监测者第一收益函数;所述攻击者第一收益函数包括:最大化攻击者对所述设备节点造成的影响、最大化攻击者被所述采集代理监测到的时间、最大化所述攻击者被监测到时所述设备节点的感染数量中的任意一个或多个;所述监测者第一收益函数包括:最小化所述采集代理的成本、最大化所述采集代理获取的采集项数据的有效性、最小 化所述攻击者第一收益函数中的任意一个或多个;所述第二目标函数的约束条件包括:采集代理的个数小于第四预设阈值、各潜在威胁事件导致的风险值小于第五预设阈值、所述采集代理的监测时间小于第六预设阈值中的任意一个或多个;根据所述第二目标函数和所述第二目标函数的约束条件,基于启发式算法或非启发式算法,获取所述采集代理的部署位置。
具体地,采集代理部署主要包括三个步骤:确定采集代理个数、确定采集代理部署点和采集代理实施部署,具体流程如下:
(1)采集代理个数确定:依据成本约束和QoS约束,确定部署采集代理的个数。
确定采集代理个数具体步骤包含但不限于:首先,构建第一目标函数,所构建的第一目标函数包括但不限于:最大化采集效用、最小化采集代理部署成本、最小化采集代理的资源消耗中的任意一个或多个;其次,选择约束条件,约束条件包括但限于:部署采集代理的成本小于部署总预算、采集效用不低于第二预设阈值、采集代理的资源消耗不超过第三预设阈值中的任意一个或多个;最后,求解上述优化第一目标函数,求解第一目标函数的方法包括但不限于:背包算法、多目标规划方程、局部搜索等。需要说明的是,在构建优化第一目标函数中选取作为优化目标的一项,不能出现在约束条件中。例如,第一目标函数为最大化采集效用,则采集效用不低于最低基本效用值不可作为约束条件。
(2)采集代理位置确定:依据风险点、采集代理个数,构建监测者目标函数,确定采集代理的部署点。
确定采集代理位置的具体步骤包含但不限于:首先,构建第二目标函数:①以攻击者角度选择攻击者第一收益函数,攻击者第一收益函数包括但不限于:最大化攻击者对设备节点或网络***造成的影响,最大化攻击者被监测到的时间,最大化攻击者被监测到时设备节点或网络***被感染的数量。②以监测角度选择监测者第一收益函数,监测者第一收益函数包括但不限于:最小化采集成本,最大化采集信息的有效性,最小化攻击者的收益。其次,选择约束条件,约束条件包括但不限于:采集代理的个数小于第四预设阈值、各所述潜在威胁事件的风险值小于第五预设阈值、所述采集代理的监测时间小于第六预设阈值中的任意一个或多个;然后,求 解第二目标函数,求解第二目标函数的方法包括但不限于:贪心算法、局部搜索法、模拟退火算法、遗传算法、蚁群算法、粒子群算法、拉格朗日乘数法等。最后,输出部署采集代理位置的编号ID,即为采集代理的部署位置。
(3)采集代理部署:依据(1)(2)的要求,实施部署采集代理。
在上述各实施例的基础上,本实施例中根据所述网络中的风险点、所述采集代理的采集能力和预设约束条件,将所述采集代理部署在所述风险点上的步骤之后还包括:根据所述采集代理的部署位置、所述采集代理的能力和攻击者的能力,生成所述采集代理的调度策略。
其中,所述的攻击者能力指的是攻击者对目标***的一组设备节点或数据服务进行攻击时的能力,评价攻击者能力的要素包括但不限于:所能选择的攻击中继点、所能选择的攻击范围、所能选择的攻击路径、所能选择的攻击方式、可利用的漏洞数量。本实施例依据采集代理部署位置、采集代理能力、攻击者能力,调用采集代理调度生成算法,生成采集代理调度策略。
现有采集代理开启策略主要采用被动开启模式和主动开启模式。其中,被动开启模式是指采集代理等待管理者发送的采集启动命令,一旦接收到启动命令后,按照采集命令执行数据采集。主动开启模式是指采集代理依据预先设定方式和当前环境状态自主激活采集代理,进行数据采集。其中,典型主动开启模式为定时或周期式采集,即按照预先设置的采集周期进行数据采集。例如,每5分钟采集一次主机CPU负载。这些采集代理开启策略未有效考虑攻击者能力和攻击时机等因素,从而不能有效地采集数据。例如,攻击者可探测攻击目标网络的拓扑,观测采集代理部署位置和采集代理开启规律(如,攻击者通过扫描、渗透、社会工程学等手段获得部署位置和开启规律)等信息,选取未部署采集代理或未开启采集代理的节点作为攻击目标,从而使其攻击效果最大,破坏采集代理采集数据的有效性,进而使得监测者无法准确分析目标网络安全状态。
本实施例监测者以不同概率选择采集代理组合进行开启,确保攻击者无法观测采集代理开启规律,从而阻止攻击者躲避监测,提升采集代理采集数据的有效性。
在上述实施例的基础上,本实施例中根据所述采集代理的部署位置、所述采集代理的能力和攻击者的能力,生成所述采集代理的调度策略的步骤具体包括:1)构建第三目标函数,并确定所述第三目标函数的约束条件,对所述第三目标函数进行求解,获得需要开启的采集代理个数;所述第三目标函数包括:最大化采集代理开启效用、最小化开启采集代理的所消耗的资源中的任意一个或多个;第三目标函数的约束条件包括:采集代理开启效用不低于第七预设阈、开启采集代理的资源消耗不超过第八预设阈值中的任意一个或多个;2)构建攻击者第二收益函数和监测者第二收益函数,根据所述攻击者第二收益函数和/或所述监测者第二收益函数构建第四目标函数;根据攻击者策略集合、监测者策略集合和调度所述采集代理的个数构建所述第四目标函数的约束条件;所述的攻击者策略集合是攻击者能够选择的行动集合,攻击者的行动集合包括:选取感染源、选择攻击路径、选取攻击目标中的任意一个或多个。所述的监测方策略集合是监测方能够选择的行动集合,监测方的行动集合指的是监测方选取开启哪些采集代理进行监测;根据所述第四目标函数和所述第四目标函数的约束条件,计算所述监测者的混合策略和所述攻击者的混合策略;其中,所述攻击者的混合策略包括所述攻击者选取的攻击策略和所述攻击策略被选取的概率,所述监测者的混合策略为所述监测者选取的监测策略和所述监测策略被选取的概率;根据所述监测者的混合策略,生成所述采集代理的调度策略;其中,所述攻击者第二收益函数依赖于攻击者从开始攻击到被监测者监测到的时间、所述攻击者从开始攻击到被所述监测者监测到时所述攻击者感染的设备节点总数量,和/或,所述攻击者对所述数据服务造成的影响;所述监测者第二收益函数依赖于所述监测者监测到所述攻击者的时间、所述监测者监测到所述攻击者时被感染的节点数量;所述监测者监测到所述攻击者时所述服务数据受到的影响;所述第四目标函数的约束条件包括:所述采集代理开启的个数小于第九预设阈值、所述攻击者的混合策略中策略被选取的概率总和等于1、所述监测者的混合策略中策略被选取的概率总和等于1、所述采集代理的资源消耗量小于第十预设阈值、所述采集代理的运行成本小于第十一预设阈值和所述采集代理的维护成本小于第十二预设阈值中的任意一个或多个。
具体地,本实施例确定采集代理调度策略主要步骤包含但不限于:
(1)确定第三目标函数和约束条件:依据监测者收益函数、攻击者收益函数构建整个***的目标函数;依据攻击者的策略集合、监测者的策略集合、调度采集代理的个数构建约束条件。双方的收益函数是双方依据其所属类型和选择的行动可获得的收益,收益函数包括攻击者收益函数、监测者收益函数中的一个或两个。
所述的攻击策略集合是攻击者能够选择的行动集合,攻击者的行动集合包括但不限于:选取感染源、选择攻击路径、选取攻击目标。所述的监测方策略集合是监测方能够选择的行动集合,监测方的行动集合指的是监测方选取开启哪些采集代理进行监测。
构建第四目标函数的步骤包含但不限于:首先,确定参与双方各自的收益函数,其中,①攻击者第二收益函数包括但不限于:攻击者从开始攻击到被监测者发现的时间;攻击者从开始攻击到被监测者发现时,攻击者一共感染节点数量;攻击者对服务数据造成影响等。②监测者第二收益函数包括但不限于:监测者监测到攻击的时间;监测者监测到攻击时被感染的节点数量;监测者监测到攻击是数据服务受到的影响。其次,构建第四目标函数,根据参与双方的收益函数,利用加权求和等方法计算期望收益,该期望收益即为***目标函数。
约束条件包括但不限于:采集代理开启的个数小于预设阈值、攻击者混合策略中每条策略的概率总和等于1、监测者混合策略中每条策略的概率总和等于1、资源消耗量小于第七预设阈值(例如,安装采集代理所的五个设备的剩余电量分别为20%、45%、50%、75%、90%,为了增加采集代理的运行时间,根据设备节点当前的电量有选择的开启五个设备节点中一个或多个组合)、运行成本小于第八预设阈值(开启每个采集代理都会消耗一定的成本,如人力、财力、时间等)、维护成本小于第九阈值(维护采集代理的正常运行也会消耗一定的成本,如人力、财力、时间等)。
(2)生成调度策略:依据整个***的目标函数、约束条件,求解目标函数,获得混合策略,即开启不同采集代理组合的概率。
其中,所述的混合策略是参与双方以一定的概率值选取的策略,即监测者以一定的概率值选取监测策略,攻击者以一定的概率值选取攻击策略。 求解目标函数的步骤包含但不限于:第一步,初始化参与双方的一条或几条策略,初始化的方式包括:随机选取、度中心性等。第二步,求解当前策略集的目标函数,求解目标函数的方法包括但不限于:线性规划、梯度下降法、贪心算法、局部搜索法、模拟退火算法、遗传算法、蚁群算法、粒子群算法等。求解目标函数可以分为三种情况讨论:①当参与双方的策略集合规模均小于预设阈值时,参与双方的初始策略是全部策略,可以通过线性规划方法直接求解,求出最佳目标函数值和双方混合策略;②当参与双方的策略集合规模均大于预设阈值时,参与方的初始策略是全部策略的一部分,通过目标函数求解方法直接求出选取当前策略的概率,并将其作为下一步的基准,在此基准上参与双方从各自策略集合中选取新的策略添加到原有的策略集合中,再重新调用通过上述目标函数求解方法求解基于新策略的目标函数值,循环至参与双方备选策略集合为空,最后求出最佳目标函数值和双方混合策略。③当参与双方中的一方策略集合规模大于预设阈值时,策略集规模小的参与方的初始策略是全部策略,策略集规模大的参与方的初始策略是全部策略的一部分,通过目标函数求解方法直接求出选取当前策略的概率,并将其作为下一步的基准,在此基准上从策略结合规模大的备选策略集合中选取新的策略添加到原有的策略集合中,再重新调用通过上述目标函数求解方法求解基于新策略的目标函数值,循环至参与双方备选策略集合为空,最后求出最佳目标函数值和双方混合策略。
以下为采集代理部署方法的举例。本实施例需要用到的符号及含义如表1所示。
表1 符号含义表
Figure PCTCN2019092999-appb-000004
假设在目标网络拓扑中,共有5个可以部署采集代理的设备。其中,s1代表防火墙,其上运行的数据服务是UFW服务;s2和s3代表管理服务器,其上均运行的数据服务是SSH服务;s4代表web服务器其上运行的数据服务是Apache HTTP服务;s5代表数据库,其上运行的数据服务是MySQL服务。根据web网络OWASP中top10选取排名靠前的4类作为本实施例的网络潜在威胁事件,其中,1表示暴力破解,2表示DDOS攻击,3表示XSS攻击,4表示SQL注入。
1、风险点确定:依据目标网络拓扑图、数据服务、威胁事件,计算威胁事件风险值,构建威胁-采集树,确定风险点。
(1)威胁事件特征信标生成:数据服务类型根据目标网络拓扑中设备中运行的服务为例,包括UFW服务,SSH服务,Apache HTTP服务, MySQL服务。可获取的采集项数据可以分为三类:网络流量信息(例如,发送数据包的个数、接收数据包的个数等)、设备状态信息(例如,CPU利用率、内存利用率等)和日志信息。其中日志信息包括但不限于:SSH日志信息、MySQL日志信息、HTTP日志信息、Web日志信息、防火墙、IDS等。根据上述提取方法从采集项数据中提取特征数据,形成威胁事件特征信标集合。
以应用日志(SSH日志)为例,生成潜在威胁事件“暴力破解”的特征信标的过程如下:
第一步,对采集项数据进行分析,提取出关键字段,从关键字段中提取出可用于检测威胁的威胁检测原子数据项:“failed password”。
第二步,通过对多条采集项数据SSH连接失败日志数据中“暴力破解”事件威胁事件特征标志性数据进行提取,使用统计学方法进行分析,生成判断潜在威胁事件的原子谓词“SSH尝试失败次数>阈值”。
第三步,利用逻辑连接词连接而成的威胁检测规则:“SSH尝试失败次数>阈值”and“SSH开始尝试次数>阈值”本实施例中的其他威胁事件特征信标的详细提取过程就不再进行累述,直接给出本实施例的威胁事件特征信标如下:
Figure PCTCN2019092999-appb-000005
SSH尝试失败次数>阈值
Figure PCTCN2019092999-appb-000006
SSH开始尝试次数>阈值
Figure PCTCN2019092999-appb-000007
Syn半连接个数>阈值
Figure PCTCN2019092999-appb-000008
XXS尝试通过资源上的URL字符串/logfile/index.php?page=capture_data.php
Figure PCTCN2019092999-appb-000009
XXS尝试通过表格NET_STAT_INFO注入
Figure PCTCN2019092999-appb-000010
XXS尝试通过资源上的URL字符串/logfile/index.php
Figure PCTCN2019092999-appb-000011
包含MySQL版本的字符串
Figure PCTCN2019092999-appb-000012
接收到网络数据包的个数>正常值
Figure PCTCN2019092999-appb-000013
HTTP PHP文件POST请求
Figure PCTCN2019092999-appb-000014
MySQL注入HTTP获取尝试
Figure PCTCN2019092999-appb-000015
CPU利用率>正常值
Figure PCTCN2019092999-appb-000016
表格NET_STAT_INFO尝试SQL注入
Figure PCTCN2019092999-appb-000017
MySQL注入类型询问
威胁事件特征信标和采集代理之间的对应关系可以用一个威胁-采集树进行表示,如图3所示。
(2)风险值计算:依据威胁事件特征被监测到的置信度和潜在威胁事件的影响计算潜在威胁事件风险值,计算方法包括但不限于:相乘法、矩阵法、加权和法等。
计算潜在威胁事件被监测到的置信度的步骤如下:
首先,确定最小特征信标集合被采集代理监测到的概率:根据威胁检测原子数据项与采集代理的关系,通过随机赋值方法,确定威胁检测原子数据项被采集代理监测到的概率,在利用概率传递、概率计算方法,计算最小特征信标集合被监测到的概率,如表2所示。
根据图3可知,威胁事件特征信标与采集代理的关系如下:
Figure PCTCN2019092999-appb-000018
表2 最小特征信标集合被采集代理监测到的概率
Figure PCTCN2019092999-appb-000019
以跳数作为衡量设备的物理位置距离网络边缘的标准,数据库一般存放的物理位置距离网络边缘比较远,且逻辑访问关系的限制会比较多,则数据库服务器被攻击的可能性大,防火墙数据一般处于内网与外网的边缘,且容易受到非法访问和攻击,则防火墙被攻击的可能性小。根据***中设备被攻击的可能性,利用三角范式,确定设备对应采集代理获取采集项数据的真实性和威胁检测原子数据项的真实性,真实性的取值范围在0~1之间,其中,采集项数据中无法按照威胁特征信标进行生成有效信标时,默认情况下该数据服务的真实性为0。需要说明的是,使用0.1~0.3表示真实性小,0.4~0.6表示真实性中等,0.7~0.9表示真实性大。因此,部署在数据库服务器上的采集代理获取威胁检测原子数据的真实性为0.9,部署在防火墙服务器上的采集代理获取威胁检测原子数据的真实性为0.3。每个威胁检测原子数据项真实性与生成它的采集代理的真实性保持一致,如表 3所示。
表3 采集代理的真实性
Figure PCTCN2019092999-appb-000020
再次,根据威胁检测原子数据项的真实性、威胁事件特征信标,通过模糊统计、概率分析等方法,确定已监测到数据对应的潜在威胁事件的真实性,由于每个威胁检测原子数据项是由不同的采集代理采集的采集项数据生成的,那么每个威胁检测原子数据项真实性与生成它的采集代理的真实性保持一致。当最小特征信标集合包含了两个或两个以上的特征信标,则以最低的真实性作为整个最小特征信标集合的真实性,例如最小特征信标
Figure PCTCN2019092999-appb-000021
Figure PCTCN2019092999-appb-000022
来自s2,s2的真实性为0.3,
Figure PCTCN2019092999-appb-000023
来自s3,s3的真实性为0.5,因此最小特征信标
Figure PCTCN2019092999-appb-000024
的真实性为0.3。最小特征信标集合的真实性如表4所示。
表4 被采集代理监测到的最小特征信标集合的真实性
Figure PCTCN2019092999-appb-000025
最后,根据被监测到的威胁检测原子数据项的概率和已监测到数据对应的潜在威胁事件的置信度,利用加权求和方法,计算出潜在威胁事件被采集代理监测到的置信度,公式如下:
Figure PCTCN2019092999-appb-000026
其中,p ψ表示任一所述潜在威胁事件ψ被所述采集代理监测到的置信度,τ i表示ψ对应的第i个最小特征信标集合,γ(ψ)表示ψ对应的所有最小特征信标集合的集合,
Figure PCTCN2019092999-appb-000027
表示τ i被所述采集代理监测到的概率,
Figure PCTCN2019092999-appb-000028
表示τ i的真实性。
潜在威胁事件被采集代理监测到的置信度分别为:
P ψ1=(1-0.3*0.3)(1-0.3*0.8)=0.6916
P ψ2=(1-0.5*0.5)(1-0.7*0.5)(1-0.5*0.8)=0.75*0.65*0.6=0.2925
P ψ3=(1-0.3*0.5)(1-0.3*1)(1-0.3*0.3)=0.85*0.7*0.91=0.54145
P ψ4=(1-0.3*0.9)(1-0.5*0.3)(1-0.9*0.8)=0.73*0.85*0.28=0.17374
在本实施例中,潜在威胁事件影响主要以安全属性角度对其进行描述,主要可以包括三个方面对其进行评估:***机密性(Confidentiality)、系 统完整性(Integrity)、***可用性(Availability)。以上三个方面的取值分别在在0~5之间,影响级别在I级~V级之间,I级代表极低影响,II级代表低影响,III级代表中影响,IV级代表高影响,V级代表极高影响。通过对三个方面的考虑,同时参照OWASP中top10列表中的信息,给出本算例中每个潜在威胁事件的影响值,如表5所示。
表5 潜在威胁事件影响值
Figure PCTCN2019092999-appb-000029
通过以下公式计算潜在威胁事件ψ的风险值:
Utility attacker=Risk=P ψ×I ψ
其中,P ψ表示潜在威胁事件ψ被采集代理监测到的置信度,I ψ表示潜在威胁事件ψ的影响值。
根据潜在威胁事件被检测到的置信度和潜在威胁事件的影响计算出潜在威胁事件风险值表示如下:
Figure PCTCN2019092999-appb-000030
潜在威胁事件ψ 1的风险值:Risk ψ1=0.6916*14=9.6824
潜在威胁事件ψ 2的风险值:Risk ψ2=0.2925*20=5.85
潜在威胁事件ψ 3的风险值:Risk ψ3=0.54145*5=2.70725
潜在威胁事件ψ 4的风险值:Risk ψ4=0.17374*10=1.7374
(3)风险点确定
首先,根据(2)中计算的潜在威胁事件风险值,选取出风险值大于阈值1.5的潜在威胁事件。后续简化等式的表述,使用函数R来替代Risk ψ,S d表示采集代理的部署集合。
Figure PCTCN2019092999-appb-000031
其次,根据威胁-采集树中的第3-5层给出了潜在威胁事件与威胁特征 信标的对应关系和威胁特征信标与目标网络设备节点的关系。因此,潜在威胁事件ψ 1对应的威胁特征信标是
Figure PCTCN2019092999-appb-000032
Figure PCTCN2019092999-appb-000033
ψ 2对应的威胁特征信标是
Figure PCTCN2019092999-appb-000034
Figure PCTCN2019092999-appb-000035
Figure PCTCN2019092999-appb-000036
ψ 3对应的威胁特征信标是
Figure PCTCN2019092999-appb-000037
Figure PCTCN2019092999-appb-000038
ψ 4对应的威胁特征信标是
Figure PCTCN2019092999-appb-000039
Figure PCTCN2019092999-appb-000040
Figure PCTCN2019092999-appb-000041
威胁特征信标
Figure PCTCN2019092999-appb-000042
对应的网络设备节点是v 1,威胁特征信标
Figure PCTCN2019092999-appb-000043
Figure PCTCN2019092999-appb-000044
对应的网络设备节点是v 2,威胁特征信标
Figure PCTCN2019092999-appb-000045
Figure PCTCN2019092999-appb-000046
Figure PCTCN2019092999-appb-000047
对应的网络设备节点是v 3,威胁特征信标
Figure PCTCN2019092999-appb-000048
Figure PCTCN2019092999-appb-000049
对应的网络设备节点是v 4,威胁特征信标
Figure PCTCN2019092999-appb-000050
对应的网络设备节点是v 5最后,确定风险点为网络设备节点是v 1、v 2、v 3、v 4、v 5
2、采集代理部署:
(1)采集代理个数确定
首先选择最大化采集效用为目标方程,即使采集代理获取的威胁检测原子数据项能够尽可能多的检测出潜在威胁事件,选择所有部署采集代理的金额之和小于总预算、采集代理的资源消耗不超过预设值,根据背包算法计算第一目标函数。
(2)采集代理位置确定:采集代理的类型选取采集代理能力无差异的同质内嵌式采集代理,由于部署的设备类型不同,设备上运行的数据服务的不同,因此只考虑采集项数据存在差异。
本实施例中要考虑到敌对环境的设置,因此针对本实施例中确定采集代理的位置,即第二目标函数的优化,监测者最小化攻击者的最大化攻击影响。在本实施例中,选取确定采集代理位置的方式使用贪心算法。选取一个尽可能小的数值z,对于每个z的取值,可找到成本最低的集合S d,对于所有的潜在威胁事件i可以满足R i(S d)≤z对于z>0,有如下定义:
Figure PCTCN2019092999-appb-000051
最初的函数R i在z的位置被截断,其平均值是:
Figure PCTCN2019092999-appb-000052
首先,计算出该问题中所能取到的最大值z max和最小值z min,其中,最大值z max是当所有采集代理都没有部署时,攻击方效用值最大,最小值z min是当所有设备节点上都部署上采集代理,攻击方效用最小。其次,求出最大值z max和最小值z min的平均值z,同时,针对任意一组采集代理集合S d都可以计算出对应的收益
Figure PCTCN2019092999-appb-000053
再次,调用贪心算法,根据均值z与
Figure PCTCN2019092999-appb-000054
依次找出每一轮中增量绝对值最大的设备节点ID的组合,并且将其赋值给S dbest;若所选采集代理个数不满足3个时,则使用z当前的取值赋给z max或z min。最后,再次调用贪心算法算法,依次循环来找到满足目标函数的部署集合。需要注意的是,每次调用贪心算法时,都是从空集开始的。计算结果为设备标号为1、3、4,这三个点即为部署位置。贪心算法如图4所示,采集代理部署算法流程图5所示。
(3)采集代理部署:根据(2)中的计算,将采集代理部署在v 1、v 3、v 4的设备节点上。
以下为采集代理调度方法的举例。
假设在目标网络拓扑中,有7个节点分别为:V={v0,……,v6},每一个节点代表一个已经部署的采集代理。采集代理的能力为采集代理本身能够获取采集项数据的能力,攻击者能力为攻击者可以选择目标网络中的任意一个节点作为传播病毒的传染源。其中,监测者的策略为从目标网络7个设备节点中选取k个采集代理进行开启,监测者共有c(k,n)条备选策略。攻击者的策略为从目标网络7个节点中选取一个点作为感染源点,攻击者共有7条备选策略。策略空间集合阈值设为20。采集代理开启个数k要小于预阈值,可以通过构建第三目标函数和约束条件进行求解确定,本实施中为了便于计算,k值设为3,监测者从备选策略中选取每条策略的概率总和为1。
以上述场景为例来说明:
(1)确定第四目标函数和约束条件:
本实施例中监测者第二收益函数选择最小化攻击者被监测者监测到的时间,即检测者尽早监测攻击者,攻击者第二收益函数选择最大化监测者的收益函数。根据监测者和攻击者双方的收益函数,利用加权求和的方法计算期望收益,构建整个***的第四目标函数。
监测者第二收益函数P D=τ(A,D),攻击者第二收益函数P A=-(P D)。其中,A表示任一攻击者策略,D表示任一监测策略,τ表示当监测者选择D,攻击者选择A时,攻击者被监测者监测到的时间。
给定监测者的混合策略x和攻击者选择的攻击策略A,攻击者的期望收益为:
Figure PCTCN2019092999-appb-000055
其中,
Figure PCTCN2019092999-appb-000056
是标识变量,如果
Figure PCTCN2019092999-appb-000057
即监测者未检测到攻击感染事件,z D,A=1。反之,z D,A=0。
同样,给定攻击者的混合策略y和监测者策略D,攻击者的期望收益为:
Figure PCTCN2019092999-appb-000058
当双方都是混合策略时,攻击者的期望收益为:
Figure PCTCN2019092999-appb-000059
整个***的第四目标函数如下:
Figure PCTCN2019092999-appb-000060
双方策略的限制条件如第四目标函数中的等式和不等式。其中,A是攻击者选取的攻击策略;D是监测者选取的监测策略;U为***目标函数;U d为监测者的收益函数;x是监测者的混合策略,能够以x D的概率选取备选策略集合
Figure PCTCN2019092999-appb-000061
中的一条策略D。
(2)生成调度策略:
根据策略空间集合预设置20,判断策略空间集合的规模。攻击者策略集合规模为7,监测者策略集合规模为35,因此,符合求解目标函数的第③种情况:参与双方中的一方策略集合规模大于预设阈值。
求解第四目标函数的步骤如下:第一步,攻击者的策略共有7条备选策略,可将全部策略{v0},{v1},{v2},{v3},{v4},{v5},{v6}作为攻击者初始策略;使用随机选取的方法初始化监测者,从监测者的c(3,7)条全部备选策略中随机选取一条策略{v4,v5,v3}作为监测者初始策略。第二步,根据初始策略可使用线性规划计算出当前目标函数收益、监测者当前混合策略、攻击者当前混合策略,并将上述三者作为基准,在此基准上利用贪心算法查找能够改善目标函数收益的新的监测者策略,循环制参与双方备选策略集合为空,求解最终第四目标函数收益和监测者调度采集代理的混合策略。采集代理调度策略流程如图6所示。
监测者的混合策略为:选取策略{v2,v5,v6}的概率为0.278624,选取{v3,v5,v6}的概率为0.0248471,选取{v0,v3,v6}的概率为0.246089,选取{v2,v3,v6}的概率为0.029415,选取{v2,v3,v5}的概率为0.162656,选取{v1,v3,v4}的概率为0.230108,选取{v3,v4,v6}的概率为0.0282604。
在本申请的另一个实施例中提供一种采集代理部署装置,该装置用于实现前述各实施例中的方法。因此,在前述采集代理部署方法的各实施例中的描述和定义,可以用于本申请实施例中各个执行模块的理解。图7为采集代理部署装置整体结构示意图,该装置包括构建模块701、获取模块702、确定模块703和部署模块704;其中:
构建模块701用于根据目标网络-数据服务库、数据服务-威胁事件库、威胁事件-特征信标库和采集代理-威胁检测原子数据项库构建网络的威胁-采集树;其中,目标网络-数据服务库存储目标网络拓扑与目标网络所提供的数据服务之间的对应关系,数据服务-威胁事件库存储数据服务与数据服务面临的潜在威胁事件之间的对应关系,威胁事件-特征信标库存储潜在威胁事件与能发现所述潜在威胁事件的威胁事件特征信标间的对应关系,采集代理-威胁检测原子数据项库存储采集代理与采集代理所能采集的用于检测潜在威胁事件的威胁检测原子数据项的对应关系;获取模块702用于对于任一所述潜在威胁事件,根据该潜在威胁事件被所述采集代理监测到的置信度和该潜在威胁事件的影响,获取该潜在威胁事件的风险值;确定模块703用于根据所述潜在威胁事件的风险值和所述威胁-采集树,确定设备节点是否为风险点;部署模块704用于根据所述网络中的风险点、所述采集代理的采集能力和预设约束条件,选择部署点并部署采集代理。
本实施例通过依据目标网络拓扑图、数据服务、威胁事件,计算威胁事件风险值,构建威胁-采集树,确定风险点,并依据风险点、威胁-采集树、采集代理能力、采集约束确定采集代理部署位置,从而提升数据采集能力,降低数据采集和分析所消耗的资源。
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不 使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (10)

  1. 一种采集代理部署方法,其特征在于,包括:
    根据目标网络-数据服务库、数据服务-威胁事件库、威胁事件-特征信标库和采集代理-威胁检测原子数据项库构建网络的威胁-采集树;其中,目标网络-数据服务库存储目标网络拓扑与目标网络所提供的数据服务之间的对应关系,数据服务-威胁事件库存储数据服务与数据服务面临的潜在威胁事件之间的对应关系,威胁事件-特征信标库存储潜在威胁事件与能发现所述潜在威胁事件的威胁事件特征信标间的对应关系,采集代理-威胁检测原子数据项库存储采集代理与采集代理所能采集的用于检测潜在威胁事件的威胁检测原子数据项的对应关系;
    对于任一所述潜在威胁事件,根据该潜在威胁事件被所述采集代理监测到的置信度和该潜在威胁事件的影响,获取该潜在威胁事件的风险值;
    根据所述潜在威胁事件的风险值和所述威胁-采集树,确定设备节点是否为风险点;
    根据所述网络中的风险点、所述采集代理的采集能力和预设约束条件,选择部署点并部署采集代理。
  2. 根据权利要求1所述的采集代理部署方法,其特征在于,在构建所述目标网络的威胁-采集树的步骤之前还包括:
    获取所述网络的采集项数据;所述采集项数据包括网络流量信息、设备状态信息和日志信息;
    对所述采集项数据进行分析,从所述采集项数据中提取出关键字段,从所述关键字段中提取出用于检测所述潜在威胁事件的威胁检测原子数据项;其中,所述采集项数据为历史所采集的数据和/或当前采集的数据;
    对所述威胁检测原子数据项进行分析,生成判断所述潜在威胁事件的原子谓词;
    使用逻辑连接词将所述原子谓词进行连接,生成能检测所述潜在威胁事件的威胁事件特征信标。
  3. 根据权利要求2所述的采集代理部署方法,其特征在于,根据该潜在威胁事件被所述采集代理所监测到的置信度和该潜在威胁事件的影响,获取该潜在威胁事件的风险值的步骤之前还包括:
    根据采集代理与采集代理所能采集的用于检测威胁的威胁检测原子数据项的对应关系,确定所述潜在威胁事件特征信标中的威胁检测原子数据项被所述采集代理监测到的概率;
    根据所述威胁检测原子数据项被所述采集代理监测到的概率,基于概率传递方法,计算该潜在威胁事件的最小特征信标集合所对应的威胁检测原子数据项集合被所述采集代理监测到的概率;其中,该潜在威胁事件对应的最小特征信标集合是由满足如下条件且能检测该潜在威胁事件的威胁事件特征信标所组成的集合:该集合的任意真子集均不能检测所述潜在威胁事件;
    根据各所述设备节点在网络***中的位置信息和/或设备防御度信息,确定各所述设备节点被攻击的可能性;根据所述设备节点被攻击的可能性,计算所述设备节点上的采集代理获取的威胁检测原子数据项的真实性;
    根据所述威胁检测原子数据项的真实性,计算所述威胁检测原子数据项对应的最小特征信标集合的真实性;
    根据所述最小特征信标集合被监测到的概率和所述最小特征信标集合的真实性,确定被所述采集代理监测到的最小威胁特征信标集合对应的潜在威胁事件的置信度。
  4. 根据权利要求3所述的采集代理部署方法,其特征在于,通过以下公式根据该潜在威胁事件对应的最小特征信标集合被所述采集代理监测到的概率和该潜在威胁事件对应的最小特征信标集合的真实性,确定该潜在威胁事件被所述采集代理监测到的置信度:
    Figure PCTCN2019092999-appb-100001
    其中,p ψ表示任一所述潜在威胁事件ψ被所述采集代理监测到的置信度,τ i表示ψ对应的第i个最小特征信标集合,γ(ψ)表示ψ对应的所有最小特征信标集合的集合,
    Figure PCTCN2019092999-appb-100002
    表示τ i被所述采集代理监测到的概率,
    Figure PCTCN2019092999-appb-100003
    表示τ i的真实性。
  5. 根据权利要求1所述的采集代理部署方法,其特征在于,根据该潜在威胁事件被所述采集代理监测到的置信度和该潜在威胁事件的影响,获取该潜在威胁事件的风险值的步骤之前还包括:
    对该潜在威胁事件所涉及的所述网络的安全属性进行评估;所述安全 属性包括完整性、可用性和保密性;
    根据评估结果确定该潜在威胁事件的影响。
  6. 根据权利要求1所述的采集代理部署方法,其特征在于,根据各所述潜在威胁事件的风险值和所述威胁-采集树,确定各所述设备节点是否为风险点的步骤具体包括:
    从所有所述潜在威胁事件中选择出所述风险值大于第一预设阈值的潜在威胁事件;
    根据所述威胁-采集树,确定所述潜在威胁事件对应的威胁事件特征信标和能采集所述威胁事件特征信标所对应的威胁检测原子数据项的采集代理,将所述采集代理所在的设备节点作为所述风险点。
  7. 根据权利要求1所述的采集代理部署方法,其特征在于,根据所述网络中的风险点、所述采集代理的采集能力和预设约束条件,选择部署点并部署采集代理的步骤具体包括:
    1)构建第一目标函数,并确定所述第一目标函数的约束条件,对所第一目标函数进行求解,获得需要部署的采集代理个数;
    所述第一目标函数包括:最大化采集效用、最小化采集代理部署成本、最小化采集代理的资源消耗中的任意一个或多个;
    第一目标函数的约束条件包括:部署采集代理的成本小于部署总预算、采集效用不低于第二预设阈值、采集代理的资源消耗不超过第三预设阈值中的任意一个或多个;
    2)构建第二目标函数,并确定所述第二目标函数的约束条件,对所述第二目标函数进行求解,获得需要部署的采集代理位置;
    所述第二目标函数包括攻击者第一收益函数和/或者监测者第一收益函数;
    所述攻击者第一收益函数包括:最大化攻击者对所述设备节点造成的影响、最大化攻击者被所述采集代理监测到的时间、最大化所述攻击者被监测到时所述设备节点的感染数量中的任意一个或多个;
    所述监测者第一收益函数包括:最小化所述采集代理的成本、最大化所述采集代理获取的采集项数据的有效性、最小化所述攻击者第一收益函数中的任意一个或多个;所述第二目标函数的约束条件包括:采集代理的 个数小于第四预设阈值、各潜在威胁事件导致的风险值小于第五预设阈值、所述采集代理的监测时间小于第六预设阈值中的任意一个或多个;
    根据所述第二目标函数和所述第二目标函数的约束条件,基于启发式算法或非启发式算法,获取所述采集代理的部署位置。
  8. 根据权利要求1-7任一所述的采集代理部署方法,其特征在于,根据所述网络中的风险点、所述采集代理的采集能力和预设约束条件,将所述采集代理部署在所述风险点上的步骤之后包括:
    根据所述采集代理的部署位置、所述采集代理的能力和攻击者的能力,生成所述采集代理的调度策略。
  9. 根据权利要求8所述的采集代理部署方法,其特征在于,根据所述采集代理的部署位置、所述采集代理的能力和攻击者的能力,生成所述采集代理的调度策略的步骤具体包括:
    1)构建第三目标函数,并确定所述第三目标函数的约束条件,对所述第三目标函数进行求解,获得需要开启的采集代理个数;
    所述第三目标函数包括:最大化采集代理开启效用、最小化开启采集代理的所消耗的资源中的任意一个或多个;
    第三目标函数的约束条件包括:采集代理开启效用不低于第七预设阈值、开启采集代理的资源消耗不超过第八预设阈值中的任意一个或多个;
    2)构建攻击者第二收益函数和监测者第二收益函数,根据所述攻击者第二收益函数和/或所述监测者第二收益函数构建第四目标函数;
    根据攻击者策略集合、监测者策略集合和调度所述采集代理的个数,构建所述第四目标函数的约束条件;
    所述的攻击者策略集合是攻击者能够选择的行动集合,攻击者的行动由选取感染源、选择攻击路径、选取攻击目标中的任意一个或多个构成;
    所述的监测者策略集合是监测者能够选择的行动集合,监测者的行动是指监测者选取开启用于监测的采集代理;根据所述第四目标函数和所述第四目标函数的约束条件,计算所述监测者的混合策略和所述攻击者的混合策略;其中,所述攻击者的混合策略包括所述攻击者选取的攻击策略和所述攻击策略被选取的概率,所述监测者的混合策略为所述监测者选取的监测策略和所述监测策略被选取的概率;
    根据所述监测者的混合策略,生成所述采集代理的调度策略;
    其中,所述攻击者第二收益函数依赖于攻击者从开始攻击到被监测者监测到的时间、所述攻击者从开始攻击到被所述监测者监测到时所述攻击者感染的设备节点总数量和/或所述攻击者对所述数据服务造成的影响;
    所述监测者第二收益函数依赖于所述监测者监测到所述攻击者的时间、所述监测者监测到所述攻击者时被感染的节点数量;所述监测者监测到所述攻击者时所述服务数据受到的影响;
    所述第四目标函数的约束条件包括:所述采集代理开启的个数小于第九预设阈值、所述攻击者的混合策略中策略被选取的概率总和等于1、所述监测者的混合策略中策略被选取的概率总和等于1、所述采集代理的资源消耗量小于第十预设阈值、所述采集代理的运行成本小于第十一预设阈值和所述采集代理的维护成本小于第十二预设阈值中的任意一个或多个。
  10. 一种采集代理部署装置,其特征在于,包括:
    构建模块,根据目标网络-数据服务库、数据服务-威胁事件库、威胁事件-特征信标库和采集代理-威胁检测原子数据项库构建网络的威胁-采集树;其中,目标网络-数据服务库存储目标网络拓扑与目标网络所提供的数据服务之间的对应关系,数据服务-威胁事件库存储数据服务与数据服务面临的潜在威胁事件之间的对应关系,威胁事件-特征信标库存储潜在威胁事件与能发现所述潜在威胁事件的威胁事件特征信标间的对应关系,采集代理-威胁检测原子数据项库存储采集代理与采集代理所能采集的用于检测潜在威胁事件的威胁检测原子数据项的对应关系;
    获取模块,用于对于任一所述潜在威胁事件,根据该潜在威胁事件被所述采集代理监测到的置信度和该潜在威胁事件的影响,获取该潜在威胁事件的风险值;
    确定模块,用于根据所述潜在威胁事件的风险值和所述威胁-采集树,确定设备节点是否为风险点;
    部署模块,用于根据所述网络中的风险点、所述采集代理的采集能力和预设约束条件,选择部署点并部署采集代理。
PCT/CN2019/092999 2019-06-13 2019-06-26 采集代理部署方法及装置 WO2020248306A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910509683.6A CN110430158B (zh) 2019-06-13 2019-06-13 采集代理部署方法及装置
CN201910509683.6 2019-06-13

Publications (1)

Publication Number Publication Date
WO2020248306A1 true WO2020248306A1 (zh) 2020-12-17

Family

ID=68407610

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/092999 WO2020248306A1 (zh) 2019-06-13 2019-06-26 采集代理部署方法及装置

Country Status (2)

Country Link
CN (1) CN110430158B (zh)
WO (1) WO2020248306A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112347484A (zh) * 2020-10-27 2021-02-09 杭州安恒信息技术股份有限公司 软件漏洞检测方法、装置、设备及计算机可读存储介质
CN113536678B (zh) * 2021-07-19 2022-04-19 中国人民解放军国防科技大学 基于贝叶斯网络及stride模型的xss风险分析方法及装置
CN114448660B (zh) * 2021-12-16 2024-06-04 国网江苏省电力有限公司电力科学研究院 一种物联网数据接入方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180013775A1 (en) * 2016-07-08 2018-01-11 Nec Laboratories America, Inc. Host level detect mechanism for malicious dns activities
CN109413088A (zh) * 2018-11-19 2019-03-01 中国科学院信息工程研究所 一种网络中的威胁处置策略分解方法及***
CN109639648A (zh) * 2018-11-19 2019-04-16 中国科学院信息工程研究所 一种基于采集数据异常的采集策略生成方法及***
CN109714312A (zh) * 2018-11-19 2019-05-03 中国科学院信息工程研究所 一种基于外部威胁的采集策略生成方法及***
CN109787943A (zh) * 2017-11-14 2019-05-21 华为技术有限公司 一种抵御拒绝服务攻击的方法及设备

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101436967A (zh) * 2008-12-23 2009-05-20 北京邮电大学 一种网络安全态势评估方法及其***
CN101888380A (zh) * 2010-07-07 2010-11-17 南京烽火星空通信发展有限公司 一种传感器与采集代理之间数据交互的通用通信方法
CN103731298A (zh) * 2013-11-15 2014-04-16 中国航天科工集团第二研究院七〇六所 一种大规模分布式网络安全数据采集方法与***
US9602530B2 (en) * 2014-03-28 2017-03-21 Zitovault, Inc. System and method for predicting impending cyber security events using multi channel behavioral analysis in a distributed computing environment
CN104111983B (zh) * 2014-06-30 2017-12-19 中国科学院信息工程研究所 一种开放式的多源数据采集***及方法
CN105376085A (zh) * 2014-08-27 2016-03-02 中兴通讯股份有限公司 一种升级数据采集代理的方法、装置及***
CN108494787B (zh) * 2018-03-29 2019-12-06 北京理工大学 一种基于资产关联图的网络风险评估方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180013775A1 (en) * 2016-07-08 2018-01-11 Nec Laboratories America, Inc. Host level detect mechanism for malicious dns activities
CN109787943A (zh) * 2017-11-14 2019-05-21 华为技术有限公司 一种抵御拒绝服务攻击的方法及设备
CN109413088A (zh) * 2018-11-19 2019-03-01 中国科学院信息工程研究所 一种网络中的威胁处置策略分解方法及***
CN109639648A (zh) * 2018-11-19 2019-04-16 中国科学院信息工程研究所 一种基于采集数据异常的采集策略生成方法及***
CN109714312A (zh) * 2018-11-19 2019-05-03 中国科学院信息工程研究所 一种基于外部威胁的采集策略生成方法及***

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BANIK SHANKAR M; PENA LUIS: "Deploying Agents in the Network to Detect Intrusions", 2015 IEEE/ACIS 14TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION SCIENCE (ICIS), 28 June 2015 (2015-06-28), pages 83 - 87, XP033181828, DOI: 10.1109/ICIS.2015.7166574 *

Also Published As

Publication number Publication date
CN110430158A (zh) 2019-11-08
CN110430158B (zh) 2020-07-03

Similar Documents

Publication Publication Date Title
Kumar et al. A Distributed framework for detecting DDoS attacks in smart contract‐based Blockchain‐IoT Systems by leveraging Fog computing
Osanaiye et al. Distributed denial of service (DDoS) resilience in cloud: Review and conceptual cloud DDoS mitigation framework
Miehling et al. A POMDP approach to the dynamic defense of large-scale cyber networks
Diro et al. Leveraging LSTM networks for attack detection in fog-to-things communications
Rao et al. A model for generating synthetic network flows and accuracy index for evaluation of anomaly network intrusion detection systems
Inayat et al. Intrusion response systems: Foundations, design, and challenges
WO2020248306A1 (zh) 采集代理部署方法及装置
Chkirbene et al. A combined decision for secure cloud computing based on machine learning and past information
Chen et al. FCM technique for efficient intrusion detection system for wireless networks in cloud environment
Cambiaso et al. Detection and classification of slow DoS attacks targeting network servers
Manimaran et al. The conjectural framework for detecting DDoS attack using enhanced entropy based threshold technique (EEB-TT) in cloud environment
Fenil et al. Towards a secure software defined network with adaptive mitigation of dDoS attacks by machine learning approaches
Sree et al. Detection of http flooding attacks in cloud using dynamic entropy method
Hsiao et al. Constructing an ARP attack detection system with SNMP traffic data mining
Vidal et al. Detecting Workload-based and Instantiation-based Economic Denial of Sustainability on 5G environments
Abdulqadder et al. Validating user flows to protect software defined network environments
Prajisha et al. An intrusion detection system for blackhole attack detection and isolation in RPL based IoT using ANN
Schulter et al. A grid-based intrusion detection system
EP4262144A1 (en) Network threat processing method and communication apparatus
Zaman et al. TCP/IP model and intrusion detection systems
Moharamkhani et al. Intrusion detection system based firefly algorithm‐random forest for cloud computing
Ge et al. On effective sampling techniques for host-based intrusion detection in MANET
Zhuang et al. Applying data fusion in collaborative alerts correlation
Li et al. QLSFC: An Intelligent Security Function Chain with Q-Learning in SDN/NFV Network
Bhattacharya et al. Cyber threat screening using a queuing-based game-theoretic approach

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19932741

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19932741

Country of ref document: EP

Kind code of ref document: A1