Excavation Trojan detection system based on flow analysis
Technical Field
The invention relates to the field of computer network safety, in particular to a mining Trojan horse detection system based on flow analysis.
Background
The blockchain is a huge decentralized account book, and in the decentralized system, the positions of all participating nodes are equal, and in order to maintain the consistency of the blockchain at all nodes, the blockchain system needs all the nodes to follow the same consensus mechanism to achieve consensus. One widely adopted consensus mechanism is the workload certification (PoW) mechanism proposed by the inventor in 2009, in which nodes participating in a block chain network perform "puzzle computation" using computing power (hereinafter, computing power) of a computer to obtain random number answers meeting difficulty requirements, and if the answers are computed and authenticated by the whole network, the nodes possess generation rights and accounting rights of the block and obtain a digital encryption currency profit, which stimulates miners to invest a large amount of computing power in the block chain system, so as to maintain consistency and tamper resistance of the workload certification block chain system.
And malicious mining behaviors are that under the condition that a user does not know or is not allowed, hardware resources and software resources of user terminal equipment are occupied to conduct mining, and therefore digital encryption money is obtained for profit making. Malicious mining activities initiated by an attacker implanting a mining trojan may typically occur on a user's personal computer, an enterprise website or server, a personal cell phone, a network router, and the like. With the development of the digital cryptocurrency market and the increase of the value thereof in recent years, malicious mining attacks have become a threat attack which has the most extensive influence, and threaten enterprise institutions and vast individual netizens. The mining trojans are mainly divided into two types, one type is the trojans invading a host or a cloud server through various modes, the trojans are high in concealment and strong in transmission, a long life cycle can be obtained in the server, mining behaviors can be carried out, the trojans attacking the mining trojans are more and more serious since 2017, and large-scale mining botnet networks such as WannaMine and Mykings appear; another type of mining trojan is a web page mining trojan script, which hides a link pointing to a target mining program in the code of a web page, and when a user accesses the web site, the target mining program is loaded unconsciously for mining, and the mining trojan takes effect only when the user accesses a web page hung with horses. At present, the mining Trojan attack technology is obviously promoted, the malicious mining industry tends to mature, and the value of a damaged computer and network equipment is squeezed to a greater extent by the mutual cooperation of malicious mining families.
The traditional detection method comprises the steps of identifying abnormal high CPU occupancy rate, identifying abnormal high Hash calculation amount, identifying malicious processes or codes of malicious files, monitoring abnormal continuous high-temperature hardware, adopting active defense software to monitor calling of system sensitive resources and functions by a process, adopting a blacklist mechanism to shield a known mine pool address used by the mining Trojan horse and the like, carrying out relatively accurate behavior judgment on the mining Trojan horse on a host layer, however, the judgment method based on the host layer still has the defects, the mining Trojan horse can adopt the Rootkit and other technologies to hide the mining processes, hide characteristics by limiting the CPU utilization rate, using time and other modes, adopting the technical means of adding shells, mixing codes and the like to avoid matching of code segments in a virus library, adopting the 'no-file' technology or process image conversion to hide files, the host-based identification method has congenital defects and deficiencies in resisting various hiding methods.
Accordingly, those skilled in the art have endeavored to develop a mining Trojan detection system that does not rely on host signature detection, but rather is based on flow analysis, to overcome the shortcomings of the existing systems.
Disclosure of Invention
In view of the above-mentioned defects of the prior art, the technical problem to be solved by the present invention is how to detect a mining trojan based on the communication traffic between the host and other nodes on the network.
In order to achieve the purpose, the invention provides a mining Trojan detection system based on flow analysis, which comprises a connection mining Trojan behavior detection subsystem and a p2p mining network mining Trojan behavior detection subsystem.
And furthermore, a detection subsystem connected with the mine excavation trojan behavior detection subsystem comprises a mine excavation trojan detection module with static flow and a mine excavation trojan detection module with dynamic flow, and a signature characteristic detection method is used for filtering and positioning the flow to a data field, matching special character strings, wherein the special character strings comprise method, params, mining, difficulity, blob and nonce, generating a detection report, and sending an alarm and recording the detection report into a log file.
Further, the input of the static flow ore pool excavation Trojan detection module is a static flow pcap packet.
Further, the input of the dynamic-flow mine pit excavation trojan detection module is a pcap packet captured by starting a data packet capturing function according to the selected network card.
Further, the detection subsystem for the mining trojan horse digging behavior of the mining receiving pond comprises the following working steps:
step 101, filtering out data packets with the total length less than 80 bytes;
102, filtering the data packet again, leaving the data packet which meets the length requirement and uses the TCP/IP protocol, reading the length of the TCP header, and positioning the initial position of the data segment according to the lengths of the three protocol headers;
103, extracting a data field part of the data packet, preprocessing a character string according to a json matching rule, and searching a target field by using a character string function, wherein the target field comprises a method, a params, a mining, a difficulty, a blob and a nonce;
and step 104, counting the detected excavation IP addresses and the detected traffic, generating a counting report, giving an IP address list of the excavation behavior and a counting result of the traffic, and giving a suggestion of forbidding or checking to the IP addresses.
Further, the p2p mining network mining Trojan horse excavation behavior detection subsystem comprises a p2p network mining Trojan horse detection module with static flow and a p2p network mining Trojan horse detection module with dynamic flow, pcap packet data are input, filtered, extracted into network flow data according to the definition of the network flow, the characteristics of the network flow data are counted, an Affinity prediction clustering algorithm is carried out according to the characteristics to obtain a p2p network cluster, each p2p network dynamic behavior characteristic is extracted, the network flow data are input into a pre-trained support vector machine for detection, whether the network flow is a p2p mining network is judged, finally, each node is filtered again, a mining node is screened from the p2p mining network, and if the mining node is found, a warning is sent and recorded in a log.
Further, the input of the p2p network mining trojan detection module for static flow is a static flow pcap packet.
Further, the input of the p2p network mining trojan detection module with dynamic flow is a pcap packet captured by starting a data packet capturing function according to the selected network card.
Further, the p2p network mining trojan horse excavation behavior detection subsystem comprises the following working steps:
105, filtering the flow of the common ports (80, 443), and judging that the node is a p2p node if the flow is higher than a set failure rate threshold value according to the connection failure rate statistical result of the target detection node;
step 106, changing the real flow data into statistical flow data according to the concept of network flow data, and extracting specific traffic flow characteristics of each p2p node obtained in step 105, including the number of bytes in a unit data flow, the number of packets in a unit flow, a flow arrival interval, a flow density, and the number of concentrated time period flows;
step 107, carrying out AP algorithm clustering by using the features extracted in the step 106 to obtain p2p network clusters, wherein the clusters comprise possible mining p2p networks and normal p2p application networks, and carrying out mining flow communication flow feature extraction on each p2p network, wherein the mining flow communication flow feature extraction comprises cluster connection features, cluster shared neighbor features, cluster hottest connection features and cluster changeability features;
step 108, inputting the features extracted in the step 106 into a support vector machine, identifying the excavation p2p network, and taking the average value of training features for all nodes if the network is judged to be the excavation p2p network;
step 109, comparing the training characteristics of all the nodes with the average value obtained in the step 107, and further screening out p2p ore excavation nodes;
and step 110, for the p2p mining node detected in step 109, giving an IP address list of the mining-passing behavior and a statistical result of the traffic, generating a statistical report, and giving a recommendation of forbidding or checking to the IP address.
Further, the detected mining trojans are added into a blacklist.
The system for detecting the mining trojans based on the flow analysis detects the mining trojans by using the flow analysis method, and extracts and identifies the characteristics of the mining trojans which are in communication with a mine pool and participate in communication between a p2p network and adjacent nodes according to different communication objects of the mining trojans.
The system can be deployed on a gateway or a router at the boundary of a large network for real-time dynamic monitoring, and can also detect the mining flow in static flow data.
Compared with the traditional mine digging Trojan detection software system, the system has the following advantages: the system can adopt the ore digging behavior of plaintext transmission, such as the ore pool adopting stratum protocol communication, aiming at the conditions of plaintext transmission and ciphertext transmission, and the ore digging Trojan horse behavior detection subsystem connected with the ore pool can accurately match the content of a data field, and adopts the ore digging behavior of ciphertext transmission, such as the Trojan horse participating in p2p network ore digging, and the p2p network ore digging Trojan horse behavior detection subsystem can identify through the analysis of data stream characteristics; and secondly, the system simultaneously meets the requirements of the personal host and the enterprise-level users, and has high real-time performance and high efficiency because the system filters the original data for many times and uses the algorithm of the fastest character string matching and machine learning, thereby meeting the application requirements of two conditions. The system can finally give an analysis report and log information, and a network administrator can find and stop the ore digging behavior in time according to the log and the report, analyze the type of the ore digging trojans infecting the system, perform corresponding protection and effectively prevent the network from being attacked by the ore digging trojans.
The conception, the specific structure and the technical effects of the present invention will be further described with reference to the accompanying drawings to fully understand the objects, the features and the effects of the present invention.
Drawings
FIG. 1 is a diagram of a flow analysis based software architecture for a mining Trojan detection system;
fig. 2 is a flow chart of a p2p mine pool excavation detection model.
Detailed Description
The technical contents of the preferred embodiments of the present invention will be more clearly and easily understood by referring to the drawings attached to the specification. The present invention may be embodied in many different forms of embodiments and the scope of the invention is not limited to the embodiments set forth herein.
In the drawings, structurally identical elements are represented by like reference numerals, and structurally or functionally similar elements are represented by like reference numerals throughout the several views. The size and thickness of each component shown in the drawings are arbitrarily illustrated, and the present invention is not limited to the size and thickness of each component. The thickness of the components may be exaggerated where appropriate in the figures to improve clarity.
The invention relates to a system for detecting an excavated Trojan horse based on flow analysis, which is characterized in that the excavated Trojan horse is identified by using a flow analysis method, and is divided into two types of communication with a mine pool and communication with a neighboring node participating in a p2p network according to different communication objects of the excavated Trojan horse, and a subsystem for detecting the behavior of the excavated Trojan horse connected with the mine pool is constructed based on a method for matching special fields in communication flow; a p2p mining network mining Trojan horse excavation behavior detection subsystem is built on the basis of communication flow feature extraction and a machine learning algorithm. The system is shown in fig. 1, and comprises the following modules:
1) the mine pool excavation Trojan detection module with static flow comprises: inputting static flow pcap packet data by using a signature characteristic detection method, matching special character strings of method, params, mining, difficulity, blob and nonce after filtering and positioning the packet data to a data section, finally generating a detection report, sending an alarm and recording the alarm into a log file;
2) the mine pool excavation Trojan detection module with dynamic flow comprises: automatically starting a data packet capturing function according to the selected network card by using a signature characteristic detection method, matching special character strings of method, params, mining, difficulty, blob and nonce after filtering and positioning each flow packet to a data segment, finally generating a detection report, and sending an alarm and recording the detection report into a log file;
3) static flow p2p network mining trojan detection module: inputting static flow pcap packet data, extracting the static flow pcap packet data into network flow data according to the definition of the network flow after filtering, counting the characteristics of the network flow data, carrying out Affinity prediction clustering algorithm according to the characteristics to obtain a p2p network cluster, extracting dynamic behavior characteristics of each p2p network, inputting the dynamic behavior characteristics into a pre-trained support vector machine for detection, judging whether the network is a p2p mining network, finally filtering each node again, and screening mining nodes from the p2p mining network; if the mining node is found, warning is given and recorded in a log;
4) dynamic flow p2p network mining trojan detection module: capturing data for 1 hour according to a selected network card to obtain a real-time flow data pcap packet for one hour, inputting the real-time flow data pcap packet into a system, extracting the real-time flow data pcap packet into network flow data according to the definition of network flow after filtering, counting the characteristics of the real-time flow data pcap packet, performing Affinity Propagation clustering algorithm according to the characteristics to obtain a p2p network cluster, extracting dynamic behavior characteristics of each p2p network, inputting the dynamic behavior characteristics into a pre-trained support vector machine for detection, judging whether the network is a p2p mining network, finally filtering each node again, and screening mining nodes from the p2p mining network; if a mining node is found, an alert is issued and recorded in the log.
After detecting the mining flow, the system can count the detection condition to generate a report, display the mining flow distribution of the input flow data and the IP address and the port number which need to be checked or forbidden, and record the flow in a log so as to be convenient for a network administrator to check. The information is compared with an established black list of the mining trojans, and if the information is in the black list, the known mining trojans are matched; if the detected mine digging Trojan horse is not in the blacklist, the detected mine digging Trojan horse can be used as a novel threat to be recorded into threat information of the mine digging Trojan horse.
Fig. 2 explains the flow of the p2p network mining trojan detection module in detail:
the upper part of the graph in fig. 2 represents the training process, a supervised learning method is adopted to support a vector machine classification model, the training set comprises labeled data of a p2p mine digging network and a p2p normal application network, and the labeled data is trained by using a connection characteristic representing a dynamic characteristic to ensure that the system has the dynamic characteristic.
The lower half of fig. 2 is a detection process, real-time stream data is input, a p2p cluster is obtained through unsupervised learning clustering, and dynamic detection is performed by using a trained model. The training and testing data are represented as streaming data, i.e., a collection of packets with the same five-tuple (source IP address, destination IP address, source port, destination port, protocol number). And (4) performing feature extraction on the stream data, namely counting the number of bytes in the elementary stream, the number of packets in the elementary stream, the stream arrival interval, the stream density and the number of streams in a centralized time period, and storing the statistics in the csv file. Inputting the csv files into an Affinity prediction clustering algorithm for clustering to obtain a p2p network cluster, extracting dynamic behavior characteristics of each detected p2p network, and inputting the extracted dynamic behavior characteristics into a support vector machine for detection. The dynamic behavior characteristics include: the connection characteristics of the cluster (the sum of the number of connections between each node), the shared neighbor characteristics of the cluster (the ratio of the number of shared connections to the sum of the number of connections), the hottest connection characteristics of the cluster (the connections between some nodes will contribute the most flows), and the changeability characteristics of the cluster (whether a certain p2p network has the same hottest connection set). The vector machine needs to be trained before detection, 24 models are built in the system, one model is trained every hour to meet the dynamic property and detection real-time property of a p2p network, and a proper time period model can be selected according to input data during detection. The vector machine outputs a determination of whether each p2p network is a mine excavation network. And for the network judged to be the mining p2p, taking the average value of the dynamic characteristics, and for all considered nodes, comparing the dynamic characteristics with the obtained average value of the dynamic characteristics to screen out the p2p mining nodes.
Finally, the system obtains the IP address and other flow characteristics of the nodes connecting the mining pool and the mining p2p, the characteristics can be compared with a built-in mining Trojan blacklist, if the characteristics appear in the blacklist, the found nodes can be identified, and if the characteristics do not appear in the blacklist, the nodes can be added into the blacklist.
At present, the attack activity of the mining trojans is very frequent, each common computer or cloud server has the risk of infecting the trojans, regular searching and killing of the mining trojans is an important means for maintaining the safety of the computers, however, the mining trojans can be hidden by using a skillful hiding technology. The system adopts a modular design, the training model, the machine learning algorithm, the clustering algorithm, the signature feature matching rule and the Trojan blacklist can be plugged and unplugged, the more complete and comprehensive mining behavior of the mining Trojan is considered, and the system is suitable for the requirements of a user level and an enterprise level by the time efficiency, the high efficiency and the high detection success rate.
The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.