CN116016220A - Method, device and equipment for predicting service traffic based on DNS traffic - Google Patents

Method, device and equipment for predicting service traffic based on DNS traffic Download PDF

Info

Publication number
CN116016220A
CN116016220A CN202211662873.XA CN202211662873A CN116016220A CN 116016220 A CN116016220 A CN 116016220A CN 202211662873 A CN202211662873 A CN 202211662873A CN 116016220 A CN116016220 A CN 116016220A
Authority
CN
China
Prior art keywords
dns
traffic
flow
influence
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211662873.XA
Other languages
Chinese (zh)
Other versions
CN116016220B (en
Inventor
边占朝
周涛
贾晋康
张敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianyi Safety Technology Co Ltd
Original Assignee
Tianyi Safety Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianyi Safety Technology Co Ltd filed Critical Tianyi Safety Technology Co Ltd
Priority to CN202211662873.XA priority Critical patent/CN116016220B/en
Publication of CN116016220A publication Critical patent/CN116016220A/en
Application granted granted Critical
Publication of CN116016220B publication Critical patent/CN116016220B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The application discloses a method, a device and equipment for predicting service traffic based on DNS traffic, wherein in the method, distributed traffic of a plurality of position points in a target network is obtained, and the distributed traffic comprises DNS traffic and service traffic; constructing a DNS influence model based on the behavior characteristics of the DNS traffic and the behavior characteristics of the service traffic; the DNS influence model is used for indicating the association relation between DNS servers included in the target network and influence information of each DNS server; and obtaining a prediction result of a service flow model of the target network according to the DNS query behavior of the first node and the DNS influence model, wherein the service flow model is used for indicating the service flow generated by each DNS server. Thus, based on distributed global traffic collection, more accurate traffic prediction can be realized.

Description

Method, device and equipment for predicting service traffic based on DNS traffic
Technical Field
The present disclosure relates to the field of network security technologies, and in particular, to a method, an apparatus, and a device for predicting traffic based on DNS traffic.
Background
The domain name system (Domain Name System, DNS) is a service of the internet. The method is used as a distributed database for mapping the domain name and the IP address with each other, so that people can access the Internet more conveniently. The DNS protocol refers to an application layer network protocol, and uses the transmission control protocol (Transmission Control Protocol, TCP)/user datagram protocol (User Datagram Protocol, UDP) port-53 communication to define a protocol standard for a network device to map between IP addresses and domain names. In some scenarios, a domain name server (Domain Name Server, DNS) represents devices and systems that provide DNS services for users.
DNS traffic refers to traffic of DNS protocol, typically but not limited to a byte stream using UDP/TCP and having a port number of 53. According to the trend of DNS traffic, traffic that is queried from a user to a server may be referred to as DNS query traffic, and response or reply information returned from the DNS server to a terminal DNS user may be referred to as DNS response traffic. Traffic refers to non-DNS traffic communication between an end user and a traffic server, which may be any protocol, and is IP traffic communicated with the server after the user has learned the internet protocol (Internet Protocol, IP) address of the server. Traffic and DNS query traffic are typically many-to-one, i.e., multiple accesses may originate from one DNS query.
Based on the close association of DNS traffic with traffic, traffic of a linear network can be predicted by DNS traffic. Currently, the total traffic of a specific area in the future is estimated through the ratio of DNS traffic to total traffic; another is to predict traffic by analyzing DNS query behavior by monitoring DNS services. However, neither approach allows for accurate predictions of traffic. Therefore, how to improve accuracy of predicting traffic based on DNS traffic is worthy of research.
Disclosure of Invention
The application provides a method, a device, equipment and a medium for predicting service traffic based on DNS traffic, which are used for improving accuracy of predicting service traffic based on DNS traffic.
In a first aspect, the present application provides a method for predicting traffic based on DNS traffic, the method comprising:
acquiring distributed traffic of a plurality of position points in a target network, wherein the distributed traffic comprises DNS traffic and service traffic;
constructing a DNS influence model based on the behavior characteristics of the DNS traffic and the behavior characteristics of the service traffic; the DNS influence model is used for indicating the association relation between DNS servers included in the target network and influence information of each DNS server, and the influence information is used for indicating the probability that the DNS traffic from a first node influences the traffic of the DNS servers; the first node is any device accessed to the target network;
and acquiring a prediction result of a service flow model of the target network according to the DNS query behavior of the first node and the DNS influence model, wherein the service flow model is used for indicating the service flow generated by each DNS server.
By the method, global flow analysis in the target network can be realized based on global distributed flow collection, so that a DNS influence model which can reflect the global flow condition of the target network can be constructed, and more accurate service flow prediction can be realized based on the DNS influence model. Therefore, the method provided by the application can provide accuracy of predicting the traffic flow based on the DNS traffic.
In an alternative embodiment, before the constructing a DNS influence model based on the behavior feature of the DNS traffic and the behavior feature of the traffic, the method further includes:
acquiring an initialized DNS influence model configured by a user, wherein the initialized DNS influence model comprises: initializing a DNS role and initializing influence information of each DNS server; the initialization DNS role is used for indicating the role of a DNS server in the target network;
the constructing a DNS influence model based on the behavior feature of the DNS traffic and the behavior feature of the service traffic includes:
and constructing a DNS influence model based on the behavior characteristics of the DNS traffic and the behavior characteristics of the service traffic, and updating the initialized DNS influence model to obtain the DNS influence model.
In this embodiment, by initializing the DNS influence model, the construction efficiency of the DNS influence model can be improved.
In an alternative embodiment, the constructing a DNS influence model based on the behavior feature of the DNS traffic and the behavior feature of the traffic includes:
counting the query times of the second node to the first DNS server;
determining the probability of influence of DNS query behaviors from the second node on the first DNS server according to the query times; when the number of times of inquiry is a first number, the probability is a first probability value, and when the number of times of inquiry is a second number, the probability is a second probability value, the second number is larger than the first number, and the second probability value is larger than the first probability value.
In an alternative embodiment, the behavioral characteristics include one or more of the following information: timestamp, network area identification, internet protocol, IP, address, DNS query source device identification, traffic source device identification.
In an optional embodiment, after the obtaining the prediction result of the traffic flow model of the target network, the method further includes:
And when the absolute difference value between the predicted result and the service flow included in the distributed flow is greater than or equal to a first flow threshold value, returning to construct the DNS influence model.
In an alternative embodiment, the acquiring the distributed traffic of the plurality of location points in the target network includes:
acquiring flow information acquired by flow acquisition components from different position points;
screening out target flow information belonging to a preset domain name of interest in the flow information;
and separating DNS traffic from the target traffic information, and determining the residual traffic information in the target traffic information as the service traffic.
In an alternative embodiment, the method further comprises:
identifying an associated domain name based on the behavioral characteristics of the traffic flow;
and updating the domain name of interest according to the associated domain name.
In an alternative embodiment, the influence information includes one or more of the following:
a client IP list of traffic sources, a router IP list, a proxy IP list, and a load balancing device IP list.
In an alternative embodiment, the prediction result of the traffic flow model includes one or more of the following indexes: flow situation, up-down amplitude, flow distribution prediction, flow period information, flow offset and flow normal oscillation interval.
In a second aspect, the present application provides an apparatus for predicting traffic based on DNS traffic, the apparatus comprising:
the system comprises an acquisition module, a service flow acquisition module and a control module, wherein the acquisition module is used for acquiring distributed flow of a plurality of position points in a target network, and the distributed flow comprises DNS flow and service flow;
the construction module is used for constructing a DNS influence model based on the behavior characteristics of the DNS traffic and the behavior characteristics of the service traffic; the DNS influence model is used for indicating the association relation between DNS servers included in the target network and influence information of each DNS server, and the influence information is used for indicating the probability that the DNS traffic from a first node influences the traffic of the DNS servers; the first node is any device accessed to the target network;
and the prediction module is used for acquiring a prediction result of a service flow model of the target network according to the DNS query behavior of the first node and the DNS influence model, wherein the service flow model is used for indicating the service flow generated by each DNS server.
In an alternative embodiment, the obtaining module is further configured to, before constructing the DNS influence model, before the DNS traffic behavior feature and the traffic behavior feature:
Acquiring an initialized DNS influence model configured by a user, wherein the initialized DNS influence model comprises: initializing a DNS role and initializing influence information of each DNS server; the initialization DNS role is used for indicating the role of a DNS server in the target network;
the construction module is configured to construct a DNS influence model based on the behavior feature of the DNS traffic and the behavior feature of the traffic, and is specifically configured to:
and constructing a DNS influence model based on the behavior characteristics of the DNS traffic and the behavior characteristics of the service traffic, and updating the initialized DNS influence model to obtain the DNS influence model.
In an alternative embodiment, the building module is configured to build a DNS influence model based on the behavior feature of the DNS traffic and the behavior feature of the traffic, specifically configured to:
counting the query times of the second node to the first DNS server;
determining the probability of influence of DNS query behaviors from the second node on the first DNS server according to the query times; when the number of times of inquiry is a first number, the probability is a first probability value, and when the number of times of inquiry is a second number, the probability is a second probability value, the second number is larger than the first number, and the second probability value is larger than the first probability value.
In an alternative embodiment, the behavioral characteristics include one or more of the following information: timestamp, network area identification, internet protocol, IP, address, DNS query source device identification, traffic source device identification.
In an alternative embodiment, the apparatus further comprises:
and the prediction deviation correcting module is used for returning to construct the DNS influence model after obtaining the prediction result of the service flow model of the target network and when determining that the absolute difference value between the prediction result and the service flow included in the distributed flow is greater than or equal to a first flow threshold value.
In an alternative embodiment, the acquiring module is configured to, when acquiring distributed traffic of a plurality of location points in the target network, specifically:
acquiring flow information acquired by flow acquisition components from different position points;
screening out target flow information belonging to a preset domain name of interest in the flow information;
and separating DNS traffic from the target traffic information, and determining the residual traffic information in the target traffic information as the service traffic.
In an alternative embodiment, the apparatus further comprises:
the identification module is used for identifying the associated domain name based on the behavior characteristics of the service flow;
And the updating module is used for updating the domain name of interest according to the associated domain name.
In an alternative embodiment, the influence information includes one or more of the following:
a client IP list of traffic sources, a router IP list, a proxy IP list, and a load balancing device IP list.
In an alternative embodiment, the prediction result of the traffic flow model includes one or more of the following indexes: flow situation, up-down amplitude, flow distribution prediction, flow period information, flow offset and flow normal oscillation interval.
In a third aspect, the present application provides an electronic device, including: the device comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
the memory has stored therein a computer program which, when executed by the processor, causes the processor to implement the steps of any one of the methods of predicting traffic based on DNS traffic described above.
Accordingly, the present application provides a computer readable storage medium storing a computer program which when executed by a processor implements the steps of any of the methods of predicting traffic based on DNS traffic described above.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a method for predicting traffic flow based on DNS traffic according to an embodiment of the present application;
fig. 2 is a schematic flow chart of obtaining distributed traffic according to an embodiment of the present application;
fig. 3 is a schematic flow chart of constructing a DNS influence model according to an embodiment of the present application;
fig. 4 is another flow chart of a method for predicting traffic flow based on DNS traffic according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an apparatus for predicting traffic flow based on DNS traffic according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail below with reference to the accompanying drawings, wherein it is apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Currently, based on close correlation between DNS traffic and traffic, traffic of a linear network can be predicted by DNS traffic. A first possible implementation is to estimate the total amount of traffic in a specific area in the future by means of the ratio of DNS traffic to total traffic. However, this implementation uses the traffic duty ratio as the prediction basis only, and cannot realize accurate prediction; in addition, the flow prediction in the limited domain name space is more difficult to predict, and only the whole flow of the local area can be roughly calculated.
A second possible implementation is to predict traffic by analyzing DNS query behavior by monitoring DNS services. However, in this implementation, compared with the first implementation, although some data processing is performed on DNS service behaviors, for a DNS system of multi-level caching, traffic intercepted by an intermediate caching device cannot be perceived, different roles cannot be differentially processed, and the implementation is limited to a DNS server, so that the full life cycle of the traffic cannot be perceived, and thus, accurate prediction of the traffic cannot be performed.
Among other things, a multi-level cached DNS system may have DNS servers of many different roles, for example, may include, but is not limited to, the following roles:
(1) And the root DNS server is used for storing the top-level domain name information.
(2) Authoritative DNS server: the domain name resolution server is authorized by the upper level, and the authorization can be transmitted downwards.
(3) Recursive DNS server: and receiving the DNS query of the user and returning the result to the user, and caching the query result to avoid repeated upward query. Is open to the public and is provided by network operators and is often configured into the network address information of personal computers.
(4) Forwarding DNS: or DNS relay, accepts a user query and returns the result, which is simply forwarded from the recursive DNS, instead of querying instead of the user, the role also has DNS caching capabilities. For example, a daily home wireless router is typically a forwarding DNS.
(5) DNS users, i.e. devices or systems that really need to convert domain names to IP, such as personal computers.
(6) DNS caching, generally refers to DNS service roles with DNS caching capabilities.
In view of this, embodiments of the present application provide a method for predicting traffic based on DNS traffic. In the method, the DNS influence model of the target network can be constructed by collecting the distributed flow in the target network, so that the prediction of the service flow through the DNS influence model can be realized. Therefore, by the method, more accurate prediction of the traffic flow can be realized by constructing the influence model based on the collected traffic information, and abnormal traffic attacks and the like in the network can be prevented by accurate prediction of the traffic flow.
Referring to fig. 1, a flowchart of a method for predicting traffic flow based on DNS traffic is provided in an embodiment of the present application. The process may include the steps of:
step 101, obtaining distributed traffic of a plurality of location points in a target network, wherein the distributed traffic comprises DNS traffic and service traffic. Wherein the plurality of location points of the target network include, but are not limited to: customer access layer core network egress, carrier access layer network, carrier convergence layer network, client device, etc.
In an alternative embodiment, the distributed flow may be obtained by a current flow acquisition component or system or device; alternatively, distributed traffic may be acquired by a specialized traffic acquisition component or system or device. By way of example, the flow acquisition component or system or device may include, for example, but is not limited to: existing network devices with a network traffic monitoring function (netflow) are devices or probes specially used for collecting network traffic. Thus, in the embodiment of the application, in the process of collecting the flow, the cost can be reduced by multiplexing the existing flow collecting equipment and the like. Moreover, the business does not need to be invaded when the flow is collected, so that the workload of DNS operation and maintenance personnel is not increased, and the influence on the actual business can be avoided.
Exemplary, referring to fig. 2, a schematic flow chart of obtaining distributed traffic is provided in an embodiment of the present application. The method can comprise the following steps:
and 1011, acquiring flow information acquired by the flow acquisition component from different position points. The traffic information at this time may include statistics of traffic data received and transmitted in the communication link by the traffic acquisition component.
Step 1012, judging whether the flow information belongs to target flow information of a preset interested domain name. If so, execution continues with step 1013. The target traffic information belonging to the preset domain name of interest in the traffic information may be screened out in step 1012. The preset domain name of interest is used for indicating a domain name needing to be focused, for example, the preset initialized domain name of interest may be configured for a user, or the domain name of interest currently maintained after updating the initialized domain name of interest, which is not limited in the application.
Step 1013, determine whether the target traffic information belongs to DNS traffic. If yes, go on to step 1014a; if not, proceed to step 1014b. The target flow information may be a plurality of pieces of flow information acquired by different flow acquisition devices at different times, for example, may be behavior log information of a plurality of pieces of flows.
Step 1014a, obtaining DNS traffic information.
Step 1014b, acquiring service flow information.
In addition, based on the behavior characteristics of the service flow, the associated domain name can be identified; and updating the domain name of interest according to the associated domain name. It will be appreciated that there may also be domain name information that does not belong to the domain name of interest, based on source IP address information, destination IP address information, etc. described in the traffic flow. Alternatively, such domain name information may be added to the domain name of interest. Alternatively, when the occurrence number of the domain name information is greater than or equal to a preset number threshold, the domain name information is added to the domain name of interest, so that the DNS traffic and the traffic can be timely and accurately separated, and the accuracy of predicting the traffic based on the DNS traffic can be improved. Therefore, by maintaining the interested domain name information, the service change in the target network can be timely focused, so that the accuracy of service flow prediction can be ensured.
In one possible implementation, after obtaining DNS traffic based on step 1014a and obtaining traffic based on step 1014b, for facilitating subsequent data processing and data analysis, data sorting may also be performed on the DNS traffic information and the traffic information, respectively, to obtain structured DNS traffic information and structured traffic information.
In the embodiment of the application, based on the distributed collection of the flow information in the target network, and the collected flow information is filtered, screened and separated from the service flow by referring to the domain name of interest, the more accurate data transmission states of the DNS flow and the service flow in the target network focused on can be obtained, the follow-up distributed flow obtained based on processing can be conveniently constructed, a more accurate DNS influence model can be constructed, and further more accurate service flow prediction can be performed.
In addition, in the embodiment of the application, not only the DNS traffic but also the service traffic are concerned, so that full life cycle perception of the DNS traffic and the service traffic can be realized, and more accurate prediction of the service traffic based on the DNS traffic can be supported and realized.
102, constructing a DNS influence model based on the behavior characteristics of the DNS traffic and the behavior characteristics of the service traffic; the DNS influence model is used for indicating the association relation between DNS servers included in the target network and influence information of each DNS server, and the influence information is used for indicating the probability that the DNS traffic from a first node influences the traffic of the DNS servers; the first node is any device accessing the target network.
In an alternative embodiment, before constructing the DNS influence model, in order to improve the construction efficiency, an initialized DNS influence model configured by the user may also be obtained, where the initialized DNS influence model includes: initializing a DNS role and initializing influence information of each DNS server; wherein the initialized DNS role is used to indicate the role of DNS server in the target network. For example, role information uploaded by the user regarding DNS servers in the target network may be obtained and stored, which may facilitate the rate of model convergence in constructing DNS impact models. Wherein the influence information may include, but is not limited to, one or more of the following: a client IP list of traffic sources, a router IP list, a proxy IP list, and a load balancing device IP list. Therefore, the association relation between the DNS servers can be more effectively determined on the basis of the DNS roles through the influence information, so that the deployment architecture of the DNS servers in the target network can be conveniently obtained.
Based on the above embodiment, the constructing a DNS influence model based on the behavior feature of the DNS traffic and the behavior feature of the traffic may be implemented as constructing a DNS influence model based on the behavior feature of the DNS traffic and the behavior feature of the traffic, and updating the initialized DNS influence model to obtain the DNS influence model. For example, referring to fig. 3, a schematic flow chart of constructing a DNS influence model according to an embodiment of the present application is provided. The process may include the steps of:
Step 1021, obtaining an initialized DNS influence model.
Step 1022a, extracting the behavior characteristics of DNS traffic. Among other things, behavior characteristics of DNS traffic may include, but are not limited to, one or more of the following information: timestamp, network area identification, internet protocol, IP, address, DNS query source device identification, traffic source device identification.
Step 1022b, extracting the behavior characteristics of the service flow. The behavior characteristics of the traffic flow may also include, but are not limited to, one or more of the following information: timestamp, network area identification, internet protocol, IP, address, DNS query source device identification, traffic source device identification. It can be understood that based on the behavior characteristics of the DNS traffic and the behavior characteristics of the service traffic, the matching relationship between the DNS traffic and the service traffic can be realized, and the DNS server involved in the DNS traffic and the service traffic can be known, so that the service state in the target network can be reflected more accurately when the DNS influence model is constructed, and further, the service traffic can be predicted more accurately based on the DNS traffic.
Step 1023a, updating DNS role information based on the behavior characteristics of the DNS and the behavior characteristics of the traffic. For example, based on the behavior characteristics of DNS and the behavior characteristics of traffic, cluster analysis may be performed, so that DNS roles related to nodes in the target network may be determined according to the clustering result. For example, if the analysis is based on the clustering result that one node accepts more DNS queries and fewer DNS queries are initiated by itself, the DNS role of the node may be considered as a DNS server with DNS caching function. It can be understood that if the DNS role information obtained by analyzing the clustering result is inconsistent with the initialized DNS role information, the initialized DNS role may be updated according to the clustering result.
Step 1023b, updating the association relation between DNS servers based on the behavior characteristics of the DNS and the behavior characteristics of the service flow. Illustratively, based on the behavior characteristics of DNS and the behavior characteristics of traffic, a DNS query source node, a query target node, an intermediate forwarding node, and the like for each DNS traffic may be analyzed, and a source node, a target node, an intermediate forwarding node, and the like for traffic may also be analyzed. In this way, for the node in the target network, traffic information to be processed by the node and the trend of the traffic information can be counted, so that the traffic influence of the node can be determined, traffic abnormal scenes and the like can be predicted in time, and the traffic abnormal scenes can be, for example, traffic exceeding the expected amplitude requirement and the like.
Step 1024, constructing a DNS influence model based on the association relationship between the DNS role information and the DNS server. It is understood that the roles of DNS servers in the target network may be determined based on the DNS role information, and traffic trends of traffic information across multiple DNS servers may be determined based on the association of DNS servers. In this way, the total amount of traffic handled on each DNS server in the target network can be determined based on the roles and traffic trends of the DNS servers, and thus the impact of the DNS servers can be determined.
Illustratively, constructing the DNS influence model may be implemented to count the number of queries from the second node to the first DNS server; and determining the probability that the DNS query behavior from the second node affects the first DNS server according to the query times. Wherein the second node may be, for example, each DNS query source device querying the first DNS server. It will be appreciated that the probability value may be increased as the number of queries is higher the probability of impact is greater.
Illustratively, constructing the DNS influence model may be implemented as constructing a DNS influence weighting map, which may be represented, for example, as g= (V, E, W); the DNS inquiry source equipment can be used as a point Vi, the DNS server can be used as a point Vj, the directed edge Eij is used for indicating the DNS inquiry source equipment to carry out DNS inquiry to the DNS server, and W is used for indicating inquiry count; the weighting distribution may be denoted as Q (w), which may be used to represent weights determined based on traffic status, with greater weights indicating greater DNS impact. In this way, the traffic condition of the DNS server in the target network can be determined through the DNS influence weighted graph, and the prediction of the abnormal traffic scene can be realized through Q (w).
Step 103, according to the DNS query behavior of the first node and the DNS influence model, a prediction result of a traffic flow model of the target network is obtained, where the traffic flow model is used to indicate traffic flows generated by each DNS server. Wherein, the prediction result of the service flow model comprises one or more of the following indexes: flow situation, up-down amplitude, flow distribution prediction, flow period information, flow offset and flow normal oscillation interval.
For example, the DNS query behavior of the first node may be used as an input of a DNS influence model, and the influence probability of the DNS server in the DNS influence model is determined, so that a prediction result of the traffic flow model may be output.
In addition, in the embodiment of the application, based on the obtained prediction result of the service flow model, prediction deviation correction processing can be performed. Optionally, when it is determined that the absolute difference between the predicted result and the traffic flow included in the distributed traffic is greater than or equal to a first traffic threshold, the DNS influence model is constructed in a returning manner. It can be understood that the service flow obtained in the distributed flow is obtained based on the collection in the target network, if the deviation between the predicted result and the collection result is larger, a certain deviation may exist in the constructed DNS influence model, and the DNS influence model can be reconstructed back based on the deviation condition of the predicted result and the collection result. Therefore, through prediction deviation correction processing, accuracy of predicting service traffic based on DNS traffic can be improved.
According to the method provided by the embodiment of the application, the route of the DNS query can be reflected more accurately based on the traffic condition in the distributed acquisition target network, and the route can be reflected more accurately through the distributed traffic acquisition and the DNS influence model under the condition that the DNS query needs an intermediate node to forward. Thus, DNS traffic information can be collected more comprehensively, and influence of the DNS server can be estimated more accurately. In addition, by constructing the DNS influence model, the influence of the DNS inquiry behavior in the target network on the DNS server in the network can be predicted more timely, so that the abnormal processing can be performed more rapidly in the traffic abnormal scene.
In order to facilitate understanding of the method provided in the embodiments of the present application, referring to fig. 4, another flowchart of a method for predicting traffic based on DNS traffic provided in the embodiments of the present application is shown. The process may include the steps of:
step 400a, obtaining an initialized DNS role. Illustratively, initializing the DNS role may be user configurable, which may improve the efficiency of DNS construction of the impact model.
Step 400b, obtaining the initialized domain name of interest. The target flow can be focused more accurately through the domain name of interest, so that the prediction efficiency can be improved.
Step 401, obtaining flow information of a target network. In the embodiment of the application, distributed acquisition can be performed on the basis of a plurality of position points of the distributed flow probe in the target network. Wherein the flow probe can multiplex existing flow acquisition devices or systems, etc.
And step 402a, separating DNS traffic information from the traffic information of the target network. By way of example, through screening and separating the collected flow information, DNS flow information and service flow information corresponding to the domain name of interest focused in the application can be obtained based on the collected flow information.
And step 402b, separating and obtaining service flow information from the flow information of the target network.
Step 402c, based on the traffic flow information, performing discovery of the associated domain name, and updating the domain name of interest based on the associated domain name.
Step 403, constructing a DNS influence model based on the initialized DNS roles, the DNS traffic information and the service traffic information. In the embodiment of the application, besides the DNS role can be initialized, a DNS influence model can also be initialized. By way of example, the method and the device for initializing the DNS influence model can update the initialized DNS influence model based on the DNS traffic information and the service traffic information, so that the DNS traffic trend condition and the service traffic trend condition in the target network can be reflected more accurately through the DNS influence model, and the relation between the DNS traffic and the service traffic can be obtained.
And step 404, predicting the service flow model based on the DNS influence model.
And 405, performing prediction deviation correction processing. For example, this may be performed based on the prediction report obtained in step 404 and the traffic information obtained in step 402b, and if the deviation is large, the process returns to step 403 to adjust the constructed DNS influence model. Thus, the prediction accuracy can be improved through the prediction deviation correction processing.
By the method provided by the embodiment of the application, the global acquisition and the global prediction of the target network can be realized. By the method, misjudgment of the flow of a single network, a local network or local equipment in the target network can be avoided based on global prediction; the method can also be applied to scenes such as DNS poisoning treatment, and the like, so that the accurate and rapid cleaning of DNS poisoning is realized; the method can also be applied to the precise and directional delivery of the black and white lists of the domain names, and reduces impression concept strands to normal users.
Based on similar inventive concepts, fig. 5 is a schematic structural diagram of an apparatus for predicting traffic based on DNS traffic according to an embodiment of the present application, as shown in fig. 5, where the apparatus includes:
an obtaining module 501, configured to obtain distributed traffic of a plurality of location points in a target network, where the distributed traffic includes DNS traffic and traffic;
a construction module 502, configured to construct a DNS influence model based on the behavior feature of the DNS traffic and the behavior feature of the traffic; the DNS influence model is used for indicating the association relation between DNS servers included in the target network and influence information of each DNS server, and the influence information is used for indicating the probability that the DNS traffic from a first node influences the traffic of the DNS servers; the first node is any device accessed to the target network;
And the prediction module 503 is configured to obtain a prediction result of a traffic flow model of the target network according to the DNS query behavior of the first node and the DNS influence model, where the traffic flow model is used to indicate traffic flow generated by each DNS server.
In an alternative embodiment, the obtaining module 501 is further configured to, before constructing the DNS influence model, further perform:
acquiring an initialized DNS influence model configured by a user, wherein the initialized DNS influence model comprises: initializing a DNS role and initializing influence information of each DNS server; the initialization DNS role is used for indicating the role of a DNS server in the target network;
the building module 502 is configured to build a DNS influence model based on the behavior feature of the DNS traffic and the behavior feature of the traffic, and is specifically configured to:
and constructing a DNS influence model based on the behavior characteristics of the DNS traffic and the behavior characteristics of the service traffic, and updating the initialized DNS influence model to obtain the DNS influence model.
In an alternative embodiment, the building module 502 is configured to build a DNS influence model based on the behavior feature of the DNS traffic and the behavior feature of the traffic, specifically configured to:
Counting the query times of the second node to the first DNS server;
determining the probability of influence of DNS query behaviors from the second node on the first DNS server according to the query times; when the number of times of inquiry is a first number, the probability is a first probability value, and when the number of times of inquiry is a second number, the probability is a second probability value, the second number is larger than the first number, and the second probability value is larger than the first probability value.
In an alternative embodiment, the behavioral characteristics include one or more of the following information: timestamp, network area identification, internet protocol, IP, address, DNS query source device identification, traffic source device identification.
In an alternative embodiment, the apparatus further comprises:
and the prediction deviation correcting module is used for returning to construct the DNS influence model after obtaining the prediction result of the service flow model of the target network and when determining that the absolute difference value between the prediction result and the service flow included in the distributed flow is greater than or equal to a first flow threshold value.
In an alternative embodiment, the obtaining module 501 is configured to, when obtaining distributed traffic of a plurality of location points in the target network, specifically:
Acquiring flow information acquired by flow acquisition components from different position points;
screening out target flow information belonging to a preset domain name of interest in the flow information;
and separating DNS traffic from the target traffic information, and determining the residual traffic information in the target traffic information as the service traffic.
In an alternative embodiment, the apparatus further comprises:
the identification module is used for identifying the associated domain name based on the behavior characteristics of the service flow;
and the updating module is used for updating the domain name of interest according to the associated domain name.
In an alternative embodiment, the influence information includes one or more of the following:
a client IP list of traffic sources, a router IP list, a proxy IP list, and a load balancing device IP list.
In an alternative embodiment, the prediction result of the traffic flow model includes one or more of the following indexes: flow situation, up-down amplitude, flow distribution prediction, flow period information, flow offset and flow normal oscillation interval.
Fig. 6 is a schematic structural diagram of an electronic device provided in an embodiment of the present application, and on the basis of the foregoing embodiments, the embodiment of the present application further provides an electronic device, as shown in fig. 6, including: processor 601, communication interface 602, memory 603 and communication bus 604, wherein processor 601, communication interface 602, memory 603 accomplish each other's communication through communication bus 604.
The memory 603 has stored therein a computer program which, when executed by the processor 601, causes the processor 601 to perform the steps of the method of predicting traffic based on DNS traffic in the above embodiments.
The communication bus mentioned above for the electronic devices may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, etc. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface 602 is used for communication between the electronic device and other devices described above.
The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit, a network processor (Network Processor, NP), etc.; but also digital instruction processors (Digital Signal Processing, DSP), application specific integrated circuits, field programmable gate arrays or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
On the basis of the above embodiments, the present application further provides a computer readable storage medium, in which a computer program executable by a processor is stored, which when executed on the processor, causes the processor to implement the steps of the method for predicting traffic based on DNS traffic in the above embodiments.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the spirit or scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims (20)

1. A method for predicting traffic based on DNS traffic, the method comprising:
acquiring distributed traffic of a plurality of position points in a target network, wherein the distributed traffic comprises DNS traffic and service traffic;
constructing a DNS influence model based on the behavior characteristics of the DNS traffic and the behavior characteristics of the service traffic; the DNS influence model is used for indicating the association relation between DNS servers included in the target network and influence information of each DNS server, and the influence information is used for indicating the probability that the DNS traffic from a first node influences the traffic of the DNS servers; the first node is any device accessed to the target network;
and acquiring a prediction result of a service flow model of the target network according to the DNS query behavior of the first node and the DNS influence model, wherein the service flow model is used for indicating the service flow generated by each DNS server.
2. The method of claim 1, wherein prior to constructing a DNS influence model based on the behavioral characteristics of the DNS traffic and the behavioral characteristics of the traffic, the method further comprises:
Acquiring an initialized DNS influence model configured by a user, wherein the initialized DNS influence model comprises: initializing a DNS role and initializing influence information of each DNS server; the initialization DNS role is used for indicating the role of a DNS server in the target network;
the constructing a DNS influence model based on the behavior feature of the DNS traffic and the behavior feature of the service traffic includes:
and constructing a DNS influence model based on the behavior characteristics of the DNS traffic and the behavior characteristics of the service traffic, and updating the initialized DNS influence model to obtain the DNS influence model.
3. The method according to claim 1 or 2, wherein said constructing a DNS influence model based on the behavior characteristics of the DNS traffic and the behavior characteristics of the traffic comprises:
counting the query times of the second node to the first DNS server;
determining the probability of influence of DNS query behaviors from the second node on the first DNS server according to the query times; when the number of times of inquiry is a first number, the probability is a first probability value, and when the number of times of inquiry is a second number, the probability is a second probability value, the second number is larger than the first number, and the second probability value is larger than the first probability value.
4. A method according to claim 3, wherein the behavioral characteristics include one or more of the following information: timestamp, network area identification, internet protocol, IP, address, DNS query source device identification, traffic source device identification.
5. The method according to claim 1 or 2, wherein after the obtaining the prediction result of the traffic flow model of the target network, the method further comprises:
and when the absolute difference value between the predicted result and the service flow included in the distributed flow is greater than or equal to a first flow threshold value, returning to construct the DNS influence model.
6. The method according to claim 1 or 2, wherein the obtaining distributed traffic of a plurality of location points in the target network comprises:
acquiring flow information acquired by flow acquisition components from different position points;
screening out target flow information belonging to a preset domain name of interest in the flow information;
and separating DNS traffic from the target traffic information, and determining the residual traffic information in the target traffic information as the service traffic.
7. The method of claim 6, wherein the method further comprises:
Identifying an associated domain name based on the behavioral characteristics of the traffic flow;
and updating the domain name of interest according to the associated domain name.
8. The method according to claim 1 or 2, wherein the influence information comprises one or more of the following information:
a client IP list of traffic sources, a router IP list, a proxy IP list, and a load balancing device IP list.
9. The method according to claim 1 or 2, wherein the prediction result of the traffic flow model comprises one or more of the following metrics: flow situation, up-down amplitude, flow distribution prediction, flow period information, flow offset and flow normal oscillation interval.
10. An apparatus for predicting traffic based on DNS traffic, the apparatus comprising:
the system comprises an acquisition module, a service flow acquisition module and a control module, wherein the acquisition module is used for acquiring distributed flow of a plurality of position points in a target network, and the distributed flow comprises DNS flow and service flow;
the construction module is used for constructing a DNS influence model based on the behavior characteristics of the DNS traffic and the behavior characteristics of the service traffic; the DNS influence model is used for indicating the association relation between DNS servers included in the target network and influence information of each DNS server, and the influence information is used for indicating the probability that the DNS traffic from a first node influences the traffic of the DNS servers; the first node is any device accessed to the target network;
And the prediction module is used for acquiring a prediction result of a service flow model of the target network according to the DNS query behavior of the first node and the DNS influence model, wherein the service flow model is used for indicating the service flow generated by each DNS server.
11. The apparatus of claim 10, wherein the obtaining module is further configured to, prior to constructing a DNS influence model based on the behavior characteristics of the DNS traffic and the behavior characteristics of the traffic,:
acquiring an initialized DNS influence model configured by a user, wherein the initialized DNS influence model comprises: initializing a DNS role and initializing influence information of each DNS server; the initialization DNS role is used for indicating the role of a DNS server in the target network;
the construction module is configured to construct a DNS influence model based on the behavior feature of the DNS traffic and the behavior feature of the traffic, and is specifically configured to:
and constructing a DNS influence model based on the behavior characteristics of the DNS traffic and the behavior characteristics of the service traffic, and updating the initialized DNS influence model to obtain the DNS influence model.
12. The apparatus according to claim 10 or 11, wherein the construction module is configured to construct a DNS influence model based on the behavior characteristics of the DNS traffic and the behavior characteristics of the traffic, in particular for:
counting the query times of the second node to the first DNS server;
determining the probability of influence of DNS query behaviors from the second node on the first DNS server according to the query times; when the number of times of inquiry is a first number, the probability is a first probability value, and when the number of times of inquiry is a second number, the probability is a second probability value, the second number is larger than the first number, and the second probability value is larger than the first probability value.
13. The apparatus of claim 12, wherein the behavioral characteristics include one or more of the following information: timestamp, network area identification, internet protocol, IP, address, DNS query source device identification, traffic source device identification.
14. The apparatus according to claim 10 or 11, characterized in that the apparatus further comprises:
and the prediction deviation correcting module is used for returning to construct the DNS influence model after obtaining the prediction result of the service flow model of the target network and when determining that the absolute difference value between the prediction result and the service flow included in the distributed flow is greater than or equal to a first flow threshold value.
15. The apparatus according to claim 10 or 11, wherein the obtaining module is configured to, when obtaining the distributed traffic of the plurality of location points in the target network, specifically:
acquiring flow information acquired by flow acquisition components from different position points;
screening out target flow information belonging to a preset domain name of interest in the flow information;
and separating DNS traffic from the target traffic information, and determining the residual traffic information in the target traffic information as the service traffic.
16. The apparatus of claim 15, wherein the apparatus further comprises:
the identification module is used for identifying the associated domain name based on the behavior characteristics of the service flow;
and the updating module is used for updating the domain name of interest according to the associated domain name.
17. The apparatus of claim 10 or 11, wherein the influence information comprises one or more of the following:
a client IP list of traffic sources, a router IP list, a proxy IP list, and a load balancing device IP list.
18. The apparatus according to claim 10 or 11, wherein the prediction result of the traffic flow model comprises one or more of the following metrics: flow situation, up-down amplitude, flow distribution prediction, flow period information, flow offset and flow normal oscillation interval.
19. An electronic device, comprising: the device comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
the memory having stored therein a computer program which, when executed by the processor, causes the processor to perform the steps of the method of predicting traffic based on DNS traffic as claimed in any one of claims 1 to 9.
20. A computer readable storage medium, characterized in that it stores a computer program executable by a processor, said program, when run on said processor, causing said processor to perform the steps of the method of predicting traffic based on DNS traffic as claimed in any of claims 1-9.
CN202211662873.XA 2022-12-23 2022-12-23 Method, device and equipment for predicting service traffic based on DNS traffic Active CN116016220B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211662873.XA CN116016220B (en) 2022-12-23 2022-12-23 Method, device and equipment for predicting service traffic based on DNS traffic

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211662873.XA CN116016220B (en) 2022-12-23 2022-12-23 Method, device and equipment for predicting service traffic based on DNS traffic

Publications (2)

Publication Number Publication Date
CN116016220A true CN116016220A (en) 2023-04-25
CN116016220B CN116016220B (en) 2024-06-18

Family

ID=86031009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211662873.XA Active CN116016220B (en) 2022-12-23 2022-12-23 Method, device and equipment for predicting service traffic based on DNS traffic

Country Status (1)

Country Link
CN (1) CN116016220B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110145316A1 (en) * 2009-12-14 2011-06-16 Verizon Patent And Licensing, Inc. Proactive dns query system based on call flow analysis
CN106911536A (en) * 2017-04-14 2017-06-30 四川大学 A kind of DNS health degree appraisal procedures based on model of fuzzy synthetic evaluation
CN108075909A (en) * 2016-11-11 2018-05-25 阿里巴巴集团控股有限公司 A kind of method for predicting and device
CN110912749A (en) * 2019-11-29 2020-03-24 北京工业大学 Method for predicting DNS data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110145316A1 (en) * 2009-12-14 2011-06-16 Verizon Patent And Licensing, Inc. Proactive dns query system based on call flow analysis
CN108075909A (en) * 2016-11-11 2018-05-25 阿里巴巴集团控股有限公司 A kind of method for predicting and device
CN106911536A (en) * 2017-04-14 2017-06-30 四川大学 A kind of DNS health degree appraisal procedures based on model of fuzzy synthetic evaluation
CN110912749A (en) * 2019-11-29 2020-03-24 北京工业大学 Method for predicting DNS data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李雨泰;来风刚;尚智婕;董希杰;: "基于智能DNS的网络流量负载均衡控制研究", 自动化与仪器仪表, no. 11, 25 November 2018 (2018-11-25) *

Also Published As

Publication number Publication date
CN116016220B (en) 2024-06-18

Similar Documents

Publication Publication Date Title
CN108076019B (en) Abnormal flow detection method and device based on flow mirror image
US7801985B1 (en) Data transfer for network interaction fraudulence detection
CN107683586A (en) Method and apparatus for rare degree of the calculating in abnormality detection based on cell density
US20160308833A1 (en) Platforms for implementing an analytics framework for dns security
US20160294773A1 (en) Behavior analysis based dns tunneling detection and classification framework for network security
US9729563B2 (en) Data transfer for network interaction fraudulence detection
CN107967488B (en) Server classification method and classification system
CN110324327B (en) User and server IP address calibration device and method based on specific enterprise domain name data
CN109120463B (en) Flow prediction method and device
CN113067804B (en) Network attack detection method and device, electronic equipment and storage medium
CN110719194B (en) Network data analysis method and device
CN113114618B (en) Internet of things equipment intrusion detection method based on traffic classification recognition
KR20140035678A (en) Learning-based dns analyzer and analysis method
CN109327356B (en) User portrait generation method and device
CN108459908A (en) Identification to the incompatible cotenant pair in cloud computing
CN111181799A (en) Network traffic monitoring method and equipment
CN112101692A (en) Method and device for identifying poor-quality users of mobile Internet
CN107846402B (en) BGP stability abnormity detection method and device and electronic equipment
Kozik et al. Pattern extraction algorithm for NetFlow‐based botnet activities detection
CN116016220B (en) Method, device and equipment for predicting service traffic based on DNS traffic
CN112953961B (en) Equipment type identification method in power distribution room Internet of things
Pekar et al. Towards threshold‐agnostic heavy‐hitter classification
CN114860543A (en) Anomaly detection method, device, equipment and computer readable storage medium
Kalliola et al. Learning flow characteristics distributions with elm for distributed denial of service detection and mitigation
CN110493264B (en) Internal threat discovery method based on internal network entity relationship and behavior chain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant