WO2024036822A1 - Method and apparatus for determining malicious domain name, device, and medium - Google Patents

Method and apparatus for determining malicious domain name, device, and medium Download PDF

Info

Publication number
WO2024036822A1
WO2024036822A1 PCT/CN2022/136819 CN2022136819W WO2024036822A1 WO 2024036822 A1 WO2024036822 A1 WO 2024036822A1 CN 2022136819 W CN2022136819 W CN 2022136819W WO 2024036822 A1 WO2024036822 A1 WO 2024036822A1
Authority
WO
WIPO (PCT)
Prior art keywords
address
candidate
domain name
malicious
blacklist
Prior art date
Application number
PCT/CN2022/136819
Other languages
French (fr)
Chinese (zh)
Inventor
马晨
任毅
崔乾
李世辰
冯晓冬
曹亮
徐涛
孙琦瑞
吴同
***
Original Assignee
天翼安全科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 天翼安全科技有限公司 filed Critical 天翼安全科技有限公司
Publication of WO2024036822A1 publication Critical patent/WO2024036822A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/10Network architectures or network communication protocols for network security for controlling access to devices or network resources
    • H04L63/101Access control lists [ACL]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1441Countermeasures against malicious traffic

Definitions

  • This application relates to the field of data security technology, and in particular to a method, device, equipment and medium for determining malicious domain names.
  • Malicious domain names refer to a type of URL with malicious links.
  • Such URLs usually exploit vulnerabilities in application software or browsers to implant Trojans and virus programs into the website. and other malicious codes, and use disguised website service content to induce users to access these websites. If users operate their computers to access these websites, they may be "scammed", causing their computers to be infected by malicious codes, thus causing security issues.
  • Phishing websites refer to a type of website that pretends to be the website of a legitimate institution such as a bank or online store. It attempts to trick users into entering user names, passwords or other private information on its website. Such websites can pose certain threats to personal privacy and property security. . Malware websites contain malicious code that hackers can use to obtain and transmit a user's private or sensitive information by installing malware on a user's computer.
  • IP Internet Protocol
  • Embodiments of the present application provide a method, device, equipment and medium for determining malicious domain names to improve the accuracy of determining malicious domain names.
  • inventions of the present application provide a method for determining malicious domain names.
  • the method includes:
  • For each candidate IP address obtain the sub-data of the preset type corresponding to the network flow (Netflow) data of the candidate IP address within the preset time period, and extract the sub-data of the preset feature type through the embedded method.
  • the feature type corresponding to the abnormal sub-data in the sub-data is input into the pre-trained recognition model to obtain whether the candidate IP address output by the recognition model is a malicious IP address;
  • each domain name corresponding to each determined malicious IP address is determined as a malicious domain name.
  • determining each candidate IP address based on the IP addresses in the pre-saved blacklist includes:
  • DNS Domain Name System
  • determining each candidate IP address based on the domain names in the pre-saved blacklist includes:
  • each candidate IP address corresponding to the domain name in the blacklist is determined.
  • the method also includes:
  • the method after determining each candidate IP address, before inputting the feature type into the recognition model and obtaining whether the candidate IP address is a malicious IP address output by the recognition model, the method also includes:
  • For each candidate IP address determine the number of domain names corresponding to the candidate IP address as the first number, obtain the domain name corresponding to the candidate IP address as the target domain name, determine each IP address corresponding to the target domain name, and collect statistics a second number of each IP address present in the blacklist;
  • the step of inputting the feature type into the recognition model and obtaining whether the candidate IP address output by the recognition model is a malicious IP address includes:
  • the confidence level corresponding to the candidate IP address and the feature type are input into the recognition model, and whether the candidate IP address output by the recognition model is obtained is a malicious IP address.
  • inventions of the present application also provide a device for determining malicious domain names.
  • the device includes:
  • a determination module used to determine each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist;
  • a processing module configured to obtain, for each candidate IP address, sub-data of a preset type corresponding to the network flow Netflow data of the candidate IP address within a preset time period, and extract the preset feature type through the embedded Embedded method The feature type corresponding to the abnormal sub-data in the sub-data, input the feature type into the pre-trained recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address;
  • the determination module is also configured to determine each domain name corresponding to each determined malicious IP address as a malicious domain name based on the pre-saved correspondence between the domain name and the IP address.
  • the determination module is specifically configured to determine each candidate domain name corresponding to the IP address in the pre-saved blacklist based on the corresponding relationship between the domain name and the IP address in the pre-saved DNS log, and determine each candidate domain name. Each candidate IP address corresponding to the domain name.
  • the determination module is specifically configured to determine each candidate IP address corresponding to the domain name in the blacklist based on the corresponding relationship between the domain name and the IP address in the pre-saved DNS log.
  • the determination module is also configured to cyclically perform the following steps for each candidate IP address: determine each candidate IP address and each corresponding domain name; determine the IP address corresponding to each domain name as the candidate IP address. ; Until the IP address corresponding to each obtained domain name is a candidate IP address, or the domain name corresponding to each candidate IP address is obtained.
  • the processing module is further configured to, for each candidate IP address, determine the number of domain names corresponding to the candidate IP address as the first number, and obtain the domain name corresponding to the candidate IP address as the target domain name, and determine the number of domain names corresponding to the candidate IP address. For each IP address corresponding to the target domain name, count the second number of each IP address that exists in the blacklist; according to the ratio of the second number to the first number, determine the candidate IP address as Confidence of malicious IP addresses;
  • the processing module is specifically configured to input the confidence degree corresponding to the candidate IP address and the feature type into the recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address.
  • inventions of the present application further provide an electronic device.
  • the electronic device at least includes a processor and a memory.
  • the processor is configured to perform any one of the above malicious domain name determinations when executing a computer program stored in the memory. Method steps.
  • embodiments of the present application further provide a computer-readable storage medium that stores a computer program that, when executed by a processor, performs the steps of any one of the above malicious domain name determination methods.
  • the electronic device determines each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist. After determining each candidate IP address, the electronic device determines each candidate IP address. For the candidate IP address, obtain the preset type of sub-data corresponding to the Netflow data of the candidate IP address within the preset time period, and extract the characteristics corresponding to the abnormal sub-data of the preset type of sub-data corresponding to the candidate IP address through the Embedded method. Type, input this feature type into the pre-trained recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address.
  • the electronic device After obtaining each malicious IP address, the electronic device determines each domain name corresponding to each determined malicious IP address as a malicious domain name based on the pre-saved correspondence between the domain name and the IP address. In this embodiment of the present application, the electronic device determines each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist. After determining each candidate IP address, the electronic device obtains the preset time period.
  • the preset type of sub-data corresponding to the Netflow data of the candidate IP address is extracted, and through the Embedded method, the feature type corresponding to the abnormal sub-data is extracted from the preset type of sub-data corresponding to the candidate IP address, and the feature type is input into pre-training
  • Each corresponding domain name is determined to be a malicious domain name, thereby improving the accuracy of determining the malicious domain name.
  • Figure 1 is a schematic diagram of a malicious domain name determination process provided by an embodiment of the present application.
  • Figure 2 is a schematic diagram of IP addresses in a blacklist provided by an embodiment of the present application.
  • Figure 3 is a schematic diagram of a process for determining candidate IP addresses provided by an embodiment of the present application.
  • Figure 4 is a schematic diagram of a process of training an original recognition model provided by an embodiment of the present application.
  • Figure 5 is a detailed schematic diagram of determining a malicious domain name provided by an embodiment of the present application.
  • Figure 6 is a schematic structural diagram of a malicious domain name determination device provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device determines each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist. After determining each candidate IP address, the electronic device determines each candidate IP address. For the candidate IP address, obtain the preset type of sub-data corresponding to the Netflow data of the candidate IP address within the preset time period, and extract the characteristics corresponding to the abnormal sub-data of the preset type of sub-data corresponding to the candidate IP address through the Embedded method. Type, input this feature type into the pre-trained recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address. After obtaining each malicious IP address, the electronic device determines each domain name corresponding to each determined malicious IP address as a malicious domain name based on the pre-saved correspondence between the domain name and the IP address.
  • embodiments of the present application provide a method, device, equipment and medium for determining malicious domain names.
  • Figure 1 is a schematic diagram of a malicious domain name determination process provided by an embodiment of this application. The process includes the following steps:
  • S101 Determine each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist.
  • the method for determining malicious domain names provided in the embodiments of this application is applied to electronic devices, which may be PCs, servers, and other devices.
  • a blacklist is pre-stored in the electronic device.
  • the blacklist includes IP addresses and domain names.
  • the IP addresses and domain names included in the blacklist are well-known and relatively fixed malicious ones at home and abroad. IP addresses, and well-known and relatively fixed malicious domain names at home and abroad.
  • the electronic device can determine each candidate IP address that may be a malicious IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist.
  • the number of IP addresses included in the blacklist is not fixed and may include one or more than one.
  • the number of domain names included in the blacklist is also not fixed.
  • the IP addresses and domain names in the blacklist can be called precise threat intelligence seed data.
  • the electronic device can locally store the corresponding relationship between the IP address and the domain name.
  • the electronic device can determine each candidate corresponding to the IP address and domain name in the blacklist based on the locally stored corresponding relationship between the IP address and the domain name.
  • IP address it is worth explaining that a certain IP address may correspond to multiple domain names, or there may not be a corresponding domain name, and a certain domain name may correspond to multiple IP addresses, or there may be no corresponding IP address.
  • the electronic device can determine each IP address corresponding to the domain name in the blacklist as a candidate IP address, determine each domain name corresponding to the IP address in the blacklist, and determine the IP address corresponding to each domain name as a candidate IP. address.
  • the domain names in the blacklist include a.b.com
  • the IP addresses corresponding to a.b.com are 1.1.1.2 and 1.1.1.3
  • the two IP addresses 1.1.1.2 and 1.1.1.3 can be determined as candidate IP addresses.
  • S102 For each candidate IP address, obtain the preset type of sub-data corresponding to the Netflow data of the candidate IP address within the preset time period, and extract the abnormal sub-data corresponding to the preset type of sub-data through the Embedded method.
  • the feature type is input into the pre-trained recognition model to obtain whether the candidate IP address output by the recognition model is a malicious IP address.
  • the electronic device determines whether each candidate IP address is a malicious IP address. Specifically, for each obtained candidate IP address, the electronic device obtains the Netflow data of the candidate IP address within a preset time period, and obtains the preset type sub-data corresponding to the Netflow data, wherein the obtained preset
  • the sub-data of the type can be one or more of the ratio of upstream and downstream traffic packets, commonly used ports, peer communication port range, and the number of bytes of upstream data packets, among which.
  • how to obtain Netflow data of a certain IP address within a preset time period is an existing technology and will not be described again here.
  • the Netflow data includes different feature types and sub-data corresponding to each feature type.
  • the electronic device can obtain the sub-data corresponding to the preset type in the Netflow data.
  • the preset type is the uplink and downlink traffic packet ratio
  • the electronic device can obtain the corresponding sub-data of the uplink and downlink traffic packet ratio from the acquired Netflow data
  • the sub-data can be a specific ratio value.
  • the preset type is a commonly used port
  • the electronic device can obtain the corresponding sub-data of the commonly used port type from the obtained Netflow data
  • the sub-data is a specific port.
  • the preset type is the peer communication port range
  • the electronic device can obtain the corresponding sub-data of the peer communication port range in the obtained Netflow data
  • the sub-data is the specific port range.
  • the electronic device can extract the preset type corresponding to the candidate IP address through the Embedded method for each candidate IP address.
  • the characteristic type corresponding to the abnormal sub-data may be one or more of the ratio of uplink and downlink traffic packets, commonly used ports, peer communication port range, and the number of bytes of uplink data packets.
  • the extracted feature type can be the ratio of uplink and downlink traffic packets.
  • the feature type corresponding to the abnormal sub-data is extracted from several types of sub-data through the Embedded method. It is an existing technology. This will not be described again.
  • a pre-trained identification model is pre-stored in the electronic device. For each candidate IP address, the electronic device determines the abnormality of the candidate IP address. After the feature type corresponding to the data is input, the feature type corresponding to the candidate IP address is input into the recognition model, and the output of the recognition model is obtained. The output of the recognition model is whether the candidate IP address is a malicious IP address. In this way, the electronic device can determine the malicious IP address in each candidate IP address.
  • the obtained candidate IP addresses include 1.1.1.1, 1.1.1.2, and 1.1.1.3
  • the electronic device obtains the proportion of uplink and downlink traffic packets corresponding to 1.1.1.1, 1.1.1.2, and 1.1.1.3 within the preset time period.
  • the Embedded method performs feature screening, obtains the corresponding feature type, and further identifies whether the candidate IP address is a malicious IP address through the pre-trained recognition model, thereby improving the accuracy of malicious IP address identification.
  • the correspondence between the domain name and the IP address is pre-stored in the electronic device. After determining each malicious IP address, the electronic device will, for each malicious IP address, according to the pre-stored correspondence between the domain name and the IP address. , determine that each domain name corresponding to the malicious IP address is a malicious domain name. In this way, the electronic device can determine the malicious domain name corresponding to each malicious IP address.
  • the electronic device determines each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist. After determining each candidate IP address, the electronic device obtains the preset time period. The preset type of sub-data corresponding to the Netflow data of the candidate IP address is extracted, and through the Embedded method, the feature type corresponding to the abnormal sub-data is extracted from the preset type of sub-data corresponding to the candidate IP address, and the feature type is input into pre-training In the completed identification model, obtain whether the candidate IP address output by the identification model is a malicious IP address, thereby avoiding misidentification of malicious IP addresses, and classify each malicious IP address according to the correspondence between the pre-saved domain name and IP address. Each corresponding domain name is determined to be a malicious domain name, thereby improving the accuracy of determining the malicious domain name.
  • determining each candidate IP address based on the IP addresses in the pre-saved blacklist includes:
  • each candidate domain name corresponding to the IP address in the pre-saved blacklist is determined, and the IP address corresponding to each candidate domain name is determined as the candidate IP address.
  • a certain domain name may correspond to multiple IP addresses, and a certain IP address may correspond to multiple domain names.
  • a malicious domain name corresponds to a single IP address
  • the single IP address may Being intercepted, the malicious domain name cannot be accessed.
  • the malicious domain name will correspond to multiple IP addresses, and usually in order to prevent the malicious domain name from being detected, the offending party will continuously generate new domain names through domain name generation (Domain Generation Algorithm, DGA) algorithms and other methods. , thus making multiple domain names corresponding to a certain IP address all malicious domain names.
  • the electronic device can determine each candidate domain name corresponding to the IP address in the blacklist, and determine the IP address corresponding to each candidate domain name as the candidate IP address.
  • a DNS log is pre-stored in the electronic device, and the correspondence between the domain name and the IP address is stored in the DNS log.
  • the electronic device determines the blacklist based on the correspondence between the domain name and the IP address in the DNS log.
  • Each domain name corresponding to the IP address in is a candidate domain name.
  • This step can be called IP DNS log reverse analysis. After obtaining each candidate domain name, the electronic device can obtain each candidate domain name according to the DNS log. The corresponding relationship between the saved domain name and the IP address is determined to determine each IP address corresponding to the candidate domain name as a candidate IP address. In this way, the electronic device can determine each candidate IP address corresponding to each candidate domain name.
  • This step can Known as domain name DNS log parsing, each candidate IP address is potentially a malicious IP address.
  • the electronic device can delete the IP addresses that exist in the blacklist among the candidate IP addresses.
  • the IP addresses in the blacklist include 1.1.1.1.
  • Each domain name corresponding to the obtained IP address is a.b.com and c.b.com.
  • the IP addresses corresponding to a.b.com are 1.1.1.1 and 1.1.1.2, and the corresponding IP addresses to c.b.com If the IP addresses are 1.1.1.1 and 1.1.1.3, the corresponding candidate IP addresses are 1.1.1.2 and 1.1.1.3.
  • Figure 2 is a schematic diagram of IP addresses in a blacklist provided by an embodiment of the present application.
  • IP addresses are stored in the blacklist, and more than one IP address can be stored in the blacklist.
  • the electronic device can determine that each domain name corresponding to each malicious IP address is a malicious domain name based on the correspondence between the domain name and the IP address in the DNS log.
  • determining each candidate IP address based on the domain name in the pre-saved blacklist includes:
  • each candidate IP address corresponding to the domain name in the blacklist is determined.
  • the electronic device can determine each IP address corresponding to the domain name as a candidate IP address based on the correspondence between the domain name and the IP address in the pre-saved DNS log. In this way, the device can determine the candidate IP address corresponding to each domain name in the blacklist.
  • each candidate IP address corresponding to the domain name in the blacklist there may be an IP address that is the same as the IP address in the blacklist. Since the IP addresses in the blacklist themselves are malicious IP addresses, in order to save time, improve efficiency, and no longer need to determine whether the IP address in the blacklist is a malicious IP address. Therefore, after obtaining each candidate IP address, the electronic device can target each candidate IP address. If the candidate IP address is consistent with If an IP address in the blacklist is the same, the candidate IP address will be deleted from the candidate IP addresses.
  • the method further includes:
  • the domain names corresponding to some malicious IP addresses may not be malicious domain names, so DNS logs need to be repeatedly parsed to obtain each candidate IP address.
  • a loop is executed for each determined candidate IP address. The following steps: Determine each domain name corresponding to each candidate IP address based on the corresponding relationship between domain names and IP addresses in the pre-saved DNS logs, and determine the IP corresponding to each domain name based on the corresponding relationship between domain names and IP addresses in the DNS logs.
  • the address is a candidate IP address.
  • IP addresses corresponding to each obtained domain name are all candidate IP addresses, or the domain names corresponding to each candidate IP address have been obtained. If the IP addresses corresponding to each domain name are all candidates If the IP address or the domain name corresponding to each candidate IP address is obtained, there is no need to continue to determine the candidate IP address. If any IP address corresponding to each obtained domain name is not a candidate IP address, then the IP address is determined as Candidate IP address, and determine each domain name corresponding to the IP address; if there is a domain name corresponding to a determined candidate IP address that has not been obtained, determine each IP address corresponding to the domain name, and determine each domain name corresponding to the determined candidate IP address. Whether there is an IP address that is not a candidate IP address among the IP addresses.
  • the domain name is repeatedly parsed through DNS logs.
  • the first time a.b.com is parsed the IP addresses corresponding to the domain name are obtained as 1.1.1.1 and 1.1.1.2.
  • the corresponding domain names obtained by IP reverse analysis are a.b.com and c.b.com. Since c.b.com is a new domain name, we continue to use DNS logs to parse c.b.com and obtain 1.1.1.2 and 1.1.1.3. Since 1.1.1.3 is a new IP Address, continue to perform reverse analysis on the IP address, and obtain the corresponding domain names a.b.com and c.b.com. If no new domain name appears, there is no need to continue to determine.
  • the determined candidate IP addresses are 1.1.1.1, 1.1.1.2, and 1.1 .1.3.
  • Figure 3 is a schematic diagram of a process for determining candidate IP addresses provided by an embodiment of the present application. The process includes the following steps:
  • Figure 3 is a schematic diagram of the process of determining each candidate IP address corresponding to a certain domain name in the blacklist.
  • S306 Determine each IP address as a candidate IP address, and obtain each new candidate IP address.
  • S308 Determine whether there is a new domain name for each domain name. If so, execute S303. If not, execute S305.
  • the method further includes:
  • For each candidate IP address determine the number of domain names corresponding to the candidate IP address as the first number, obtain the domain name corresponding to the candidate IP address as the target domain name, determine each IP address corresponding to the target domain name, and collect statistics a second number of each IP address present in the blacklist;
  • the step of inputting the feature type into the recognition model and obtaining whether the candidate IP address output by the recognition model is a malicious IP address includes:
  • the confidence corresponding to the candidate IP address and the feature type are input into the recognition model, and whether the candidate IP address output by the recognition model is obtained is a malicious IP address.
  • the electronic device determines the confidence corresponding to the candidate IP address according to the number of domain names corresponding to the candidate IP address, and combines the confidence corresponding to the candidate IP address with the candidate IP
  • the feature type corresponding to the address is input into the recognition model, thereby further improving the accuracy of the recognition model in determining whether the IP address is a malicious IP address.
  • the electronic device determines each candidate IP address corresponding to the candidate IP address based on the corresponding relationship between the domain name and the IP address in the DNS log. domain name, and determines the number of each domain name as the first number.
  • the electronic device also takes each domain name as a target domain name, and determines each domain name corresponding to each target domain name according to the corresponding relationship between the domain name and the IP address in the DNS log.
  • IP addresses determine the number of each IP address in the blacklist as the second number. After determining the first number and the second number, the electronic device can obtain the ratio of the second number to the first number, and the electronic device can The ratio is determined as the confidence level corresponding to the candidate IP address.
  • the product of the ratio and the preset value may also be used to determine the confidence level corresponding to the candidate IP address.
  • the greater the second number it means that the candidate IP address corresponds to each target domain name, and the greater the number of IP addresses in the corresponding blacklist, it means that the candidate IP address is a malicious IP address.
  • the formula for the confidence corresponding to a certain candidate IP address of the electronic device is:
  • Score is the confidence corresponding to the candidate IP address
  • 100 is the preset value
  • Cevil is the second number
  • Ctotal is the first number.
  • the electronic device For each candidate IP address, after obtaining the confidence that the candidate IP address is a malicious IP address and the feature type corresponding to the candidate IP address, the electronic device inputs the confidence level and feature type corresponding to the candidate IP address in advance. In the recognition model that has been trained, the output of the recognition model is obtained. The output of the recognition model is whether the candidate IP address is a malicious IP address.
  • the electronic device determines whether the candidate IP address is a malicious IP address through the pre-trained recognition model
  • the determination is made based on the confidence level and feature type corresponding to the candidate IP address, thereby increasing the diversity of the input information. properties, further improving the accuracy of model identification.
  • a sample set is pre-stored in the electronic device. Multiple IP addresses are stored in the sample set, and each IP address is marked with whether it is a malicious IP address.
  • the electronic device For each IP address in the sample set, the device obtains the preset type of sub-data corresponding to the IP address within the preset time period, and uses the Embedding method to extract the feature type corresponding to the abnormal sub-data in the preset type of sub-data. and determine the third number of domain names corresponding to the IP address, determine each domain name corresponding to the IP address, determine each IP address corresponding to each domain name, and count the fourth number of each IP address present in the blacklist.
  • the original recognition model determines the confidence that the IP address is a malicious IP address based on the ratio of the fourth quantity to the third quantity, input the confidence and the feature type into the original recognition model, and obtain whether the IP address output by the original recognition model is malicious.
  • the original recognition model is trained based on the results output by the original recognition model and whether the IP address is previously marked as a malicious IP address.
  • the recognition model is trained in the above method.
  • the preset conditions are met, the trained recognition model is obtained.
  • the preset condition may be that the feature type and confidence level corresponding to the IP address in the sample set are consistent with the number of training results obtained after training the original recognition model and whether the labeled IP address is a malicious IP address, which is greater than the set value.
  • the number it can also be that the number of iterations for training the original recognition model reaches the set maximum number of iterations, etc.
  • the embodiments of this application do not limit this.
  • Figure 4 is a schematic diagram of a process of training an original recognition model provided by an embodiment of the present application.
  • a sample set is saved, in which for each IP address, whether the IP address is a malicious IP address is saved.
  • the electronic device for each IP address in the sample set, respectively Determine the corresponding feature type and confidence level, input the corresponding feature type and confidence level into the original recognition model, obtain the output result of the original recognition model, and determine whether it is malicious according to the output result of the original recognition model and whether it is saved for each IP address. IP address to train the original recognition model.
  • FIG. 5 is a detailed schematic diagram of determining a malicious domain name provided by an embodiment of the present application. The process includes the following steps:
  • Figure 5 takes as an example that the confidence level corresponding to the candidate IP address is determined first, and then the feature type corresponding to the IP address is determined.
  • S501 Determine each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist.
  • S502 For each candidate IP address, determine the number of domain names corresponding to the candidate IP address as the first number, obtain the domain name corresponding to the candidate IP address as the target domain name, determine each IP address corresponding to the target domain name, and collect statistics Each IP address exists in the second number of blacklists.
  • S503 For each candidate IP address, determine the ratio of the second number corresponding to the candidate IP address to the first number, which is the confidence level corresponding to the candidate IP address.
  • S504 Obtain the preset type of subdata corresponding to each candidate IP address within the preset time period.
  • S505 Use the Embedded method to extract the feature type corresponding to the abnormal sub-data in the sub-data of the preset feature type corresponding to each candidate IP address.
  • S506 For each candidate IP address, input the feature type corresponding to the candidate IP address into the pre-trained recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address.
  • S507 Based on the correspondence between domain names and IP addresses in the DNS log, determine each malicious domain name corresponding to each malicious IP address.
  • Figure 6 is a schematic structural diagram of a malicious domain name determination device provided by an embodiment of the present application.
  • the device includes:
  • the determination module 601 is used to determine each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist;
  • the processing module 602 is configured to obtain, for each candidate IP address, a preset type of sub-data corresponding to the network flow Netflow data of the candidate IP address within a preset time period, and extract the preset features through the Embedded method.
  • the feature type corresponding to the abnormal sub-data in the type of sub-data input the feature type into the pre-trained recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address;
  • the determination module 601 is also configured to determine each domain name corresponding to each determined malicious IP address as a malicious domain name based on the pre-saved correspondence between the domain name and the IP address.
  • the determination module 601 is specifically configured to determine each candidate domain name corresponding to the IP address in the pre-saved blacklist based on the corresponding relationship between the domain name and the IP address in the pre-saved DNS log. , determine each candidate IP address corresponding to each candidate domain name.
  • the determination module 601 is specifically configured to determine each candidate IP address corresponding to the domain name in the blacklist based on the correspondence between the domain name and the IP address in the pre-saved DNS log.
  • the determination module 601 is also configured to cyclically perform the following steps for each candidate IP address: determine each candidate IP address and each corresponding domain name; The IP address is determined as the candidate IP address; until the IP address corresponding to each obtained domain name is a candidate IP address, or the domain name corresponding to each candidate IP address is obtained.
  • the processing module 602 is further configured to determine, for each candidate IP address, the number of domain names corresponding to the candidate IP address as a first number, and obtain the number of domain names corresponding to the candidate IP address.
  • the domain name is used as the target domain name, each IP address corresponding to the target domain name is determined, and the second number of each IP address present in the blacklist is counted; according to the ratio of the second number to the first number , determine the confidence that the candidate IP address is a malicious IP address;
  • the processing module 602 is specifically configured to input the confidence level corresponding to the candidate IP address and the feature type into the recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address.
  • Figure 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present application. As shown in Figure 7, it includes: a processor 701, a communication interface 702, a memory 703 and a communication bus 704, where , the processor 701, the communication interface 702, and the memory 703 complete communication with each other through the communication bus 704.
  • the memory 703 stores a computer program. When the program is executed by the processor 701, the processor 701 performs the following steps:
  • For each candidate IP address obtain the preset type of sub-data corresponding to the Netflow data of the candidate IP address within the preset time period, and extract the abnormal sub-data correspondence in the sub-data of the preset feature type through the Embedded method
  • the feature type is input into the pre-trained recognition model to obtain whether the candidate IP address output by the recognition model is a malicious IP address;
  • each domain name corresponding to each determined malicious IP address is determined as a malicious domain name.
  • the processor 701 is specifically configured to determine each candidate domain name corresponding to the IP address in the pre-saved blacklist based on the corresponding relationship between the domain name and the IP address in the pre-saved DNS log, and determine each candidate domain name. Each candidate IP address corresponding to the candidate domain name.
  • the processor 701 is specifically configured to determine each candidate IP address corresponding to the domain name in the blacklist according to the corresponding relationship between the domain name and the IP address in the pre-saved DNS log.
  • processor 701 is also configured to perform the following steps cyclically for each candidate IP address:
  • the processor 701 is also configured to, for each candidate IP address, determine the number of domain names corresponding to the candidate IP address as the first number, and obtain the domain name corresponding to the candidate IP address as the target domain name, and determine For each IP address corresponding to the target domain name, count the second number of each IP address present in the blacklist;
  • the processor 701 is specifically configured to input the confidence corresponding to the candidate IP address and the feature type into the recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address.
  • the communication bus mentioned in the above-mentioned server can be the Peripheral Component Interconnect (PCI) bus or the Extended Industry Standard Architecture (EISA) bus, etc.
  • PCI Peripheral Component Interconnect
  • EISA Extended Industry Standard Architecture
  • the communication bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
  • the communication interface 702 is used for communication between the above-mentioned electronic device and other devices.
  • the memory may include random access memory (Random Access Memory, RAM) or non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk memory.
  • RAM Random Access Memory
  • NVM Non-Volatile Memory
  • the memory may also be at least one storage device located remotely from the aforementioned processor.
  • the above-mentioned processor can be a general-purpose processor, including a central processing unit, a network processor (Network Processor, NP), etc.; it can also be a digital instruction processor (Digital Signal Processing, DSP), an application-specific integrated circuit, a field programmable gate array, or Other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • NP Network Processor
  • DSP Digital Signal Processing
  • embodiments of the present application also provide a computer-readable storage medium.
  • the computer-readable storage medium stores a computer program that can be executed by an electronic device. When the program is stored in the electronic device, When running on the device, the following steps are implemented when the electronic device is executed:
  • a computer program is stored in the memory, and when the program is executed by the processor, the processor performs the following steps:
  • For each candidate IP address obtain the preset type of sub-data corresponding to the Netflow data of the candidate IP address within the preset time period, and extract the abnormal sub-data correspondence in the sub-data of the preset feature type through the Embedded method
  • the feature type is input into the pre-trained recognition model to obtain whether the candidate IP address output by the recognition model is a malicious IP address;
  • each domain name corresponding to each determined malicious IP address is determined as a malicious domain name.
  • determining each candidate IP address based on the IP addresses in a pre-saved blacklist includes:
  • each candidate domain name corresponding to the IP address in the pre-saved blacklist is determined, and each candidate IP address corresponding to each candidate domain name is determined.
  • determining each candidate IP address based on domain names in a pre-saved blacklist includes:
  • each candidate IP address corresponding to the domain name in the blacklist is determined.
  • the method further includes:
  • the method after determining each candidate IP address, before inputting the feature type into the recognition model and obtaining whether the candidate IP address is a malicious IP address output by the recognition model, the method also includes:
  • For each candidate IP address determine the number of domain names corresponding to the candidate IP address as the first number, obtain the domain name corresponding to the candidate IP address as the target domain name, determine each IP address corresponding to the target domain name, and collect statistics a second number of each IP address present in the blacklist;
  • the step of inputting the feature type into the recognition model and obtaining whether the candidate IP address output by the recognition model is a malicious IP address includes:
  • the confidence corresponding to the candidate IP address and the feature type are input into the recognition model, and whether the candidate IP address output by the recognition model is obtained is a malicious IP address.
  • embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including, but not limited to, disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions
  • the device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.
  • These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device.
  • Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

A method and apparatus for determining a malicious domain name, a device, and a medium, used to improve the accuracy of determining a malicious domain name. An electronic device determines each candidate IP address according to an IP address pre-stored in a blacklist and a domain name pre-stored in the blacklist. The electronic device also acquires sub-data of a preset type corresponding to Netflow data of a candidate IP address within a preset period of time, extracts a feature type corresponding to abnormal sub-data in the sub-data of the preset type corresponding to the candidate IP address, by means of an embedded method, and inputs the feature type into a recognition model, to acquire whether the candidate IP address outputted by the recognition model is a malicious IP address. It is thus possible to avoid mistaken recognition of a malicious IP address, and according to a pre-stored correspondence relationship between a domain name and an IP address, determine each domain name corresponding to each malicious IP address to be a malicious domain name, thereby improving the accuracy of determining a malicious domain name.

Description

一种恶意域名确定方法、装置、设备及介质A method, device, equipment and medium for determining malicious domain names
相关申请的交叉引用Cross-references to related applications
本申请要求在2022年08月16日提交中国专利局、申请号为202210978392.3、申请名称为“一种恶意域名确定方法、装置、设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application submitted to the China Patent Office on August 16, 2022, with application number 202210978392.3 and the application title "A method, device, equipment and medium for determining malicious domain names", the entire content of which is incorporated by reference. incorporated in this application.
技术领域Technical field
本申请涉及数据安全技术领域,尤其涉及一种恶意域名确定方法、装置、设备及介质。This application relates to the field of data security technology, and in particular to a method, device, equipment and medium for determining malicious domain names.
背景技术Background technique
随着社会的发展,恶意域名的出现也越来越频繁,恶意域名是指一类具有恶意链接的网址,这种网址通常利用应用软件或浏览器的漏洞,在网站内植入木马、病毒程序等恶意代码,并利用伪装的网站服务内容来诱导用户访问,用户若操作计算机访问这些网站,就有可能“中招”,导致计算机被恶意代码感染,进而引发安全问题。With the development of society, malicious domain names appear more and more frequently. Malicious domain names refer to a type of URL with malicious links. Such URLs usually exploit vulnerabilities in application software or browsers to implant Trojans and virus programs into the website. and other malicious codes, and use disguised website service content to induce users to access these websites. If users operate their computers to access these websites, they may be "scammed", causing their computers to be infected by malicious codes, thus causing security issues.
以恶意域名的攻击方式,恶意域名链接的网页被分为两类:钓鱼网站和恶意软件网站。钓鱼网站是指伪装成银行或网上商店等合法机构网站的一类网站,它试图诱骗用户在其网站中输入用户名、密码或其他私人信息,这类网站对个人隐私和财产安全可造成一定威胁。恶意软件网站包含恶意代码,通过在用户计算机上安装恶意软件,黑客可利用该软件来获取和传输用户的隐私或敏感信息。Based on the attack method of malicious domain names, web pages linked by malicious domain names are divided into two categories: phishing websites and malware websites. Phishing websites refer to a type of website that pretends to be the website of a legitimate institution such as a bank or online store. It attempts to trick users into entering user names, passwords or other private information on its website. Such websites can pose certain threats to personal privacy and property security. . Malware websites contain malicious code that hackers can use to obtain and transmit a user's private or sensitive information by installing malware on a user's computer.
现在仅知道国内外几个知名的恶意网络互连协议(Internet Protocol,IP)地址及知名的恶意域名,相关技术中确定恶意域名的方式只是采用知名的恶意IP地址及知名的恶意域名,进行匹配来确定恶意域名,此种方式所确定出的恶意域名并不准确。Currently, we only know a few well-known malicious Internet Protocol (IP) addresses and well-known malicious domain names at home and abroad. The method of determining malicious domain names in related technologies is to use well-known malicious IP addresses and well-known malicious domain names for matching. To determine the malicious domain name, the malicious domain name determined in this way is not accurate.
发明内容Contents of the invention
本申请实施例提供了一种恶意域名确定方法、装置、设备及介质,用以提高恶意域名确定的准确性。Embodiments of the present application provide a method, device, equipment and medium for determining malicious domain names to improve the accuracy of determining malicious domain names.
第一方面,本申请实施例提供了一种恶意域名确定方法,所述方法包括:In the first aspect, embodiments of the present application provide a method for determining malicious domain names. The method includes:
根据预先保存的黑名单中的IP地址以及黑名单中的域名,确定每个候选IP地址;Determine each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist;
针对所述每个候选IP地址,获取预设时间段内该候选IP地址的网络流(Netflow)数据对应的预设类型的子数据,通过嵌入式(Embedded)方法,提取该预设特征类型的子数据中异常的子数据对应的特征类型,将所述特征类型输入预先训练完成的识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址;For each candidate IP address, obtain the sub-data of the preset type corresponding to the network flow (Netflow) data of the candidate IP address within the preset time period, and extract the sub-data of the preset feature type through the embedded method. The feature type corresponding to the abnormal sub-data in the sub-data is input into the pre-trained recognition model to obtain whether the candidate IP address output by the recognition model is a malicious IP address;
根据预先保存的域名与IP地址的对应关系,将所确定的每个恶意IP地址对应的每个域名,确定为恶意域名。According to the pre-saved correspondence between domain names and IP addresses, each domain name corresponding to each determined malicious IP address is determined as a malicious domain name.
进一步地,所述根据预先保存的黑名单中的IP地址,确定每个候选IP地址包括:Further, determining each candidate IP address based on the IP addresses in the pre-saved blacklist includes:
根据预先保存的域名解析协议(Domain Name System,DNS)日志中的域名与IP地址的对应关系,确定预先保存的黑名单中的IP地址对应的每个候选域名,确定所述每个候选域名对应的每个候选IP地址。According to the correspondence between domain names and IP addresses in the pre-saved Domain Name System (DNS) logs, determine each candidate domain name corresponding to the IP address in the pre-saved blacklist, and determine the corresponding relationship between each candidate domain name for each candidate IP address.
进一步地,所述根据预先保存的黑名单中的域名,确定每个候选IP地址包括:Further, determining each candidate IP address based on the domain names in the pre-saved blacklist includes:
根据预先保存的DNS日志中的域名与IP地址的对应关系,确定黑名单中的域名对应的每个候选IP地址。Based on the correspondence between domain names and IP addresses in the pre-saved DNS logs, each candidate IP address corresponding to the domain name in the blacklist is determined.
进一步地,所述方法还包括:Further, the method also includes:
针对每个候选IP地址循环执行以下步骤:Cycle through the following steps for each candidate IP address:
确定每个候选IP地址,对应的每个域名;Determine each candidate IP address and each corresponding domain name;
将所述每个域名对应的IP地址确定为候选IP地址;Determine the IP address corresponding to each domain name as a candidate IP address;
直至获取到的每个域名对应的IP地址均为候选IP地址,或每个候选IP 地址对应的域名均被获取到。Until the IP address corresponding to each obtained domain name is a candidate IP address, or the domain name corresponding to each candidate IP address is obtained.
进一步地,所述确定每个候选IP地址之后,所述将所述特征类型输入识别模型中,获取所述识别模型输出的,该候选IP地址是否为恶意IP地址之前,所述方法还包括:Further, after determining each candidate IP address, before inputting the feature type into the recognition model and obtaining whether the candidate IP address is a malicious IP address output by the recognition model, the method also includes:
针对每个候选IP地址,将该候选IP地址对应的域名的数量,确定为第一数量,并获取该候选IP地址对应的域名作为目标域名,确定所述目标域名对应的每个IP地址,统计所述每个IP地址存在于所述黑名单中的第二数量;For each candidate IP address, determine the number of domain names corresponding to the candidate IP address as the first number, obtain the domain name corresponding to the candidate IP address as the target domain name, determine each IP address corresponding to the target domain name, and collect statistics a second number of each IP address present in the blacklist;
根据所述第二数量与所述第一数量的比值,确定该候选IP地址为恶意IP地址的置信度;Determine the confidence that the candidate IP address is a malicious IP address based on the ratio of the second number to the first number;
所述将所述特征类型输入识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址包括:The step of inputting the feature type into the recognition model and obtaining whether the candidate IP address output by the recognition model is a malicious IP address includes:
将该候选IP地址对应的置信度及所述特征类型输入识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址。The confidence level corresponding to the candidate IP address and the feature type are input into the recognition model, and whether the candidate IP address output by the recognition model is obtained is a malicious IP address.
第二方面,本申请实施例还提供了一种恶意域名确定装置,所述装置包括:In a second aspect, embodiments of the present application also provide a device for determining malicious domain names. The device includes:
确定模块,用于根据预先保存的黑名单中的IP地址以及黑名单中的域名,确定每个候选IP地址;A determination module used to determine each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist;
处理模块,用于针对所述每个候选IP地址,获取预设时间段内该候选IP地址的网络流Netflow数据对应的预设类型的子数据,通过嵌入式Embedded方法,提取该预设特征类型的子数据中异常的子数据对应的特征类型,将所述特征类型输入预先训练完成的识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址;A processing module configured to obtain, for each candidate IP address, sub-data of a preset type corresponding to the network flow Netflow data of the candidate IP address within a preset time period, and extract the preset feature type through the embedded Embedded method The feature type corresponding to the abnormal sub-data in the sub-data, input the feature type into the pre-trained recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address;
所述确定模块,还用于根据预先保存的域名与IP地址的对应关系,将所确定的每个恶意IP地址对应的每个域名,确定为恶意域名。The determination module is also configured to determine each domain name corresponding to each determined malicious IP address as a malicious domain name based on the pre-saved correspondence between the domain name and the IP address.
进一步地,所述确定模块,具体用于根据预先保存的DNS日志中的域名与IP地址的对应关系,确定预先保存的黑名单中的IP地址对应的每个候选域名,确定所述每个候选域名对应的每个候选IP地址。Further, the determination module is specifically configured to determine each candidate domain name corresponding to the IP address in the pre-saved blacklist based on the corresponding relationship between the domain name and the IP address in the pre-saved DNS log, and determine each candidate domain name. Each candidate IP address corresponding to the domain name.
进一步地,所述确定模块,具体用于根据预先保存的DNS日志中的域名与IP地址的对应关系,确定黑名单中的域名对应的每个候选IP地址。Further, the determination module is specifically configured to determine each candidate IP address corresponding to the domain name in the blacklist based on the corresponding relationship between the domain name and the IP address in the pre-saved DNS log.
进一步地,所述确定模块,还用于针对每个候选IP地址循环执行以下步骤:确定每个候选IP地址,对应的每个域名;将所述每个域名对应的IP地址确定为候选IP地址;直至获取到的每个域名对应的IP地址均为候选IP地址,或每个候选IP地址对应的域名均被获取到。Further, the determination module is also configured to cyclically perform the following steps for each candidate IP address: determine each candidate IP address and each corresponding domain name; determine the IP address corresponding to each domain name as the candidate IP address. ; Until the IP address corresponding to each obtained domain name is a candidate IP address, or the domain name corresponding to each candidate IP address is obtained.
进一步地,所述处理模块,还用于针对每个候选IP地址,将该候选IP地址对应的域名的数量,确定为第一数量,并获取该候选IP地址对应的域名作为目标域名,确定所述目标域名对应的每个IP地址,统计所述每个IP地址存在于所述黑名单中的第二数量;根据所述第二数量与所述第一数量的比值,确定该候选IP地址为恶意IP地址的置信度;Further, the processing module is further configured to, for each candidate IP address, determine the number of domain names corresponding to the candidate IP address as the first number, and obtain the domain name corresponding to the candidate IP address as the target domain name, and determine the number of domain names corresponding to the candidate IP address. For each IP address corresponding to the target domain name, count the second number of each IP address that exists in the blacklist; according to the ratio of the second number to the first number, determine the candidate IP address as Confidence of malicious IP addresses;
所述处理模块,具体用于将该候选IP地址对应的置信度及所述特征类型输入识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址。The processing module is specifically configured to input the confidence degree corresponding to the candidate IP address and the feature type into the recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address.
第三方面,本申请实施例还提供了一种电子设备,所述电子设备至少包括处理器和存储器,所述处理器用于执行存储器中存储的计算机程序时执行上述任一项所述恶意域名确定方法的步骤。In a third aspect, embodiments of the present application further provide an electronic device. The electronic device at least includes a processor and a memory. The processor is configured to perform any one of the above malicious domain name determinations when executing a computer program stored in the memory. Method steps.
第四方面,本申请实施例还提供了一种计算机可读存储介质,其存储有计算机程序,所述计算机程序被处理器执行时执行上述任一项所述恶意域名确定方法的步骤。In a fourth aspect, embodiments of the present application further provide a computer-readable storage medium that stores a computer program that, when executed by a processor, performs the steps of any one of the above malicious domain name determination methods.
在本申请实施例中,电子设备根据预先保存的黑名单中的IP地址及黑名单中的域名,确定每个候选IP地址,在确定每个候选IP地址后,电子设备针对所确定的每个候选IP地址,获取预设时间段内该候选IP地址的Netflow数据对应的预设类型的子数据,通过Embedded方法,提取该候选IP地址对应的预设类型的子数据异常的子数据对应的特征类型,将该特征类型输入预先训练完成的识别模型中,获取识别模型输出的该候选IP地址是否为恶意IP地址。在获取到每个恶意IP地址后,电子设备根据预先保存的域名与IP地址的对应关系,将所确定的每个恶意IP地址对应的每个域名,确定为恶意域名。 由于在本申请实施例中,电子设备根据预先保存的黑名单中的IP地址以及黑名单中的域名,确定每个候选IP地址,在确定每个候选IP地址后,电子设备获取预设时间段内候选IP地址的Netflow数据对应的预设类型的子数据,并通过Embedded方法,提取候选IP地址对应的预设类型的子数据中,异常的子数据对应的特征类型,将特征类型输入预先训练完成的识别模型中,获取识别模型输出的该候选IP地址是否为恶意IP地址,从而避免了恶意IP地址的误识别,并根据预先保存的域名与IP地址的对应关系,将每个恶意IP地址对应的每个域名,确定为恶意域名,从而提高恶意域名确定的准确性。In this embodiment of the present application, the electronic device determines each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist. After determining each candidate IP address, the electronic device determines each candidate IP address. For the candidate IP address, obtain the preset type of sub-data corresponding to the Netflow data of the candidate IP address within the preset time period, and extract the characteristics corresponding to the abnormal sub-data of the preset type of sub-data corresponding to the candidate IP address through the Embedded method. Type, input this feature type into the pre-trained recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address. After obtaining each malicious IP address, the electronic device determines each domain name corresponding to each determined malicious IP address as a malicious domain name based on the pre-saved correspondence between the domain name and the IP address. In this embodiment of the present application, the electronic device determines each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist. After determining each candidate IP address, the electronic device obtains the preset time period. The preset type of sub-data corresponding to the Netflow data of the candidate IP address is extracted, and through the Embedded method, the feature type corresponding to the abnormal sub-data is extracted from the preset type of sub-data corresponding to the candidate IP address, and the feature type is input into pre-training In the completed identification model, obtain whether the candidate IP address output by the identification model is a malicious IP address, thereby avoiding misidentification of malicious IP addresses, and classify each malicious IP address according to the correspondence between the pre-saved domain name and IP address. Each corresponding domain name is determined to be a malicious domain name, thereby improving the accuracy of determining the malicious domain name.
附图说明Description of drawings
图1为本申请实施例提供的一种恶意域名确定过程示意图;Figure 1 is a schematic diagram of a malicious domain name determination process provided by an embodiment of the present application;
图2为本申请实施例提供的一种黑名单中的IP地址示意图;Figure 2 is a schematic diagram of IP addresses in a blacklist provided by an embodiment of the present application;
图3为本申请实施例提供的一种确定候选IP地址的过程示意图;Figure 3 is a schematic diagram of a process for determining candidate IP addresses provided by an embodiment of the present application;
图4为本申请实施例提供的一种对原始识别模型训练的过程示意图;Figure 4 is a schematic diagram of a process of training an original recognition model provided by an embodiment of the present application;
图5为本申请实施例提供的一种确定恶意域名的详细示意图;Figure 5 is a detailed schematic diagram of determining a malicious domain name provided by an embodiment of the present application;
图6为本申请实施例提供的一种恶意域名确定装置结构示意图;Figure 6 is a schematic structural diagram of a malicious domain name determination device provided by an embodiment of the present application;
图7为本申请实施例提供的一种电子设备的结构示意图。FIG. 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
具体实施方式Detailed ways
为了使本申请实施例的上述目的、特征和优点能够更加明显易懂,下面将结合附图对本申请作进一步地详细描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。In order to make the above objects, features and advantages of the embodiments of the present application more obvious and easy to understand, the present application will be further described in detail below in conjunction with the accompanying drawings. Obviously, the described embodiments are only some of the embodiments of the present application, not all of them. embodiment. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative efforts fall within the scope of protection of this application.
在本申请实施例中,电子设备根据预先保存的黑名单中的IP地址及黑名单中的域名,确定每个候选IP地址,在确定每个候选IP地址后,电子设备针对所确定的每个候选IP地址,获取预设时间段内该候选IP地址的Netflow数 据对应的预设类型的子数据,通过Embedded方法,提取该候选IP地址对应的预设类型的子数据异常的子数据对应的特征类型,将该特征类型输入预先训练完成的识别模型中,获取识别模型输出的该候选IP地址是否为恶意IP地址。在获取到每个恶意IP地址后,电子设备根据预先保存的域名与IP地址的对应关系,将所确定的每个恶意IP地址对应的每个域名,确定为恶意域名。In this embodiment of the present application, the electronic device determines each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist. After determining each candidate IP address, the electronic device determines each candidate IP address. For the candidate IP address, obtain the preset type of sub-data corresponding to the Netflow data of the candidate IP address within the preset time period, and extract the characteristics corresponding to the abnormal sub-data of the preset type of sub-data corresponding to the candidate IP address through the Embedded method. Type, input this feature type into the pre-trained recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address. After obtaining each malicious IP address, the electronic device determines each domain name corresponding to each determined malicious IP address as a malicious domain name based on the pre-saved correspondence between the domain name and the IP address.
为了准确地确定恶意域名,本申请实施例提供了一种恶意域名确定方法、装置、设备及介质。In order to accurately determine malicious domain names, embodiments of the present application provide a method, device, equipment and medium for determining malicious domain names.
实施例1:Example 1:
图1为本申请实施例提供的一种恶意域名确定过程示意图,该过程包括以下步骤:Figure 1 is a schematic diagram of a malicious domain name determination process provided by an embodiment of this application. The process includes the following steps:
S101:根据预先保存的黑名单中的IP地址以及黑名单中的域名,确定每个候选IP地址。S101: Determine each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist.
本申请实施例提供的恶意域名确定方法应用于电子设备,该电子设备可以为PC或服务器等设备。The method for determining malicious domain names provided in the embodiments of this application is applied to electronic devices, which may be PCs, servers, and other devices.
在本申请实施例中,为了确定恶意域名,电子设备中预先保存有黑名单,黑名单中包括IP地址及域名,其中,黑名单中包括的IP地址及域名为国内外知名且相对固定的恶意IP地址,及国内外知名且相对固定的恶意域名。电子设备可以根据预先保存的黑名单中的IP地址以及黑名单中的域名,确定可能为恶意IP地址的每个候选IP地址。其中,黑名单中包括的IP地址的数量并不固定,可能包含一个,也可能不止一个,黑名单中包括的域名的数量也并不固定。在本申请实施例中可以将黑名单中的IP地址及域名称为精准威胁情报种子数据。In the embodiment of this application, in order to determine malicious domain names, a blacklist is pre-stored in the electronic device. The blacklist includes IP addresses and domain names. The IP addresses and domain names included in the blacklist are well-known and relatively fixed malicious ones at home and abroad. IP addresses, and well-known and relatively fixed malicious domain names at home and abroad. The electronic device can determine each candidate IP address that may be a malicious IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist. Among them, the number of IP addresses included in the blacklist is not fixed and may include one or more than one. The number of domain names included in the blacklist is also not fixed. In the embodiment of this application, the IP addresses and domain names in the blacklist can be called precise threat intelligence seed data.
在本申请实施例中,电子设备本地可以保存有IP地址与域名的对应关系,电子设备可以根据本地保存的IP地址与域名的对应关系,确定黑名单中的IP地址及域名对应的每个候选IP地址,值得说明的是,某一个IP地址可能对应多个域名,也可能不存在对应的域名,某一个域名可能对应多个IP地址,也可能不存在对应的IP地址。具体的,电子设备可以确定黑名单中的域名对应 的每个IP地址为候选IP地址,并确定黑名单中的IP地址对应的每个域名,将该每个域名对应的IP地址确定为候选IP地址。In the embodiment of the present application, the electronic device can locally store the corresponding relationship between the IP address and the domain name. The electronic device can determine each candidate corresponding to the IP address and domain name in the blacklist based on the locally stored corresponding relationship between the IP address and the domain name. IP address, it is worth explaining that a certain IP address may correspond to multiple domain names, or there may not be a corresponding domain name, and a certain domain name may correspond to multiple IP addresses, or there may be no corresponding IP address. Specifically, the electronic device can determine each IP address corresponding to the domain name in the blacklist as a candidate IP address, determine each domain name corresponding to the IP address in the blacklist, and determine the IP address corresponding to each domain name as a candidate IP. address.
例如,黑名单中的域名包括a.b.com,并且a.b.com对应的IP地址为1.1.1.2和1.1.1.3,则可以确定1.1.1.2和1.1.1.3这两个IP地址为候选IP地址。For example, if the domain names in the blacklist include a.b.com, and the IP addresses corresponding to a.b.com are 1.1.1.2 and 1.1.1.3, then the two IP addresses 1.1.1.2 and 1.1.1.3 can be determined as candidate IP addresses.
S102:针对所述每个候选IP地址,获取预设时间段内该候选IP地址的Netflow数据对应的预设类型的子数据,通过Embedded方法,提取该预设类型的子数据中异常子数据对应的特征类型,将所述特征类型输入预先训练完成的识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址。S102: For each candidate IP address, obtain the preset type of sub-data corresponding to the Netflow data of the candidate IP address within the preset time period, and extract the abnormal sub-data corresponding to the preset type of sub-data through the Embedded method. The feature type is input into the pre-trained recognition model to obtain whether the candidate IP address output by the recognition model is a malicious IP address.
在获取到每个候选IP地址后,电子设备确定该每个候选IP地址是否为恶意IP地址。具体的,电子设备针对获取到的每个候选IP地址,获取预设时间段内该候选IP地址的Netflow数据,并获取该Netflow数据对应的预设类型的子数据,其中,所获取的预设类型的子数据可以为上下行流量包比例、常用端口、对端通信端口范围以及上行数据包字节数等中的一个或几个,其中。在本申请实施例中,如何获取预设时间段内某一IP地址的Netflow数据为现有技术,在此不再赘述。After obtaining each candidate IP address, the electronic device determines whether each candidate IP address is a malicious IP address. Specifically, for each obtained candidate IP address, the electronic device obtains the Netflow data of the candidate IP address within a preset time period, and obtains the preset type sub-data corresponding to the Netflow data, wherein the obtained preset The sub-data of the type can be one or more of the ratio of upstream and downstream traffic packets, commonly used ports, peer communication port range, and the number of bytes of upstream data packets, among which. In the embodiment of the present application, how to obtain Netflow data of a certain IP address within a preset time period is an existing technology and will not be described again here.
其中,Netflow数据中包括不同的特征类型,及各特征类型对应的子数据,在本申请实施例中,电子设备可以获取Netflow数据中对应的类型为预设类型的子数据。例如预设类型为上下行流量包比例,则电子设备可以在获取到的Netflow数据中,获取对应的类型为上下行流量包比例的子数据,该子数据可以为具体的比例值。例如预设类型为常用端口,则电子设备可以在获取到的Netflow数据中,获取对应的类型为常用端口的子数据,该子数据为具体的端口。例如预设类型为对端通信端口范围,则电子设备可以在获取到的Netflow数据中,获取对应的类型为对端通信端口范围的子数据,该子数据为具体的端口范围。The Netflow data includes different feature types and sub-data corresponding to each feature type. In this embodiment of the present application, the electronic device can obtain the sub-data corresponding to the preset type in the Netflow data. For example, the preset type is the uplink and downlink traffic packet ratio, and the electronic device can obtain the corresponding sub-data of the uplink and downlink traffic packet ratio from the acquired Netflow data, and the sub-data can be a specific ratio value. For example, the preset type is a commonly used port, and the electronic device can obtain the corresponding sub-data of the commonly used port type from the obtained Netflow data, and the sub-data is a specific port. For example, the preset type is the peer communication port range, and the electronic device can obtain the corresponding sub-data of the peer communication port range in the obtained Netflow data, and the sub-data is the specific port range.
在获取到预设时间段内每个候选IP地址的Netflow数据对应的预设类型的子数据后,电子设备可以针对每个候选IP地址,通过Embedded方法,提取该候选IP地址对应的预设类型的子数据中,异常的子数据对应的特征类型。 其中,所确定出的特征类型可以为上下行流量包比例、常用端口、对端通信端口范围及上行数据包字节数中一个或几个。例如,所提取出的特征类型可以为上下行流量包比例,在本申请实施例中,通过Embedded方法,在几个类型的子数据中提取异常的子数据对应的特征类型为现有技术,在此不再赘述。After obtaining the sub-data of the preset type corresponding to the Netflow data of each candidate IP address within the preset time period, the electronic device can extract the preset type corresponding to the candidate IP address through the Embedded method for each candidate IP address. Among the sub-data, the characteristic type corresponding to the abnormal sub-data. The determined characteristic type may be one or more of the ratio of uplink and downlink traffic packets, commonly used ports, peer communication port range, and the number of bytes of uplink data packets. For example, the extracted feature type can be the ratio of uplink and downlink traffic packets. In the embodiment of this application, the feature type corresponding to the abnormal sub-data is extracted from several types of sub-data through the Embedded method. It is an existing technology. This will not be described again.
在本申请实施例中,为了进一步确定候选IP地址是否为恶意IP地址,电子设备中预先保存有预先训练完成的识别模型,电子设备针对每个候选IP地址,在确定该候选IP地址的异常子数据对应的特征类型后,将该候选IP地址对应的特征类型输入该识别模型中,获取该识别模型的输出,该识别模型的输出即为该候选IP地址是否为恶意IP地址。通过该方式电子设备即可确定出每个候选IP地址中的恶意IP地址。In the embodiment of the present application, in order to further determine whether the candidate IP address is a malicious IP address, a pre-trained identification model is pre-stored in the electronic device. For each candidate IP address, the electronic device determines the abnormality of the candidate IP address. After the feature type corresponding to the data is input, the feature type corresponding to the candidate IP address is input into the recognition model, and the output of the recognition model is obtained. The output of the recognition model is whether the candidate IP address is a malicious IP address. In this way, the electronic device can determine the malicious IP address in each candidate IP address.
例如,所获取到的候选IP地址包括1.1.1.1、1.1.1.2、1.1.1.3,则电子设备分别获取预设时间段内1.1.1.1、1.1.1.2及1.1.1.3对应的上下行流量包比例的子数据、常用端口的子数据、对端通信端口范围的子数据、上行数据包字节数的子数据,并通过Embedded方法提取获取到的子数据中异常的子数据对应的特征类型,针对该每个候选IP地址,将针对该候选IP地址对应确定的特征类型输入预先训练完成的识别模型中,获取识别模型该识别模型输出的该候选IP地址是否为恶意IP地址。For example, the obtained candidate IP addresses include 1.1.1.1, 1.1.1.2, and 1.1.1.3, then the electronic device obtains the proportion of uplink and downlink traffic packets corresponding to 1.1.1.1, 1.1.1.2, and 1.1.1.3 within the preset time period. sub-data, sub-data of common ports, sub-data of the peer communication port range, sub-data of the number of bytes of uplink data packets, and extract the characteristic type corresponding to the abnormal sub-data in the obtained sub-data through the Embedded method, aiming at For each candidate IP address, the characteristic type determined corresponding to the candidate IP address is input into the pre-trained recognition model to obtain whether the candidate IP address output by the recognition model is a malicious IP address.
在本申请实施例中,仅获取到候选IP地址是不够的,需要结合预设类型的子数据进行辅助研判,本申请实施例中,获取候选IP地址对应的预设类型的子数据,并利用Embedded方法进行特征筛选,获取对应的特征类型,通过预先训练完成的识别模型,进一步识别候选IP地址是否为恶意IP地址,从而可以提高恶意IP地址识别的准确性。In the embodiment of this application, it is not enough to only obtain the candidate IP address. It is necessary to combine it with the sub-data of the preset type to assist in the research and judgment. In the embodiment of this application, the sub-data of the preset type corresponding to the candidate IP address is obtained and used The Embedded method performs feature screening, obtains the corresponding feature type, and further identifies whether the candidate IP address is a malicious IP address through the pre-trained recognition model, thereby improving the accuracy of malicious IP address identification.
S103:根据预先保存的域名与IP地址的对应关系,将所确定的每个恶意IP地址对应的每个域名,确定为恶意域名。S103: According to the pre-saved correspondence between domain names and IP addresses, determine each domain name corresponding to each determined malicious IP address as a malicious domain name.
在本申请实施例中,电子设备中预先保存有域名与IP地址的对应关系,电子设备在确定每个恶意IP地址后,针对每个恶意IP地址,根据预先保存的域名与IP地址的对应关系,确定该恶意IP地址对应的每个域名为恶意域名, 通过该方式,电子设备即可确定出每个恶意IP地址对应的恶意域名。In the embodiment of the present application, the correspondence between the domain name and the IP address is pre-stored in the electronic device. After determining each malicious IP address, the electronic device will, for each malicious IP address, according to the pre-stored correspondence between the domain name and the IP address. , determine that each domain name corresponding to the malicious IP address is a malicious domain name. In this way, the electronic device can determine the malicious domain name corresponding to each malicious IP address.
由于在本申请实施例中,电子设备根据预先保存的黑名单中的IP地址以及黑名单中的域名,确定每个候选IP地址,在确定每个候选IP地址后,电子设备获取预设时间段内候选IP地址的Netflow数据对应的预设类型的子数据,并通过Embedded方法,提取候选IP地址对应的预设类型的子数据中,异常的子数据对应的特征类型,将特征类型输入预先训练完成的识别模型中,获取识别模型输出的该候选IP地址是否为恶意IP地址,从而避免了恶意IP地址的误识别,并根据预先保存的域名与IP地址的对应关系,将每个恶意IP地址对应的每个域名,确定为恶意域名,从而提高恶意域名确定的准确性。In this embodiment of the present application, the electronic device determines each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist. After determining each candidate IP address, the electronic device obtains the preset time period. The preset type of sub-data corresponding to the Netflow data of the candidate IP address is extracted, and through the Embedded method, the feature type corresponding to the abnormal sub-data is extracted from the preset type of sub-data corresponding to the candidate IP address, and the feature type is input into pre-training In the completed identification model, obtain whether the candidate IP address output by the identification model is a malicious IP address, thereby avoiding misidentification of malicious IP addresses, and classify each malicious IP address according to the correspondence between the pre-saved domain name and IP address. Each corresponding domain name is determined to be a malicious domain name, thereby improving the accuracy of determining the malicious domain name.
实施例2:Example 2:
为了确定每个候选IP地址,在上述各实施例的基础上,在本申请实施例中,所述根据预先保存的黑名单中的IP地址,确定每个候选IP地址包括:In order to determine each candidate IP address, based on the above embodiments, in this embodiment of the present application, determining each candidate IP address based on the IP addresses in the pre-saved blacklist includes:
根据预先保存的DNS日志中的域名与IP地址的对应关系,确定预先保存的黑名单中的IP地址对应的每个候选域名,将所述每个候选域名对应的IP地址确定为候选IP地址。According to the corresponding relationship between the domain name and the IP address in the pre-saved DNS log, each candidate domain name corresponding to the IP address in the pre-saved blacklist is determined, and the IP address corresponding to each candidate domain name is determined as the candidate IP address.
在实际应用场景中,域名与IP地址的对应关系中,某一域名可能对应多个IP地址,某一IP地址可能对应多个域名,具体若恶意域名对应单一IP地址,则该单一IP地址可能被拦截,导致恶意域名无法被访问,恶意域名会对应多个IP地址,并且通常为了防止恶意域名被检测到,违规方会通过域名生成(Domain Generation Algorithm,DGA)算法等方式不断生成新的域名,从而使得某一IP地址对应的多个域名均为恶意域名。在本申请实施例中,电子设备可以确定黑名单中的IP地址对应的每个候选域名,并将每个候选域名对应的IP地址均确定为候选IP地址。In actual application scenarios, in the correspondence between domain names and IP addresses, a certain domain name may correspond to multiple IP addresses, and a certain IP address may correspond to multiple domain names. Specifically, if a malicious domain name corresponds to a single IP address, the single IP address may Being intercepted, the malicious domain name cannot be accessed. The malicious domain name will correspond to multiple IP addresses, and usually in order to prevent the malicious domain name from being detected, the offending party will continuously generate new domain names through domain name generation (Domain Generation Algorithm, DGA) algorithms and other methods. , thus making multiple domain names corresponding to a certain IP address all malicious domain names. In this embodiment of the present application, the electronic device can determine each candidate domain name corresponding to the IP address in the blacklist, and determine the IP address corresponding to each candidate domain name as the candidate IP address.
具体的,在本申请实施例中,电子设备中预先保存有DNS日志,DNS日志中保存有域名与IP地址的对应关系,电子设备根据DNS日志中的域名与IP地址的对应关系,确定黑名单中的IP地址对应的每个域名为候选域名,该步骤可以被称为IP DNS日志反解,在获取到每个候选域名后,电子设备可以 针对获取到的每个候选域名,根据DNS日志中保存的域名与IP地址的对应关系,确定该候选域名对应的每个IP地址为候选IP地址,通过该方式电子设备即可确定出该每个候选域名对应的每个候选IP地址,该步骤可以被称为域名DNS日志解析,该每个候选IP地址都可能是恶意IP地址。Specifically, in the embodiment of the present application, a DNS log is pre-stored in the electronic device, and the correspondence between the domain name and the IP address is stored in the DNS log. The electronic device determines the blacklist based on the correspondence between the domain name and the IP address in the DNS log. Each domain name corresponding to the IP address in is a candidate domain name. This step can be called IP DNS log reverse analysis. After obtaining each candidate domain name, the electronic device can obtain each candidate domain name according to the DNS log. The corresponding relationship between the saved domain name and the IP address is determined to determine each IP address corresponding to the candidate domain name as a candidate IP address. In this way, the electronic device can determine each candidate IP address corresponding to each candidate domain name. This step can Known as domain name DNS log parsing, each candidate IP address is potentially a malicious IP address.
具体在本申请实施例中,在获取到每个候选IP地址后,由于黑名单中的IP地址为恶意IP地址,无需对黑名单中的IP地址进行是否为恶意IP地址的确定,因此在本申请实施例中,电子设备可以将候选IP地址中存在于黑名单中的IP地址删除。Specifically, in the embodiment of this application, after obtaining each candidate IP address, since the IP addresses in the blacklist are malicious IP addresses, there is no need to determine whether the IP addresses in the blacklist are malicious IP addresses. Therefore, in this application In the application embodiment, the electronic device can delete the IP addresses that exist in the blacklist among the candidate IP addresses.
例如,黑名单中的IP地址包括1.1.1.1,获取到的该IP地址对应的每个域名为a.b.com和c.b.com,a.b.com对应的IP地址为1.1.1.1和1.1.1.2,c.b.com对应的IP地址1.1.1.1和1.1.1.3,则对应的候选IP地址为1.1.1.2和1.1.1.3。For example, the IP addresses in the blacklist include 1.1.1.1. Each domain name corresponding to the obtained IP address is a.b.com and c.b.com. The IP addresses corresponding to a.b.com are 1.1.1.1 and 1.1.1.2, and the corresponding IP addresses to c.b.com If the IP addresses are 1.1.1.1 and 1.1.1.3, the corresponding candidate IP addresses are 1.1.1.2 and 1.1.1.3.
图2为本申请实施例提供的一种黑名单中的IP地址示意图。Figure 2 is a schematic diagram of IP addresses in a blacklist provided by an embodiment of the present application.
由图2可知,黑名单中保存有IP地址,并且黑名单中可以保存有不止一个IP地址。As can be seen from Figure 2, IP addresses are stored in the blacklist, and more than one IP address can be stored in the blacklist.
其中,在本申请实施例中,在确定每个恶意IP地址后,电子设备可以根据DNS日志中域名与IP地址的对应关系,确定每个恶意IP地址对应的每个域名为恶意域名。In this embodiment of the present application, after determining each malicious IP address, the electronic device can determine that each domain name corresponding to each malicious IP address is a malicious domain name based on the correspondence between the domain name and the IP address in the DNS log.
实施例3:Example 3:
为了准确地确定每个候选IP地址,在上述各实施例的基础上,在本申请实施例中,所述根据预先保存的黑名单中的域名,确定每个候选IP地址包括:In order to accurately determine each candidate IP address, based on the above embodiments, in the embodiment of this application, determining each candidate IP address based on the domain name in the pre-saved blacklist includes:
根据预先保存的DNS日志中的域名与IP地址的对应关系,确定黑名单中的域名对应的每个候选IP地址。Based on the correspondence between domain names and IP addresses in the pre-saved DNS logs, each candidate IP address corresponding to the domain name in the blacklist is determined.
在本申请实施例中,电子设备可以针对黑名单中的每个域名,在预先保存的DNS日志中域名与IP地址的对应关系中,确定该域名对应的每个IP地址为候选IP地址,电子设备通过该方式即可确定黑名单中每个域名对应的候选IP地址。In this embodiment of the present application, for each domain name in the blacklist, the electronic device can determine each IP address corresponding to the domain name as a candidate IP address based on the correspondence between the domain name and the IP address in the pre-saved DNS log. In this way, the device can determine the candidate IP address corresponding to each domain name in the blacklist.
在本申请实施例中,黑名单中的域名对应的每个候选IP地址中,可能存 在和黑名单中的IP地址相同的IP地址,由于黑名单中的IP地址本身就是恶意IP地址,为了节省时间,提高效率,无需再对黑名单中的IP地址进行是否为恶意IP地址的确定,因此在获取到每个候选IP地址后,电子设备可以针对每个候选IP地址,若该候选IP地址与黑名单中某一IP地址相同,则将该候选IP地址从候选IP地址中删除。In the embodiment of this application, in each candidate IP address corresponding to the domain name in the blacklist, there may be an IP address that is the same as the IP address in the blacklist. Since the IP addresses in the blacklist themselves are malicious IP addresses, in order to save time, improve efficiency, and no longer need to determine whether the IP address in the blacklist is a malicious IP address. Therefore, after obtaining each candidate IP address, the electronic device can target each candidate IP address. If the candidate IP address is consistent with If an IP address in the blacklist is the same, the candidate IP address will be deleted from the candidate IP addresses.
实施例4:Example 4:
为了提高恶意IP地址确定的准确性,在上述各实施例的基础上,在本申请实施例中,所述方法还包括:In order to improve the accuracy of determining malicious IP addresses, based on the above embodiments, in the embodiment of the present application, the method further includes:
针对每个候选IP地址循环执行以下步骤:Cycle through the following steps for each candidate IP address:
确定每个候选IP地址,对应的每个域名;Determine each candidate IP address and each corresponding domain name;
将所述每个域名对应的IP地址确定为候选IP地址;Determine the IP address corresponding to each domain name as a candidate IP address;
直至获取到的每个域名对应的IP地址均为候选IP地址,或每个候选IP地址对应的域名均被获取到。Until the IP address corresponding to each obtained domain name is a candidate IP address, or the domain name corresponding to each candidate IP address is obtained.
在实际应用场景中,某些恶意IP地址对应的域名可能并非恶意的域名,因此需要利用DNS日志反复解析获取每个候选IP地址。为了防止某一候选IP地址未被获取到,导致存在某些恶意IP地址未被获取到,在本申请实施例中,在获取到候选IP地址后,针对所确定的每个候选IP地址循环执行以下步骤:根据预先保存的DNS日志中域名与IP地址的对应关系,确定每个候选IP地址对应的每个域名,根据DNS日志中域名与IP地址的对应关系,确定该每个域名对应的IP地址为候选IP地址,判断获取到的每个域名对应的IP地址是否均为候选IP地址,或每个候选IP地址对应的域名均被获取到,若该每个域名对应的IP地址均为候选IP地址,或每个候选IP地址对应的域名均被获取到,则无需继续确定候选IP地址,若获取到的每个域名对应的任一IP地址不是候选IP地址,则将该IP地址确定为候选IP地址,并确定该IP地址对应的每个域名;若存在所确定的某一候选IP地址对应的某一域名未被获取到,则确定该域名对应的每个IP地址,确定该每个IP地址中是否存在并非候选IP地址的IP地址。In actual application scenarios, the domain names corresponding to some malicious IP addresses may not be malicious domain names, so DNS logs need to be repeatedly parsed to obtain each candidate IP address. In order to prevent a certain candidate IP address from being obtained, resulting in some malicious IP addresses not being obtained, in the embodiment of this application, after obtaining the candidate IP address, a loop is executed for each determined candidate IP address. The following steps: Determine each domain name corresponding to each candidate IP address based on the corresponding relationship between domain names and IP addresses in the pre-saved DNS logs, and determine the IP corresponding to each domain name based on the corresponding relationship between domain names and IP addresses in the DNS logs. The address is a candidate IP address. Determine whether the IP addresses corresponding to each obtained domain name are all candidate IP addresses, or the domain names corresponding to each candidate IP address have been obtained. If the IP addresses corresponding to each domain name are all candidates If the IP address or the domain name corresponding to each candidate IP address is obtained, there is no need to continue to determine the candidate IP address. If any IP address corresponding to each obtained domain name is not a candidate IP address, then the IP address is determined as Candidate IP address, and determine each domain name corresponding to the IP address; if there is a domain name corresponding to a determined candidate IP address that has not been obtained, determine each IP address corresponding to the domain name, and determine each domain name corresponding to the determined candidate IP address. Whether there is an IP address that is not a candidate IP address among the IP addresses.
以黑名单中某一域名为a.b.com为例,通过DNS日志反复对该域名进行解析,第一次a.b.com解析,获取到该域名对应IP地址为1.1.1.1和1.1.1.2,利用这两个IP反解析得到对应域名为a.b.com和c.b.com,由于c.b.com为新增的域名,继续使用DNS日志对c.b.com进行解析,得到1.1.1.2和1.1.1.3,由于1.1.1.3为新增的IP地址,继续对该IP地址进行反解析,得到对应的域名a.b.com和c.b.com,无新增域名出现,则无需继续确定,则所确定出的候选IP地址为1.1.1.1、1.1.1.2和1.1.1.3。Taking a domain name in the blacklist as a.b.com as an example, the domain name is repeatedly parsed through DNS logs. The first time a.b.com is parsed, the IP addresses corresponding to the domain name are obtained as 1.1.1.1 and 1.1.1.2. Using these two The corresponding domain names obtained by IP reverse analysis are a.b.com and c.b.com. Since c.b.com is a new domain name, we continue to use DNS logs to parse c.b.com and obtain 1.1.1.2 and 1.1.1.3. Since 1.1.1.3 is a new IP Address, continue to perform reverse analysis on the IP address, and obtain the corresponding domain names a.b.com and c.b.com. If no new domain name appears, there is no need to continue to determine. The determined candidate IP addresses are 1.1.1.1, 1.1.1.2, and 1.1 .1.3.
图3为本申请实施例提供的一种确定候选IP地址的过程示意图,该过程包括以下步骤:Figure 3 is a schematic diagram of a process for determining candidate IP addresses provided by an embodiment of the present application. The process includes the following steps:
其中,图3中为确定黑名单中某一域名对应的每个候选IP地址的过程示意图。Among them, Figure 3 is a schematic diagram of the process of determining each candidate IP address corresponding to a certain domain name in the blacklist.
S301:确定该域名对应的每个IP地址为候选IP地址。S301: Determine each IP address corresponding to the domain name as a candidate IP address.
S302:确定每个候选IP地址对应的域名。S302: Determine the domain name corresponding to each candidate IP address.
S303:确定每个域名对应的每个IP地址。S303: Determine each IP address corresponding to each domain name.
S304:判断该每个IP地址是否均为候选IP地址,若是,则执行S305,若否,则执行S306。S304: Determine whether each IP address is a candidate IP address. If so, execute S305. If not, execute S306.
S305:结束。S305: End.
S306:将该每个IP地址均确定为候选IP地址,并获取新增的每个新增候选IP地址。S306: Determine each IP address as a candidate IP address, and obtain each new candidate IP address.
S307:确定每个新增候选IP地址对应的每个域名。S307: Determine each domain name corresponding to each new candidate IP address.
S308:判断该每个域名是否存在新增的域名,若是,则执行S303,若否,则执行S305。S308: Determine whether there is a new domain name for each domain name. If so, execute S303. If not, execute S305.
实施例5:Example 5:
为了提高恶意IP地址确定的准确性,在上述各实施例的基础上,在本申请实施例中,所述确定每个候选IP地址之后,所述将所述特征类型输入识别模型中,获取所述识别模型输出的,该候选IP地址是否为恶意IP地址之前,所述方法还包括:In order to improve the accuracy of malicious IP address determination, on the basis of the above embodiments, in the embodiment of the present application, after determining each candidate IP address, the feature type is input into the identification model to obtain all Before determining whether the candidate IP address is a malicious IP address output by the identification model, the method further includes:
针对每个候选IP地址,将该候选IP地址对应的域名的数量,确定为第一数量,并获取该候选IP地址对应的域名作为目标域名,确定所述目标域名对应的每个IP地址,统计所述每个IP地址存在于所述黑名单中的第二数量;For each candidate IP address, determine the number of domain names corresponding to the candidate IP address as the first number, obtain the domain name corresponding to the candidate IP address as the target domain name, determine each IP address corresponding to the target domain name, and collect statistics a second number of each IP address present in the blacklist;
根据所述第二数量与所述第一数量的比值,确定该候选IP地址为恶意IP地址的置信度;Determine the confidence that the candidate IP address is a malicious IP address based on the ratio of the second number to the first number;
所述将所述特征类型输入识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址包括:The step of inputting the feature type into the recognition model and obtaining whether the candidate IP address output by the recognition model is a malicious IP address includes:
将该候选IP地址对应的置信度及所述特征类型输入识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址。The confidence corresponding to the candidate IP address and the feature type are input into the recognition model, and whether the candidate IP address output by the recognition model is obtained is a malicious IP address.
在本申请实施例中,若仅将某一候选IP地址对应的特征类型输入识别模型中,获取识别模型输出的该候选IP地址是否为恶意IP地址,则识别模型输出的结果可能不够准确。因此在本申请实施例中,电子设备针对每个候选IP地址,根据该候选IP地址对应的域名的数量确定该候选IP地址对应的置信度,将该候选IP地址对应的置信度及该候选IP地址对应的特征类型输入识别模型中,从而进一步提高识别模型确定该IP地址是否为恶意IP地址的准确性。In this embodiment of the present application, if only the feature type corresponding to a certain candidate IP address is input into the recognition model and whether the candidate IP address output by the recognition model is obtained as a malicious IP address, the result output by the recognition model may not be accurate enough. Therefore, in this embodiment of the present application, for each candidate IP address, the electronic device determines the confidence corresponding to the candidate IP address according to the number of domain names corresponding to the candidate IP address, and combines the confidence corresponding to the candidate IP address with the candidate IP The feature type corresponding to the address is input into the recognition model, thereby further improving the accuracy of the recognition model in determining whether the IP address is a malicious IP address.
在本申请实施例中,电子设备在获取到每个候选IP地址后,针对获取到的每个候选IP地址,根据DNS日志中域名与IP地址的对应关系,确定该候选IP地址对应的每个域名,并将该每个域名的数量确定为第一数量,电子设备还将该每个域名作为目标域名,根据DNS日志中域名与IP地址的对应关系,确定该每个目标域名对应的每个IP地址,确定该每个IP地址存在于黑名单中的数量为第二数量,在确定第一数量及第二数量之后,电子设备可以获取第二数量与第一数量的比值,电子设备可以将该比值确定为该候选IP地址对应的置信度,也可以将该比值与预设数值的乘积确定为该候选IP地址对应的置信度。在本申请实施例中,该第二数量越大,则说明该候选IP地址对应的每个目标域名,对应的黑名单中的IP地址的数量越多,则说明该候选IP地址为恶意IP地址的可能性越大,且对应的置信度越大,因此通过置信度可以增加识别模型识别IP地址是否为恶意IP地址的准确性。In this embodiment of the present application, after obtaining each candidate IP address, the electronic device determines each candidate IP address corresponding to the candidate IP address based on the corresponding relationship between the domain name and the IP address in the DNS log. domain name, and determines the number of each domain name as the first number. The electronic device also takes each domain name as a target domain name, and determines each domain name corresponding to each target domain name according to the corresponding relationship between the domain name and the IP address in the DNS log. IP addresses, determine the number of each IP address in the blacklist as the second number. After determining the first number and the second number, the electronic device can obtain the ratio of the second number to the first number, and the electronic device can The ratio is determined as the confidence level corresponding to the candidate IP address. The product of the ratio and the preset value may also be used to determine the confidence level corresponding to the candidate IP address. In this embodiment of the present application, the greater the second number, it means that the candidate IP address corresponds to each target domain name, and the greater the number of IP addresses in the corresponding blacklist, it means that the candidate IP address is a malicious IP address. The greater the possibility, and the greater the corresponding confidence. Therefore, the confidence can increase the accuracy of the recognition model in identifying whether the IP address is a malicious IP address.
其中,电子设备某一候选IP地址对应的置信度的公式为:Among them, the formula for the confidence corresponding to a certain candidate IP address of the electronic device is:
Score=100*(Cevil/Ctotal)Score=100*(Cevil/Ctotal)
其中,Score为候选IP地址对应的置信度,100为预设数值,Cevil为第二数量,Ctotal为第一数量。Among them, Score is the confidence corresponding to the candidate IP address, 100 is the preset value, Cevil is the second number, and Ctotal is the first number.
针对每个候选IP地址,电子设备在获取到该候选IP地址为恶意IP地址对应的置信度,及该候选IP地址对应的特征类型之后,将该候选IP地址对应的置信度及特征类型输入预先训练完成的识别模型中,获取该识别模型的输出,该识别模型的输出即为该候选IP地址是否为恶意IP地址。For each candidate IP address, after obtaining the confidence that the candidate IP address is a malicious IP address and the feature type corresponding to the candidate IP address, the electronic device inputs the confidence level and feature type corresponding to the candidate IP address in advance. In the recognition model that has been trained, the output of the recognition model is obtained. The output of the recognition model is whether the candidate IP address is a malicious IP address.
在本申请实施例中,电子设备在通过预先训练完成的识别模型确定候选IP地址是否为恶意IP地址时,通过候选IP地址对应的置信度及特征类型进行确定,从而提高了输入的信息的多样性,进一步提高模型识别的准确性。In the embodiment of the present application, when the electronic device determines whether the candidate IP address is a malicious IP address through the pre-trained recognition model, the determination is made based on the confidence level and feature type corresponding to the candidate IP address, thereby increasing the diversity of the input information. properties, further improving the accuracy of model identification.
在本申请实施例中,在对该识别模型进行训练时,电子设备中预先保存有样本集,样本集中保存有多个IP地址,并针对每个IP地址标注有其是否为恶意IP地址,电子设备针对该样本集中的每个IP地址,获取预设时间段内该IP地址对应的预设类型的子数据,通过Embedding方法,提取预设类型的子数据中异常的子数据对应的特征类型,并确定该IP地址对应的域名的第三数量,确定该IP地址对应的每个域名,确定该每个域名对应的每个IP地址,统计该每个IP地址存在于黑名单中的第四数量,根据该第四数量与第三数量的比值确定该IP地址为恶意IP地址的置信度,将该置信度及该特征类型输入原始识别模型中,获取原始识别模型输出的该IP地址是否为恶意IP地址,根据原始识别模型输出的结果,及预先针对该IP地址标注的其是否为恶意IP地址对该原始识别模型进行训练。In this embodiment of the present application, when training the recognition model, a sample set is pre-stored in the electronic device. Multiple IP addresses are stored in the sample set, and each IP address is marked with whether it is a malicious IP address. The electronic device For each IP address in the sample set, the device obtains the preset type of sub-data corresponding to the IP address within the preset time period, and uses the Embedding method to extract the feature type corresponding to the abnormal sub-data in the preset type of sub-data. and determine the third number of domain names corresponding to the IP address, determine each domain name corresponding to the IP address, determine each IP address corresponding to each domain name, and count the fourth number of each IP address present in the blacklist. , determine the confidence that the IP address is a malicious IP address based on the ratio of the fourth quantity to the third quantity, input the confidence and the feature type into the original recognition model, and obtain whether the IP address output by the original recognition model is malicious. For IP addresses, the original recognition model is trained based on the results output by the original recognition model and whether the IP address is previously marked as a malicious IP address.
对识别模型采用上述方式进行训练,当满足预设的条件时,得到训练完成的识别模型。其中,该预设的条件可以是,样本集中的IP地址对应的特征类型及置信度通过原始识别模型训练后得到的训练结果及标注的IP地址是否为恶意IP地址的结果一致的数量大于设定数量;也可以是对原始识别模型进行训练的迭代次数达到设置的最大迭代次数等。具体的,本申请实施例对此 不做限制。The recognition model is trained in the above method. When the preset conditions are met, the trained recognition model is obtained. Among them, the preset condition may be that the feature type and confidence level corresponding to the IP address in the sample set are consistent with the number of training results obtained after training the original recognition model and whether the labeled IP address is a malicious IP address, which is greater than the set value. The number; it can also be that the number of iterations for training the original recognition model reaches the set maximum number of iterations, etc. Specifically, the embodiments of this application do not limit this.
图4为本申请实施例提供的一种对原始识别模型训练的过程示意图。Figure 4 is a schematic diagram of a process of training an original recognition model provided by an embodiment of the present application.
由图4可知,在对原始识别模型进行训练时,保存有样本集,其中针对每个IP地址,对应保存有该IP地址是否为恶意IP地址,电子设备针对样本集中的每个IP地址,分别确定对应的特征类型及置信度,将对应的特征类型及置信度输入原始识别模型中,获取原始识别模型输出的结果,根据原始识别模型输出的结果及针对每个IP地址保存的其是否为恶意IP地址,对原始识别模型进行训练。As can be seen from Figure 4, when training the original recognition model, a sample set is saved, in which for each IP address, whether the IP address is a malicious IP address is saved. The electronic device, for each IP address in the sample set, respectively Determine the corresponding feature type and confidence level, input the corresponding feature type and confidence level into the original recognition model, obtain the output result of the original recognition model, and determine whether it is malicious according to the output result of the original recognition model and whether it is saved for each IP address. IP address to train the original recognition model.
图5为本申请实施例提供的一种确定恶意域名的详细示意图,该过程包括以下步骤:Figure 5 is a detailed schematic diagram of determining a malicious domain name provided by an embodiment of the present application. The process includes the following steps:
其中,图5中以先确定候选IP地址对应的置信度,后确定IP地址对应的特征类型为例进行说明。Among them, Figure 5 takes as an example that the confidence level corresponding to the candidate IP address is determined first, and then the feature type corresponding to the IP address is determined.
S501:根据预先保存的黑名单中的IP地址以及黑名单中的域名,确定每个候选IP地址。S501: Determine each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist.
S502:针对每个候选IP地址,将该候选IP地址对应的域名的数量,确定为第一数量,并获取该候选IP地址对应的域名作为目标域名,确定目标域名对应的每个IP地址,统计每个IP地址存在于黑名单中的第二数量。S502: For each candidate IP address, determine the number of domain names corresponding to the candidate IP address as the first number, obtain the domain name corresponding to the candidate IP address as the target domain name, determine each IP address corresponding to the target domain name, and collect statistics Each IP address exists in the second number of blacklists.
S503:针对每个候选IP地址,确定该候选IP地址对应的第二数量与第一数量的比值,为该候选IP地址对应的置信度。S503: For each candidate IP address, determine the ratio of the second number corresponding to the candidate IP address to the first number, which is the confidence level corresponding to the candidate IP address.
S504:获取预设时间段内每个候选IP地址对应的预设类型的子数据。S504: Obtain the preset type of subdata corresponding to each candidate IP address within the preset time period.
S505:通过Embedded方法,提取每个候选IP地址对应的预设特征类型的子数据中异常的子数据对应的特征类型。S505: Use the Embedded method to extract the feature type corresponding to the abnormal sub-data in the sub-data of the preset feature type corresponding to each candidate IP address.
S506:针对每个候选IP地址,将该候选IP地址对应的特征类型输入预先训练完成的识别模型中,获取识别模型输出的该候选IP地址是否为恶意IP地址。S506: For each candidate IP address, input the feature type corresponding to the candidate IP address into the pre-trained recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address.
S507:根据DNS日志中域名与IP地址的对应关系,确定每个恶意IP地址对应的每个恶意域名。S507: Based on the correspondence between domain names and IP addresses in the DNS log, determine each malicious domain name corresponding to each malicious IP address.
实施例6:Example 6:
图6为本申请实施例提供的一种恶意域名确定装置结构示意图,该装置包括:Figure 6 is a schematic structural diagram of a malicious domain name determination device provided by an embodiment of the present application. The device includes:
确定模块601,用于根据预先保存的黑名单中的IP地址以及黑名单中的域名,确定每个候选IP地址;The determination module 601 is used to determine each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist;
处理模块602,用于针对所述每个候选IP地址,获取预设时间段内该候选IP地址的网络流Netflow数据对应的预设类型的子数据,通过嵌入式Embedded方法,提取该预设特征类型的子数据中异常的子数据对应的特征类型,将所述特征类型输入预先训练完成的识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址;The processing module 602 is configured to obtain, for each candidate IP address, a preset type of sub-data corresponding to the network flow Netflow data of the candidate IP address within a preset time period, and extract the preset features through the Embedded method. The feature type corresponding to the abnormal sub-data in the type of sub-data, input the feature type into the pre-trained recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address;
所述确定模块601,还用于根据预先保存的域名与IP地址的对应关系,将所确定的每个恶意IP地址对应的每个域名,确定为恶意域名。The determination module 601 is also configured to determine each domain name corresponding to each determined malicious IP address as a malicious domain name based on the pre-saved correspondence between the domain name and the IP address.
在一种可能的实施方式中,所述确定模块601,具体用于根据预先保存的DNS日志中的域名与IP地址的对应关系,确定预先保存的黑名单中的IP地址对应的每个候选域名,确定所述每个候选域名对应的每个候选IP地址。In a possible implementation, the determination module 601 is specifically configured to determine each candidate domain name corresponding to the IP address in the pre-saved blacklist based on the corresponding relationship between the domain name and the IP address in the pre-saved DNS log. , determine each candidate IP address corresponding to each candidate domain name.
在一种可能的实施方式中,所述确定模块601,具体用于根据预先保存的DNS日志中的域名与IP地址的对应关系,确定黑名单中的域名对应的每个候选IP地址。In a possible implementation, the determination module 601 is specifically configured to determine each candidate IP address corresponding to the domain name in the blacklist based on the correspondence between the domain name and the IP address in the pre-saved DNS log.
在一种可能的实施方式中,所述确定模块601,还用于针对每个候选IP地址循环执行以下步骤:确定每个候选IP地址,对应的每个域名;将所述每个域名对应的IP地址确定为候选IP地址;直至获取到的每个域名对应的IP地址均为候选IP地址,或每个候选IP地址对应的域名均被获取到。In a possible implementation, the determination module 601 is also configured to cyclically perform the following steps for each candidate IP address: determine each candidate IP address and each corresponding domain name; The IP address is determined as the candidate IP address; until the IP address corresponding to each obtained domain name is a candidate IP address, or the domain name corresponding to each candidate IP address is obtained.
在一种可能的实施方式中,所述处理模块602,还用于针对每个候选IP地址,将该候选IP地址对应的域名的数量,确定为第一数量,并获取该候选IP地址对应的域名作为目标域名,确定所述目标域名对应的每个IP地址,统计所述每个IP地址存在于所述黑名单中的第二数量;根据所述第二数量与所述第一数量的比值,确定该候选IP地址为恶意IP地址的置信度;In a possible implementation, the processing module 602 is further configured to determine, for each candidate IP address, the number of domain names corresponding to the candidate IP address as a first number, and obtain the number of domain names corresponding to the candidate IP address. The domain name is used as the target domain name, each IP address corresponding to the target domain name is determined, and the second number of each IP address present in the blacklist is counted; according to the ratio of the second number to the first number , determine the confidence that the candidate IP address is a malicious IP address;
所述处理模块602,具体用于将该候选IP地址对应的置信度及所述特征类型输入识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址。The processing module 602 is specifically configured to input the confidence level corresponding to the candidate IP address and the feature type into the recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address.
实施例7:Example 7:
在上述各实施例的基础上,图7为本申请实施例提供的一种电子设备的结构示意图,如图7所示,包括:处理器701、通信接口702、存储器703和通信总线704,其中,处理器701,通信接口702,存储器703通过通信总线704完成相互间的通信。Based on the above embodiments, Figure 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present application. As shown in Figure 7, it includes: a processor 701, a communication interface 702, a memory 703 and a communication bus 704, where , the processor 701, the communication interface 702, and the memory 703 complete communication with each other through the communication bus 704.
所述存储器703中存储有计算机程序,当所述程序被所述处理器701执行时,使得所述处理器701执行如下步骤:The memory 703 stores a computer program. When the program is executed by the processor 701, the processor 701 performs the following steps:
根据预先保存的黑名单中的IP地址以及黑名单中的域名,确定每个候选IP地址;Determine each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist;
针对所述每个候选IP地址,获取预设时间段内该候选IP地址的Netflow数据对应的预设类型的子数据,通过Embedded方法,提取该预设特征类型的子数据中异常的子数据对应的特征类型,将所述特征类型输入预先训练完成的识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址;For each candidate IP address, obtain the preset type of sub-data corresponding to the Netflow data of the candidate IP address within the preset time period, and extract the abnormal sub-data correspondence in the sub-data of the preset feature type through the Embedded method The feature type is input into the pre-trained recognition model to obtain whether the candidate IP address output by the recognition model is a malicious IP address;
根据预先保存的域名与IP地址的对应关系,将所确定的每个恶意IP地址对应的每个域名,确定为恶意域名。According to the pre-saved correspondence between domain names and IP addresses, each domain name corresponding to each determined malicious IP address is determined as a malicious domain name.
进一步地,所述处理器701,具体用于根据预先保存的DNS日志中的域名与IP地址的对应关系,确定预先保存的黑名单中的IP地址对应的每个候选域名,确定所述每个候选域名对应的每个候选IP地址。Further, the processor 701 is specifically configured to determine each candidate domain name corresponding to the IP address in the pre-saved blacklist based on the corresponding relationship between the domain name and the IP address in the pre-saved DNS log, and determine each candidate domain name. Each candidate IP address corresponding to the candidate domain name.
进一步地,所述处理器701,具体用于根据预先保存的DNS日志中的域名与IP地址的对应关系,确定黑名单中的域名对应的每个候选IP地址。Further, the processor 701 is specifically configured to determine each candidate IP address corresponding to the domain name in the blacklist according to the corresponding relationship between the domain name and the IP address in the pre-saved DNS log.
进一步地,所述处理器701,还用于针对每个候选IP地址循环执行以下步骤:Further, the processor 701 is also configured to perform the following steps cyclically for each candidate IP address:
确定每个候选IP地址,对应的每个域名;Determine each candidate IP address and each corresponding domain name;
将所述每个域名对应的IP地址确定为候选IP地址;Determine the IP address corresponding to each domain name as a candidate IP address;
直至获取到的每个域名对应的IP地址均为候选IP地址,或每个候选IP地址对应的域名均被获取到。Until the IP address corresponding to each obtained domain name is a candidate IP address, or the domain name corresponding to each candidate IP address is obtained.
进一步地,所述处理器701,还用于针对每个候选IP地址,将该候选IP地址对应的域名的数量,确定为第一数量,并获取该候选IP地址对应的域名作为目标域名,确定所述目标域名对应的每个IP地址,统计所述每个IP地址存在于所述黑名单中的第二数量;Further, the processor 701 is also configured to, for each candidate IP address, determine the number of domain names corresponding to the candidate IP address as the first number, and obtain the domain name corresponding to the candidate IP address as the target domain name, and determine For each IP address corresponding to the target domain name, count the second number of each IP address present in the blacklist;
根据所述第二数量与所述第一数量的比值,确定该候选IP地址为恶意IP地址的置信度;Determine the confidence that the candidate IP address is a malicious IP address based on the ratio of the second number to the first number;
所述处理器701,具体用于将该候选IP地址对应的置信度及所述特征类型输入识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址。The processor 701 is specifically configured to input the confidence corresponding to the candidate IP address and the feature type into the recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address.
上述服务器提到的通信总线可以是外设部件互连标准(Peripheral Component Interconnect,PCI)总线或扩展工业标准结构(Extended Industry Standard Architecture,EISA)总线等。该通信总线可以分为地址总线、数据总线、控制总线等。为便于表示,图中仅用一条粗线表示,但并不表示仅有一根总线或一种类型的总线。The communication bus mentioned in the above-mentioned server can be the Peripheral Component Interconnect (PCI) bus or the Extended Industry Standard Architecture (EISA) bus, etc. The communication bus can be divided into address bus, data bus, control bus, etc. For ease of presentation, only one thick line is used in the figure, but it does not mean that there is only one bus or one type of bus.
通信接口702用于上述电子设备与其他设备之间的通信。The communication interface 702 is used for communication between the above-mentioned electronic device and other devices.
存储器可以包括随机存取存储器(Random Access Memory,RAM),也可以包括非易失性存储器(Non-Volatile Memory,NVM),例如至少一个磁盘存储器。可选地,存储器还可以是至少一个位于远离前述处理器的存储装置。The memory may include random access memory (Random Access Memory, RAM) or non-volatile memory (Non-Volatile Memory, NVM), such as at least one disk memory. Optionally, the memory may also be at least one storage device located remotely from the aforementioned processor.
上述处理器可以是通用处理器,包括中央处理器、网络处理器(Network Processor,NP)等;还可以是数字指令处理器(Digital Signal Processing,DSP)、专用集成电路、现场可编程门陈列或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。The above-mentioned processor can be a general-purpose processor, including a central processing unit, a network processor (Network Processor, NP), etc.; it can also be a digital instruction processor (Digital Signal Processing, DSP), an application-specific integrated circuit, a field programmable gate array, or Other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
实施例8:Example 8:
在上述各实施例的基础上,本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质内存储有可由电子设备执行的计算机程序, 当所述程序在所述电子设备上运行时,使得所述电子设备执行时实现如下步骤:On the basis of the above embodiments, embodiments of the present application also provide a computer-readable storage medium. The computer-readable storage medium stores a computer program that can be executed by an electronic device. When the program is stored in the electronic device, When running on the device, the following steps are implemented when the electronic device is executed:
所述存储器中存储有计算机程序,当所述程序被所述处理器执行时,使得所述处理器执行如下步骤:A computer program is stored in the memory, and when the program is executed by the processor, the processor performs the following steps:
根据预先保存的黑名单中的IP地址以及黑名单中的域名,确定每个候选IP地址;Determine each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist;
针对所述每个候选IP地址,获取预设时间段内该候选IP地址的Netflow数据对应的预设类型的子数据,通过Embedded方法,提取该预设特征类型的子数据中异常的子数据对应的特征类型,将所述特征类型输入预先训练完成的识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址;For each candidate IP address, obtain the preset type of sub-data corresponding to the Netflow data of the candidate IP address within the preset time period, and extract the abnormal sub-data correspondence in the sub-data of the preset feature type through the Embedded method The feature type is input into the pre-trained recognition model to obtain whether the candidate IP address output by the recognition model is a malicious IP address;
根据预先保存的域名与IP地址的对应关系,将所确定的每个恶意IP地址对应的每个域名,确定为恶意域名。According to the pre-saved correspondence between domain names and IP addresses, each domain name corresponding to each determined malicious IP address is determined as a malicious domain name.
在一种可能的实施方式中,所述根据预先保存的黑名单中的IP地址,确定每个候选IP地址包括:In a possible implementation, determining each candidate IP address based on the IP addresses in a pre-saved blacklist includes:
根据预先保存的DNS日志中的域名与IP地址的对应关系,确定预先保存的黑名单中的IP地址对应的每个候选域名,确定所述每个候选域名对应的每个候选IP地址。According to the correspondence between the domain name and the IP address in the pre-saved DNS log, each candidate domain name corresponding to the IP address in the pre-saved blacklist is determined, and each candidate IP address corresponding to each candidate domain name is determined.
在一种可能的实施方式中,所述根据预先保存的黑名单中的域名,确定每个候选IP地址包括:In a possible implementation, determining each candidate IP address based on domain names in a pre-saved blacklist includes:
根据预先保存的DNS日志中的域名与IP地址的对应关系,确定黑名单中的域名对应的每个候选IP地址。Based on the correspondence between domain names and IP addresses in the pre-saved DNS logs, each candidate IP address corresponding to the domain name in the blacklist is determined.
在一种可能的实施方式中,所述方法还包括:In a possible implementation, the method further includes:
针对每个候选IP地址循环执行以下步骤:Cycle through the following steps for each candidate IP address:
确定每个候选IP地址,对应的每个域名;Determine each candidate IP address and each corresponding domain name;
将所述每个域名对应的IP地址确定为候选IP地址;Determine the IP address corresponding to each domain name as a candidate IP address;
直至获取到的每个域名对应的IP地址均为候选IP地址,或每个候选IP地址对应的域名均被获取到。Until the IP address corresponding to each obtained domain name is a candidate IP address, or the domain name corresponding to each candidate IP address is obtained.
在一种可能的实施方式中,所述确定每个候选IP地址之后,所述将所述特征类型输入识别模型中,获取所述识别模型输出的,该候选IP地址是否为恶意IP地址之前,所述方法还包括:In a possible implementation, after determining each candidate IP address, before inputting the feature type into the recognition model and obtaining whether the candidate IP address is a malicious IP address output by the recognition model, The method also includes:
针对每个候选IP地址,将该候选IP地址对应的域名的数量,确定为第一数量,并获取该候选IP地址对应的域名作为目标域名,确定所述目标域名对应的每个IP地址,统计所述每个IP地址存在于所述黑名单中的第二数量;For each candidate IP address, determine the number of domain names corresponding to the candidate IP address as the first number, obtain the domain name corresponding to the candidate IP address as the target domain name, determine each IP address corresponding to the target domain name, and collect statistics a second number of each IP address present in the blacklist;
根据所述第二数量与所述第一数量的比值,确定该候选IP地址为恶意IP地址的置信度;Determine the confidence that the candidate IP address is a malicious IP address based on the ratio of the second number to the first number;
所述将所述特征类型输入识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址包括:The step of inputting the feature type into the recognition model and obtaining whether the candidate IP address output by the recognition model is a malicious IP address includes:
将该候选IP地址对应的置信度及所述特征类型输入识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址。The confidence corresponding to the candidate IP address and the feature type are input into the recognition model, and whether the candidate IP address output by the recognition model is obtained is a malicious IP address.
本领域内的技术人员应明白,本申请的实施例可提供为方法、***、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will understand that embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本申请是参照根据本申请的方法、设备(***)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the present application. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine, such that the instructions executed by the processor of the computer or other programmable data processing device produce a use A device for realizing the functions specified in one process or multiple processes of the flowchart and/or one block or multiple blocks of the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或 多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions The device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device. Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的精神和范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。Obviously, those skilled in the art can make various changes and modifications to the present application without departing from the spirit and scope of the present application. In this way, if these modifications and variations of the present application fall within the scope of the claims of the present application and equivalent technologies, the present application is also intended to include these modifications and variations.

Claims (10)

  1. 一种恶意域名确定方法,所述方法包括:A method for determining malicious domain names, the method includes:
    根据预先保存的黑名单中的IP地址以及黑名单中的域名,确定每个候选IP地址;Determine each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist;
    针对所述每个候选IP地址,获取预设时间段内该候选IP地址的网络流Netflow数据对应的预设类型的子数据,通过嵌入式Embedded方法,提取该预设特征类型的子数据中异常的子数据对应的特征类型,将所述特征类型输入预先训练完成的识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址;For each candidate IP address, obtain the preset type of sub-data corresponding to the network flow Netflow data of the candidate IP address within the preset time period, and extract anomalies in the sub-data of the preset feature type through the embedded Embedded method The feature type corresponding to the sub-data, input the feature type into the pre-trained recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address;
    根据预先保存的域名与IP地址的对应关系,将所确定的每个恶意IP地址对应的每个域名,确定为恶意域名。According to the pre-saved correspondence between domain names and IP addresses, each domain name corresponding to each determined malicious IP address is determined as a malicious domain name.
  2. 根据权利要求1所述的方法,所述根据预先保存的黑名单中的IP地址,确定每个候选IP地址包括:The method according to claim 1, wherein determining each candidate IP address based on the IP addresses in a pre-saved blacklist includes:
    根据预先保存的域名解析协议DNS日志中的域名与IP地址的对应关系,确定预先保存的黑名单中的IP地址对应的每个候选域名,确定所述每个候选域名对应的每个候选IP地址。Determine each candidate domain name corresponding to the IP address in the pre-saved blacklist according to the corresponding relationship between the domain name and the IP address in the pre-saved Domain Name Resolution Protocol DNS log, and determine each candidate IP address corresponding to each candidate domain name. .
  3. 根据权利要求1所述的方法,所述根据预先保存的黑名单中的域名,确定每个候选IP地址包括:The method according to claim 1, wherein determining each candidate IP address based on domain names in a pre-saved blacklist includes:
    根据预先保存的DNS日志中的域名与IP地址的对应关系,确定黑名单中的域名对应的每个候选IP地址。Based on the correspondence between domain names and IP addresses in the pre-saved DNS logs, each candidate IP address corresponding to the domain name in the blacklist is determined.
  4. 根据权利要求2或3所述的方法,所述方法还包括:The method according to claim 2 or 3, further comprising:
    针对每个候选IP地址循环执行以下步骤:Cycle through the following steps for each candidate IP address:
    确定每个候选IP地址,对应的每个域名;Determine each candidate IP address and each corresponding domain name;
    将所述每个域名对应的IP地址确定为候选IP地址;Determine the IP address corresponding to each domain name as a candidate IP address;
    直至获取到的每个域名对应的IP地址均为候选IP地址,或每个候选IP地址对应的域名均被获取到。Until the IP address corresponding to each obtained domain name is a candidate IP address, or the domain name corresponding to each candidate IP address is obtained.
  5. 根据权利要求1所述的方法,所述确定每个候选IP地址之后,所述将所述特征类型输入识别模型中,获取所述识别模型输出的,该候选IP地址是否为恶意IP地址之前,所述方法还包括:The method according to claim 1, after determining each candidate IP address, before inputting the feature type into the recognition model and obtaining whether the candidate IP address is a malicious IP address output by the recognition model, The method also includes:
    针对每个候选IP地址,将该候选IP地址对应的域名的数量,确定为第一数量,并获取该候选IP地址对应的域名作为目标域名,确定所述目标域名对应的每个IP地址,统计所述每个IP地址存在于所述黑名单中的第二数量;For each candidate IP address, determine the number of domain names corresponding to the candidate IP address as the first number, obtain the domain name corresponding to the candidate IP address as the target domain name, determine each IP address corresponding to the target domain name, and collect statistics a second number of each IP address present in the blacklist;
    根据所述第二数量与所述第一数量的比值,确定该候选IP地址为恶意IP地址的置信度;Determine the confidence that the candidate IP address is a malicious IP address based on the ratio of the second number to the first number;
    所述将所述特征类型输入识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址包括:The step of inputting the feature type into the recognition model and obtaining whether the candidate IP address output by the recognition model is a malicious IP address includes:
    将该候选IP地址对应的置信度及所述特征类型输入识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址。The confidence corresponding to the candidate IP address and the feature type are input into the recognition model, and whether the candidate IP address output by the recognition model is obtained is a malicious IP address.
  6. 一种恶意域名确定装置,所述装置包括:A malicious domain name determination device, the device includes:
    确定模块,用于根据预先保存的黑名单中的IP地址以及黑名单中的域名,确定每个候选IP地址;A determination module used to determine each candidate IP address based on the IP addresses in the pre-saved blacklist and the domain names in the blacklist;
    处理模块,用于针对所述每个候选IP地址,获取预设时间段内该候选IP地址的网络流Netflow数据对应的预设类型的子数据,通过嵌入式Embedded方法,提取该预设特征类型的子数据中异常的子数据对应的特征类型,将所述特征类型输入预先训练完成的识别模型中,获取所述识别模型输出的该候选IP地址是否为恶意IP地址;A processing module configured to obtain, for each candidate IP address, sub-data of a preset type corresponding to the network flow Netflow data of the candidate IP address within a preset time period, and extract the preset feature type through the embedded Embedded method The feature type corresponding to the abnormal sub-data in the sub-data, input the feature type into the pre-trained recognition model, and obtain whether the candidate IP address output by the recognition model is a malicious IP address;
    所述确定模块,还用于根据预先保存的域名与IP地址的对应关系,将所确定的每个恶意IP地址对应的每个域名,确定为恶意域名。The determination module is also configured to determine each domain name corresponding to each determined malicious IP address as a malicious domain name based on the pre-saved correspondence between the domain name and the IP address.
  7. 根据权利要求6所述的装置,所述确定模块,具体用于根据预先保存的域名解析协议DNS日志中的域名与IP地址的对应关系,确定预先保存的黑名单中的IP地址对应的每个候选域名,确定所述每个候选域名对应的每个候选IP地址。The device according to claim 6, the determination module is specifically configured to determine each IP address corresponding to the pre-saved blacklist according to the corresponding relationship between the domain name and the IP address in the pre-saved Domain Name Resolution Protocol DNS log. Candidate domain names, determine each candidate IP address corresponding to each candidate domain name.
  8. 根据权利要求6所述的装置,所述确定模块,具体用于根据预先保存 的DNS日志中的域名与IP地址的对应关系,确定黑名单中的域名对应的每个候选IP地址。The device according to claim 6, the determination module is specifically configured to determine each candidate IP address corresponding to the domain name in the blacklist based on the correspondence between the domain name and the IP address in the pre-saved DNS log.
  9. 一种电子设备,所述电子设备至少包括处理器和存储器,所述处理器用于执行存储器中存储的计算机程序时执行权利要求1-5中任一项所述恶意域名确定方法的步骤。An electronic device. The electronic device at least includes a processor and a memory. The processor is configured to perform the steps of the malicious domain name determination method in any one of claims 1-5 when executing a computer program stored in the memory.
  10. 一种计算机可读存储介质,其存储有计算机程序,所述计算机程序被处理器执行时执行权利要求1-5中任一项所述恶意域名确定方法的步骤。A computer-readable storage medium stores a computer program. When the computer program is executed by a processor, the steps of the malicious domain name determination method described in any one of claims 1-5 are performed.
PCT/CN2022/136819 2022-08-16 2022-12-06 Method and apparatus for determining malicious domain name, device, and medium WO2024036822A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210978392.3 2022-08-16
CN202210978392.3A CN115412312A (en) 2022-08-16 2022-08-16 Malicious domain name determination method, device, equipment and medium

Publications (1)

Publication Number Publication Date
WO2024036822A1 true WO2024036822A1 (en) 2024-02-22

Family

ID=84159721

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/136819 WO2024036822A1 (en) 2022-08-16 2022-12-06 Method and apparatus for determining malicious domain name, device, and medium

Country Status (2)

Country Link
CN (1) CN115412312A (en)
WO (1) WO2024036822A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115412312A (en) * 2022-08-16 2022-11-29 天翼安全科技有限公司 Malicious domain name determination method, device, equipment and medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101702660A (en) * 2009-11-12 2010-05-05 中国科学院计算技术研究所 Abnormal domain name detection method and system
CN102523210A (en) * 2011-12-06 2012-06-27 中国科学院计算机网络信息中心 Phishing website detection method and device
CN102882889A (en) * 2012-10-18 2013-01-16 珠海市君天电子科技有限公司 Method and system for concentrated IP (Internet Protocol) collection and identification of phishing websites
CN104994117A (en) * 2015-08-07 2015-10-21 国家计算机网络与信息安全管理中心江苏分中心 Malicious domain name detection method and system based on DNS (Domain Name Server) resolution data
CN105959294A (en) * 2016-06-17 2016-09-21 北京网康科技有限公司 Malicious domain name identification method and device
CN107517193A (en) * 2016-06-17 2017-12-26 百度在线网络技术(北京)有限公司 Malicious websites recognition methods and device
CN108540490A (en) * 2018-04-26 2018-09-14 四川长虹电器股份有限公司 A kind of detection of fishing website and domain name are put on record storage method
CN110431817A (en) * 2017-03-10 2019-11-08 维萨国际服务协会 Identify malicious network device
US20210014247A1 (en) * 2019-07-09 2021-01-14 Mcafee, Llc Methods, systems, articles of manufacture and apparatus for producing generic ip reputation through cross-protocol analysis
CN115412312A (en) * 2022-08-16 2022-11-29 天翼安全科技有限公司 Malicious domain name determination method, device, equipment and medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101702660A (en) * 2009-11-12 2010-05-05 中国科学院计算技术研究所 Abnormal domain name detection method and system
CN102523210A (en) * 2011-12-06 2012-06-27 中国科学院计算机网络信息中心 Phishing website detection method and device
CN102882889A (en) * 2012-10-18 2013-01-16 珠海市君天电子科技有限公司 Method and system for concentrated IP (Internet Protocol) collection and identification of phishing websites
CN104994117A (en) * 2015-08-07 2015-10-21 国家计算机网络与信息安全管理中心江苏分中心 Malicious domain name detection method and system based on DNS (Domain Name Server) resolution data
CN105959294A (en) * 2016-06-17 2016-09-21 北京网康科技有限公司 Malicious domain name identification method and device
CN107517193A (en) * 2016-06-17 2017-12-26 百度在线网络技术(北京)有限公司 Malicious websites recognition methods and device
CN110431817A (en) * 2017-03-10 2019-11-08 维萨国际服务协会 Identify malicious network device
CN108540490A (en) * 2018-04-26 2018-09-14 四川长虹电器股份有限公司 A kind of detection of fishing website and domain name are put on record storage method
US20210014247A1 (en) * 2019-07-09 2021-01-14 Mcafee, Llc Methods, systems, articles of manufacture and apparatus for producing generic ip reputation through cross-protocol analysis
CN115412312A (en) * 2022-08-16 2022-11-29 天翼安全科技有限公司 Malicious domain name determination method, device, equipment and medium

Also Published As

Publication number Publication date
CN115412312A (en) 2022-11-29

Similar Documents

Publication Publication Date Title
Vinayakumar et al. Scalable framework for cyber threat situational awareness based on domain name systems data analysis
Prasse et al. Malware detection by analysing network traffic with neural networks
Choi et al. Detecting malicious web links and identifying their attack types
Dietrich et al. CoCoSpot: Clustering and recognizing botnet command and control channels using traffic analysis
US8893278B1 (en) Detecting malware communication on an infected computing device
US7979368B2 (en) Systems and methods for processing data flows
US20110238855A1 (en) Processing data flows with a data flow processor
US20110231564A1 (en) Processing data flows with a data flow processor
CN110362992B (en) Method and apparatus for blocking or detecting computer attacks in cloud-based environment
US20080229415A1 (en) Systems and methods for processing data flows
JP2019021294A (en) SYSTEM AND METHOD OF DETERMINING DDoS ATTACKS
JP5832951B2 (en) Attack determination device, attack determination method, and attack determination program
CN107463844B (en) WEB Trojan horse detection method and system
Gupta et al. DDoS attack algorithm using ICMP flood
CN113556343B (en) DDoS attack defense method and device based on browser fingerprint identification
Fallah et al. Android malware detection using network traffic based on sequential deep learning models
US20220141252A1 (en) System and method for data filtering in machine learning model to detect impersonation attacks
WO2024036822A1 (en) Method and apparatus for determining malicious domain name, device, and medium
Mishra et al. Intelligent phishing detection system using similarity matching algorithms
Li et al. A method based on statistical characteristics for detection malware requests in network traffic
CN112583827A (en) Data leakage detection method and device
Raftopoulos et al. A quality metric for IDS signatures: in the wild the size matters
CN110177113B (en) Internet protection system and access request processing method
Echevarria et al. An experimental study on the applicability of SYN cookies to networked constrained devices
Chiba et al. Botprofiler: Profiling variability of substrings in http requests to detect malware-infected hosts

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22955591

Country of ref document: EP

Kind code of ref document: A1