WO2018157336A1 - 数据处理装置和方法 - Google Patents

数据处理装置和方法 Download PDF

Info

Publication number
WO2018157336A1
WO2018157336A1 PCT/CN2017/075349 CN2017075349W WO2018157336A1 WO 2018157336 A1 WO2018157336 A1 WO 2018157336A1 CN 2017075349 W CN2017075349 W CN 2017075349W WO 2018157336 A1 WO2018157336 A1 WO 2018157336A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
content
mapping database
historical
unknown attack
Prior art date
Application number
PCT/CN2017/075349
Other languages
English (en)
French (fr)
Inventor
郭代飞
刘锡峰
Original Assignee
西门子公司
郭代飞
刘锡峰
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 西门子公司, 郭代飞, 刘锡峰 filed Critical 西门子公司
Priority to US16/490,150 priority Critical patent/US11405358B2/en
Priority to CN201780087242.0A priority patent/CN110574348B/zh
Priority to ES17898759T priority patent/ES2931991T3/es
Priority to PCT/CN2017/075349 priority patent/WO2018157336A1/zh
Priority to EP17898759.0A priority patent/EP3576365B1/en
Publication of WO2018157336A1 publication Critical patent/WO2018157336A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • H04L63/0245Filtering by information in the payload
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/16Threshold monitoring
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/02Network architectures or network communication protocols for network security for separating internal from external traffic, e.g. firewalls
    • H04L63/0227Filtering policies
    • H04L63/0263Rule management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1416Event detection, e.g. attack signature detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1433Vulnerability analysis

Definitions

  • the present invention relates to a data processing apparatus and method.
  • network traffic collection devices can be deployed at the objects that need to be protected to collect network traffic from the network.
  • a Network Security Monitor (NSM) can be deployed on a customer's network to obtain network traffic.
  • NSM Network Security Monitor
  • IDS Intrusion Detection System
  • the NSM can be configured not only as a detection sensor disposed at the front end in the network environment, but also as a raw data collector.
  • NSM will be able to capture network data streams into unstructured files such as pcap files, preprocess them, and then send them to the Central Cyber Security Monitoring Center.
  • NSM can be used to help correlate cybersecurity threats.
  • the present invention is directed to a data processing apparatus and method that solves the above and/or other technical problems.
  • a data processing apparatus includes: a data collection unit configured to collect data transmitted in a network and divide the collected data into known attack data and unknown attack data according to predetermined characteristics; The conversion unit is configured to replace at least a portion of the content included in the unknown attack data with the corresponding identification code according to the mapping database. Therefore, the data to be transmitted to the central network security monitoring center can be reduced.
  • the data conversion unit includes: a data identification unit configured to identify content included in the unknown attack data; a data classification unit configured to identify the unknown attack data by the data identification unit according to the recognition result of the data identification unit The content is classified. Therefore, the speed and accuracy of the safety analysis can be improved.
  • the data conversion unit includes: a data matching unit configured to determine whether the content in the unknown attack data is the same as the historical data previously transmitted in the network included in the mapping database; a data replacement unit configured to determine the unknown attack data When the content in the same is the same as the historical data, the same content is replaced with the identification code corresponding to the historical data in the mapping database.
  • the mapping database includes an identification code corresponding to the historical data and information related to the historical data, and the data matching unit is configured to determine whether the content in the unknown attack data is identical to the historical data according to the information related to the historical data in the mapping database.
  • the information related to the historical data includes a message digest message digest of the historical data
  • the data matching unit is configured to obtain a message digest message digest of the content in the unknown attack data, and according to whether the message digest of the content in the location attack data is related to the historical data
  • the message digest is the same to determine if the content in the unknown attack data is the same as the historical data.
  • the information related to the history data includes the start position and length of the history data, and the data matching unit is configured to select, among the unknown attack data, content for performing the same judgment based on the start position and length of the history data.
  • the data processing apparatus further includes: a mapping database generating unit configured to generate the mapping database based on historical data previously transmitted in the network.
  • the mapping database generating unit generates a mapping database based on historical data in which historical frequency of data previously transmitted in the network is greater than a predetermined threshold.
  • the data processing apparatus further includes: a communication unit configured to transmit the data converted by the data conversion unit to the outside.
  • a data processing method includes: collecting data transmitted in a network, and dividing the collected data into known attack data and unknown attack data according to a predetermined feature; and including unknown attack data according to the mapping database At least a portion of the content is replaced with a corresponding identification code. Therefore, the data to be transmitted to the central network security monitoring center can be reduced.
  • the converting comprises: identifying content included in the unknown attack data; classifying the content of the unknown attack data that has been identified by the data identifying unit according to the recognition result. Therefore, the speed and accuracy of the safety analysis can be improved.
  • the step of converting includes: determining whether the content in the unknown attack data is the same as the historical data previously transmitted in the network included in the mapping database; and replacing the same content with the mapping when determining that the content in the unknown attack data is the same as the historical data
  • the mapping database includes an identifier corresponding to the historical data and information related to the historical data, and the converting comprises: determining whether the content in the unknown attack data is the same as the historical data according to the information related to the historical data in the mapping database.
  • the information related to the historical data includes a message digest of the historical data
  • the step of converting includes: obtaining a message digest of the content in the unknown attack data, and determining whether the message digest of the content in the location attack data is the same as the message digest of the historical data. Whether the content in the unknown attack data is the same as the historical data.
  • the information related to the historical data includes the starting position and length of the historical data, and the converting step includes: selecting whether to perform the same judgment in the unknown attack data according to the starting position and length of the historical data. content.
  • the method also includes generating a mapping database based on historical data previously transmitted in the network.
  • the step of generating a mapping database includes generating a mapping database based on historical data having a frequency greater than a predetermined threshold appearing in historical data previously transmitted in the network.
  • the method also includes transmitting the converted data to the outside.
  • a data processing apparatus and method may perform correlation analysis on data transmitted in an industrial control network, construct a mapping database, and replace the same portion of the data with an identification code, thereby reducing transmission to a central network.
  • Data from the Security Monitoring Center can be categorized to improve the speed and accuracy of security analysis.
  • FIG. 1 is a schematic block diagram showing a data processing device according to an exemplary embodiment
  • FIG. 2 is a diagram illustrating an exemplary application of a data processing device, according to an exemplary embodiment
  • FIG. 3 is a flowchart illustrating a data processing method according to an exemplary embodiment.
  • FIG. 1 is a schematic block diagram illustrating a data processing device according to an exemplary embodiment
  • FIG. 2 is a diagram illustrating an exemplary application of a data processing device according to an exemplary embodiment
  • the data processing apparatus may collect data such as transmissions in an industrial control network, and process the data to reduce the size of the data, and thus may transmit the reduced processing by a smaller bandwidth.
  • the latter data, and thus the data processing device is also referred to hereinafter as a Data Collecting and Preprocessing Agent.
  • a data processing apparatus may include a data collection unit 100 and a data conversion unit 300.
  • the data collection unit 100 can be deployed in a network environment, such as an industrial control network, in need of protection to collect data transmitted in the network to be protected.
  • industrial control networks can use Modbus Industrial Control Protocol and FTP. Agreement, etc.
  • the data collection unit 100 may divide the collected data into known data and unknown attack data according to predetermined characteristics.
  • the data collection unit 100 may perform a basic security scan on the collected data according to predetermined characteristics, thereby determining which of the collected data is data corresponding to a security attack that may threaten the network to be protected.
  • the data collection unit 100 may divide the collected data into known attack data corresponding to a known attack and unknown attack data corresponding to an unknown attack based on a feature string matching technique of a known attack signature library. It is known that in order to avoid redundancy, a description of known techniques is omitted here.
  • the data collection unit 100 can filter known attack data transmitted in the network.
  • the data collection unit 100 may transmit the determined unknown attack data to the data conversion unit 300.
  • the data conversion unit (300) may replace the content included in the unknown attack data with the corresponding identification code according to the mapping database.
  • the data conversion unit 300 may include a data identification unit 310 and a data classification unit 330.
  • the data identification unit 310 can identify the content included in the unknown attack data.
  • the data identification unit 310 can analyze the protocol used for transmission of unknown attack data in the network to obtain header data and load data of unknown attack data.
  • a Modbus protocol label can be obtained.
  • the data classification unit 330 can classify the content of the unknown attack data that has been recognized by the data identification unit according to the recognition result of the data identification unit 310.
  • data classification unit 330 may classify unknown attack data into different categories based on a category database.
  • the data classification database may include category rule information related to different network protocols, such as protocol categories, application categories, and command categories.
  • a category database can be used to classify data based on an application scenario. In industrial control networks, more and more applications are combined with traditional network protocols such as HTTP, FTP, Telnet, SSH, and the like. For example, in the industrial control network of the Siemens PCS7 series, the PROFINET, OPC and S7 protocols are used.
  • the category database can store industrial control protocol types and important commands based on the data transmitted in the network and the construction of the network.
  • the category database may include protocol labels, command categories, and the like of the Modbus protocol.
  • the data conversion unit 300 may further include a data matching unit 350 and a data replacement unit 370.
  • the data matching unit 350 may determine whether the content classified into the different categories in the unknown attack data is partially or entirely the same as the data included in the mapping database.
  • the mapping database may store information related to historical data and an identification code corresponding to the historical data, wherein the information related to the historical data may include a message digest of the data, related category information, and data. Starting position and length.
  • historical data is those packets that frequently appear in data previously transmitted in the network.
  • the message digest may include a hash calculation result of the historical data, for example, MD5, SHA, and the like.
  • the data matching unit 350 may query information related to historical data in the mapping database, for example, a message digest of the data, related category information, a starting position and length of the data, and the like. Data matching unit 350 can then perform an affinity analysis to find out if the mapping database has the same content as the content in the unknown attack data. For example, the data matching unit 350 may locate the content in the unknown attack data by mapping the starting position of the data in the database, and then determine the data segment in the location attack data that has the same length from the starting position as the length in the mapping database. The result of the hash calculation, thereby determining whether the content in the location attack data is the same as the content in the mapping database by judging whether the determined hash calculation result is the same as the message digest in the mapping database.
  • a message digest of the data for example, a message digest of the data, related category information, a starting position and length of the data, and the like.
  • Data matching unit 350 can then perform an affinity analysis to find out if the mapping database has the same content as the content in the unknown
  • the data matching unit 350 may first calculate and compare whether the hash calculation result of the content having the smallest length and the message digest are the same. When it is determined that the same, the data matching unit 350 can calculate and compare whether the second smallest content is the same. As such, when the data matching unit 350 determines that the message digest of the content of the same length is different from the hash calculation result, the check data matching unit 350 can stop the operation. This means that the content behind the unknown attack data will be different from the longer historical data in the mapping database.
  • the data matching unit 350 may transmit the start position and length information of the same content in the unknown attack data to the data replacement.
  • Unit 370 the data matching unit 350 may transmit the start position and length information of the same content in the unknown attack data to the data replacement.
  • the data replacement unit 370 can replace the same content with an identification code in the mapping database that has a mapping relationship with the same content. For example, the data replacement unit 370 can replace the same content with the identification code from the starting position. As described above, the size of the identification code in the mapping database may be smaller than the size of the data corresponding to the identification code. Therefore, the data obtained after the data replacement unit 370 replaces the processing may be smaller than, for example, much smaller than the original unknown attack data.
  • the data processing device may include a mapping database generating unit 500.
  • the mapping database generating unit 500 is configured to perform correlation analysis on historical data transmitted in the network, and can extract frequently occurring public or overlapping data.
  • the mapping database generating unit 500 may first perform statistics on the historical data according to the category information such as the protocol category, the application category, the command category, and the like, thereby obtaining a common or overlapping frequency having a high frequency of occurrence (eg, above a predetermined threshold). data.
  • the mapping database generation unit 500 can then set an identification code for the public or overlapping data and can construct a mapping database based on the identification code and information related to the public or overlapping data.
  • the mapping database generation module 500 can perform a maximum matching association scan based on historical data.
  • the mapping database generation module 500 can determine which are frequently occurring data in the network based on predetermined thresholds T1 and T2. If the number of occurrences of the same category of data is greater than the first threshold T1, the mapping database generation module 500 will perform matching calculations on data having the same protocol, applications, and commands.
  • the mapping database generation module 500 will select two data with the same protocol information and compare the longest common or overlapping portions between them, and then record the starting position and length of the portion.
  • the mapping database generation module 500 can use the portion to compare with other data, and store and have The amount of data for the same part of the section. If the number is greater than the second threshold T2, the mapping database generation module 500 can build the mapping database with the portion.
  • the data processing apparatus may further include a communication unit 700.
  • the communication unit 700 can transmit the processed data to an external central network security monitoring center.
  • the central network security monitoring center may process the processed data according to the mapping database to restore the unknown attack data, and perform security analysis on the restored unknown attack data.
  • FIG. 3 is a flowchart illustrating a data processing method according to an exemplary embodiment.
  • data transmitted in the network may be collected, and the collected data is classified into known attack data and unknown attack data according to predetermined characteristics. Then, at least a part of the content included in the unknown attack data may be replaced with a corresponding identification code according to the mapping database (S330). Further, the converted data may be transmitted to the outside at operation S350.
  • the content included in the unknown attack data may be identified, and then the content of the unknown attack data that has been identified by the data identification unit may be classified according to the recognition result.
  • the content is replaced with an identifier corresponding to the historical data in the mapping database.
  • the mapping database may include an identification code corresponding to the historical data and information related to the historical data, such that the content in the unknown attack data is determined to be the same as the historical data according to the information related to the historical data in the mapping database.
  • Information related to historical data includes a message digest of historical data.
  • the message digest of the content in the unknown attack data can be obtained, and whether the content in the unknown attack data is the same as the historical data is determined according to whether the message digest of the content in the location attack data is the same as the message digest of the historical data.
  • information related to historical data includes the starting position and length of historical data.
  • the content of whether or not the same judgment is made is selected among the unknown attack data based on the start position and length of the history data.
  • the mapping database can be generated from historical data previously transmitted in the network. For example, a mapping database is generated based on historical data in which historical frequencies previously transmitted in the network appear to be greater than a predetermined threshold.
  • a data processing apparatus and method may perform correlation analysis on data transmitted in an industrial control network, construct a mapping database, and replace the same portion of the data with an identification code, thereby reducing transmission to a central network.
  • Data from the Security Monitoring Center can be categorized to improve the speed and accuracy of security analysis.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Computer And Data Communications (AREA)

Abstract

一种数据处理装置和方法。所述数据处理装置包括:一数据收集单元(100),被构造为收集在网络中传输的数据,并根据预定的特征将收集数据分为已知攻击数据和未知攻击数据;一数据转换单元(300),被构造为根据映射数据库将未知攻击数据中包括的至少一部分内容替换为对应的识别码。因此,可以减小网络中传输的数据的大小。

Description

数据处理装置和方法 技术领域
本发明涉及一种数据处理装置和方法。
背景技术
在中央网络安全监视中,可以在需要保护的对象处布设网络流量收集装置,以收集来自网络的网络流量。例如,网络安全监视器(NSM)可以布设在客户的网络中以获得网络流量。网络安全监视器(NSM)以与入侵检测***(IDS)的工作方式相似的方式工作,其可以监视诸如服务器拒绝(Denial of Service)的安全事件、网络扫描和由恶意软件触发的其他网络或应用的攻击。
在中央网络安全监视中,NSM可以不仅被设置为布设在网络环境中的前端处的检测传感器,也可以被用作原始数据收集器。就此,NSM将可以将网络数据流捕获为诸如pcap文件的无结构文件,并对这些数据文件进行预处理,然后将它们发送到中央网络安全监视中心。如此,可以使用NSM来帮助进行网络安全威胁的相关性分析。
但是,当监视的网络数据流变得很大时,需要很高的带宽来传输诸如pcap类文件。在工业控制网络应用中,为了处理这样的问题,提出了一种在将收集的数据发送到中央网络安全监视中心之前对收集的数据进行基于相关性分析的数据预处理方法。在工业控制网络环境中,与自动化生产工艺的控制和监视相关的网络流量相对固定。因此,需要通过识别出并简化已知数据并仅处理未知的数据,以减小需要发送的数据并缓解带宽的压力。
发明内容
本发明旨在提供一种解决上述和/或其他技术问题的数据处理装置和方法。
在一个实施例中,一种数据处理装置包括:一数据收集单元,被构造为收集在网络中传输的数据,并根据预定的特征将收集数据分为已知攻击数据和未知攻击数据;一数据转换单元,被构造为根据映射数据库将未知攻击数据中包括的至少一部分内容替换为对应的识别码。因此,可以减小将被发送到中央网络安全监视中心的数据。
数据转换单元包括:一数据识别单元,被构造为识别未知攻击数据中包括的内容;一数据分类单元,被构造为根据数据识别单元的识别结果将未知攻击数据中的已被数据识别单元所识别的内容进行分类。因此,可以改善安全分析的速度和准确度。
数据转换单元包括:一数据匹配单元,被构造为确定未知攻击数据中的内容是否与映射数据库中包括的先前在网络中传输的历史数据相同;一数据替换单元,被构造为在确定未知攻击数据中的内容与历史数据相同时,将相同的内容替换为映射数据库中的与历史数据对应的识别码。
映射数据库中包括与历史数据对应的识别码和与历史数据相关的信息,数据匹配单元被构造为根据映射数据库中的与历史数据相关的信息确定未知攻击数据中的内容是否与历史数据相同。与历史数据相关的信息包括历史数据的消息摘要消息摘要,数据匹配单元被构造为得到未知攻击数据中的内容的消息摘要消息摘要,并根据位置攻击数据中的内容的消息摘要是否与历史数据的消息摘要相同来确定未知攻击数据中的内容是否与历史数据相同。与历史数据相关的信息包括历史数据的起始位置和长度,数据匹配单元被构造为根据历史数据的起始位置和长度来在未知攻击数据中选择进行是否相同判断的内容。
数据处理装置还包括:一映射数据库生成单元,被构造为根据先前在网络中传输的历史数据来生成映射数据库。映射数据库生成单元根据将先前在网络中传输的历史数据中出现频率大于预定阈值的历史数据来生成映射数据库。
所述数据处理装置还包括:一通信单元,被构造为将经数据转换单元转换的数据发送到外部。
在另一个实施例中,一种数据处理方法包括:收集在网络中传输的数据,并根据预定的特征将收集数据分为已知攻击数据和未知攻击数据;根据映射数据库将未知攻击数据中包括的至少一部分内容替换为对应的识别码。因此,可以减小将被发送到中央网络安全监视中心的数据。
转换的步骤包括:识别未知攻击数据中包括的内容;根据识别结果将未知攻击数据中的已被数据识别单元所识别的内容进行分类。因此,可以改善安全分析的速度和准确度。
转换的步骤包括:确定未知攻击数据中的内容是否与映射数据库中包括的先前在网络中传输的历史数据相同;在确定未知攻击数据中的内容与历史数据相同时,将相同的内容替换为映射数据库中的与历史数据对应的识别码。映射数据库中包括与历史数据对应的识别码和与历史数据相关的信息,转换的步骤包括:根据映射数据库中的与历史数据相关的信息确定未知攻击数据中的内容是否与历史数据相同。与历史数据相关的信息包括历史数据的消息摘要,转换的步骤包括:得到未知攻击数据中的内容的消息摘要,并根据位置攻击数据中的内容的消息摘要是否与历史数据的消息摘要相同来确定未知攻击数据中的内容是否与历史数据相同。与历史数据相关的信息包括历史数据的起始位置和长度,转换的步骤包括:根据历史数据的起始位置和长度来在未知攻击数据中选择进行是否相同判断的 内容。
所述方法还包括:根据先前在网络中传输的历史数据来生成映射数据库。生成映射数据库的步骤包括:根据将先前在网络中传输的历史数据中出现频率大于预定阈值的历史数据来生成映射数据库。
所述方法还包括:将经转换的数据发送到外部。
根据示例性实施例,数据处理装置和方法可以对工业控制网络中传输的数据进行相关性分析,构建映射数据库,并以识别码来代替数据中相同的部分,从而减小将被发送到中央网络安全监视中心的数据。此外,可以对网络中传输的数据进行分类,从而可以改善安全分析的速度和准确度。
附图说明
以下附图仅旨在于对本发明做示意性说明和解释,并不限定本发明的范围。其中,
图1是示出根据示例性实施例的数据处理装置的示意性框图;
图2是示出根据示例性实施例的数据处理装置的示例性应用的示图;
图3是示出根据示例性实施例的数据处理方法的流程图。
附图标记说明:
100数据收集单元  300数据转换单元  500映射数据库生成单元  700通信单元
310数据识别单元  330数据分类单元  350数据匹配单元  370数据替换单元
具体实施方式
为了对本发明的技术特征、目的和效果有更加清楚的理解,现对照附图说明本发明的具体实施方式。
图1是示出根据示例性实施例的数据处理装置的示意性框图,图2是示出根据示例性实施例的数据处理装置的示例性应用的示图。这里,根据示例性实施例的数据处理装置可以收集诸如工业控制网络中的传输的数据,并对数据进行处理,以减小数据的大小,并因此可以通过更小的带宽来传输减小的处理后的数据,从而数据处理装置在下文中也被称为数据收集和预处理单元(Data Collecting and Preprocessing Agent)。
如图1中所示,根据示例性实施例的数据处理装置可以包括数据收集单元100和数据转换单元300。
数据收集单元100可以被布设在诸如工业控制网络的需要保护的网络环境中,以收集需要保护的网络中传输的数据。例如,工业控制网络可以采用Modbus工业控制协议和FTP 协议等。
当收集了需要保护的网络中传输的数据之后,数据收集单元100可以根据预定的特征将收集数据分为已知数据和未知攻击数据。具体地讲,数据收集单元100可以根据预定的特征对收集的数据进行基础的安全性扫描,从而确定收集的数据中的哪些数据是可能威胁需要保护的网络的安全的攻击相对应的数据。这里,数据收集单元100可以基于已知攻击特征库的特征串匹配技术来将收集数据分为与已知的攻击对应的已知攻击数据和与未知的攻击对应的未知攻击数据,这样的方法是已知的,为了避免冗余,在此省略对已知技术的描述。
数据收集单元100可以过滤在网络中传输的已知攻击数据。数据收集单元100可以将确定的未知攻击数据发送到数据转换单元300。数据转换单元(300)可以根据映射数据库将未知攻击数据中包括的内容替换为对应的识别码。
具体地讲,数据转换单元300可以包括数据识别单元310和数据分类单元330。数据识别单元310可以识别未知攻击数据中包括的内容。例如,数据识别单元310可以对未知攻击数据在网络中传输所采用的协议进行分析,以得到未知攻击数据的头数据和负载数据。当例如采用Modbus协议进行数据传输时,可以得到Modbus的协议标签。
然后,数据分类单元330可以根据数据识别单元310的识别结果将未知攻击数据中的已被数据识别单元所识别的内容进行分类。具体地讲,数据分类单元330可以基于类别数据库来将未知攻击数据分类为不同的类别。数据分类数据库可以包括与不同的网络协议相关的类别规则信息,例如,协议类别、应用类别和命令类别。类别数据库可以用于基于应用场景来对数据进行分类。在工业控制网络中,越来越多的应用与诸如HTTP、FTP、Telnet、SSH等的传统的网络协议相结合。例如,在西门子PCS7系列的工业控制网络中,采用了PROFINET、OPC和S7协议。因此,类别数据库可以基于网络中传输的数据和网络的构造来存储工业控制协议类型和的重要的命令。例如,当采用Modbus协议时,类别数据库可以包括Modbus协议的协议标签、命令类别等。
数据转换单元300还可以包括数据匹配单元350和数据替换单元370。数据匹配单元350可以确定未知攻击数据中的被分为不同的类别的内容是否部分或全部与映射数据库中包括的数据相同。具体地讲,映射数据库可以存储有与历史数据相关的信息以及与历史数据对应的识别码,其中,与历史数据相关的信息可以包括数据的消息摘要(message digest)、相关的类别信息、数据的起始位置和长度。这里,历史数据是在先前在网络中传输的数据中的经常出现的那些数据包。消息摘要可以包括历史数据的散列计算结果,例如,MD5、SHA等。
数据匹配单元350可以查询映射数据库中的与历史数据相关的信息,例如,数据的消息摘要(message digest)、相关的类别信息、数据的起始位置和长度等。然后,数据匹配单元350可以进行关联性分析以找到映射数据库中是否具有与未知攻击数据中的内容相同的内容。例如,数据匹配单元350可以通过映射数据库中的数据的起始位置来定位未知攻击数据中的内容,然后确定位置攻击数据中的从起始位置开始的长度与映射数据库中的长度相同的数据段的散列计算结果,从而通过判断确定的散列计算结果是否与映射数据库中的消息摘要相同,来确定位置攻击数据中的内容是否与映射数据库中的内容相同。
此外,对于映射数据库中的具有相同起始位置的内容,数据匹配单元350可以首先计算并比较长度最小的内容的散列计算结果和消息摘要是否相同。当确定相同时,数据匹配单元350可以计算并比较长度第二小的内容是否相同。如此,当数据匹配单元350确定长度相同的内容的消息摘要与散列计算结果不同时,校验数据匹配单元350可以停止运行。这意味着未知攻击数据的后面的内容将与映射数据库中的长度更长的历史数据不同。
然后,当数据匹配单元350确定了未知攻击数据中的与映射数据库中的历史数据相同的内容时,数据匹配单元350可以将相同内容在未知攻击数据中的起始位置和长度信息发送到数据替换单元370。
数据替换单元370可以将相同的内容替换为映射数据库中与相同的内容具有映射关系的识别码。例如,数据替换单元370可以从起始位置开始将相同的内容替换为识别码。如上面所描述,映射数据库中的识别码的大小可以小于与该识别码对应的数据的大小。因此,经数据替换单元370替换处理之后所得的数据可以小于,例如远小于原始的未知攻击数据。
此外,根据示例性实施例的数据处理装置可以包括映射数据库生成单元500。映射数据库生成单元500用于对网络中传输的历史数据进行关联性分析,并可以提取频繁出现的公共或重叠的数据。具体地讲,映射数据库生成单元500可以首先根据诸如协议类别、应用类别、命令类别等的类别信息对历史数据进行统计,从而得到出现频率较高(例如,高于预定阈值)的公共或重叠的数据。然后,映射数据库生成单元500可以为公共或重叠的数据设置识别码,并可以根据识别码和与公共或重叠的数据相关的信息来构建映射数据库。
更具体地讲,映射数据库产生模块500可以基于历史数据进行最大匹配关联扫描。映射数据库产生模块500可以根据预定的阈值T1和T2来确定哪些是网络中频繁出现的数据。如果相同类别的数据出现的次数大于第一阈值T1,则映射数据库产生模块500将对具有相同协议、应用和命令的数据进行匹配计算。映射数据库产生模块500将选择具有相同的协议信息的两个数据,并比较他们之间最长的公共或重叠的部分,然后记录该部分的开始位置和长度。映射数据库产生模块500可以使用该部分与其他数据进行比较,并存储与具有 与该部分相同的部分的数据的数量。如果该数量大于第二阈值T2,则映射数据库产生模块500可以以该部分来构建映射数据库。
此外,数据处理装置还可以包括通信单元700。当以较短的识别码替换了未知攻击数据中的内容从而减小了数据大小时,通信单元700可以将处理后的数据发送到外部的中央网络安全监视中心。当接收到处理后的数据时,中央网络安全监视中心可以根据映射数据库对处理后的数据进行处理,以还原未知攻击数据,并对还原的未知攻击数据进行安全分析。
图3是示出根据示例性实施例的数据处理方法的流程图。
如图3中所示,首先,在操作S310,可以收集在网络中传输的数据,并根据预定的特征将收集数据分为已知攻击数据和未知攻击数据。然后,可以根据映射数据库将未知攻击数据中包括的至少一部分内容替换为对应的识别码(S330)。此外,在操作S350,可以将经转换的数据发送到外部。
在一个实施例中,可以识别未知攻击数据中包括的内容,并然后可以根据识别结果将未知攻击数据中的已被数据识别单元所识别的内容进行分类。
在另一个实施例中,可以确定未知攻击数据中的内容是否与映射数据库中包括的先前在网络中传输的历史数据相同,并可以在确定未知攻击数据中的内容与历史数据相同时,将相同的内容替换为映射数据库中的与历史数据对应的识别码。
具体地讲,映射数据库中可以包括与历史数据对应的识别码和与历史数据相关的信息,这样,根据映射数据库中的与历史数据相关的信息确定未知攻击数据中的内容是否与历史数据相同。与历史数据相关的信息包括历史数据的消息摘要。如此,可以得到未知攻击数据中的内容的消息摘要,并根据位置攻击数据中的内容的消息摘要是否与历史数据的消息摘要相同来确定未知攻击数据中的内容是否与历史数据相同。例如,与历史数据相关的信息包括历史数据的起始位置和长度。这里,根据历史数据的起始位置和长度来在未知攻击数据中选择进行是否相同判断的内容。
映射数据库可以根据先前在网络中传输的历史数据来生成。例如,根据将先前在网络中传输的历史数据中出现频率大于预定阈值的历史数据来生成映射数据库。
根据示例性实施例,数据处理装置和方法可以对工业控制网络中传输的数据进行相关性分析,构建映射数据库,并以识别码来代替数据中相同的部分,从而减小将被发送到中央网络安全监视中心的数据。此外,可以对网络中传输的数据进行分类,从而可以改善安全分析的速度和准确度。
应当理解,虽然本说明书是按照各个实施例描述的,但并非每个实施例仅包含一个独 立的技术方案,说明书的这种叙述方式仅仅是为清楚起见,本领域技术人员应当将说明书作为一个整体,各实施例中的技术方案也可以经适当组合,形成本领域技术人员可以理解的其他实施方式。
以上所述仅为本发明示意性的具体实施方式,并非用以限定本发明的范围。任何本领域的技术人员,在不脱离本发明的构思和原则的前提下所作的等同变化、修改与结合,均应属于本发明保护的范围。

Claims (18)

  1. 数据处理装置,其特征在于,所述数据处理装置包括:
    一数据收集单元(100),被构造为收集在网络中传输的数据,并根据预定的特征将收集数据分为已知攻击数据和未知攻击数据;
    一数据转换单元(300),被构造为根据映射数据库将未知攻击数据中包括的至少一部分内容替换为对应的识别码。
  2. 如权利要求1所述的数据处理装置,其特征在于,数据转换单元包括:
    一数据识别单元(310),被构造为识别未知攻击数据中包括的内容;
    一数据分类单元(330),被构造为根据数据识别单元的识别结果将未知攻击数据中的已被数据识别单元所识别的内容进行分类。
  3. 如权利要求1所述的数据处理装置,其特征在于,数据转换单元包括:
    一数据匹配单元(350),被构造为确定未知攻击数据中的内容是否与映射数据库中包括的先前在网络中传输的历史数据相同;
    一数据替换单元(370),被构造为在确定未知攻击数据中的内容与历史数据相同时,将相同的内容替换为映射数据库中的与历史数据对应的识别码。
  4. 如权利要求3所述的数据处理装置,其特征在于,映射数据库中包括与历史数据对应的识别码和与历史数据相关的信息,数据匹配单元被构造为根据映射数据库中的与历史数据相关的信息确定未知攻击数据中的内容是否与历史数据相同。
  5. 如权利要求4所述的数据处理装置,其特征在于,与历史数据相关的信息包括历史数据的消息摘要,数据匹配单元被构造为得到未知攻击数据中的内容的消息摘要,并根据位置攻击数据中的内容的消息摘要是否与历史数据的消息摘要相同来确定未知攻击数据中的内容是否与历史数据相同。
  6. 如权利要求5所述的数据处理装置,其特征在于,与历史数据相关的信息包括历史数据的起始位置和长度,数据匹配单元被构造为根据历史数据的起始位置和长度来在未知攻击数据中选择进行是否相同判断的内容。
  7. 如权利要求3所述的数据处理装置,其特征在于,数据处理装置还包括:
    一映射数据库生成单元(500),被构造为根据先前在网络中传输的历史数据来生成映射数据库。
  8. 如权利要求7所述的数据处理装置,其特征在于,映射数据库生成单元根据将先前在网络中传输的历史数据中出现频率大于预定阈值的历史数据来生成映射数据库。
  9. 如权利要求1所述的数据处理装置,其特征在于,所述数据处理装置还包括:
    一通信单元(700),被构造为将经数据转换单元转换的数据发送到外部。
  10. 数据处理方法,其特征在于,所述数据处理方法包括:
    收集在网络中传输的数据,并根据预定的特征将收集数据分为已知攻击数据和未知攻击数据;
    根据映射数据库将未知攻击数据中包括的至少一部分内容替换为对应的识别码。
  11. 如权利要求10所述的方法,其特征在于,转换的步骤包括:
    识别未知攻击数据中包括的内容;
    根据识别结果将未知攻击数据中的已被数据识别单元所识别的内容进行分类。
  12. 如权利要求10所述的方法,其特征在于,转换的步骤包括:
    确定未知攻击数据中的内容是否与映射数据库中包括的先前在网络中传输的历史数据相同;
    在确定未知攻击数据中的内容与历史数据相同时,将相同的内容替换为映射数据库中的与历史数据对应的识别码。
  13. 如权利要求12所述的方法,其特征在于,映射数据库中包括与历史数据对应的识别码和与历史数据相关的信息,转换的步骤包括:
    根据映射数据库中的与历史数据相关的信息确定未知攻击数据中的内容是否与历史数据相同。
  14. 如权利要求13所述的方法,其特征在于,与历史数据相关的信息包括历史数据的消息摘要,转换的步骤包括:
    得到未知攻击数据中的内容的消息摘要,并根据位置攻击数据中的内容的消息摘要是否与历史数据的消息摘要相同来确定未知攻击数据中的内容是否与历史数据相同。
  15. 如权利要求14所述的方法,其特征在于,与历史数据相关的信息包括历史数据的起始位置和长度,转换的步骤包括:
    根据历史数据的起始位置和长度来在未知攻击数据中选择进行是否相同判断的内容。
  16. 如权利要求12所述的方法,其特征在于,所述方法还包括:
    根据先前在网络中传输的历史数据来生成映射数据库。
  17. 如权利要求16所述的方法,其特征在于,生成映射数据库的步骤包括:
    根据将先前在网络中传输的历史数据中出现频率大于预定阈值的历史数据来生成映射数据库。
  18. 如权利要求10所述的方法,其特征在于,所述方法还包括: 将经转换的数据发送到外部。
PCT/CN2017/075349 2017-03-01 2017-03-01 数据处理装置和方法 WO2018157336A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
US16/490,150 US11405358B2 (en) 2017-03-01 2017-03-01 Network security monitoring of network traffic
CN201780087242.0A CN110574348B (zh) 2017-03-01 2017-03-01 数据处理装置和方法
ES17898759T ES2931991T3 (es) 2017-03-01 2017-03-01 Dispositivo y método de procesamiento de datos
PCT/CN2017/075349 WO2018157336A1 (zh) 2017-03-01 2017-03-01 数据处理装置和方法
EP17898759.0A EP3576365B1 (en) 2017-03-01 2017-03-01 Data processing device and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2017/075349 WO2018157336A1 (zh) 2017-03-01 2017-03-01 数据处理装置和方法

Publications (1)

Publication Number Publication Date
WO2018157336A1 true WO2018157336A1 (zh) 2018-09-07

Family

ID=63369865

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/075349 WO2018157336A1 (zh) 2017-03-01 2017-03-01 数据处理装置和方法

Country Status (5)

Country Link
US (1) US11405358B2 (zh)
EP (1) EP3576365B1 (zh)
CN (1) CN110574348B (zh)
ES (1) ES2931991T3 (zh)
WO (1) WO2018157336A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114978782A (zh) * 2022-08-02 2022-08-30 北京六方云信息技术有限公司 工控威胁检测方法、装置、工控设备以及存储介质

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639073A (zh) * 2020-04-30 2020-09-08 深圳精匠云创科技有限公司 边缘计算接入方法及边缘计算节点装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100082513A1 (en) * 2008-09-26 2010-04-01 Lei Liu System and Method for Distributed Denial of Service Identification and Prevention
CN103825888A (zh) * 2014-02-17 2014-05-28 北京奇虎科技有限公司 网络威胁处理方法及设备
CN104159249A (zh) * 2014-07-30 2014-11-19 华为技术有限公司 一种业务数据管理的方法、装置及***
CN105429963A (zh) * 2015-11-04 2016-03-23 北京工业大学 基于Modbus/Tcp的入侵检测分析方法
CN105491078A (zh) * 2014-09-15 2016-04-13 阿里巴巴集团控股有限公司 Soa***中的数据处理方法及装置、soa***
CN105577685A (zh) * 2016-01-25 2016-05-11 浙江海洋学院 云计算环境中的自主分析入侵检测方法及***

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6449695B1 (en) * 1999-05-27 2002-09-10 Microsoft Corporation Data cache using plural lists to indicate sequence of data storage
US7146644B2 (en) * 2000-11-13 2006-12-05 Digital Doors, Inc. Data security system and method responsive to electronic attacks
US7966658B2 (en) 2004-04-08 2011-06-21 The Regents Of The University Of California Detecting public network attacks using signatures and fast content analysis
US9152706B1 (en) * 2006-12-30 2015-10-06 Emc Corporation Anonymous identification tokens
US8321936B1 (en) * 2007-05-30 2012-11-27 M86 Security, Inc. System and method for malicious software detection in multiple protocols
US8762515B2 (en) * 2008-08-20 2014-06-24 The Boeing Company Methods and systems for collection, tracking, and display of near real time multicast data
US8621634B2 (en) * 2011-01-13 2013-12-31 F-Secure Oyj Malware detection based on a predetermined criterion
EP2815360A4 (en) 2012-02-17 2015-12-02 Vencore Labs Inc MULTI-FUNCTION ELECTRIC METER ADAPTER AND METHOD OF USE
CN103731393A (zh) * 2012-10-10 2014-04-16 盐城睿泰数字科技有限公司 一种Web资源数据的压缩方法
US8935784B1 (en) * 2013-03-15 2015-01-13 Symantec Corporation Protecting subscribers of web feeds from malware attacks
US10230747B2 (en) * 2014-07-15 2019-03-12 Cisco Technology, Inc. Explaining network anomalies using decision trees
US9973520B2 (en) * 2014-07-15 2018-05-15 Cisco Technology, Inc. Explaining causes of network anomalies
CN104125273A (zh) * 2014-07-16 2014-10-29 百度在线网络技术(北京)有限公司 网页中图片传输方法、图片服务器、网络服务器和客户端
WO2016207774A1 (en) * 2015-06-23 2016-12-29 Politecnico Di Torino Method and device for searching images

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100082513A1 (en) * 2008-09-26 2010-04-01 Lei Liu System and Method for Distributed Denial of Service Identification and Prevention
CN103825888A (zh) * 2014-02-17 2014-05-28 北京奇虎科技有限公司 网络威胁处理方法及设备
CN104159249A (zh) * 2014-07-30 2014-11-19 华为技术有限公司 一种业务数据管理的方法、装置及***
CN105491078A (zh) * 2014-09-15 2016-04-13 阿里巴巴集团控股有限公司 Soa***中的数据处理方法及装置、soa***
CN105429963A (zh) * 2015-11-04 2016-03-23 北京工业大学 基于Modbus/Tcp的入侵检测分析方法
CN105577685A (zh) * 2016-01-25 2016-05-11 浙江海洋学院 云计算环境中的自主分析入侵检测方法及***

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3576365A4 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114978782A (zh) * 2022-08-02 2022-08-30 北京六方云信息技术有限公司 工控威胁检测方法、装置、工控设备以及存储介质
CN114978782B (zh) * 2022-08-02 2022-11-01 北京六方云信息技术有限公司 工控威胁检测方法、装置、工控设备以及存储介质

Also Published As

Publication number Publication date
US20200007505A1 (en) 2020-01-02
CN110574348A (zh) 2019-12-13
EP3576365A1 (en) 2019-12-04
EP3576365A4 (en) 2020-09-16
CN110574348B (zh) 2022-09-27
EP3576365B1 (en) 2022-10-26
US11405358B2 (en) 2022-08-02
ES2931991T3 (es) 2023-01-05

Similar Documents

Publication Publication Date Title
CN109063745B (zh) 一种基于决策树的网络设备类型识别方法及***
CN110597734B (zh) 一种适用于工控私有协议的模糊测试用例生成方法
CN110336827B (zh) 一种基于异常字段定位的Modbus TCP协议模糊测试方法
US10164839B2 (en) Log analysis system
US10104108B2 (en) Log analysis system
KR101295708B1 (ko) 트래픽 수집장치, 트래픽 분석장치, 시스템 및 그 분석방법
CN112953971B (zh) 一种网络安全流量入侵检测方法和***
JP3957712B2 (ja) 通信監視システム
CN110611640A (zh) 一种基于随机森林的dns协议隐蔽通道检测方法
US20080186876A1 (en) Method for classifying applications and detecting network abnormality by statistical information of packets and apparatus therefor
CN107209834B (zh) 恶意通信模式提取装置及其***和方法、记录介质
CN103281336A (zh) 网络入侵检测方法
CN108713310B (zh) 一种确定冗余离散原始警报的方法和***
CN112291213A (zh) 一种基于智能终端的异常流量分析方法及装置
WO2018157336A1 (zh) 数据处理装置和方法
KR101488271B1 (ko) Ids 오탐 검출 장치 및 방법
CN114006719B (zh) 基于态势感知的ai验证方法、装置及***
Sapozhnikova et al. Intrusion detection system based on data mining technics for industrial networks
KR102559398B1 (ko) 인공지능을 이용한 보안관제 침입탐지 알람 처리 장치 및 방법
JP2019216305A (ja) 通信装置、パケット処理方法及びプログラム
KR20230085692A (ko) IoT 시스템의 비정상행위 탐지 방법 및 그 장치
CN115766204B (zh) 一种针对加密流量的动态ip设备标识***及方法
CN112417462B (zh) 一种网络安全漏洞追踪方法及***
CN116094841B (zh) 加密信道中的行为识别方法、装置及电子设备
KR102646586B1 (ko) 이상패턴 감지 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17898759

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2017898759

Country of ref document: EP

Effective date: 20190830