CN115840951B - Method and system for realizing network security based on full-flow asset discovery - Google Patents

Method and system for realizing network security based on full-flow asset discovery Download PDF

Info

Publication number
CN115840951B
CN115840951B CN202211362554.7A CN202211362554A CN115840951B CN 115840951 B CN115840951 B CN 115840951B CN 202211362554 A CN202211362554 A CN 202211362554A CN 115840951 B CN115840951 B CN 115840951B
Authority
CN
China
Prior art keywords
asset
data
flow
metadata
discovery
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211362554.7A
Other languages
Chinese (zh)
Other versions
CN115840951A (en
Inventor
闫印强
孙俊虎
郭寒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changyang Technology Beijing Co ltd
Original Assignee
Changyang Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changyang Technology Beijing Co ltd filed Critical Changyang Technology Beijing Co ltd
Priority to CN202211362554.7A priority Critical patent/CN115840951B/en
Publication of CN115840951A publication Critical patent/CN115840951A/en
Application granted granted Critical
Publication of CN115840951B publication Critical patent/CN115840951B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Abstract

The invention provides a method and a system for realizing network security based on full-flow asset discovery, comprising the steps of configuring a mirror image port of a switch where a network is positioned so as to capture network flow; performing flow analysis processing on the network flow to generate flow metadata, and caching by using Kafka message middleware; performing deduplication and merging of the traffic metadata and updating and warehousing of the traffic metadata using different threads, and performing aggregation processing on the traffic metadata based on a time window to obtain a final data source so as to discover new assets; writing the new assets into an es asset discovery temporary table for supplementing and converting the new assets into formal assets to be brought into a management and control range; and continuously refreshing the updating time of the last new asset discovery in the asset warehousing process, and recording the asset state. The method improves the data processing efficiency, realizes that the information protocol and the industrial control protocol can be supported, and has extremely high discovery accuracy.

Description

Method and system for realizing network security based on full-flow asset discovery
Technical Field
The invention relates to the technical field of industrial safety, in particular to a method and a system for realizing network safety based on full-flow asset discovery.
Background
With the popularization and intelligent development of industrial Internet, assets in enterprises become huge and complex, network isolation of industrial control environments is difficult to comprehensively master and manage, problem equipment cannot be found timely, and immeasurable loss is caused to enterprises once an attack event or equipment failure occurs.
In order to find out the bottoms of the families, the unified management and the global control, all the devices of the whole factory and the whole station need to be registered, and for large enterprises, the number of the devices is thousands, the registration of all the devices in the books is obviously a huge and difficult task, omission and negligence exist in the manual registration process, wrong registration and missed registration are caused, and the asset management is difficult and challenging.
The method has the advantages that the known and unknown assets in the network are automatically found based on the full-flow data, the means of asset discovery are enriched, the enterprise asset account is gradually built, and a foundation is laid for asset safety management. And once the illegal asset is accessed to the network, an alarm prompt is timely sent out, so that the risk of accessing the unauthorized equipment is reduced. The automatic carding and discovery of the industrial control assets are realized, and the problems of illegal configuration, unclear assets or inaccurate standing accounts and the like are solved. Meanwhile, the activity, communication relation, flow trend and the like of the intelligent portrait are summarized and analyzed to construct the intelligent portrait of the asset.
At present, asset discovery mainly comprises active detection and internet WEB asset discovery based on HTTP requests, wherein the active detection belongs to an active scanning discovery technology, and the passive asset discovery at a network request server side. The active scanning has a certain safety risk in an industrial control scene, and a scanning forbidden zone exists in a production environment, so that a dead angle is found. Passive asset discovery only discovers specific agreement resources where requests occur, and many requests that are no longer within the scope of the agreement analysis will be ignored, resulting in incomplete discovery.
Disclosure of Invention
The invention provides a method and a system for realizing network security based on full-flow asset discovery, which are used for solving the defects of the prior art.
In one aspect, the invention provides a method for implementing network security based on full traffic asset discovery, comprising the steps of:
s1: the method comprises the steps of configuring a mirror image port of a switch where a network is located, accessing mirror image flow to a collector, configuring and starting a collection task by using the collector so as to capture the network flow;
s2: the network traffic is sent to a data analyzer through a forwarding task, the data analyzer carries out flow analysis processing on the network traffic to generate flow metadata, and Kafka message middleware is used as a data caching layer to cache the flow metadata;
S3: an asset discovery program is written based on funnel design principles, thereby layering the processing of the traffic metadata by the asset discovery program into two processes: duplication and merging of the flow metadata and updating and warehousing of the flow metadata; wherein, the funnel design principle includes: the first step: building a distributed funnel matrix pool; and a second step of: big data distributed matrix calculation; and a third step of: step MR aggregation calculation;
the two processes are respectively executed by using different threads, the traffic metadata are subjected to aggregation processing based on a time window to obtain a final data source, and new assets are found based on the final data source;
s4: writing new assets found by the asset discovery program into an es asset discovery temporary table, editing and storing information related to the new assets and needing additional supplement through web visualization by maintainers, and converting the new assets into formal assets to be brought into a management and control range;
s5: continuously refreshing the update time of the last new asset discovery, and comparing the difference between the current time and the update time with a preset window time so as to judge and record the asset state.
The method comprises the steps of configuring a mirror image port of a switch where a network is located so as to capture network traffic; performing flow analysis processing on the network flow to generate flow metadata, and caching by using Kafka message middleware; performing deduplication and merging of the traffic metadata and updating and warehousing of the traffic metadata using different threads, and performing aggregation processing on the traffic metadata based on a time window to obtain a final data source so as to discover new assets; writing the new asset found by the asset finding program into an es asset finding temporary table for supplementing, and then converting the new asset into a formal asset to be brought into a management and control range; and continuously refreshing the update time of the last new asset found in the asset warehousing process, and comparing the difference between the current time and the update time with a preset window time so as to judge and record the asset state. According to the method, different threads are utilized to maintain the merging process and the warehousing process of the flow metadata to be cached, so that the data processing efficiency is improved; through full-flow asset discovery and identification, the method realizes that the method can support from an information protocol to an industrial control protocol, and has extremely high discovery accuracy; through the discovery and identification of the cross-route assets, the multi-level cross-route access from the simplest single local area network to the review is realized; the complete ecological system is utilized, and from collection-analysis to display-operation-maintenance, the gap between upstream and downstream data is opened, and the data interconnection sharing is realized; operation management is highlighted, associating security events with assets, making the assets the subject of the security center.
In a particular embodiment, the forwarding task internally employs a thread pool for multiline Cheng Zhuaifa. After receiving the flow data, the data analyzer needs to analyze the flow data, and can know detailed information such as IP addresses, mac addresses and the like of the assets through the analyzed flow metadata. The analysis process is a protocol analysis process, and the processing time is longer than the forwarding and collecting time, so that more processing threads are used, data backlog cannot be caused, the real-time requirement is met, and the sending efficiency can be improved by using multithreading.
In a specific embodiment, the network traffic is encrypted before the forwarding task is executed. In view of the higher requirement of industrial control environment data security, the method can improve the security while achieving performance and efficiency.
In a specific embodiment, the specific step of S2 includes:
s201: the analysis program consumes data from Kafka, wherein the data content is the absolute path of a pcap packet, tshark is called to analyze the data, and the analyzed data is segmented, standardized and formatted and then packaged into json to be written into another topic of Kafka;
s202: creating a Kafka consumption table all_kafka_queue in clickhouse configured for automatic consumption when there is data in flow_meta_topic;
S203: creating a view chart in a clickhouse, keeping field name types in the view chart consistent with the Kafka consumption table; wherein no data is stored in the view diagram.
S204: creating a data table in the clickhouse for storing flow metadata obtained by flow analysis processing, and simultaneously mapping data in the Kafka consumption table into the data table by utilizing the view table. The analyzed data needs to be cached before analysis, otherwise, the wave crest data seriously affects the stability of the program, and in order to make the program modularized and mutually decoupled, the strong dependence among the programs is reduced, so that the problems of quick positioning and repairing are facilitated. In front of massive data, general buffering is difficult to bear, after various tests, kafka message middleware is finally selected as a data buffering layer, the zero-copy technology is utilized to realize the processing speed of more than ten thousands times per second of single node, and meanwhile, the system can be laterally expanded according to the traffic, so that the flexibility and the expandability of the system are greatly improved.
In a specific embodiment, the step S3 further includes:
and respectively taking data from different topics for data sharing among different threads, and sending the data subjected to the de-duplication and merging to kafka for a plurality of business programs to use. The method ensures that the bottom layer business system only needs to be in butt joint with small data volume, thereby saving system resources; in addition, the warehousing and merging threads respectively maintain own internal cache objects, and even if the caching objects face tens of thousands of data impacts per second, the caching objects can still be processed efficiently.
In a specific embodiment, the number of threads in the method is scalable. The system is telescopic, expandable and easy to maintain.
In a specific embodiment, the es asset discovery temporary table has been stored in advance for IP, MAC, MAC vendors, source collector MACs, security zones, location areas, and discovery times for the new asset.
In a specific embodiment, the comparing the difference between the current time and the updated time with a preset window time, so as to determine and record the asset status specifically includes:
if the current time is greater than the window time, the current time is offline, and offline time length are recorded;
and if the current time is smaller than the window time, the online time is represented, and the online time length are recorded.
In a specific embodiment, the equipment vendor information is enhanced by the mac address association equipment vendor library while the location area information is enhanced upon discovery of the new asset.
In a specific embodiment, the collectors are used for mirroring network flows in different areas in a scattered deployment mode, so that multiple paths of mirrored flows are simultaneously accessed to a single collector, and the multiple paths of mirrored flows are simultaneously collected. Because the internal network of the enterprise is complicated, the local area network isolation areas are very difficult to collect all network flow data on one switch, so that the collection equipment needs to be distributed and deployed, network flows of different areas are mirrored as much as possible, the utilization rate of system resources can be greatly improved, and the construction cost is saved. In addition, the full flow capturing adopts a mature tshark technology, so that not only can the filtering condition be added to remove interference or unwanted packets, but also the full flow capturing under any filtering condition can be omitted, and the flexible capturing configuration enables different industrial control scenes to be easily handled.
In a specific embodiment, the discovering new assets based on the final data source specifically includes:
the new asset is discovered by the uniqueness of the mac address, and the routing device and the cross-route access device are identified. After the flow metadata enter the cache layer, the asset discovery program has a data source, and because of full flow acquisition, a large amount of repeated data can be frequently generated in the communication among the resources in the network, aggregation and convergence processing are needed, the data after the time window aggregation processing is the final data source for asset discovery, and new assets are discovered through the uniqueness of mac addresses. While routing devices and cross-route access devices, which are also one of the devices for asset discovery, may be identified.
According to a second aspect of the present invention, a computer-readable storage medium is presented, on which a computer program is stored, which computer program, when being executed by a computer processor, carries out the above-mentioned method.
According to a third aspect of the present invention, there is provided a system for implementing network security based on full traffic asset discovery, the system comprising:
the acquisition task configuration module: the configuration is used for configuring a mirror image port of a switch where the network is located, the mirror image flow is accessed to a collector, and the collector is utilized to configure and start a collection task so as to capture the network flow;
And a flow analysis module: the data analyzer is configured to perform flow analysis processing on the network flow to generate flow metadata, and cache the flow metadata by using Kafka message middleware as a data caching layer;
asset discovery warehouse entry module: configured to write an asset discovery program based on funnel design principles, thereby layering the processing of the traffic metadata by the asset discovery program into two processes: duplication and merging of the flow metadata and updating and warehousing of the flow metadata;
the two processes are respectively executed by using different threads, the traffic metadata are subjected to aggregation processing based on a time window to obtain a final data source, and new assets are found based on the final data source;
asset completion module: the maintenance personnel edit and save the information related to the new asset and needing additional supplement through web visualization, and then convert the new asset into a formal asset to be brought into a management and control range;
Asset status recording module: and the device is configured to continuously refresh the update time of the last new asset discovery, and compare the difference between the current time and the update time with a preset window time so as to judge and record the asset state.
The invention carries out mirror image port configuration on the exchanger where the network is located so as to capture the network flow; performing flow analysis processing on the network flow to generate flow metadata, and caching by using Kafka message middleware; performing deduplication and merging of the traffic metadata and updating and warehousing of the traffic metadata using different threads, and performing aggregation processing on the traffic metadata based on a time window to obtain a final data source so as to discover new assets; writing the new asset found by the asset finding program into an es asset finding temporary table for supplementing, and then converting the new asset into a formal asset to be brought into a management and control range; and continuously refreshing the update time of the last new asset found in the asset warehousing process, and comparing the difference between the current time and the update time with a preset window time so as to judge and record the asset state. According to the method, different threads are utilized to maintain the merging process and the warehousing process of the flow metadata to be cached, so that the data processing efficiency is improved; through full-flow asset discovery and identification, the method realizes that the method can support from an information protocol to an industrial control protocol, and has extremely high discovery accuracy; through the discovery and identification of the cross-route assets, the multi-level cross-route access from the simplest single local area network to the review is realized; the complete ecological system is utilized, and from collection-analysis to display-operation-maintenance, the gap between upstream and downstream data is opened, and the data interconnection sharing is realized; operation management is highlighted, associating security events with assets, making the assets the subject of the security center.
Drawings
The accompanying drawings are included to provide a further understanding of the embodiments and are incorporated in and constitute a part of this specification. The drawings illustrate embodiments and together with the description serve to explain the principles of the invention. Many of the intended advantages of other embodiments and embodiments will be readily appreciated as they become better understood by reference to the following detailed description. Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings, in which:
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow chart of a method of implementing network security based on full traffic asset discovery in accordance with one embodiment of the invention;
FIG. 3 is a diagram of acquisition task configuration information, in accordance with a specific embodiment of the present invention;
FIG. 4 is forwarding task configuration information for one particular embodiment of the present invention;
FIG. 5 is a flow resolution flow chart of a particular embodiment of the present invention;
FIG. 6 is a flow chart of asset merge and warehousing of a particular embodiment of the invention;
FIG. 7 is an asset completion interface of a particular embodiment of the invention;
FIG. 8 is an asset status record flow of a particular embodiment of the invention;
FIG. 9 is an asset discovery architecture diagram of a particular embodiment of the invention;
FIG. 10 is a diagram of an exemplary deployment architecture for an industrial enterprise in accordance with a specific embodiment of the present invention;
FIG. 11 is a framework diagram of a network security full traffic asset discovery based implementation system in accordance with one embodiment of the invention;
FIG. 12 is a schematic diagram of a computer system suitable for use in implementing embodiments of the present application;
fig. 13 is a schematic illustration of a funnel process according to an embodiment of the present invention.
Detailed Description
The present application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.
It should be noted that, in the case of no conflict, the embodiments and features in the embodiments may be combined with each other. The present application will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
FIG. 1 illustrates an exemplary system architecture 100 to which an implementation of network security full traffic asset discovery based implementation of embodiments of the present application may be applied.
As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various applications, such as a data processing class application, a data visualization class application, a web browser application, and the like, may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices including, but not limited to, smartphones, tablets, laptop and desktop computers, and the like. When the terminal devices 101, 102, 103 are software, they can be installed in the above-listed electronic devices. Which may be implemented as multiple software or software modules (e.g., software or software modules for providing distributed services) or as a single software or software module. The present invention is not particularly limited herein.
The server 105 may be a server providing various services, such as a background information processing server providing support for network traffic presented on the terminal devices 101, 102, 103. The background information processing server may process the acquired traffic metadata and generate processing results (e.g., new assets).
It should be noted that, the method provided in the embodiment of the present application may be executed by the server 105, or may be executed by the terminal devices 101, 102, 103, and the corresponding apparatus is generally disposed in the server 105, or may be disposed in the terminal devices 101, 102, 103.
The server may be hardware or software. When the server is hardware, the server may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. When the server is software, it may be implemented as a plurality of software or software modules (e.g., software or software modules for providing distributed services), or as a single software or software module. The present invention is not particularly limited herein.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Fig. 2 shows a flowchart of a method for implementing network security based on full traffic asset discovery according to an embodiment of the invention. As shown in fig. 2, the method comprises the steps of:
s1: and configuring a mirror image port of a switch where the network is located, accessing mirror image flow to a collector, and utilizing the collector to configure and start a collection task so as to capture the network flow.
Fig. 3 illustrates acquisition task configuration information for a particular embodiment of the present invention.
In a specific embodiment, the collectors are used for mirroring network flows in different areas in a scattered deployment mode, so that multiple paths of mirrored flows are simultaneously accessed to a single collector, and the multiple paths of mirrored flows are simultaneously collected.
In a specific embodiment, more than 4 switch mirror image flows can be connected on one collector at the same time.
S2: and sending the network traffic to a data analyzer through a forwarding task, performing flow analysis processing on the network traffic by the data analyzer to generate flow metadata, and caching the flow metadata by using Kafka message middleware as a data caching layer.
Fig. 4 is forwarding task configuration information for a particular embodiment of the present invention.
In a particular embodiment, the forwarding task internally employs a thread pool for multiline Cheng Zhuaifa. After receiving the flow data, the data analyzer needs to analyze the flow data, and can know detailed information such as IP addresses, mac addresses and the like of the assets through the analyzed flow metadata. The analysis process is a protocol analysis process, and the processing time is longer than the forwarding and collecting time, so that more processing threads are used, data backlog cannot be caused, the real-time requirement is met, and the sending efficiency can be improved by using multithreading.
In a specific embodiment, the network traffic is encrypted before the forwarding task is executed. In view of the higher requirement of industrial control environment data security, the method can improve the security while achieving performance and efficiency.
In a specific embodiment, the specific step of S2 includes:
s201: the analysis program consumes data from Kafka, wherein the data content is the absolute path of a pcap packet, tshark is called to analyze the data, and the analyzed data is segmented, standardized and formatted and then packaged into json to be written into another topic of Kafka;
S202: creating a Kafka consumption table all_kafka_queue in clickhouse configured for automatic consumption when there is data in flow_meta_topic;
s203: creating a view chart in a clickhouse, keeping field name types in the view chart consistent with the Kafka consumption table; wherein no data is stored in the view diagram.
S204: creating a data table in the clickhouse for storing flow metadata obtained by flow analysis processing, and simultaneously mapping data in the Kafka consumption table into the data table by utilizing the view table. The analyzed data needs to be cached before analysis, otherwise, the wave crest data seriously affects the stability of the program, and in order to make the program modularized and mutually decoupled, the strong dependence among the programs is reduced, so that the problems of quick positioning and repairing are facilitated. In front of mass data, general buffering is difficult to bear, after various tests, kafka message middleware is finally selected as a data buffering layer, the zero-copy technology is utilized to realize the processing speed of more than ten thousand times per second of single node, and meanwhile, the system can be laterally expanded according to the traffic, so that the flexibility and the expandability of the system are greatly improved.
Fig. 5 is a flow resolution flow chart of a specific embodiment of the present invention.
In a specific embodiment, detailed information of the traffic asset including the IP address and Mac address is obtained from the traffic metadata.
S3: an asset discovery program is written based on funnel design principles, thereby layering the processing of the traffic metadata by the asset discovery program into two processes: duplication and merging of the flow metadata and updating and warehousing of the flow metadata;
the two processes are respectively executed by using different threads, and the traffic metadata is subjected to aggregation processing based on a time window to obtain a final data source, and new assets are found based on the final data source.
Fig. 13 is a schematic diagram of a funnel treatment process according to an embodiment of the present invention, and as shown in fig. 13, the funnel treatment process includes the following three steps:
the first step: and (5) building a distributed funnel matrix pool. The traditional funnel pool has only one, has weaker concurrent processing capacity, is very easy to overflow, and seriously affects the processing capacity of the program. The invention sets a plurality of data partitions, each partition is equivalent to a pool, and provides a large enough capacity space for mass data, so as to construct a large data distributed computing infrastructure.
And a second step of: big data distributed matrix computation. And writing a distributed computing program, wherein each partition corresponds to one worker, namely a processing thread, so that the concurrent processing capacity of the program is fundamentally improved. According to the service scene and the data type, the result processed by each worker thread is funneled again to form a funnel layering calculation layering, the data volume between layers is gradually decreased, and the data value is larger and larger.
And a third step of: stepped MR aggregation calculation. The results generated by the distributed calculation of each layer of funnel are converged to an MR computing center, and the MR computing nodes aggregate and merge the data and redistribute the data to the next funnel matrix pool for more complex and longer time-consuming processing. By progressive funneling and MR aggregation, the data is analyzed and processed layer by layer, ultimately leaving high quality data. And finally, utilizing the clickhouse and elastic search big data storage components to persistence data, and providing a data warehouse for subsequent maintenance, analysis and visualization.
FIG. 6 is a flow chart of asset merge and warehousing of one particular embodiment of the invention.
In a specific embodiment, the discovering new assets based on the final data source specifically includes:
the new asset is discovered by the uniqueness of the mac address, and the routing device and the cross-route access device are identified.
In a specific embodiment, the number of threads in the method is scalable. The system is telescopic, expandable and easy to maintain.
In a specific embodiment, the step S3 further includes:
and respectively taking data from different topics for data sharing among different threads, and sending the data subjected to the de-duplication and merging to kafka for a plurality of business programs to use. The method ensures that the bottom layer business system only needs to be in butt joint with small data volume, thereby saving system resources; in addition, the warehousing and merging threads respectively maintain own internal cache objects, and even if the caching objects face tens of thousands of data impacts per second, the caching objects can still be processed efficiently.
S4: and writing the new asset found by the asset finding program into an es asset finding temporary table, editing and storing the information which is related to the new asset and needs to be additionally supplemented by a maintainer through web visualization, and converting the new asset into a formal asset to be brought into a management and control range.
FIG. 7 is an asset completion interface of one particular embodiment of the invention.
In a specific embodiment, the es asset discovery temporary table has been stored in advance for IP, MAC, MAC vendors, source collector MACs, security zones, location areas, and discovery times for the new asset.
S5: continuously refreshing the update time of the last new asset discovery, and comparing the difference between the current time and the update time with a preset window time so as to judge and record the asset state.
FIG. 8 is an asset status recording flow of a particular embodiment of the invention.
In a specific embodiment, the data set generated by the deduplication and merging is consumed from Kafka, the update time of the last new asset found is continuously refreshed by using the asset discovery program, and the difference between the current time and the update time is compared with a preset window time, so that the asset state is judged and recorded.
In a specific embodiment, the comparing the difference between the current time and the updated time with a preset window time, so as to determine and record the asset status specifically includes:
if the current time is greater than the window time, the current time is offline, and offline time length are recorded;
and if the current time is smaller than the window time, the online time is represented, and the online time length are recorded.
In a specific embodiment, the equipment vendor information is enhanced by the mac address association equipment vendor library while the location area information is enhanced upon discovery of the new asset.
FIG. 9 is a diagram of an overall architecture of asset discovery of one specific embodiment of the invention, as shown in FIG. 9, the asset discovery system is composed of four parts, acquisition, traffic parsing, asset discovery, asset completion: the acquisition part is processed by the intelligent acquisition device and is responsible for capturing the full flow of the mirror image port of the switch, the flow analysis part is responsible for analyzing the flow metadata and writing the flow metadata into the message middleware, the asset discovery module consumes the flow metadata to discover the in-network assets and then enhance the warehousing, and the completion module checks and supplements the information such as the asset size importance level and the like through the web page.
The technical scheme has the following advantages and functions through multi-scene network flow research:
1. Full-flow asset discovery and identification can be supported from an information protocol to an industrial control protocol, and the discovery accuracy is extremely high
2. Cross-route asset discovery and identification, from the simplest single LAN to multiple levels of cross-route access for review
3. The technology has a complete ecological system, from collection-analysis to display-operation and maintenance, opens up the gap between upstream and downstream data, and realizes the sharing of data interconnection
4. The operation management is highlighted, the security event is associated with the asset, and the asset becomes the main body of the security center.
The application occasions of the invention are as follows:
the invention can be used in an industrial control situation sensing system, monitors all links of network safety in a full-flow collection zero-invasion mode, and analyzes weak links of safety by finding and combing through asset venation of the whole plant. Meanwhile, risk linkage can be realized through the security event associated assets, the risk assets are positioned at the first time, and the loss of the enterprise caused by network security accidents is reduced to the maximum extent. Fig. 10 shows a typical deployment architecture diagram of an industrial enterprise according to a specific embodiment of the present invention, where, as shown in the drawing, generally, the first area and the second area are production areas, and the third area is an information management area, where each area is deployed with a collector to collect log and full-flow data, and after the data is primarily processed, the data is sent to a full-flow log analysis system. The asset discovery program is deployed here, and the asset is discovered and identified through flow analysis and automatically put in storage, and the on-site operators are completed and converted into the controlled asset. Meanwhile, the full-flow log analysis system is further provided with CEP rules, flow rules, AI flow learning, an information library, a vulnerability library and the like, so that abnormal flow can be analyzed in time, a safety alarm event is sent out, risk assets are positioned, an emergency response mechanism is started, risks are visible, the system can be managed, positioned and operated, and the system is used for enterprise network safety protection and navigation.
FIG. 11 illustrates a framework diagram of a network security full traffic asset discovery based implementation system in accordance with one embodiment of the invention. The system includes an acquisition task configuration module 1101, a flow resolution module 1102, an asset discovery and warehousing module 1103, an asset completion module 1104, and an asset status recording module 1105.
In a specific embodiment, the acquisition task configuration module 1101 is configured to perform mirror image port configuration on a switch where the network is located, access mirror image traffic to an acquisition device, and configure and start an acquisition task by using the acquisition device so as to capture network traffic;
the flow analysis module 1102 is configured to send the network flow to a data analyzer through a forwarding task, the data analyzer performs flow analysis processing on the network flow to generate flow metadata, and caches the flow metadata by using Kafka message middleware as a data caching layer;
the asset discovery library module 1103 is configured to write an asset discovery program based on the funnel design principle, thereby layering the processing of the traffic metadata by the asset discovery program into two processes: duplication and merging of the flow metadata and updating and warehousing of the flow metadata;
The two processes are respectively executed by using different threads, the traffic metadata are subjected to aggregation processing based on a time window to obtain a final data source, and new assets are found based on the final data source;
the asset completion module 1104 is configured to write the new asset found by the asset discovery program into an es asset discovery temporary table, and a maintainer edits and saves the information related to the new asset, which needs additional supplementation, through web visualization, and then converts the new asset into a formal asset to be brought into a management and control range;
the asset status recording module 1105 is configured to continuously refresh the update time of the last discovery of a new asset, compare the difference between the current time and the update time with a preset window time, and thereby determine and record the asset status.
The system performs mirror image port configuration on a switch where a network is located so as to capture network traffic; performing flow analysis processing on the network flow to generate flow metadata, and caching by using Kafka message middleware; performing deduplication and merging of the traffic metadata and updating and warehousing of the traffic metadata using different threads, and performing aggregation processing on the traffic metadata based on a time window to obtain a final data source so as to discover new assets; writing the new asset found by the asset finding program into an es asset finding temporary table for supplementing, and then converting the new asset into a formal asset to be brought into a management and control range; and continuously refreshing the update time of the last new asset found in the asset warehousing process, and comparing the difference between the current time and the update time with a preset window time so as to judge and record the asset state. According to the method, different threads are utilized to maintain the merging process and the warehousing process of the flow metadata to be cached, so that the data processing efficiency is improved; through full-flow asset discovery and identification, the method realizes that the method can support from an information protocol to an industrial control protocol, and has extremely high discovery accuracy; through the discovery and identification of the cross-route assets, the multi-level cross-route access from the simplest single local area network to the review is realized; the complete ecological system is utilized, and from collection-analysis to display-operation-maintenance, the gap between upstream and downstream data is opened, and the data interconnection sharing is realized; operation management is highlighted, associating security events with assets, making the assets the subject of the security center.
Referring now to FIG. 12, there is illustrated a schematic diagram of a computer system 1200 suitable for use in implementing the electronic device of the present application. The electronic device shown in fig. 12 is only an example, and should not impose any limitation on the functionality and scope of use of the embodiments of the present application.
As shown in fig. 12, the computer system 1200 includes a Central Processing Unit (CPU) 1201, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 1202 or a program loaded from a storage section 1208 into a Random Access Memory (RAM) 1203. In the RAM1203, various programs and data required for the operation of the system 1200 are also stored. The CPU 1201, ROM 1202, and RAM1203 are connected to each other through a bus 1204. An input/output (I/O) interface 1205 is also connected to the bus 1204.
The following components are connected to the I/O interface 1205: an input section 1206 including a keyboard, a mouse, and the like; an output section 1207 including a display such as a Liquid Crystal Display (LCD) and a speaker; a storage section 1208 including a hard disk or the like; and a communication section 1209 including a network interface card such as a LAN card, a modem, or the like. The communication section 1209 performs communication processing via a network such as the internet. The drive 1210 is also connected to the I/O interface 1205 as needed. A removable medium 1211 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on the drive 1210 so that a computer program read out therefrom is installed into the storage section 1208 as needed.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program can be downloaded and installed from a network via the communication portion 1209, and/or installed from the removable media 1211. The above-described functions defined in the method of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 1201. It should be noted that the computer readable storage medium described in the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable storage medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations of the present application may be written in one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules involved in the embodiments described in the present application may be implemented by software, or may be implemented by hardware. The units described may also be provided in a processor, and the names of these units do not in some case constitute a limitation of the unit itself.
Embodiments of the present invention also relate to a computer readable storage medium having stored thereon a computer program which, when executed by a computer processor, implements the method as described above. The computer program contains program code for performing the method shown in the flow chart. It should be noted that the computer readable medium of the present application may be a computer readable signal medium or a computer readable medium or any combination of the two.
The invention carries out mirror image port configuration on the exchanger where the network is located so as to capture the network flow; performing flow analysis processing on the network flow to generate flow metadata, and caching by using Kafka message middleware; performing deduplication and merging of the traffic metadata and updating and warehousing of the traffic metadata using different threads, and performing aggregation processing on the traffic metadata based on a time window to obtain a final data source so as to discover new assets; writing the new asset found by the asset finding program into an es asset finding temporary table for supplementing, and then converting the new asset into a formal asset to be brought into a management and control range; and continuously refreshing the update time of the last new asset found in the asset warehousing process, and comparing the difference between the current time and the update time with a preset window time so as to judge and record the asset state. According to the method, different threads are utilized to maintain the merging process and the warehousing process of the flow metadata to be cached, so that the data processing efficiency is improved; through full-flow asset discovery and identification, the method realizes that the method can support from an information protocol to an industrial control protocol, and has extremely high discovery accuracy; through the discovery and identification of the cross-route assets, the multi-level cross-route access from the simplest single local area network to the review is realized; the complete ecological system is utilized, and from collection-analysis to display-operation-maintenance, the gap between upstream and downstream data is opened, and the data interconnection sharing is realized; operation management is highlighted, associating security events with assets, making the assets the subject of the security center.
The foregoing description is only of the preferred embodiments of the present application and is presented as a description of the principles of the technology being utilized. It will be appreciated by persons skilled in the art that the scope of the invention referred to in this application is not limited to the specific combinations of features described above, but it is intended to cover other embodiments in which any combination of features described above or equivalents thereof is possible without departing from the spirit of the invention. Such as the above-described features and technical features having similar functions (but not limited to) disclosed in the present application are replaced with each other.

Claims (12)

1. The realization method of network security based on full-flow asset discovery is characterized by comprising the following steps:
s1: the method comprises the steps of configuring a mirror image port of a switch where a network is located, accessing mirror image flow to a collector, configuring and starting a collection task by using the collector so as to capture the network flow;
s2: the network traffic is sent to a data analyzer through a forwarding task, the data analyzer carries out flow analysis processing on the network traffic to generate flow metadata, and Kafka message middleware is used as a data caching layer to cache the flow metadata;
S3: an asset discovery program is written based on funnel design principles, thereby layering the processing of the traffic metadata by the asset discovery program into two processes: duplication and merging of the flow metadata and updating and warehousing of the flow metadata; the two processes are respectively executed by using different threads, the traffic metadata are subjected to aggregation processing based on a time window to obtain a final data source, and new assets are found based on the final data source; wherein, the funnel design principle includes: the first step: building a distributed funnel matrix pool; and a second step of: big data distributed matrix calculation; and a third step of: step MR aggregation calculation;
s4: writing new assets found by the asset discovery program into an es asset discovery temporary table, editing and storing information related to the new assets and needing additional supplement through web visualization by maintainers, and converting the new assets into formal assets to be brought into a management and control range;
s5: continuously refreshing the update time of the last new asset discovery, and comparing the difference between the current time and the update time with a preset window time so as to judge and record the asset state;
The step S2 specifically comprises the following steps:
s201: the analysis program consumes data from Kafka, wherein the data content is the absolute path of a pcap packet, tshark is called to analyze the data, and the analyzed data is segmented, standardized and formatted and then packaged into json to be written into another topic of Kafka;
s202: creating a Kafka consumption table all_kafka_queue in clickhouse configured for automatic consumption when there is data in flow_meta_topic;
s203: creating a view chart in a clickhouse, keeping field name types in the view chart consistent with the Kafka consumption table;
s204: creating a data table in the clickhouse for storing flow metadata obtained by flow analysis processing, and simultaneously mapping data in the Kafka consumption table into the data table by utilizing the view table.
2. The method of claim 1, wherein the forwarding task is multi-threaded Cheng Zhuaifa by an internal thread pool.
3. The method of claim 1, wherein the network traffic is encrypted prior to execution of the forwarding task.
4. The method of claim 1, wherein S3 further comprises:
And respectively taking data from different topics for data sharing among different threads, and sending the data subjected to the de-duplication and merging to kafka for a plurality of business programs to use.
5. The method of claim 1, wherein the number of threads in the method is scalable.
6. The method of claim 1, wherein the es asset discovery temporary table has been previously saved with IP, MAC, MAC vendor, source collector MAC, security zone, location area, and discovery time for the new asset.
7. The method of claim 1, wherein comparing the difference between the current time and the updated time with a predetermined window time to determine and record asset status comprises:
if the current time is greater than the window time, the current time is offline, and offline time length are recorded;
and if the current time is smaller than the window time, the online time is represented, and the online time length are recorded.
8. The method of claim 1, wherein the device vendor information is enhanced by a mac address association device vendor library while the location area information is enhanced upon discovery of the new asset.
9. The method of claim 1, wherein the collectors are configured in a decentralized manner to mirror network traffic in different areas, so that multiple mirror traffic is simultaneously accessed to a single collector, and the multiple mirror traffic is simultaneously collected.
10. The method according to claim 1, wherein said discovering new assets based on said final data source comprises in particular:
the new asset is discovered by the uniqueness of the mac address, and the routing device and the cross-route access device are identified.
11. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a computer processor, implements the method of any one of claims 1 to 10.
12. A system for implementing network security based on full traffic asset discovery, comprising:
the acquisition task configuration module: the configuration is used for configuring a mirror image port of a switch where the network is located, the mirror image flow is accessed to a collector, and the collector is utilized to configure and start a collection task so as to capture the network flow;
and a flow analysis module: the data analyzer is configured to perform flow analysis processing on the network flow to generate flow metadata, and cache the flow metadata by using Kafka message middleware as a data caching layer;
Asset discovery warehouse entry module: configured to write an asset discovery program based on funnel design principles, thereby layering the processing of the traffic metadata by the asset discovery program into two processes: duplication and merging of the flow metadata and updating and warehousing of the flow metadata;
the two processes are respectively executed by using different threads, the traffic metadata are subjected to aggregation processing based on a time window to obtain a final data source, and new assets are found based on the final data source;
asset completion module: the maintenance personnel edit and save the information related to the new asset and needing additional supplement through web visualization, and then convert the new asset into a formal asset to be brought into a management and control range;
asset status recording module: the method comprises the steps of configuring updating time for continuously refreshing the latest new asset discovery, comparing the difference between the current time and the updating time with preset window time, and judging and recording the asset state;
the flow analysis module is specifically configured to consume data from Kafka by an analysis program, wherein the data content is an absolute path of a pcap packet, then tshark is called to analyze the data, and the analyzed data is segmented, standardized and formatted and then packaged into json to be written into another topic of Kafka; creating a Kafka consumption table all_kafka_queue in clickhouse configured for automatic consumption when there is data in flow_meta_topic; creating a view chart in a clickhouse, keeping field name types in the view chart consistent with the Kafka consumption table; creating a data table in the clickhouse for storing flow metadata obtained by flow analysis processing, and simultaneously mapping data in the Kafka consumption table into the data table by utilizing the view table.
CN202211362554.7A 2022-11-02 2022-11-02 Method and system for realizing network security based on full-flow asset discovery Active CN115840951B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211362554.7A CN115840951B (en) 2022-11-02 2022-11-02 Method and system for realizing network security based on full-flow asset discovery

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211362554.7A CN115840951B (en) 2022-11-02 2022-11-02 Method and system for realizing network security based on full-flow asset discovery

Publications (2)

Publication Number Publication Date
CN115840951A CN115840951A (en) 2023-03-24
CN115840951B true CN115840951B (en) 2024-02-13

Family

ID=85576784

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211362554.7A Active CN115840951B (en) 2022-11-02 2022-11-02 Method and system for realizing network security based on full-flow asset discovery

Country Status (1)

Country Link
CN (1) CN115840951B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110100228A (en) * 2017-01-06 2019-08-06 甲骨文国际公司 Utilize the efficient delta backup and recovery of the file system hierarchy structure of cloud object storage
CN111181775A (en) * 2019-12-17 2020-05-19 杭州安恒信息技术股份有限公司 Integrated operation and maintenance management alarm method based on automatic host asset discovery
CN113472580A (en) * 2021-07-01 2021-10-01 交通运输信息安全中心有限公司 Alarm system and alarm method based on dynamic loading mechanism
CN113507461A (en) * 2021-07-01 2021-10-15 交通运输信息安全中心有限公司 Network monitoring system and network monitoring method based on big data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110100228A (en) * 2017-01-06 2019-08-06 甲骨文国际公司 Utilize the efficient delta backup and recovery of the file system hierarchy structure of cloud object storage
CN111181775A (en) * 2019-12-17 2020-05-19 杭州安恒信息技术股份有限公司 Integrated operation and maintenance management alarm method based on automatic host asset discovery
CN113472580A (en) * 2021-07-01 2021-10-01 交通运输信息安全中心有限公司 Alarm system and alarm method based on dynamic loading mechanism
CN113507461A (en) * 2021-07-01 2021-10-15 交通运输信息安全中心有限公司 Network monitoring system and network monitoring method based on big data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
曾彬.大数据驱动的网络综合监测***的设计与实现.《信息技术与网络安全》.2020,全文. *

Also Published As

Publication number Publication date
CN115840951A (en) 2023-03-24

Similar Documents

Publication Publication Date Title
Khare et al. Big data in IoT
US10560465B2 (en) Real time anomaly detection for data streams
US11392416B2 (en) Automated reconfiguration of real time data stream processing
Das et al. Big data analytics: A framework for unstructured data analysis
CN113254466B (en) Data processing method and device, electronic equipment and storage medium
CN111639114A (en) Distributed data fusion management system based on Internet of things platform
CN108052358B (en) Distributed deployment system and method
CN113282611A (en) Method and device for synchronizing stream data, computer equipment and storage medium
CN110706148B (en) Face image processing method, device, equipment and storage medium
CN113542074B (en) Method and system for visually managing east-west network flow of kubernets cluster
CN112256682A (en) Data quality detection method and device for multi-dimensional heterogeneous data
CN113704178A (en) Big data management method, system, electronic device and storage medium
Zamani et al. Submarine: A subscription‐based data streaming framework for integrating large facilities and advanced cyberinfrastructure
CN115840951B (en) Method and system for realizing network security based on full-flow asset discovery
Chen et al. The research about video surveillance platform based on cloud computing
CN106257447A (en) The video storage of cloud storage server and search method, video cloud storage system
Borelli et al. Architectural software patterns for the development of iot smart applications
CN110674080A (en) Method and system for collecting large-data-volume non-structural files based on NiFi
Zhang et al. Efficient online surveillance video processing based on spark framework
CN114039860B (en) Method and system for quickly constructing server network topology graph
CN111552740B (en) Data processing method and device
CN107894942B (en) Method and device for monitoring data table access amount
CN111818310A (en) Public safety management platform
CN111858260A (en) Information display method, device, equipment and medium
CN110866165A (en) Network video acquisition method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant