CN114037091B - Expert joint evaluation-based network security information sharing system, method, electronic equipment and storage medium - Google Patents

Expert joint evaluation-based network security information sharing system, method, electronic equipment and storage medium Download PDF

Info

Publication number
CN114037091B
CN114037091B CN202111332573.0A CN202111332573A CN114037091B CN 114037091 B CN114037091 B CN 114037091B CN 202111332573 A CN202111332573 A CN 202111332573A CN 114037091 B CN114037091 B CN 114037091B
Authority
CN
China
Prior art keywords
expert
risk
network security
data
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111332573.0A
Other languages
Chinese (zh)
Other versions
CN114037091A (en
Inventor
叶麟
胡灵娟
黄洁润
胡振鹏
彭凤杰
杨晓丽
杨立炳
叶甜甜
成燕
梁稚媛
张宏莉
杨语晨
尹公主
孟超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Shanghai Pudong Development Bank Co Ltd
Original Assignee
Harbin Institute of Technology
Shanghai Pudong Development Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology, Shanghai Pudong Development Bank Co Ltd filed Critical Harbin Institute of Technology
Priority to CN202111332573.0A priority Critical patent/CN114037091B/en
Publication of CN114037091A publication Critical patent/CN114037091A/en
Application granted granted Critical
Publication of CN114037091B publication Critical patent/CN114037091B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Software Systems (AREA)
  • Probability & Statistics with Applications (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The application discloses a network security information sharing system and method based on expert joint evaluation, electronic equipment and a storage medium, and belongs to the technical field of network security. The application introduces dynamic weighted expert committee and active learning idea into the process of information sharing and research and judgment, the central node uses trained risk classifier to research and judge risk category of all safety information, feeds the research and judgment result back to each node expert, the expert improves self analysis process according to the research and judgment result, the central node backups all safety information and risk category output by the risk classifier and then uploads the data to the superior data service processing center, thus greatly improving the accuracy of risk research and judgment in the network safety information sharing mechanism and being beneficial to enhancing member analysis capability in the network safety information sharing mechanism.

Description

Expert joint evaluation-based network security information sharing system, method, electronic equipment and storage medium
Technical Field
The application relates to a network security information sharing system, a network security information sharing method, electronic equipment and a storage medium, in particular to a network security information sharing system, a network security information sharing method, electronic equipment and a storage medium based on expert joint evaluation, and belongs to the technical field of network security.
Background
In recent years, the strength of network security attack is further improved, and each organization cannot defend against isolated, thin and fragmented network security information. Therefore, the establishment of the network security information sharing mechanism is a serious issue in the current network security work. The network security information sharing mechanism can effectively relieve the problem of information asymmetry, mobilize and coordinate the whole society to realize real-time group prevention and group control, and improve the effect of network security management based on security big data. In the mechanism, the safety information of each department can be interacted and summarized, and the risk research and judgment model learns the safety risk category characteristics through summarized information, so that the defect of strong one-sided performance existing in the past learning based on single-node information is effectively avoided.
There has been some research on information sharing mechanisms, but little research is done specifically on network security information sharing mechanisms. The prior researches mainly adopt the following schemes:
(1) A mechanism for information sharing based on block chains. Blockchain technology has the advantages of decentralization, non-real name and co-maintainability, and many systems realize information sharing based on the blockchain technology. DORRI et al devised a distributed privacy protection and security architecture based on a federated chain in which transaction data is packaged by a block manager node authenticated by a trusted third party, each node being connected into the blockchain and generating transactions at time intervals. KANG et al propose a dual stage soft security enhancement scheme with separate miner selection and block verification, using a reputation based voting scheme to ensure safe miner selection in the first stage, and further verifying and auditing each backup miner using a new block in the second stage. However, because the confidentiality of the network security information is strong, the information cannot be directly transferred by using the blockchain technology.
(2) Information sharing mechanism based on excitation theory. Information sharing mechanisms are highly dependent on the quality of information provided by each node, so many systems apply incentive theory to information sharing mechanisms to enhance the sharing willingness and analysis capabilities of the nodes. HUAN et al analyze random disturbance of market demand caused by sudden events and propose a dairy product supply chain information sharing mechanism according to overall benefits. Zhang et al propose an engineering construction safety production information sharing and credit evaluation mechanism based on an excitation theory, and apply excitation measures with different degrees to different levels of building enterprises by combining a analytic hierarchy process.
(3) Information sharing mechanism based on game theory. In the information sharing process, each node has independent information judging and selecting decision-making capability and takes the benefit maximization of own party as a principle, thus being a continuous game process. Wu et al analyze the total benefits of technical innovation by means of an evolution game model, and research the influence of benefit distribution ratio on an information sharing mechanism. Lv Lu et al use game theory to build information resource sharing evolution blogging model between military enterprises and civil enterprises, and perform stability analysis on the equilibrium point of the replication dynamic equation to obtain evolution stability strategy. However, in the network security information sharing mechanism, most of the participating nodes are security departments of each organization, and the benefits cannot be directly quantified, so that the game theory is not applicable.
In a network security information sharing mechanism, the privacy of transmitted information is strong, the degree of dependence of an analysis process on expert knowledge is high, and the member analysis capability difference is large, so that the scheme cannot be directly applied.
Disclosure of Invention
In view of the above, the application provides a network security information sharing system, a method, an electronic device and a storage medium based on expert joint evaluation, so as to solve the problems of large member analysis capability gap and low risk research and judgment accuracy of the existing network security information sharing mechanism; the application introduces the dynamic weighted expert committee and the active learning idea into the information sharing and studying and judging process, greatly improves the risk studying and judging accuracy in the network security information sharing mechanism, and is beneficial to enhancing the member analysis capability in the network security information sharing mechanism.
The technical scheme of the application is realized as follows:
Scheme one: a network security information sharing system based on expert joint evaluation, comprising: n end nodes, a central node;
Each end node comprises a data preprocessing module, an expert data labeling module and an expert weight updating module; the center node comprises an initial risk classifier training module, an expert annotation accuracy calculating module, an updating judging module and a data uploading module;
The data preprocessing module is used for data cleaning and processing and performing preliminary marking on risk categories of network security information content;
The initial risk classifier training module is used for receiving network security data from each end node, summarizing and sorting, selecting a small number of samples with risk labeling information for training to obtain an initial risk classifier, classifying unlabeled samples in summarized information by using the initial risk classifier, selecting samples with uncertainty higher than a threshold epsilon, and distributing the samples to each end node, wherein the threshold epsilon is set according to application scenes;
The expert data labeling module labels the distributed samples by the expert committee of each end node;
The expert annotation accuracy calculation module is used for acquiring expert weights in the expert committee of each end node and obtaining sample annotation and probability distribution by using a weighted voting mode;
The expert weight updating module is used for updating the expert weight in the expert committee according to the sample labeling accuracy;
The updating judging module is used for adding a part with the confidence coefficient larger than a constant lambda in the sample marked by the expert into the initial risk classifier training module, and incrementally training the risk classifier, wherein the constant lambda is set according to the application scene;
And the data uploading module is used for carrying out data backup on all the safety information and the risk categories output by the risk classifier and then uploading the data together to the upper data service processing center.
Scheme II: a network security information sharing method based on expert joint evaluation comprises the following steps:
step one, each end node preprocesses network security information to be uploaded;
The preprocessing operation comprises data cleaning processing and preliminary labeling of risk categories for the network security information content;
step two, the central node receives network security data from all end nodes, gathers and sorts the network security data, and selects a small amount of samples with risk marking information to train an initial risk classifier;
Classifying unlabeled samples in the summarized information by using a risk classifier, and selecting samples with uncertainty higher than a threshold epsilon to be distributed to expert committees consisting of nodes for labeling; the threshold epsilon is freely set according to a specific application scene;
step four, acquiring expert weights in expert committees of all the end nodes and obtaining sample labels and probability distribution thereof by using a weighted voting mode;
step five: updating expert weights in the expert committee according to the sample labeling accuracy;
Step six: adding a part with confidence coefficient larger than a constant lambda in a sample marked by an expert into an initial risk classifier training module, and incrementally training a risk classifier, wherein the constant lambda is set according to an application scene, and repeating the second step to the fourth step until one of the following conditions is met:
① No new unlabeled samples to be handed to the expert committee can be selected;
② Reaching preset iteration times;
Step seven: the central node uses the trained risk classifier to conduct risk category research and judgment on all safety information, the research and judgment results are fed back to each node expert, and the expert improves the self analysis process according to the research and judgment results;
Step eight: and the central node performs data backup on all the security information and the risk categories output by the risk classifier and then uploads the data backup to the upper data service processing center.
Further: in the first step, the network security information content comprises security information assets and various element indexes; the data cleansing processing operation includes checking data consistency, processing invalid values and missing values.
Further: in the third step, the uncertainty is measured by the information quantity of the sample class probability distribution calculated by the risk classifier, namely, the uncertainty is calculated by adopting an information entropy mode:
where n represents the total number of risk categories, p i represents the probability that a sample is determined by the classifier to belong to the i-th category, and H (x) represents the entropy of the information of sample x.
Further: the specific operation of the fourth step is as follows:
Step four, first: initializing each expert weight in the expert committee if the training is the primary training, otherwise, acquiring the expert weight in the previous round;
The initial weight is defined as 1/N, wherein N is the total number of members of the expert committee, and Wj is set to represent the weight of the jth expert
Step four, two: each node expert marks the sample, and the voting result of each expert is summarized by using a weighted voting mode as the final sample mark, and the calculation process is as follows:
Wherein the method comprises the steps of The final labeling result of the sample x after expert voting is represented by i, V (y ij) represents whether the j-th expert classifies the sample risk as i, if yes, the sample risk is 1, otherwise, the sample risk is 0, the summation symbol represents the summation of voting results of all classifiers, W j represents the weight of the j-th expert, and N represents the total number of the experts in the expert committee.
Further: the specific operation of the fifth step is as follows:
Step five: calculating the Kullback-Leibler divergence of all data labels of each expert committee, and evaluating the expert classification accuracy through the sum of all the labeled Kullback-Leibler divergences;
Further: the specific calculation formula of the classification KL divergence of the expert j to the sample x is as follows:
wherein P (x) represents the risk category probability distribution of sample x after voting [ P 1,p2,...pn],Qj (x) represents the risk category probability distribution of sample x calculated by expert j [ q 1,q2,...qn ];
step five: updating the expert weight according to the expert classification accuracy calculated in the fifth step.
Further: the expert weight updating process comprises the following steps: firstly, calculating the KL divergence sum of all samples by each expert; then taking the logarithm of the inverse sum of KL divergence and normalizing; the specific calculation formula is as follows:
where M represents the total number of samples and N represents the total number of experts involved in the evaluation.
Scheme III: an electronic device comprising a processor and a memory for storing a computer program capable of running on the processor,
Wherein the processor is configured to execute the steps of the method according to the second aspect when running the computer program.
Scheme IV: a storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of claim two.
The beneficial effects of the application are as follows:
The application provides a network security information sharing mechanism based on expert joint evaluation, which introduces a dynamic weighted expert committee and an active learning idea into the information sharing and studying and judging process, greatly improves the risk studying and judging accuracy in the network security information sharing mechanism, and is beneficial to enhancing the member analysis capability in the network security information sharing mechanism.
Drawings
Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the accompanying drawings in which:
FIG. 1 is a block diagram of a network security information sharing system based on expert joint evaluation according to an embodiment of the present application;
Fig. 2 is a flow chart of a network security information sharing method based on expert joint evaluation according to a second embodiment of the present application;
Fig. 3 is a schematic structural diagram of an electronic device according to the present application.
Detailed Description
The application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and not limiting of the application. It should be noted that, for convenience of description, only the portions related to the application are shown in the drawings.
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.
The first embodiment of the application provides a network security information sharing mechanism based on expert joint evaluation, wherein the information sharing mechanism is that an upper-level department shares information, risk research and judgment results and the like related to network security to each lower-level department, and simultaneously each lower-level department reports the information related to threat, attack and the like to the upper-level department, and the lower-level departments mutually exchange information which is helpful for network security protection work. According to the application, by introducing an active learning technology based on an expert committee, more accurate risk assessment is performed on shared information, and meanwhile, the labeling accuracy of members of the expert committee is assessed, so that the risk assessment capability of the system is improved, and the working efficiency of the information sharing system is finally improved.
Example 1
Fig. 1 shows a block diagram of a network security information sharing system based on expert joint evaluation according to embodiment 1 of the present application.
A network security information sharing system based on expert joint evaluation, comprising: n end nodes, a center node; each end node comprises a data preprocessing module, an expert data labeling module and an expert weight updating module; the center node comprises an initial risk classifier training module, an expert annotation accuracy calculating module, an updating judging module and a data uploading module; the data preprocessing module is used for data cleaning and processing and performing preliminary marking on risk categories of network security information content; the initial risk classifier training module is used for receiving network security data from each end node, summarizing and sorting, selecting a small number of samples with risk labeling information for training to obtain an initial risk classifier, classifying unlabeled samples in summarized information by using the initial risk classifier, selecting samples with uncertainty higher than a threshold epsilon, and distributing the samples to each end node, wherein the threshold epsilon is set according to application scenes; the expert data labeling module labels the distributed samples by the expert committee of each end node; the expert annotation accuracy calculation module is used for acquiring expert weights in the expert committee of each end node and obtaining sample annotation and probability distribution by using a weighted voting mode; the expert weight updating module is used for updating the expert weight in the expert committee according to the sample labeling accuracy; the updating judging module is used for adding a part with the confidence coefficient larger than a constant lambda in the sample marked by the expert into the initial risk classifier training module, and incrementally training the risk classifier, wherein the constant lambda is set according to the application scene; and the data uploading module is used for carrying out data backup on all the safety information and the risk categories output by the risk classifier and then uploading the data together to the upper data service processing center.
Example two
In order to better explain the purposes and advantages of the present embodiment, the second embodiment of the present application provides a network security information sharing method based on expert joint evaluation (see fig. 2), and the following description is further provided in detail. The application carries out simulation experiments in a computer according to the following steps:
S1: each endpoint preprocesses the network security information to be uploaded.
The application carries out simulation experiments on database transaction security risk assessment data from a scientific research institution. There are 170 pieces of initial training data and 2300 pieces of test data. The input characteristics of each piece of data are specific factors influencing risk assessment, and the total number of the input characteristics is 12; the output result is a risk assessment grade, which is divided into two grades of low risk and high risk. The training data label is evaluated by a security expert, and the test data label is evaluated by a real user according to actual conditions. In order to simulate the process of uploading network security information by each end node, different experts respectively label different parts of training data by adopting an AHP method, and the labeling results of all the parties are converged to obtain a complete training set.
S2: the central node receives network security information from each endpoint, collects and cleans the network security information, and selects a small amount of samples with risk labeling information to train an initial risk classifier.
In the experimental process, the initial risk classifier uses SVM to simulate, and the punishment parameter is set to be 0.3. The SVM classifier is trained using training data labeled by each expert.
S3: and classifying unlabeled samples in the summarized information by using a risk classifier, and selecting samples with uncertainty higher than a certain threshold value and distributing the samples to expert committees consisting of the nodes for labeling.
And classifying the test data by using a risk classifier, calculating the information entropy of each sample in the classification result, selecting samples with the information entropy larger than a threshold epsilon, and adding the samples into the set to be annotated by the expert. Wherein epsilon is a constant between 0 and 1, and the control expert needs to mark the number of samples.
S4: and obtaining the weight of each expert in the expert committee and obtaining the sample label and the probability distribution thereof by using a weighted voting mode.
S4.1: if the training is the primary training, initializing each expert weight in the expert committee. Otherwise, the expert weight in the previous coherence is obtained.
The initial weight is defined as 1/N, where N is the total number of expert committee members. Each expert also uses an SVM classifier for simulation, but the penalty parameter values are set differently. In a specific experimental process, penalty parameters are equidistantly set within the range of [0.1,0.6] according to the number of experts N. And during initial training, training each expert classifier by using the original training data to obtain an expert set with different scoring preferences. In practice, the expert in the committee, such as xgboost, lightGBM, etc., may also be simulated using different kinds of classifiers.
S4.2: and marking the samples by each expert, and summarizing the voting results of each expert by using a weighted voting mode to serve as final sample marking.
And predicting the test data by using an expert SVM classifier, and solving a weighted average value of each expert prediction result to obtain a final labeling result.
S5: and updating the expert weights in the expert committee according to the sample labeling accuracy.
S5.1: and calculating the Kullback-Leibler divergence of all data labels of each expert committee, and evaluating the expert classification accuracy through the sum of all the labeled Kullback-Leibler divergences.
And calculating the KL divergence of the expert prediction distribution and the final result distribution for each sample, and summing the KL divergence sum of each sample to obtain the difference between the expert classification and the final distribution.
S5.2: and updating the expert weights according to the expert classification accuracy calculated in the step S5.1.
And the updated expert weight is obtained by taking the logarithm of the inverse KL divergence of all the experts and normalizing.
S6: and adding a part with higher confidence in the sample marked by the expert into the training set, and incrementally training the risk classifier.
And adding expert committee labeling data into the training set, and retraining the risk classifier by taking the final labeling of the expert committee on the sample as a label.
Repeating S2-S4 until one of the following conditions is met:
① No new unlabeled samples to be handed to the expert committee can be selected;
② The preset iteration times are reached.
In the experimental process, the iteration number is set to 3.
S7: the central node uses the trained risk classifier to conduct risk category research and judgment on all safety information, the research and judgment results are fed back to each node expert, and the expert improves the self analysis process according to the research and judgment results.
And feeding the classification result of the final test set back to each expert, adding the training data into the expert by each expert, retraining the training data, and simulating the improvement process.
S8: and the central node performs data backup on all the security information and the risk categories output by the risk classifier and then uploads the data backup to the upper data service processing center.
And outputting the classification result of the final test set, and calculating each index.
In order to verify the effect of the network security information sharing mechanism provided by the application on security risk assessment and judgment, the network security information sharing mechanism is compared with the conditions of no information sharing mechanism, no expert committee and expert committee but using fixed weights, and the precision, recall ratio, F1 value and accuracy of risk judgment under each condition are recorded, and the obtained results are shown in table 1.
Table 1 comparison of experimental effects of whether network security information sharing mechanism is adopted
Precision ratio of Recall ratio F1 value Accuracy rate of
Sharing-free 0.548 0.828 0.659 0.572
Expert-free committee 0.663 0.702 0.682 0.673
Expert weight fixing 0.672 0.711 0.693 0.680
The method of the application 0.678 0.720 0.699 0.690
As can be seen from Table 1, when there is no sharing mechanism, each risk classifier can only acquire part of the data in the training set, so that the learned knowledge is limited, and the precision and accuracy are low. When the expert committee mechanism is not used, the expert knowledge of each party cannot be interactively fused, and the effect is more time difference than that of the expert committee mechanism. In addition, the expert weight dynamic calculation algorithm adopted by the application is improved compared with the method adopting fixed weights, which proves that the weight updating method adopted by the application has a certain promotion effect on an information sharing mechanism.
Example III
An electronic device according to a third embodiment of the present application is shown in fig. 3, and is in the form of a general-purpose computing device. Components of an electronic device may include, but are not limited to: one or more processors or processing units, a memory for storing a computer program capable of running on the processor, a bus connecting the different system components (including the memory, the one or more processors or processing units).
Wherein the one or more processors or processing units are configured to execute the steps of the method according to embodiment two when the computer program is run. The processor may be of a type that includes a central processing unit, a general purpose processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof.
Where a bus represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, and a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Example IV
A fourth embodiment of the present application provides a storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method described in the second embodiment.
The storage medium shown in the present application may be a computer readable signal medium or a storage medium, or any combination of the two. The storage medium may be, for example but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this patent, a storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the storage medium may include a data signal that propagates in baseband or as part of a carrier wave, in which computer readable program code is carried. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A storage medium may also be any computer-readable medium that can transmit, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The foregoing embodiments have further described the objects, technical solutions and advantageous effects of the present application in detail, and it should be understood that the foregoing embodiments are merely examples of the present application and are not intended to limit the scope of the present application, and any modifications, equivalent substitutions, improvements, etc. made on the basis of the technical solutions of the present application should be included in the scope of the present application.

Claims (10)

1. A network security information sharing system based on expert joint evaluation, comprising: n end nodes, a central node;
Each end node comprises a data preprocessing module, an expert data labeling module and an expert weight updating module; the center node comprises an initial risk classifier training module, an expert annotation accuracy calculating module, an updating judging module and a data uploading module;
The data preprocessing module is used for data cleaning and processing and performing preliminary marking on risk categories of network security information content;
The initial risk classifier training module is used for receiving network security data from each end node, summarizing and sorting, selecting a small number of samples with risk labeling information for training to obtain an initial risk classifier, classifying unlabeled samples in summarized information by using the initial risk classifier, selecting samples with uncertainty higher than a threshold epsilon, and distributing the samples to each end node, wherein the threshold epsilon is set according to application scenes;
The expert data labeling module labels the distributed samples by the expert committee of each end node;
The expert annotation accuracy calculation module is used for acquiring expert weights in the expert committee of each end node and obtaining sample annotation and probability distribution by using a weighted voting mode;
The expert weight updating module is used for updating the expert weight in the expert committee according to the sample labeling accuracy;
The updating judging module is used for adding a part with the confidence coefficient larger than a constant lambda in the sample marked by the expert into the initial risk classifier training module, and incrementally training the risk classifier, wherein the constant lambda is set according to the application scene;
And the data uploading module is used for carrying out data backup on all the safety information and the risk categories output by the risk classifier and then uploading the data together to the upper data service processing center.
2. The network security information sharing method based on expert joint evaluation is characterized by comprising the following steps of:
step one, each end node preprocesses network security information to be uploaded;
The preprocessing operation comprises data cleaning processing and preliminary labeling of risk categories for the network security information content;
step two, the central node receives network security data from all end nodes, gathers and sorts the network security data, and selects a small amount of samples with risk marking information to train an initial risk classifier;
Classifying unlabeled samples in the summarized information by using a risk classifier, and selecting samples with uncertainty higher than a threshold epsilon to be distributed to expert committees consisting of nodes for labeling; the threshold epsilon is freely set according to a specific application scene;
step four, acquiring expert weights in expert committees of all the end nodes and obtaining sample labels and probability distribution thereof by using a weighted voting mode;
step five: updating expert weights in the expert committee according to the sample labeling accuracy;
Step six: adding a part with confidence coefficient larger than a constant lambda in a sample marked by an expert into an initial risk classifier training module, and incrementally training a risk classifier, wherein the constant lambda is set according to an application scene, and repeating the second step to the fourth step until one of the following conditions is met:
① No new unlabeled samples to be handed to the expert committee can be selected;
② Reaching preset iteration times;
Step seven: the central node uses the trained risk classifier to conduct risk category research and judgment on all safety information, the research and judgment results are fed back to each node expert, and the expert improves the self analysis process according to the research and judgment results;
Step eight: and the central node performs data backup on all the security information and the risk categories output by the risk classifier and then uploads the data backup to the upper data service processing center.
3. The network security information sharing method based on expert joint evaluation according to claim 2, wherein in the first step, the network security information content includes security information assets and various element indexes; the data cleansing processing operation includes checking data consistency, processing invalid values and missing values.
4. The network security information sharing method based on expert joint evaluation as claimed in claim 3, wherein in the third step, uncertainty is measured by the information quantity of the sample class probability distribution calculated by the risk classifier, namely, the uncertainty is calculated by adopting an information entropy mode:
where n represents the total number of risk categories, p i represents the probability that a sample is determined by the classifier to belong to the i-th category, and H (x) represents the entropy of the information of sample x.
5. The network security information sharing method based on expert joint evaluation as claimed in claim 4, wherein the specific operation of the fourth step is as follows:
Step four, first: initializing each expert weight in the expert committee if the training is the primary training, otherwise, acquiring the expert weight in the previous round;
The initial weight is defined as 1/N, wherein N is the total number of members of the expert committee, and W j is set to represent the weight of the jth expert
Step four, two: each node expert marks the sample, and the voting result of each expert is summarized by using a weighted voting mode as the final sample mark, and the calculation process is as follows:
Wherein the method comprises the steps of The final labeling result of the sample x after expert voting is represented by i, V (y ij) represents whether the j-th expert classifies the sample risk as i, if yes, the sample risk is 1, otherwise, the sample risk is 0, the summation symbol represents the summation of voting results of all classifiers, W j represents the weight of the j-th expert, and N represents the total number of the experts in the expert committee.
6. The network security information sharing method based on expert joint evaluation as claimed in claim 5, wherein the specific operation of the fifth step is as follows:
Step five: calculating the Kullback-Leibler divergence of all data labels of each expert committee, and evaluating the expert classification accuracy through the sum of all the labeled Kullback-Leibler divergences;
step five: updating the expert weight according to the expert classification accuracy calculated in the fifth step.
7. The network security information sharing method based on expert joint evaluation according to claim 6, wherein in the fifth step, the specific calculation formula of the classification KL divergence of the expert j to the sample x is as follows:
Where P (x) represents the risk category probability distribution of sample x after voting [ P 1,p2,...pn],Qj (x) represents the risk category probability distribution of sample x calculated by expert j [ q 1,q2,...qn ].
8. The network security information sharing method based on expert joint evaluation according to claim 7, wherein in the fifth step, the updating expert weight process is as follows:
firstly, calculating the KL divergence sum of all samples by each expert; then taking the logarithm of the inverse sum of KL divergence and normalizing; the specific calculation formula is as follows:
where M represents the total number of samples and N represents the total number of experts involved in the evaluation.
9. An electronic device, characterized in that: comprising a processor and a memory for storing a computer program capable of running on the processor,
Wherein the processor is adapted to perform the steps of the method of any of claims 2 to 8 when the computer program is run.
10. A storage medium having stored thereon a computer program, which when executed by a processor performs the steps of the method according to any of claims 2 to 8.
CN202111332573.0A 2021-11-11 2021-11-11 Expert joint evaluation-based network security information sharing system, method, electronic equipment and storage medium Active CN114037091B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111332573.0A CN114037091B (en) 2021-11-11 2021-11-11 Expert joint evaluation-based network security information sharing system, method, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111332573.0A CN114037091B (en) 2021-11-11 2021-11-11 Expert joint evaluation-based network security information sharing system, method, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN114037091A CN114037091A (en) 2022-02-11
CN114037091B true CN114037091B (en) 2024-05-28

Family

ID=80137250

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111332573.0A Active CN114037091B (en) 2021-11-11 2021-11-11 Expert joint evaluation-based network security information sharing system, method, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114037091B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103166830A (en) * 2011-12-14 2013-06-19 中国电信股份有限公司 Spam email filtering system and method capable of intelligently selecting training samples
CN103400144A (en) * 2013-07-17 2013-11-20 山东师范大学 Active learning method based on K-neighbor for support vector machine (SVM)
JP2016040650A (en) * 2014-08-12 2016-03-24 株式会社Screenホールディングス Classifier construction method, image classifying method, and image classifying device
CN109460795A (en) * 2018-12-17 2019-03-12 北京三快在线科技有限公司 Classifier training method, apparatus, electronic equipment and computer-readable medium
US10438001B1 (en) * 2018-12-31 2019-10-08 Arceo Labs Inc. Identification, prediction, and assessment of cyber security risk
CN110351113A (en) * 2019-05-17 2019-10-18 国家工业信息安全发展研究中心 Network security emergency information pooled analysis system
CN111241243A (en) * 2020-01-13 2020-06-05 华中师范大学 Knowledge measurement-oriented test question, knowledge and capability tensor construction and labeling method
CN111414942A (en) * 2020-03-06 2020-07-14 重庆邮电大学 Remote sensing image classification method based on active learning and convolutional neural network
CN111860638A (en) * 2020-07-17 2020-10-30 湖南大学 Parallel intrusion detection method and system based on unbalanced data deep belief network
CN112434734A (en) * 2020-11-20 2021-03-02 贵州大学 Selective integration method based on dynamic classifier sequence combination
CN112465627A (en) * 2020-11-26 2021-03-09 北京天仪百康科贸有限公司 Financial loan auditing method and system based on block chain and machine learning

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100466416C (en) * 2006-03-24 2009-03-04 上海市电力公司 Intelligent decision support system for urban power grid accidents
CN102254192B (en) * 2011-07-13 2013-07-31 北京交通大学 Method and system for semi-automatic marking of three-dimensional (3D) model based on fuzzy K-nearest neighbor
CN104182767B (en) * 2014-09-05 2018-03-13 西安电子科技大学 The hyperspectral image classification method that Active Learning and neighborhood information are combined
CN106338708B (en) * 2016-08-30 2020-04-24 中国电力科学研究院 Electric energy metering error analysis method combining deep learning and recurrent neural network
GB201705189D0 (en) * 2017-03-31 2017-05-17 Microsoft Technology Licensing Llc Sensor data processor with update ability
CN107665733A (en) * 2017-10-31 2018-02-06 海南职业技术学院 A kind of medical information sharing system
CN111652496B (en) * 2020-05-28 2023-09-05 中国能源建设集团广东省电力设计研究院有限公司 Running risk assessment method and device based on network security situation awareness system
CN111651783A (en) * 2020-06-24 2020-09-11 北京米弘科技有限公司 Block chain-based network security information sharing method and device

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103166830A (en) * 2011-12-14 2013-06-19 中国电信股份有限公司 Spam email filtering system and method capable of intelligently selecting training samples
CN103400144A (en) * 2013-07-17 2013-11-20 山东师范大学 Active learning method based on K-neighbor for support vector machine (SVM)
JP2016040650A (en) * 2014-08-12 2016-03-24 株式会社Screenホールディングス Classifier construction method, image classifying method, and image classifying device
CN109460795A (en) * 2018-12-17 2019-03-12 北京三快在线科技有限公司 Classifier training method, apparatus, electronic equipment and computer-readable medium
CA3063580A1 (en) * 2018-12-17 2020-06-17 10353744 Canada Ltd. Classifier training method and apparatus, electronic device and computer readable medium
US10438001B1 (en) * 2018-12-31 2019-10-08 Arceo Labs Inc. Identification, prediction, and assessment of cyber security risk
CN110351113A (en) * 2019-05-17 2019-10-18 国家工业信息安全发展研究中心 Network security emergency information pooled analysis system
CN111241243A (en) * 2020-01-13 2020-06-05 华中师范大学 Knowledge measurement-oriented test question, knowledge and capability tensor construction and labeling method
CN111414942A (en) * 2020-03-06 2020-07-14 重庆邮电大学 Remote sensing image classification method based on active learning and convolutional neural network
CN111860638A (en) * 2020-07-17 2020-10-30 湖南大学 Parallel intrusion detection method and system based on unbalanced data deep belief network
CN112434734A (en) * 2020-11-20 2021-03-02 贵州大学 Selective integration method based on dynamic classifier sequence combination
CN112465627A (en) * 2020-11-26 2021-03-09 北京天仪百康科贸有限公司 Financial loan auditing method and system based on block chain and machine learning

Also Published As

Publication number Publication date
CN114037091A (en) 2022-02-11

Similar Documents

Publication Publication Date Title
Du et al. New failure mode and effects analysis: An evidential downscaling method
de Barcelos Tronto et al. An investigation of artificial neural networks based prediction systems in software project management
Yang et al. Friend or frenemy? Predicting signed ties in social networks
CN106503929A (en) A kind of method that intellectual analysis enclose mark and string bid behavior
Lukić Analysis if the Efficiency of Trade in Oil Derivatives in Serbia by Applying the Fuzzy AHP-TOPSIS Method
CN105354595A (en) Robust visual image classification method and system
Delgado Social conflict analysis on a mining project using shannon entropy
Williams Predicting completed project cost using bidding data
Semerikov et al. Neural network analytics and forecasting the country's business climate in conditions of the coronavirus disease (COVID-19)
CN112420187A (en) Medical disease analysis method based on migratory federal learning
Gupta et al. Implementing weighted entropy-distance based approach for the selection of software reliability growth models
Bhatia et al. Quantum computing inspired framework of student performance assessment in smart classroom
CN111523768A (en) Entropy weight-TOPSIS-based generalized demand side resource quality evaluation method
CN117150232B (en) Large model non-time sequence training data quality evaluation method
Wang et al. A reputation bootstrapping model for e-commerce based on fuzzy dematel method and neural network
CN112231746B (en) Joint data analysis method, device, system and computer readable storage medium
CN114037091B (en) Expert joint evaluation-based network security information sharing system, method, electronic equipment and storage medium
He [Retracted] Early Warning Model of Sports Injury Based on RBF Neural Network Algorithm
Wang et al. Temperature forecast based on SVM optimized by PSO algorithm
CN106095811A (en) A kind of image search method of the discrete Hash of supervision based on optimum code
Alzaghal et al. Moderating effect of information and communication technology tools on the relationship between networking services and incubator success
Fachrurrazi et al. The weights detection of Multi-criteria by using Solver
Luo Application of BP neural network in economic management of coastal area
Tian et al. An optional splitting extraction based gain-AUPRC balanced strategy in federated XGBoost for mitigating imbalanced credit card fraud detection
Feng et al. Intelligent Evaluation Mechanism for Cloud-Edge-End based Next Generation Ship Simulator towards Maritime Pilot Training

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant