CN115146299B - Safety trusteeship service method based on knowledge graph and domain adaptation - Google Patents
Safety trusteeship service method based on knowledge graph and domain adaptation Download PDFInfo
- Publication number
- CN115146299B CN115146299B CN202211083553.9A CN202211083553A CN115146299B CN 115146299 B CN115146299 B CN 115146299B CN 202211083553 A CN202211083553 A CN 202211083553A CN 115146299 B CN115146299 B CN 115146299B
- Authority
- CN
- China
- Prior art keywords
- network security
- client
- representing
- capsule
- domain
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 230000006978 adaptation Effects 0.000 title claims abstract description 8
- 238000012549 training Methods 0.000 claims abstract description 26
- 230000004044 response Effects 0.000 claims abstract description 25
- 238000012545 processing Methods 0.000 claims abstract description 15
- 239000002775 capsule Substances 0.000 claims description 104
- 230000006870 function Effects 0.000 claims description 33
- 238000013507 mapping Methods 0.000 claims description 22
- 239000013598 vector Substances 0.000 claims description 15
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 claims description 6
- 230000004913 activation Effects 0.000 claims description 3
- 230000002776 aggregation Effects 0.000 claims description 3
- 238000004220 aggregation Methods 0.000 claims description 3
- 235000009508 confectionery Nutrition 0.000 claims description 3
- 230000003993 interaction Effects 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims description 3
- 238000011176 pooling Methods 0.000 claims description 3
- 238000011524 similarity measure Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/604—Tools and structures for managing or administering access control systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/554—Detecting local intrusion or implementing counter-measures involving event detection and direct action
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/606—Protecting data by securing the transmission between two devices or processes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- General Physics & Mathematics (AREA)
- Computer Security & Cryptography (AREA)
- General Health & Medical Sciences (AREA)
- Computer Hardware Design (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Bioethics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Databases & Information Systems (AREA)
- Animal Behavior & Ethology (AREA)
- Automation & Control Theory (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a safe trusteeship service method based on knowledge graph and domain adaptation, comprising the following steps: inputting the prepared network emergency response knowledge graph set into a knowledge graph redundancy processing module, removing redundant features in different knowledge graphs, extracting features with high correlation, and using the processed features as a source domain of a server side; training a network security event inference model by using a source domain at a server side, and broadcasting parameters of the trained network security event inference model to each client side; each client takes the network security log file as a target domain, different source domain-target domain pairs are constructed for different clients, and a network security event inference model of the client is trained; after the training of the network security event inference model of the client is finished, the client uploads the parameters of the reinforced inference module in the local network security event inference model to the server; the method has the characteristic of effectively improving the processing efficiency of the network security event.
Description
Technical Field
The invention relates to the technical field of network security, in particular to a safe hosting service method based on a knowledge graph and domain adaptation.
Background
With the increasing innovation and development of internet technology, the network security problem is also more severe, the scale of network attack is increasingly organized, the attack means is continuously changed, diversification and structuralization are realized, and the network emergency response work is also more important.
At present, the traditional network emergency response carries out plan matching according to the content of an alarm event based on characteristics such as keywords or indexes, and the traditional means has the defects of low matching efficiency and low matching accuracy, so that the network security event which is increasingly complicated and has high comprehensive degree is difficult to solve. Meanwhile, the traditional means needs a large amount of manual work to participate in data analysis, has high requirements on data formatting of characteristic information, and is difficult to meet the accuracy requirement of plan matching after a network security incident occurs. The network security emergency response knowledge graph comprises a large amount of characteristic information of an attack means and a corresponding solution, the knowledge graph stores data through a graph structure, the data relation stored by the structure is a non-single relation, and redundant information in the graph is more, so that the subsequent processing is not facilitated. The safety hosting Service (MSS) submits part of heavy and repeated safety operation work to a professional cloud Service provider, and the professional safety operation team develops continuous analysis and operation Service. The enterprise turning to the security hosting service provider can relieve the pressure of the enterprise on information security every day, and by means of the advantages of the security hosting service provider in some security fields, the short boards of the enterprise in security construction or operation management can be supplemented, so that the security management efficiency is improved, and therefore, a security hosting service method based on knowledge graph and domain adaptation is urgently needed to be provided to solve the problems.
Disclosure of Invention
To achieve the above object, the inventor provides a safe hosting service method based on knowledge graph and domain adaptation, comprising the following steps:
s1: preparing a network emergency response knowledge graph set, inputting the network emergency response knowledge graph set into a knowledge graph redundancy processing module, removing redundant features in different knowledge graphs through a graph capsule neural network with self-adaptive feature selection, extracting features with high correlation, and fusing the network emergency response knowledge graph set into a new feature set serving as a source domain of a server side;
s2: training a network security event inference model by using a source domain at a server side, wherein the network security event inference model adopts sub-capsules to encode the characteristics in the source domain, and strengthens the semantic information after encoding through a local reconstruction module;
s3: assembling a plurality of sub-capsules into component capsules, inputting the component capsules into a reasoning module for decoding, generating a network security emergency response plan by the reasoning module through decoding semantic information, and broadcasting parameters of a trained network security event reasoning model to each client;
s4: each client takes the network security log file as a target domain, different source domain-target domain pairs are constructed for different clients, a network security event reasoning model of the client is trained, a reinforced reasoning module is added to the network security event reasoning model of each client during reasoning, and a proper network security emergency response plan is selected according to the result of the reinforced reasoning module;
s5: after the training of the network security event inference model of the client is finished, the client uploads the parameters of the reinforced inference module in the local network security event inference model to the server.
As a preferred embodiment of the present invention, the S1 further includes the steps of:
s101: giving a set of network emergency response knowledge graphs, denoted as S N For S N The first knowledge-graph S in 1 Performing redundant processing and relevant feature selection of features in the knowledge graph, and performing redundant processing and relevant feature selection on the knowledge graph S 1 Respectively constructing node capsules by using nodes a and nodes b of the middle p-th layerAnd
s102: calculating the feature mapping vectors of the node a and the node b through the feature mapping layer of the capsule diagram neural networkThe expression is as follows:
wherein,a node capsule representing the ith neighbor node of the p-th layer node a, m represents the number of neighbor nodes of the node a, and represents the node capsule of the node a when i =0,the node capsule represents the jth neighbor node of the pth layer of node b, n is the number of neighbor nodes of the node b, when j =0, the node capsule represents the node b, and MLP is a multilayer perceptron;
measuring the correlation of two node capsules by using a mutual information function Kinfo, wherein the expression is as follows:
wherein,to representThe transpose of (a) is performed,representExp represents an exponential function with a natural constant e as a base number;
s103: for knowledge graph S 1 Executing the step S102 on any two nodes in the same layer, carrying out self-adaptive selection on the node feature mapping in each layer, and removing the feature mapping with high redundancy between layersUntil all layers are calculated, and a compressed node feature mapping set in S1 is obtainedThe expression is as follows:
wherein f is r Set of feature maps representing the r-th layer, f s Representing a feature mapping set of the s-th layer, and softmax representing a normalized exponential function;
s104: for S N The rest knowledge graphs in (1) are circulated from S102 to S103, and a feature mapping set F of all knowledge graphs is obtained S ,F is to be S The source domain is used for training a network security event inference model of the server side.
As a preferred embodiment of the present invention, the S2 includes the steps of:
s201: the network security event reasoning model at the server side adopts two capsule encoders based on a self-attention mechanism to carry out reasoning on the network security events in the source domainEncoding to generate two sub-capsulesAndthe expression is as follows:
wherein Encoder key Is a key capsule feature extractor composed of a residual error network-50, encoder value Is a value capsule feature extractor, consisting of a residual error network-50;
s202: and (3) performing characteristic reconstruction on the two sub-capsules by using a local reconstruction module, wherein the characteristic reconstruction is used for abundant semantic information, and the expression is as follows:
wherein And respectively representing the key capsule and the value capsule after the characteristic reconstruction, and representing the characteristic reconstruction vector by tau and mu, wherein the characteristic reconstruction vector is obtained by automatically learning a network security event inference model of the client during training.
As a preferred embodiment of the present invention, the S3 further includes the steps of:
s301: assembling the output characteristics of the sub-capsules into component capsules, wherein the expression is as follows:
wherein,representing in the source domainThe resulting part-capsule is then ready for use,representing two weight parameters, which are automatically learned by a network security event inference model of a client during training and are used for controlling the weight of a key capsule and a value capsule in the characteristics;
s302: for source field F S The other subsets execute the steps S201, S202 and S301, the component capsules are spliced together and then input into an inference module, and decoding is carried out in the inference module, namely, a semantic information result in a coding stage is converted into a three-dimensional embedded expression through upsampling, and the expression is as follows:
where Cat denotes the part capsule splicing operation and Decoder denotes the Decoder, consisting of 4 3 x 3-dimensional convolutions, F Decoder Is a three-dimensional embedded representation.
As a preferred embodiment of the present invention, the S3 further includes the steps of:
s303: generating a network security emergency plan, namely performing jump connection on the decoded features and the sub-capsule features in the encoding stage, constructing the network security emergency plan according to the time sequence information, recording the network security emergency plan as Play, setting that no more than m events to be treated exist in one network security emergency plan, wherein the expression is as follows:
wherein, in [ a ] m-1 ,t m-1 ]In (a) m-1 Representing an event to be treated, t m-1 Indicating the order of the events in the network security emergency protocol,representing the jth source domain in the source domainProduced byThe components of the capsule are taken together,representing jump connection, FAM representing a feature aggregation layer which is composed of convolution of 3 x 3 and double upsampling, and PAM representing a pyramid pooling layer for processing feature vectors of different shapes;
s304: the overall loss function expression of the network security event inference model of the server side is as follows:
wherein, DICE is a similarity measure function, and its expression is:
representing the kth source domain in the source domainThe key capsules in (1) and the key capsules with reconstructed characteristics;representing the kth source domain in the source domainA medium value capsule and a reconstructed value capsule;
the expression of the inference module penalty function is:
wherein,for computing the kth source domain in the source domainResulting component capsulesAnd the p-th event to be disposed [ a ] in the network security emergency plan p ,t p ]Relative entropy between;
s305: sending parameters of a network security event inference model of a server side to each Client side i And i is more than 0 and less than M +1, which indicates that M clients are shared.
As a preferred embodiment of the present invention, the S4 further includes:
s401: for each Client i Fixing the sub-capsule coding of the network security event inference model of the server end and the parameters of the local reconstruction module, and training the inference module and the reinforced inference module;
client of ith station i The corresponding target domain is Clog i Using the domain alignment penalty χ based on the information entropy, the expression is:
wherein,representing a source domain F S For the target domain Clog on the ith client i Mathematical expectation value of, hereIs each subset in the source domainAnd target domain Clog on the ith client i Relative entropy between, the expression:
wherein, log represents a logarithmic operation;
s402: the method is characterized in that a reinforced reasoning module is introduced into a network security time reasoning model of the client, parameters of the reinforced reasoning module in each client are different, and the reinforced reasoning module reasons an original result of the network security event reasoning model according to local configuration and is used for improving the accuracy rate of network security event handling and the robustness of the model.
As a preferred embodiment of the present invention, the S4 further includes:
s403: when the client is the 1 st client, the final network security emergency plan expression generated by the 1 st client is as follows:
wherein,the representation server side guides the network security emergency plan generated by the network security event reasoning model in the client side,represents the final network security emergency plan, clog, on the 1 st client 1 Representing a target domain on the 1 st client, wherein G and G are convolution layers of 1 × 1, and Refine is a reinforced inference layer which is formed by connecting two groups of activation functions and convolution of 3 × 3 through residual errors;
s404: the loss function expression of the reinforced reasoning module is as follows:
wherein,
s405: the total loss function for model training on the 1 st client is:
whereinA loss function representing an inference module of a network security event inference model on the server,a reinforced inference module loss function representing a network security event inference model on a client, a domain alignment loss based on information entropy, | | | | sweet wind 2 Representing a vector 2 norm operation.
As a preferred embodiment of the present invention, the S5 further includes: when the source domain of the server side is updated in the future, the server finely adjusts the network security event inference model according to the parameters, and the server and the client side only carry out model parameter interaction and do not relate to private information.
Different from the prior art, the technical scheme has the following beneficial effects:
(1) The method takes network security events of different terminals as a plurality of target domains, the server side uses a network security knowledge graph as a source domain, the server side trains a model well and transmits model parameters to the client side, so that inference on different client sides is guided, namely, only model parameter information is transmitted between the server and the client side without transmitting privacy information such as network security log files, the network security events can be analyzed and inferred through the client side, and a network security emergency response plan is automatically matched and disposed.
(2) The conventional network security knowledge graph has a large number of redundant features, the redundant features interfere model training, and accordingly the generalization effect of the model is poor.
Drawings
FIG. 1 is a diagram illustrating the overall architecture of a method according to an embodiment;
FIG. 2 is a diagram of a server side architecture in accordance with an embodiment;
fig. 3 is a diagram of a client architecture in accordance with an embodiment.
Detailed Description
To explain technical contents, structural features, and objects and effects of the technical solutions in detail, the following detailed description is given with reference to the accompanying drawings in conjunction with the embodiments.
Referring to fig. 1 to 3, as shown in the figure, the present embodiment provides a security hosting service method based on a knowledge graph and domain adaptation, including the following steps:
s1: preparing a network emergency response knowledge graph set, inputting the network emergency response knowledge graph set into a knowledge graph redundancy processing module, removing redundant features in different knowledge graphs through a graph capsule neural network with self-adaptive feature selection by the knowledge graph redundancy processing module, extracting features with high correlation, and fusing the network emergency response knowledge graph set into a new feature set serving as a source domain of a server side;
s2: training a network security event inference model by using a source domain at a server side, wherein the network security event inference model adopts sub-capsules to encode the characteristics in the source domain, and strengthens the semantic information after encoding through a local reconstruction module;
s3: assembling a plurality of sub-capsules into component capsules, inputting the component capsules into a reasoning module for decoding, generating a network security emergency response plan by the reasoning module through decoding semantic information, and broadcasting parameters of a trained network security event reasoning model to each client;
s4: each client takes the network security log file as a target domain, different source domain-target domain pairs are constructed for different clients, a network security event reasoning model of the client is trained, a reinforced reasoning module is added to the network security event reasoning model of each client during reasoning, and a proper network security emergency response plan is selected according to the result of the reinforced reasoning module;
s5: after the training of the network security event inference model of the client is finished, the client uploads the parameters of the reinforced inference module in the local network security event inference model to the server.
In the above embodiment, S1 further includes the steps of:
s101: giving a set of network emergency response knowledge maps, denoted S N For S N First knowledge-graph S in (1) 1 Performing redundant processing and relevant feature selection of features in the knowledge graph, and performing redundant processing and relevant feature selection on the knowledge graph S 1 Respectively constructing node capsules by nodes a and nodes b of the p-th layerAnd
as shown in the knowledge-graph redundancy process in fig. 2, S102: calculating the feature mapping vectors of the node a and the node b in the S101 through the feature mapping layer of the capsule diagram neural networkNamely, the characteristics of two nodes and the relationship between the characteristics are expressed by using the characteristic mapping vector, so that the purpose of removing redundancy is achieved, and the expression is as follows:
wherein,a node capsule representing the ith neighbor node of the p-th layer node a, m represents the number of neighbor nodes of the node a, and represents the node capsule of the node a when i =0,the node capsule represents the jth neighbor node of the pth layer node b, n is the number of neighbor nodes of the node b, and represents the node capsule of the node b when j = 0; MLP is a multilayer perceptron, feature representations of node capsules with low correlation are discarded, feature representations of node capsules with high correlation are reserved, and the effect of adaptive feature selection is achieved.
Measuring the correlation of two node capsules by using a mutual information function Kinfo, wherein the expression is as follows:
wherein,to representThe transpose of (a) is performed,to representExp represents an exponential function with a natural constant e as a base number;
s103: for knowledge graph S 1 Executing the step S102 on any two nodes in the same layer, carrying out self-adaptive selection on the node feature mapping in each layer, removing the high-redundancy feature mapping between layers until all the layers are calculated, and obtaining the compressed S 1 Middle node feature mapping setThe expression is as follows:
wherein f is r Set of feature maps representing the r-th layer, f s Representing a feature mapping set of the s-th layer, and softmax representing a normalized exponential function;
s104: for S N The rest of the knowledge graphs are circulated from S102 to S103 to obtain the feature mapping set F of all the knowledge graphs S ,As shown in the network security event inference model of FIG. 2, let F S The source domain is used for training a network security event inference model of the server side.
In the above embodiment, the S2 includes the steps of:
s201: the network security event reasoning model at the server side adopts two capsule encoders based on a self-attention mechanism to carry out reasoning on the network security events in the source domainEncoding to generate two sub-capsulesAndthe expression is as follows:
wherein Encoder key Is a key capsule feature extractor consisting of a residual error network-50 (ResNet)-50) composition, encoder value Is a value capsule feature extractor, consisting of a residual error network-50 (ResNet-50);
s202: the local reconstruction module is used for reconstructing the characteristics of the two sub-capsules, so that the two sub-capsules have richer semantic information, and the expression is as follows:
wherein The key capsule and the value capsule after characteristic reconstruction are respectively represented, and the tau and the mu represent characteristic reconstruction vectors which are automatically learned by a network security event inference model of a client during training, so that the reconstructed characteristics have richer semantic information.
In the above embodiment, the S3 further includes the following steps:
s301: assembling the output characteristics of the sub-capsules into component capsules, wherein the expression is as follows:
wherein,representing in the source domainThe resulting part-capsule is then ready for use,representing two weight parameters, which are automatically learned by a network security event inference model of a client during training and are used for controlling the weight of a key capsule and a value capsule in the characteristics;
s302: for source field F S The rest subsets execute the steps S201, S202 and S301, the component capsules are spliced together and then input into an inference module, and decoding is carried out in the inference module, namely, a semantic information result in a coding stage is converted into a three-dimensional embedded expression through upsampling, and the expression is as follows:
where Cat denotes the part capsule splicing operation and Decoder denotes the Decoder, consisting of 4 3 x 3-dimensional convolutions, F Decoder Is a three-dimensional embedded representation;
in the above embodiment, the S3 further includes the following steps:
s303: the second operation in the inference module is network security emergency plan generation, namely jump connection is carried out on the decoded features and the sub-capsule features in the encoding stage, the network security emergency plan is constructed according to the time sequence information and recorded as Play, no more than m events to be treated in one network security emergency plan are set, and the expression is as follows:
wherein, in [ a ] m-1 ,t m-1 ]In (a) m-1 Representing an event to be treated, t m-1 Indicating the order of the events in the network security emergency protocol,representing the jth source domain in the source domainThe resulting part-capsule is then ready for use,representing jump connection, FAM representing a feature aggregation layer which is composed of 3 × 3 convolution and double upsampling, and PAM representing a pyramid pooling layer, so that feature vectors of different shapes can be conveniently processed;
s304: the overall loss function expression of the network security event inference model of the server side is as follows:
wherein, DICE is a similarity measure function, and its expression is:
representing the kth source domain in the source domainThe key capsules in (1) and the key capsules with reconstructed characteristics;representing the kth source domain in the source domainA medium value capsule and a reconstructed value capsule;
the expression of the inference module penalty function is:
wherein,for computing the kth source domain in the source domainResulting part capsuleAnd the p-th event to be disposed [ a ] in the network security emergency plan p ,t p ]Relative entropy between;
in this embodiment, the MSS sends parameters of a network security event inference model of a server to each Client i And O < i < M +1, which means that M clients are shared.
In the above embodiment, S4 further includes the step of:
for each Client i As shown in fig. 3, the sub-capsule coding and local reconstruction module of the network security event inference model at the server end is fixed, and only the inference module and the reinforced inference module are trained;
client side of the ith station i For example, the corresponding target domain is Clog i To solve the problem of inconsistent content distribution in the source domain and the target domain, the present embodiment uses a domain alignment loss χ based on the information entropy, and the expression is:
wherein,representing a source domain F S For the target domain Clog on the ith client i Mathematical expectation value of, hereIs each subset in the source domainAnd a target domain Clog on the ith client i Relative entropy between, the expression:
wherein, log represents a logarithmic operation;
because the physical and software environments of each client are inconsistent, a reinforced reasoning module is additionally introduced into the network security time reasoning model of the client, parameters of the reinforced reasoning module in each client are different, and the reinforced reasoning module can reasoned the original result of the network security event reasoning model according to the configuration of the local computer, so that the accuracy rate of handling the network security event and the robustness of the model can be improved;
taking the 1 st client as an example, the final network security emergency plan expression generated by the 1 st client is as follows:
wherein,the network security emergency plan generated by the network security event inference model in the client is guided by the presentation server,represents the final network security emergency plan, clog, on the 1 st client 1 Representing a target domain on the 1 st client, wherein G and G are convolution layers of 1 × 1, and Refine is a reinforced inference layer which is formed by connecting two groups of activation functions and convolution of 3 × 3 through residual errors;
the loss function expression of the reinforced reasoning module is as follows:
wherein,
thus, the total loss function for model training on the 1 st client is:
whereinA loss function representing an inference module of a network security event inference model on the server,a reinforced inference module loss function representing a network security event inference model on a client, a domain alignment loss based on information entropy, | | | | sweet wind 2 Representing a vector 2 norm operation.
After the training of the network security event inference model on the client is completed, the parameters of the enhanced inference module are uploaded to the server, as shown in fig. 1, when the source domain of the server end is updated in the future, the server can finely tune the network security event inference model through the parameters, and the server and the client only carry out model parameter interaction and do not relate to private information.
In order to verify the accuracy of the method, using Malware Training Sets and mistre D3fend (network emergency response knowledge graph), the Malware Training Sets are a machine learning data set, and are intended to provide a useful classification data set for researchers who wish to use machine learning techniques to deeply study Malware analysis. Forming a contrast experiment by adopting 4 different model structures and the method used by the method, and calculating the accuracy of the semantic feature similarity of the data set; the experimental results are shown in the table below, where the F1 value = correct rate recall 2/(correct rate + recall) is used to characterize the actual average of both accuracy and recall.
The results of the BiGRU bidirectional gating cycle unit, the Siamese-BiGRU twin neural network-bidirectional gating cycle unit, the Linkage hierarchical clustering and the BERT + WMD distance model (self-coding language model) are compared, so that the method has high accuracy, the accuracy reaches 87.1 percent, and the recall rate reaches 85.1 percent, which shows that the method can deduce more effective samples; the F1 value is a harmonic mean of the accuracy and the recall rate, and the F1 value of the method reaches 86.1 percent. The experimental results prove that the method can effectively reason the network security events and generate the network security emergency plan.
In addition, the method takes the network security events of different terminals as a plurality of target domains, the server side uses the network security knowledge graph as a source domain, the model is trained well at the server side and the model parameters are transmitted to the client side, so that inference on the target domains is guided on different client sides, namely, only the model parameter information is transmitted between the server and the client side without transmitting privacy information such as network security log files and the like, the network security events can be analyzed and inferred through the client side, and the network security emergency response plan is automatically matched and handled. The conventional network security knowledge graph has a large number of redundant features, the redundant features interfere model training, and accordingly the generalization effect of the model is poor.
It should be noted that, although the above embodiments have been described herein, the invention is not limited thereto. Therefore, based on the innovative concepts of the present invention, the technical solutions of the present invention can be directly or indirectly applied to other related technical fields by changing and modifying the embodiments described herein or by using the equivalent structures or equivalent processes of the content of the present specification and the attached drawings, and are included in the scope of the present invention.
Claims (6)
1. A safe hosting service method based on knowledge graph and domain adaptation is characterized by comprising the following steps:
s1: preparing a network emergency response knowledge graph set, inputting the network emergency response knowledge graph set into a knowledge graph redundancy processing module, removing redundant features in different knowledge graphs through a graph capsule neural network with self-adaptive feature selection by the knowledge graph redundancy processing module, extracting features with high correlation, and fusing the network emergency response knowledge graph set into a new feature set serving as a source domain of a server side;
s2: training a network security event inference model by using a source domain at a server side, wherein the network security event inference model adopts sub-capsules to encode the characteristics in the source domain, and strengthens the semantic information after encoding through a local reconstruction module;
s3: assembling a plurality of sub-capsules into component capsules, inputting the component capsules into a reasoning module for decoding, generating a network security emergency response plan by the reasoning module through decoding semantic information, and broadcasting parameters of a trained network security event reasoning model to each client;
s4: each client takes the network security log file as a target domain, different source domain-target domain pairs are constructed for different clients, a network security event reasoning model of the client is trained, a reinforced reasoning module is added to the network security event reasoning model of each client during reasoning, and a proper network security emergency response plan is selected according to the result of the reinforced reasoning module;
s5: after the training of the network security event inference model of the client is finished, the client uploads the parameters of the reinforced inference module in the local network security event inference model to the server;
the S1 further comprises the following steps:
s101: giving a set of network emergency response knowledge maps, denoted S N For S N First knowledge-graph S in (1) 1 Performing redundant processing and correlation feature selection on features in the knowledge graph to obtain a knowledge graph S 1 Respectively constructing node capsules by nodes a and nodes b of the p-th layerAnd
s102: calculating the feature mapping vectors of the node a and the node b through the feature mapping layer of the capsule diagram neural networkThe expression is as follows:
wherein,a node capsule representing the ith neighbor node of the p-th layer node a, m represents the number of neighbor nodes of the node a, and represents the node capsule of the node a when i =0,representing node capsules of a jth neighbor node of a pth layer node b, wherein n is the number of neighbor nodes of the node b, when j =0, representing the node capsules of the node b, and MLP is a multilayer perceptron;
measuring the correlation of two node capsules by using a mutual information function Kinfo, wherein the expression is as follows:
wherein,to representThe transpose of (a) is performed,to representExp represents an exponential function with a natural constant e as a base number;
s103: for knowledge graph S 1 Executing the step S102 on any two nodes in the same layer, carrying out self-adaptive selection on the node feature mapping in each layer, removing the high-redundancy feature mapping between layers until all the layers are calculated, and obtaining the compressed S 1 Middle node feature mapping setThe expression is as follows:
wherein f is r Set of feature maps representing the r-th layer, f s Representing a feature mapping set of the s-th layer, and softmax representing a normalized exponential function;
s104: for S N The rest of the knowledge graphs are circulated from S102 to S103 to obtain the feature mapping set F of all the knowledge graphs S ,F is to be S For use as source domainTraining a network security event reasoning model at a server side;
the S2 comprises the following steps:
s201: the network security event reasoning model at the server side adopts two capsule encoders based on a self-attention mechanism to carry out reasoning on the network security events in the source domainEncoding to generate two sub-capsulesAndthe expression is as follows:
wherein Encoder key Is a key capsule feature extractor composed of a residual error network-50, encoder value Is a value capsule feature extractor, consisting of a residual error network-50;
s202: and (3) performing characteristic reconstruction on the two sub-capsules by using a local reconstruction module, wherein the characteristic reconstruction is used for abundant semantic information, and the expression is as follows:
whereinAnd respectively representing the key capsule and the value capsule after the characteristic reconstruction, and representing the characteristic reconstruction vector by tau and mu, wherein the characteristic reconstruction vector is obtained by automatically learning a network security event inference model of the client during training.
2. The knowledge-graph and domain-adaptation based secure hosting service method according to claim 1, wherein the S3 further comprises the steps of:
s301: assembling the output characteristics of the sub-capsules into component capsules, wherein the expression is as follows:
wherein,representing in the source domainThe resulting part-capsule is then ready for use,representing two weight parameters, which are automatically learned by a network security event inference model of a client during training and are used for controlling the weight of a key capsule and a value capsule in the characteristics;
s302: for source field F S The rest subsets execute the steps S201, S202 and S301, the component capsules are spliced together and then input into an inference module, and decoding is carried out in the inference module, namely, a semantic information result in a coding stage is converted into a three-dimensional embedded expression through upsampling, and the expression is as follows:
wherein Cat representsPart capsule splicing operation, decoder, consisting of 4 3 by 3 dimensional convolutions, F Decoder Is a three-dimensional embedded representation.
3. The knowledge-graph and domain-adaptation based secure hosting service method according to claim 2, wherein the S3 further comprises the steps of:
s303: generating a network security emergency plan, namely performing jump connection on the decoded features and the sub-capsule features in the encoding stage, constructing the network security emergency plan according to the time sequence information, recording the network security emergency plan as Play, setting that no more than m events to be handled exist in one network security emergency plan, wherein the expression is as follows:
wherein, in [ a ] m-1 ,t m-1 ]In (a) m-1 Representing an event to be treated, t m-1 Indicating the sequence of the event in the network security emergency protocol,representing the jth source domain in the source domainThe resulting part-capsule is then ready for use,representing jump connection, FAM representing a feature aggregation layer which is composed of convolution of 3 x 3 and double upsampling, and PAM representing a pyramid pooling layer for processing feature vectors of different shapes;
s304: the overall loss function expression of the network security event inference model of the server side is as follows:
wherein, DICE is a similarity measure function, and its expression is:
representing the kth source domain in the source domainThe key capsule in (1) and the key capsule after characteristic reconstruction;representing the kth source domain in the source domainThe value of (1) and the value after reconstruction capsule;
the expression of the inference module penalty function is:
wherein,for computing the kth source domain in the source domainResulting part capsuleAnd the p-th event to be disposed [ a ] in the network security emergency plan p ,t p ]Relative entropy between;
s305: sending parameters of a network security event inference model of a server side to each Client side i ,0<i<And M +1, representing M clients in total.
4. The knowledge-graph and domain-adaptation based secure hosting service method according to claim 3, wherein the S4 further comprises the steps of:
s401: for each Client i Fixing the sub-capsule coding of the network security event inference model of the server end and the parameters of the local reconstruction module, and training the inference module and the reinforced inference module;
client of ith station i The corresponding target domain is Clog i Using the domain alignment penalty χ based on the information entropy, the expression is:
wherein,representing a source domain F S For the target domain Clog on the ith client i Mathematical expectation value of, hereIs each subset in the source domainAnd target domain Clog on the ith client i Relative entropy between, the expression:
wherein, log represents a logarithmic operation;
s402: the method is characterized in that a reinforced reasoning module is introduced into a network security time reasoning model of the client, parameters of the reinforced reasoning module in each client are different, and the reinforced reasoning module reasons an original result of the network security event reasoning model according to local configuration and is used for improving the accuracy rate of network security event handling and the robustness of the model.
5. The knowledge-graph and domain-adaptation based secure hosting service method of claim 4, wherein the S4 further comprises the steps of:
s403: when the client is the 1 st client, the final network security emergency plan expression generated by the 1 st client is as follows:
wherein,the representation server side guides the network security emergency plan generated by the network security event reasoning model in the client side,represents the final network security emergency plan, clog, on the 1 st client 1 Representing a target domain on the 1 st client, wherein G and G are convolution layers of 1 × 1, and Refine is a reinforced inference layer which is formed by connecting two groups of activation functions and convolution of 3 × 3 through residual errors;
s404: the loss function expression of the reinforced reasoning module is as follows:
wherein,
s405: the total loss function for model training on the 1 st client is:
whereinA loss function representing an inference module of a network security event inference model on the server,a reinforced inference module loss function representing a network security event inference model on a client, a domain alignment loss based on information entropy, | | | | sweet wind 2 Representing a vector 2 norm operation.
6. The knowledge-graph and domain-adaptation based secure hosting service method according to claim 1, wherein the S5 further comprises the steps of: when the source domain of the server end is updated in the future, the server finely adjusts the network security event reasoning model through the parameters, and the server and the client only carry out model parameter interaction without involving private information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211083553.9A CN115146299B (en) | 2022-09-06 | 2022-09-06 | Safety trusteeship service method based on knowledge graph and domain adaptation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211083553.9A CN115146299B (en) | 2022-09-06 | 2022-09-06 | Safety trusteeship service method based on knowledge graph and domain adaptation |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115146299A CN115146299A (en) | 2022-10-04 |
CN115146299B true CN115146299B (en) | 2022-12-09 |
Family
ID=83416090
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211083553.9A Active CN115146299B (en) | 2022-09-06 | 2022-09-06 | Safety trusteeship service method based on knowledge graph and domain adaptation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115146299B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111522965A (en) * | 2020-04-22 | 2020-08-11 | 重庆邮电大学 | Question-answering method and system for entity relationship extraction based on transfer learning |
CN112231489A (en) * | 2020-10-19 | 2021-01-15 | 中国科学技术大学 | Knowledge learning and transferring method and system for epidemic prevention robot |
CN112883200A (en) * | 2021-03-15 | 2021-06-01 | 重庆大学 | Link prediction method for knowledge graph completion |
CN114491541A (en) * | 2022-03-31 | 2022-05-13 | 南京众智维信息科技有限公司 | Safe operation script automatic arrangement method based on knowledge graph path analysis |
-
2022
- 2022-09-06 CN CN202211083553.9A patent/CN115146299B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111522965A (en) * | 2020-04-22 | 2020-08-11 | 重庆邮电大学 | Question-answering method and system for entity relationship extraction based on transfer learning |
CN112231489A (en) * | 2020-10-19 | 2021-01-15 | 中国科学技术大学 | Knowledge learning and transferring method and system for epidemic prevention robot |
CN112883200A (en) * | 2021-03-15 | 2021-06-01 | 重庆大学 | Link prediction method for knowledge graph completion |
CN114491541A (en) * | 2022-03-31 | 2022-05-13 | 南京众智维信息科技有限公司 | Safe operation script automatic arrangement method based on knowledge graph path analysis |
Also Published As
Publication number | Publication date |
---|---|
CN115146299A (en) | 2022-10-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112633010B (en) | Aspect-level emotion analysis method and system based on multi-head attention and graph convolution network | |
CN110597991A (en) | Text classification method and device, computer equipment and storage medium | |
CN110489567B (en) | Node information acquisition method and device based on cross-network feature mapping | |
CN116050401B (en) | Method for automatically generating diversity problems based on transform problem keyword prediction | |
CN114926770B (en) | Video motion recognition method, apparatus, device and computer readable storage medium | |
CN111368545A (en) | Named entity identification method and device based on multi-task learning | |
DE102021004562A1 (en) | Modification of scene graphs based on natural language commands | |
WO2023155546A1 (en) | Structure data generation method and apparatus, device, medium, and program product | |
CN116821291A (en) | Question-answering method and system based on knowledge graph embedding and language model alternate learning | |
CN114239675A (en) | Knowledge graph complementing method for fusing multi-mode content | |
CN115146299B (en) | Safety trusteeship service method based on knowledge graph and domain adaptation | |
CN114861907A (en) | Data calculation method, device, storage medium and equipment | |
CN117010494B (en) | Medical data generation method and system based on causal expression learning | |
CN117787343A (en) | Long-sequence prediction method and device for microblog topic trend and computer storage medium | |
CN113377656A (en) | Crowd-sourcing recommendation method based on graph neural network | |
CN112288154A (en) | Block chain service reliability prediction method based on improved neural collaborative filtering | |
CN115422376B (en) | Network security event source tracing script generation method based on knowledge graph composite embedding | |
Wu et al. | Spiking neural P systems with communication on request and mute rules | |
CN116543339A (en) | Short video event detection method and device based on multi-scale attention fusion | |
CN113849641B (en) | Knowledge distillation method and system for cross-domain hierarchical relationship | |
CN115167863A (en) | Code completion method and device based on code sequence and code graph fusion | |
CN114333069A (en) | Object posture processing method, device, equipment and storage medium | |
CN117808944B (en) | Method and device for processing text action data of digital person, storage medium and electronic device | |
CN117808083B (en) | Distributed training communication method, device, system, equipment and storage medium | |
CN113609280B (en) | Multi-domain dialogue generation method, device, equipment and medium based on meta learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20230825 Address after: Room 3-3, No.1 Guanghua East Street, Qinhuai District, Nanjing City, Jiangsu Province, 210000 Patentee after: Big data Security Technology Co.,Ltd. Address before: 211300 No. 3, Longjing Road, Gaochun District, Nanjing, Jiangsu Patentee before: NANJING ZHONGZHIWEI INFORMATION TECHNOLOGY Co.,Ltd. |