CN115146299B - Safety trusteeship service method based on knowledge graph and domain adaptation - Google Patents

Safety trusteeship service method based on knowledge graph and domain adaptation Download PDF

Info

Publication number
CN115146299B
CN115146299B CN202211083553.9A CN202211083553A CN115146299B CN 115146299 B CN115146299 B CN 115146299B CN 202211083553 A CN202211083553 A CN 202211083553A CN 115146299 B CN115146299 B CN 115146299B
Authority
CN
China
Prior art keywords
network security
client
representing
capsule
domain
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211083553.9A
Other languages
Chinese (zh)
Other versions
CN115146299A (en
Inventor
孙捷
车洵
梁小川
胡牧
金奎�
孙翰墨
程佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Big Data Security Technology Co ltd
Original Assignee
Nanjing Zhongzhiwei Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Zhongzhiwei Information Technology Co ltd filed Critical Nanjing Zhongzhiwei Information Technology Co ltd
Priority to CN202211083553.9A priority Critical patent/CN115146299B/en
Publication of CN115146299A publication Critical patent/CN115146299A/en
Application granted granted Critical
Publication of CN115146299B publication Critical patent/CN115146299B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/604Tools and structures for managing or administering access control systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/55Detecting local intrusion or implementing counter-measures
    • G06F21/554Detecting local intrusion or implementing counter-measures involving event detection and direct action
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/606Protecting data by securing the transmission between two devices or processes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Bioethics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Databases & Information Systems (AREA)
  • Animal Behavior & Ethology (AREA)
  • Automation & Control Theory (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a safe trusteeship service method based on knowledge graph and domain adaptation, comprising the following steps: inputting the prepared network emergency response knowledge graph set into a knowledge graph redundancy processing module, removing redundant features in different knowledge graphs, extracting features with high correlation, and using the processed features as a source domain of a server side; training a network security event inference model by using a source domain at a server side, and broadcasting parameters of the trained network security event inference model to each client side; each client takes the network security log file as a target domain, different source domain-target domain pairs are constructed for different clients, and a network security event inference model of the client is trained; after the training of the network security event inference model of the client is finished, the client uploads the parameters of the reinforced inference module in the local network security event inference model to the server; the method has the characteristic of effectively improving the processing efficiency of the network security event.

Description

Safety trusteeship service method based on knowledge graph and domain adaptation
Technical Field
The invention relates to the technical field of network security, in particular to a safe hosting service method based on a knowledge graph and domain adaptation.
Background
With the increasing innovation and development of internet technology, the network security problem is also more severe, the scale of network attack is increasingly organized, the attack means is continuously changed, diversification and structuralization are realized, and the network emergency response work is also more important.
At present, the traditional network emergency response carries out plan matching according to the content of an alarm event based on characteristics such as keywords or indexes, and the traditional means has the defects of low matching efficiency and low matching accuracy, so that the network security event which is increasingly complicated and has high comprehensive degree is difficult to solve. Meanwhile, the traditional means needs a large amount of manual work to participate in data analysis, has high requirements on data formatting of characteristic information, and is difficult to meet the accuracy requirement of plan matching after a network security incident occurs. The network security emergency response knowledge graph comprises a large amount of characteristic information of an attack means and a corresponding solution, the knowledge graph stores data through a graph structure, the data relation stored by the structure is a non-single relation, and redundant information in the graph is more, so that the subsequent processing is not facilitated. The safety hosting Service (MSS) submits part of heavy and repeated safety operation work to a professional cloud Service provider, and the professional safety operation team develops continuous analysis and operation Service. The enterprise turning to the security hosting service provider can relieve the pressure of the enterprise on information security every day, and by means of the advantages of the security hosting service provider in some security fields, the short boards of the enterprise in security construction or operation management can be supplemented, so that the security management efficiency is improved, and therefore, a security hosting service method based on knowledge graph and domain adaptation is urgently needed to be provided to solve the problems.
Disclosure of Invention
To achieve the above object, the inventor provides a safe hosting service method based on knowledge graph and domain adaptation, comprising the following steps:
s1: preparing a network emergency response knowledge graph set, inputting the network emergency response knowledge graph set into a knowledge graph redundancy processing module, removing redundant features in different knowledge graphs through a graph capsule neural network with self-adaptive feature selection, extracting features with high correlation, and fusing the network emergency response knowledge graph set into a new feature set serving as a source domain of a server side;
s2: training a network security event inference model by using a source domain at a server side, wherein the network security event inference model adopts sub-capsules to encode the characteristics in the source domain, and strengthens the semantic information after encoding through a local reconstruction module;
s3: assembling a plurality of sub-capsules into component capsules, inputting the component capsules into a reasoning module for decoding, generating a network security emergency response plan by the reasoning module through decoding semantic information, and broadcasting parameters of a trained network security event reasoning model to each client;
s4: each client takes the network security log file as a target domain, different source domain-target domain pairs are constructed for different clients, a network security event reasoning model of the client is trained, a reinforced reasoning module is added to the network security event reasoning model of each client during reasoning, and a proper network security emergency response plan is selected according to the result of the reinforced reasoning module;
s5: after the training of the network security event inference model of the client is finished, the client uploads the parameters of the reinforced inference module in the local network security event inference model to the server.
As a preferred embodiment of the present invention, the S1 further includes the steps of:
s101: giving a set of network emergency response knowledge graphs, denoted as S N For S N The first knowledge-graph S in 1 Performing redundant processing and relevant feature selection of features in the knowledge graph, and performing redundant processing and relevant feature selection on the knowledge graph S 1 Respectively constructing node capsules by using nodes a and nodes b of the middle p-th layer
Figure GDA0003912239630000021
And
Figure GDA0003912239630000022
s102: calculating the feature mapping vectors of the node a and the node b through the feature mapping layer of the capsule diagram neural network
Figure GDA0003912239630000023
The expression is as follows:
Figure GDA0003912239630000031
wherein,
Figure GDA0003912239630000032
a node capsule representing the ith neighbor node of the p-th layer node a, m represents the number of neighbor nodes of the node a, and represents the node capsule of the node a when i =0,
Figure GDA0003912239630000033
the node capsule represents the jth neighbor node of the pth layer of node b, n is the number of neighbor nodes of the node b, when j =0, the node capsule represents the node b, and MLP is a multilayer perceptron;
measuring the correlation of two node capsules by using a mutual information function Kinfo, wherein the expression is as follows:
Figure GDA0003912239630000034
wherein,
Figure GDA0003912239630000035
to represent
Figure GDA0003912239630000036
The transpose of (a) is performed,
Figure GDA0003912239630000037
represent
Figure GDA0003912239630000038
Exp represents an exponential function with a natural constant e as a base number;
s103: for knowledge graph S 1 Executing the step S102 on any two nodes in the same layer, carrying out self-adaptive selection on the node feature mapping in each layer, and removing the feature mapping with high redundancy between layersUntil all layers are calculated, and a compressed node feature mapping set in S1 is obtained
Figure GDA0003912239630000039
The expression is as follows:
Figure GDA00039122396300000310
wherein f is r Set of feature maps representing the r-th layer, f s Representing a feature mapping set of the s-th layer, and softmax representing a normalized exponential function;
s104: for S N The rest knowledge graphs in (1) are circulated from S102 to S103, and a feature mapping set F of all knowledge graphs is obtained S
Figure GDA00039122396300000311
F is to be S The source domain is used for training a network security event inference model of the server side.
As a preferred embodiment of the present invention, the S2 includes the steps of:
s201: the network security event reasoning model at the server side adopts two capsule encoders based on a self-attention mechanism to carry out reasoning on the network security events in the source domain
Figure GDA0003912239630000041
Encoding to generate two sub-capsules
Figure GDA0003912239630000042
And
Figure GDA00039122396300000413
the expression is as follows:
Figure GDA0003912239630000043
Figure GDA0003912239630000044
wherein Encoder key Is a key capsule feature extractor composed of a residual error network-50, encoder value Is a value capsule feature extractor, consisting of a residual error network-50;
s202: and (3) performing characteristic reconstruction on the two sub-capsules by using a local reconstruction module, wherein the characteristic reconstruction is used for abundant semantic information, and the expression is as follows:
Figure GDA0003912239630000045
Figure GDA0003912239630000046
wherein
Figure GDA0003912239630000047
Figure GDA0003912239630000048
And respectively representing the key capsule and the value capsule after the characteristic reconstruction, and representing the characteristic reconstruction vector by tau and mu, wherein the characteristic reconstruction vector is obtained by automatically learning a network security event inference model of the client during training.
As a preferred embodiment of the present invention, the S3 further includes the steps of:
s301: assembling the output characteristics of the sub-capsules into component capsules, wherein the expression is as follows:
Figure GDA0003912239630000049
wherein,
Figure GDA00039122396300000410
representing in the source domain
Figure GDA00039122396300000411
The resulting part-capsule is then ready for use,
Figure GDA00039122396300000412
representing two weight parameters, which are automatically learned by a network security event inference model of a client during training and are used for controlling the weight of a key capsule and a value capsule in the characteristics;
s302: for source field F S The other subsets execute the steps S201, S202 and S301, the component capsules are spliced together and then input into an inference module, and decoding is carried out in the inference module, namely, a semantic information result in a coding stage is converted into a three-dimensional embedded expression through upsampling, and the expression is as follows:
Figure GDA0003912239630000051
where Cat denotes the part capsule splicing operation and Decoder denotes the Decoder, consisting of 4 3 x 3-dimensional convolutions, F Decoder Is a three-dimensional embedded representation.
As a preferred embodiment of the present invention, the S3 further includes the steps of:
s303: generating a network security emergency plan, namely performing jump connection on the decoded features and the sub-capsule features in the encoding stage, constructing the network security emergency plan according to the time sequence information, recording the network security emergency plan as Play, setting that no more than m events to be treated exist in one network security emergency plan, wherein the expression is as follows:
Figure GDA0003912239630000052
wherein, in [ a ] m-1 ,t m-1 ]In (a) m-1 Representing an event to be treated, t m-1 Indicating the order of the events in the network security emergency protocol,
Figure GDA0003912239630000053
representing the jth source domain in the source domain
Figure GDA0003912239630000054
Produced byThe components of the capsule are taken together,
Figure GDA0003912239630000055
representing jump connection, FAM representing a feature aggregation layer which is composed of convolution of 3 x 3 and double upsampling, and PAM representing a pyramid pooling layer for processing feature vectors of different shapes;
s304: the overall loss function expression of the network security event inference model of the server side is as follows:
Figure GDA0003912239630000056
wherein, DICE is a similarity measure function, and its expression is:
Figure GDA0003912239630000061
Figure GDA0003912239630000062
Figure GDA0003912239630000063
representing the kth source domain in the source domain
Figure GDA0003912239630000064
The key capsules in (1) and the key capsules with reconstructed characteristics;
Figure GDA0003912239630000065
representing the kth source domain in the source domain
Figure GDA0003912239630000066
A medium value capsule and a reconstructed value capsule;
the expression of the inference module penalty function is:
Figure GDA0003912239630000067
wherein,
Figure GDA0003912239630000068
for computing the kth source domain in the source domain
Figure GDA0003912239630000069
Resulting component capsules
Figure GDA00039122396300000610
And the p-th event to be disposed [ a ] in the network security emergency plan p ,t p ]Relative entropy between;
s305: sending parameters of a network security event inference model of a server side to each Client side i And i is more than 0 and less than M +1, which indicates that M clients are shared.
As a preferred embodiment of the present invention, the S4 further includes:
s401: for each Client i Fixing the sub-capsule coding of the network security event inference model of the server end and the parameters of the local reconstruction module, and training the inference module and the reinforced inference module;
client of ith station i The corresponding target domain is Clog i Using the domain alignment penalty χ based on the information entropy, the expression is:
Figure GDA0003912239630000071
wherein,
Figure GDA0003912239630000072
representing a source domain F S For the target domain Clog on the ith client i Mathematical expectation value of, here
Figure GDA0003912239630000073
Is each subset in the source domain
Figure GDA0003912239630000074
And target domain Clog on the ith client i Relative entropy between, the expression:
Figure GDA0003912239630000075
wherein, log represents a logarithmic operation;
s402: the method is characterized in that a reinforced reasoning module is introduced into a network security time reasoning model of the client, parameters of the reinforced reasoning module in each client are different, and the reinforced reasoning module reasons an original result of the network security event reasoning model according to local configuration and is used for improving the accuracy rate of network security event handling and the robustness of the model.
As a preferred embodiment of the present invention, the S4 further includes:
s403: when the client is the 1 st client, the final network security emergency plan expression generated by the 1 st client is as follows:
Figure GDA0003912239630000076
wherein,
Figure GDA0003912239630000077
the representation server side guides the network security emergency plan generated by the network security event reasoning model in the client side,
Figure GDA0003912239630000078
represents the final network security emergency plan, clog, on the 1 st client 1 Representing a target domain on the 1 st client, wherein G and G are convolution layers of 1 × 1, and Refine is a reinforced inference layer which is formed by connecting two groups of activation functions and convolution of 3 × 3 through residual errors;
s404: the loss function expression of the reinforced reasoning module is as follows:
Figure GDA0003912239630000081
wherein,
Figure GDA0003912239630000082
s405: the total loss function for model training on the 1 st client is:
Figure GDA0003912239630000083
wherein
Figure GDA0003912239630000084
A loss function representing an inference module of a network security event inference model on the server,
Figure GDA0003912239630000085
a reinforced inference module loss function representing a network security event inference model on a client, a domain alignment loss based on information entropy, | | | | sweet wind 2 Representing a vector 2 norm operation.
As a preferred embodiment of the present invention, the S5 further includes: when the source domain of the server side is updated in the future, the server finely adjusts the network security event inference model according to the parameters, and the server and the client side only carry out model parameter interaction and do not relate to private information.
Different from the prior art, the technical scheme has the following beneficial effects:
(1) The method takes network security events of different terminals as a plurality of target domains, the server side uses a network security knowledge graph as a source domain, the server side trains a model well and transmits model parameters to the client side, so that inference on different client sides is guided, namely, only model parameter information is transmitted between the server and the client side without transmitting privacy information such as network security log files, the network security events can be analyzed and inferred through the client side, and a network security emergency response plan is automatically matched and disposed.
(2) The conventional network security knowledge graph has a large number of redundant features, the redundant features interfere model training, and accordingly the generalization effect of the model is poor.
Drawings
FIG. 1 is a diagram illustrating the overall architecture of a method according to an embodiment;
FIG. 2 is a diagram of a server side architecture in accordance with an embodiment;
fig. 3 is a diagram of a client architecture in accordance with an embodiment.
Detailed Description
To explain technical contents, structural features, and objects and effects of the technical solutions in detail, the following detailed description is given with reference to the accompanying drawings in conjunction with the embodiments.
Referring to fig. 1 to 3, as shown in the figure, the present embodiment provides a security hosting service method based on a knowledge graph and domain adaptation, including the following steps:
s1: preparing a network emergency response knowledge graph set, inputting the network emergency response knowledge graph set into a knowledge graph redundancy processing module, removing redundant features in different knowledge graphs through a graph capsule neural network with self-adaptive feature selection by the knowledge graph redundancy processing module, extracting features with high correlation, and fusing the network emergency response knowledge graph set into a new feature set serving as a source domain of a server side;
s2: training a network security event inference model by using a source domain at a server side, wherein the network security event inference model adopts sub-capsules to encode the characteristics in the source domain, and strengthens the semantic information after encoding through a local reconstruction module;
s3: assembling a plurality of sub-capsules into component capsules, inputting the component capsules into a reasoning module for decoding, generating a network security emergency response plan by the reasoning module through decoding semantic information, and broadcasting parameters of a trained network security event reasoning model to each client;
s4: each client takes the network security log file as a target domain, different source domain-target domain pairs are constructed for different clients, a network security event reasoning model of the client is trained, a reinforced reasoning module is added to the network security event reasoning model of each client during reasoning, and a proper network security emergency response plan is selected according to the result of the reinforced reasoning module;
s5: after the training of the network security event inference model of the client is finished, the client uploads the parameters of the reinforced inference module in the local network security event inference model to the server.
In the above embodiment, S1 further includes the steps of:
s101: giving a set of network emergency response knowledge maps, denoted S N For S N First knowledge-graph S in (1) 1 Performing redundant processing and relevant feature selection of features in the knowledge graph, and performing redundant processing and relevant feature selection on the knowledge graph S 1 Respectively constructing node capsules by nodes a and nodes b of the p-th layer
Figure GDA0003912239630000101
And
Figure GDA0003912239630000102
as shown in the knowledge-graph redundancy process in fig. 2, S102: calculating the feature mapping vectors of the node a and the node b in the S101 through the feature mapping layer of the capsule diagram neural network
Figure GDA0003912239630000103
Namely, the characteristics of two nodes and the relationship between the characteristics are expressed by using the characteristic mapping vector, so that the purpose of removing redundancy is achieved, and the expression is as follows:
Figure GDA0003912239630000104
wherein,
Figure GDA0003912239630000105
a node capsule representing the ith neighbor node of the p-th layer node a, m represents the number of neighbor nodes of the node a, and represents the node capsule of the node a when i =0,
Figure GDA0003912239630000106
the node capsule represents the jth neighbor node of the pth layer node b, n is the number of neighbor nodes of the node b, and represents the node capsule of the node b when j = 0; MLP is a multilayer perceptron, feature representations of node capsules with low correlation are discarded, feature representations of node capsules with high correlation are reserved, and the effect of adaptive feature selection is achieved.
Measuring the correlation of two node capsules by using a mutual information function Kinfo, wherein the expression is as follows:
Figure GDA0003912239630000111
wherein,
Figure GDA0003912239630000112
to represent
Figure GDA0003912239630000113
The transpose of (a) is performed,
Figure GDA0003912239630000114
to represent
Figure GDA0003912239630000115
Exp represents an exponential function with a natural constant e as a base number;
s103: for knowledge graph S 1 Executing the step S102 on any two nodes in the same layer, carrying out self-adaptive selection on the node feature mapping in each layer, removing the high-redundancy feature mapping between layers until all the layers are calculated, and obtaining the compressed S 1 Middle node feature mapping set
Figure GDA0003912239630000116
The expression is as follows:
Figure GDA0003912239630000117
wherein f is r Set of feature maps representing the r-th layer, f s Representing a feature mapping set of the s-th layer, and softmax representing a normalized exponential function;
s104: for S N The rest of the knowledge graphs are circulated from S102 to S103 to obtain the feature mapping set F of all the knowledge graphs S
Figure GDA0003912239630000118
As shown in the network security event inference model of FIG. 2, let F S The source domain is used for training a network security event inference model of the server side.
In the above embodiment, the S2 includes the steps of:
s201: the network security event reasoning model at the server side adopts two capsule encoders based on a self-attention mechanism to carry out reasoning on the network security events in the source domain
Figure GDA0003912239630000119
Encoding to generate two sub-capsules
Figure GDA00039122396300001110
And
Figure GDA00039122396300001111
the expression is as follows:
Figure GDA00039122396300001112
Figure GDA00039122396300001113
wherein Encoder key Is a key capsule feature extractor consisting of a residual error network-50 (ResNet)-50) composition, encoder value Is a value capsule feature extractor, consisting of a residual error network-50 (ResNet-50);
s202: the local reconstruction module is used for reconstructing the characteristics of the two sub-capsules, so that the two sub-capsules have richer semantic information, and the expression is as follows:
Figure GDA0003912239630000121
Figure GDA0003912239630000122
wherein
Figure GDA0003912239630000123
Figure GDA0003912239630000124
The key capsule and the value capsule after characteristic reconstruction are respectively represented, and the tau and the mu represent characteristic reconstruction vectors which are automatically learned by a network security event inference model of a client during training, so that the reconstructed characteristics have richer semantic information.
In the above embodiment, the S3 further includes the following steps:
s301: assembling the output characteristics of the sub-capsules into component capsules, wherein the expression is as follows:
Figure GDA0003912239630000125
wherein,
Figure GDA0003912239630000126
representing in the source domain
Figure GDA0003912239630000127
The resulting part-capsule is then ready for use,
Figure GDA0003912239630000128
representing two weight parameters, which are automatically learned by a network security event inference model of a client during training and are used for controlling the weight of a key capsule and a value capsule in the characteristics;
s302: for source field F S The rest subsets execute the steps S201, S202 and S301, the component capsules are spliced together and then input into an inference module, and decoding is carried out in the inference module, namely, a semantic information result in a coding stage is converted into a three-dimensional embedded expression through upsampling, and the expression is as follows:
Figure GDA0003912239630000129
where Cat denotes the part capsule splicing operation and Decoder denotes the Decoder, consisting of 4 3 x 3-dimensional convolutions, F Decoder Is a three-dimensional embedded representation;
in the above embodiment, the S3 further includes the following steps:
s303: the second operation in the inference module is network security emergency plan generation, namely jump connection is carried out on the decoded features and the sub-capsule features in the encoding stage, the network security emergency plan is constructed according to the time sequence information and recorded as Play, no more than m events to be treated in one network security emergency plan are set, and the expression is as follows:
Figure GDA0003912239630000131
wherein, in [ a ] m-1 ,t m-1 ]In (a) m-1 Representing an event to be treated, t m-1 Indicating the order of the events in the network security emergency protocol,
Figure GDA0003912239630000132
representing the jth source domain in the source domain
Figure GDA0003912239630000133
The resulting part-capsule is then ready for use,
Figure GDA0003912239630000134
representing jump connection, FAM representing a feature aggregation layer which is composed of 3 × 3 convolution and double upsampling, and PAM representing a pyramid pooling layer, so that feature vectors of different shapes can be conveniently processed;
s304: the overall loss function expression of the network security event inference model of the server side is as follows:
Figure GDA0003912239630000135
wherein, DICE is a similarity measure function, and its expression is:
Figure GDA0003912239630000136
Figure GDA0003912239630000141
Figure GDA0003912239630000142
representing the kth source domain in the source domain
Figure GDA0003912239630000143
The key capsules in (1) and the key capsules with reconstructed characteristics;
Figure GDA0003912239630000144
representing the kth source domain in the source domain
Figure GDA0003912239630000145
A medium value capsule and a reconstructed value capsule;
the expression of the inference module penalty function is:
Figure GDA0003912239630000146
wherein,
Figure GDA0003912239630000147
for computing the kth source domain in the source domain
Figure GDA0003912239630000148
Resulting part capsule
Figure GDA0003912239630000149
And the p-th event to be disposed [ a ] in the network security emergency plan p ,t p ]Relative entropy between;
in this embodiment, the MSS sends parameters of a network security event inference model of a server to each Client i And O < i < M +1, which means that M clients are shared.
In the above embodiment, S4 further includes the step of:
for each Client i As shown in fig. 3, the sub-capsule coding and local reconstruction module of the network security event inference model at the server end is fixed, and only the inference module and the reinforced inference module are trained;
client side of the ith station i For example, the corresponding target domain is Clog i To solve the problem of inconsistent content distribution in the source domain and the target domain, the present embodiment uses a domain alignment loss χ based on the information entropy, and the expression is:
Figure GDA00039122396300001410
wherein,
Figure GDA0003912239630000151
representing a source domain F S For the target domain Clog on the ith client i Mathematical expectation value of, here
Figure GDA0003912239630000152
Is each subset in the source domain
Figure GDA0003912239630000158
And a target domain Clog on the ith client i Relative entropy between, the expression:
Figure GDA0003912239630000153
wherein, log represents a logarithmic operation;
because the physical and software environments of each client are inconsistent, a reinforced reasoning module is additionally introduced into the network security time reasoning model of the client, parameters of the reinforced reasoning module in each client are different, and the reinforced reasoning module can reasoned the original result of the network security event reasoning model according to the configuration of the local computer, so that the accuracy rate of handling the network security event and the robustness of the model can be improved;
taking the 1 st client as an example, the final network security emergency plan expression generated by the 1 st client is as follows:
Figure GDA0003912239630000154
wherein,
Figure GDA0003912239630000155
the network security emergency plan generated by the network security event inference model in the client is guided by the presentation server,
Figure GDA0003912239630000156
represents the final network security emergency plan, clog, on the 1 st client 1 Representing a target domain on the 1 st client, wherein G and G are convolution layers of 1 × 1, and Refine is a reinforced inference layer which is formed by connecting two groups of activation functions and convolution of 3 × 3 through residual errors;
the loss function expression of the reinforced reasoning module is as follows:
Figure GDA0003912239630000157
wherein,
Figure GDA0003912239630000161
thus, the total loss function for model training on the 1 st client is:
Figure GDA0003912239630000162
wherein
Figure GDA0003912239630000163
A loss function representing an inference module of a network security event inference model on the server,
Figure GDA0003912239630000164
a reinforced inference module loss function representing a network security event inference model on a client, a domain alignment loss based on information entropy, | | | | sweet wind 2 Representing a vector 2 norm operation.
After the training of the network security event inference model on the client is completed, the parameters of the enhanced inference module are uploaded to the server, as shown in fig. 1, when the source domain of the server end is updated in the future, the server can finely tune the network security event inference model through the parameters, and the server and the client only carry out model parameter interaction and do not relate to private information.
In order to verify the accuracy of the method, using Malware Training Sets and mistre D3fend (network emergency response knowledge graph), the Malware Training Sets are a machine learning data set, and are intended to provide a useful classification data set for researchers who wish to use machine learning techniques to deeply study Malware analysis. Forming a contrast experiment by adopting 4 different model structures and the method used by the method, and calculating the accuracy of the semantic feature similarity of the data set; the experimental results are shown in the table below, where the F1 value = correct rate recall 2/(correct rate + recall) is used to characterize the actual average of both accuracy and recall.
Figure GDA0003912239630000165
Figure GDA0003912239630000171
The results of the BiGRU bidirectional gating cycle unit, the Siamese-BiGRU twin neural network-bidirectional gating cycle unit, the Linkage hierarchical clustering and the BERT + WMD distance model (self-coding language model) are compared, so that the method has high accuracy, the accuracy reaches 87.1 percent, and the recall rate reaches 85.1 percent, which shows that the method can deduce more effective samples; the F1 value is a harmonic mean of the accuracy and the recall rate, and the F1 value of the method reaches 86.1 percent. The experimental results prove that the method can effectively reason the network security events and generate the network security emergency plan.
In addition, the method takes the network security events of different terminals as a plurality of target domains, the server side uses the network security knowledge graph as a source domain, the model is trained well at the server side and the model parameters are transmitted to the client side, so that inference on the target domains is guided on different client sides, namely, only the model parameter information is transmitted between the server and the client side without transmitting privacy information such as network security log files and the like, the network security events can be analyzed and inferred through the client side, and the network security emergency response plan is automatically matched and handled. The conventional network security knowledge graph has a large number of redundant features, the redundant features interfere model training, and accordingly the generalization effect of the model is poor.
It should be noted that, although the above embodiments have been described herein, the invention is not limited thereto. Therefore, based on the innovative concepts of the present invention, the technical solutions of the present invention can be directly or indirectly applied to other related technical fields by changing and modifying the embodiments described herein or by using the equivalent structures or equivalent processes of the content of the present specification and the attached drawings, and are included in the scope of the present invention.

Claims (6)

1. A safe hosting service method based on knowledge graph and domain adaptation is characterized by comprising the following steps:
s1: preparing a network emergency response knowledge graph set, inputting the network emergency response knowledge graph set into a knowledge graph redundancy processing module, removing redundant features in different knowledge graphs through a graph capsule neural network with self-adaptive feature selection by the knowledge graph redundancy processing module, extracting features with high correlation, and fusing the network emergency response knowledge graph set into a new feature set serving as a source domain of a server side;
s2: training a network security event inference model by using a source domain at a server side, wherein the network security event inference model adopts sub-capsules to encode the characteristics in the source domain, and strengthens the semantic information after encoding through a local reconstruction module;
s3: assembling a plurality of sub-capsules into component capsules, inputting the component capsules into a reasoning module for decoding, generating a network security emergency response plan by the reasoning module through decoding semantic information, and broadcasting parameters of a trained network security event reasoning model to each client;
s4: each client takes the network security log file as a target domain, different source domain-target domain pairs are constructed for different clients, a network security event reasoning model of the client is trained, a reinforced reasoning module is added to the network security event reasoning model of each client during reasoning, and a proper network security emergency response plan is selected according to the result of the reinforced reasoning module;
s5: after the training of the network security event inference model of the client is finished, the client uploads the parameters of the reinforced inference module in the local network security event inference model to the server;
the S1 further comprises the following steps:
s101: giving a set of network emergency response knowledge maps, denoted S N For S N First knowledge-graph S in (1) 1 Performing redundant processing and correlation feature selection on features in the knowledge graph to obtain a knowledge graph S 1 Respectively constructing node capsules by nodes a and nodes b of the p-th layer
Figure FDA0003912239620000011
And
Figure FDA0003912239620000012
s102: calculating the feature mapping vectors of the node a and the node b through the feature mapping layer of the capsule diagram neural network
Figure FDA0003912239620000013
The expression is as follows:
Figure FDA0003912239620000021
wherein,
Figure FDA0003912239620000022
a node capsule representing the ith neighbor node of the p-th layer node a, m represents the number of neighbor nodes of the node a, and represents the node capsule of the node a when i =0,
Figure FDA0003912239620000023
representing node capsules of a jth neighbor node of a pth layer node b, wherein n is the number of neighbor nodes of the node b, when j =0, representing the node capsules of the node b, and MLP is a multilayer perceptron;
measuring the correlation of two node capsules by using a mutual information function Kinfo, wherein the expression is as follows:
Figure FDA0003912239620000024
wherein,
Figure FDA0003912239620000025
to represent
Figure FDA0003912239620000026
The transpose of (a) is performed,
Figure FDA0003912239620000027
to represent
Figure FDA0003912239620000028
Exp represents an exponential function with a natural constant e as a base number;
s103: for knowledge graph S 1 Executing the step S102 on any two nodes in the same layer, carrying out self-adaptive selection on the node feature mapping in each layer, removing the high-redundancy feature mapping between layers until all the layers are calculated, and obtaining the compressed S 1 Middle node feature mapping set
Figure FDA0003912239620000029
The expression is as follows:
Figure FDA00039122396200000210
wherein f is r Set of feature maps representing the r-th layer, f s Representing a feature mapping set of the s-th layer, and softmax representing a normalized exponential function;
s104: for S N The rest of the knowledge graphs are circulated from S102 to S103 to obtain the feature mapping set F of all the knowledge graphs S ,
Figure FDA00039122396200000211
F is to be S For use as source domainTraining a network security event reasoning model at a server side;
the S2 comprises the following steps:
s201: the network security event reasoning model at the server side adopts two capsule encoders based on a self-attention mechanism to carry out reasoning on the network security events in the source domain
Figure FDA0003912239620000031
Encoding to generate two sub-capsules
Figure FDA0003912239620000032
And
Figure FDA0003912239620000033
the expression is as follows:
Figure FDA0003912239620000034
Figure FDA0003912239620000035
wherein Encoder key Is a key capsule feature extractor composed of a residual error network-50, encoder value Is a value capsule feature extractor, consisting of a residual error network-50;
s202: and (3) performing characteristic reconstruction on the two sub-capsules by using a local reconstruction module, wherein the characteristic reconstruction is used for abundant semantic information, and the expression is as follows:
Figure FDA0003912239620000036
Figure FDA0003912239620000037
wherein
Figure FDA0003912239620000038
And respectively representing the key capsule and the value capsule after the characteristic reconstruction, and representing the characteristic reconstruction vector by tau and mu, wherein the characteristic reconstruction vector is obtained by automatically learning a network security event inference model of the client during training.
2. The knowledge-graph and domain-adaptation based secure hosting service method according to claim 1, wherein the S3 further comprises the steps of:
s301: assembling the output characteristics of the sub-capsules into component capsules, wherein the expression is as follows:
Figure FDA0003912239620000039
wherein,
Figure FDA00039122396200000310
representing in the source domain
Figure FDA00039122396200000311
The resulting part-capsule is then ready for use,
Figure FDA00039122396200000312
representing two weight parameters, which are automatically learned by a network security event inference model of a client during training and are used for controlling the weight of a key capsule and a value capsule in the characteristics;
s302: for source field F S The rest subsets execute the steps S201, S202 and S301, the component capsules are spliced together and then input into an inference module, and decoding is carried out in the inference module, namely, a semantic information result in a coding stage is converted into a three-dimensional embedded expression through upsampling, and the expression is as follows:
Figure FDA0003912239620000041
wherein Cat representsPart capsule splicing operation, decoder, consisting of 4 3 by 3 dimensional convolutions, F Decoder Is a three-dimensional embedded representation.
3. The knowledge-graph and domain-adaptation based secure hosting service method according to claim 2, wherein the S3 further comprises the steps of:
s303: generating a network security emergency plan, namely performing jump connection on the decoded features and the sub-capsule features in the encoding stage, constructing the network security emergency plan according to the time sequence information, recording the network security emergency plan as Play, setting that no more than m events to be handled exist in one network security emergency plan, wherein the expression is as follows:
Figure FDA0003912239620000042
wherein, in [ a ] m-1 ,t m-1 ]In (a) m-1 Representing an event to be treated, t m-1 Indicating the sequence of the event in the network security emergency protocol,
Figure FDA0003912239620000043
representing the jth source domain in the source domain
Figure FDA0003912239620000044
The resulting part-capsule is then ready for use,
Figure FDA0003912239620000045
representing jump connection, FAM representing a feature aggregation layer which is composed of convolution of 3 x 3 and double upsampling, and PAM representing a pyramid pooling layer for processing feature vectors of different shapes;
s304: the overall loss function expression of the network security event inference model of the server side is as follows:
Figure FDA0003912239620000046
wherein, DICE is a similarity measure function, and its expression is:
Figure FDA0003912239620000051
Figure FDA0003912239620000052
Figure FDA0003912239620000053
representing the kth source domain in the source domain
Figure FDA0003912239620000054
The key capsule in (1) and the key capsule after characteristic reconstruction;
Figure FDA0003912239620000055
representing the kth source domain in the source domain
Figure FDA0003912239620000056
The value of (1) and the value after reconstruction capsule;
the expression of the inference module penalty function is:
Figure FDA0003912239620000057
wherein,
Figure FDA0003912239620000058
for computing the kth source domain in the source domain
Figure FDA0003912239620000059
Resulting part capsule
Figure FDA00039122396200000510
And the p-th event to be disposed [ a ] in the network security emergency plan p ,t p ]Relative entropy between;
s305: sending parameters of a network security event inference model of a server side to each Client side i ,0<i<And M +1, representing M clients in total.
4. The knowledge-graph and domain-adaptation based secure hosting service method according to claim 3, wherein the S4 further comprises the steps of:
s401: for each Client i Fixing the sub-capsule coding of the network security event inference model of the server end and the parameters of the local reconstruction module, and training the inference module and the reinforced inference module;
client of ith station i The corresponding target domain is Clog i Using the domain alignment penalty χ based on the information entropy, the expression is:
Figure FDA0003912239620000061
wherein,
Figure FDA0003912239620000062
representing a source domain F S For the target domain Clog on the ith client i Mathematical expectation value of, here
Figure FDA0003912239620000063
Is each subset in the source domain
Figure FDA0003912239620000064
And target domain Clog on the ith client i Relative entropy between, the expression:
Figure FDA0003912239620000065
wherein, log represents a logarithmic operation;
s402: the method is characterized in that a reinforced reasoning module is introduced into a network security time reasoning model of the client, parameters of the reinforced reasoning module in each client are different, and the reinforced reasoning module reasons an original result of the network security event reasoning model according to local configuration and is used for improving the accuracy rate of network security event handling and the robustness of the model.
5. The knowledge-graph and domain-adaptation based secure hosting service method of claim 4, wherein the S4 further comprises the steps of:
s403: when the client is the 1 st client, the final network security emergency plan expression generated by the 1 st client is as follows:
Figure FDA0003912239620000066
wherein,
Figure FDA0003912239620000067
the representation server side guides the network security emergency plan generated by the network security event reasoning model in the client side,
Figure FDA0003912239620000068
represents the final network security emergency plan, clog, on the 1 st client 1 Representing a target domain on the 1 st client, wherein G and G are convolution layers of 1 × 1, and Refine is a reinforced inference layer which is formed by connecting two groups of activation functions and convolution of 3 × 3 through residual errors;
s404: the loss function expression of the reinforced reasoning module is as follows:
Figure FDA0003912239620000071
wherein,
Figure FDA0003912239620000072
s405: the total loss function for model training on the 1 st client is:
Figure FDA0003912239620000073
wherein
Figure FDA0003912239620000074
A loss function representing an inference module of a network security event inference model on the server,
Figure FDA0003912239620000075
a reinforced inference module loss function representing a network security event inference model on a client, a domain alignment loss based on information entropy, | | | | sweet wind 2 Representing a vector 2 norm operation.
6. The knowledge-graph and domain-adaptation based secure hosting service method according to claim 1, wherein the S5 further comprises the steps of: when the source domain of the server end is updated in the future, the server finely adjusts the network security event reasoning model through the parameters, and the server and the client only carry out model parameter interaction without involving private information.
CN202211083553.9A 2022-09-06 2022-09-06 Safety trusteeship service method based on knowledge graph and domain adaptation Active CN115146299B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211083553.9A CN115146299B (en) 2022-09-06 2022-09-06 Safety trusteeship service method based on knowledge graph and domain adaptation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211083553.9A CN115146299B (en) 2022-09-06 2022-09-06 Safety trusteeship service method based on knowledge graph and domain adaptation

Publications (2)

Publication Number Publication Date
CN115146299A CN115146299A (en) 2022-10-04
CN115146299B true CN115146299B (en) 2022-12-09

Family

ID=83416090

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211083553.9A Active CN115146299B (en) 2022-09-06 2022-09-06 Safety trusteeship service method based on knowledge graph and domain adaptation

Country Status (1)

Country Link
CN (1) CN115146299B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111522965A (en) * 2020-04-22 2020-08-11 重庆邮电大学 Question-answering method and system for entity relationship extraction based on transfer learning
CN112231489A (en) * 2020-10-19 2021-01-15 中国科学技术大学 Knowledge learning and transferring method and system for epidemic prevention robot
CN112883200A (en) * 2021-03-15 2021-06-01 重庆大学 Link prediction method for knowledge graph completion
CN114491541A (en) * 2022-03-31 2022-05-13 南京众智维信息科技有限公司 Safe operation script automatic arrangement method based on knowledge graph path analysis

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111522965A (en) * 2020-04-22 2020-08-11 重庆邮电大学 Question-answering method and system for entity relationship extraction based on transfer learning
CN112231489A (en) * 2020-10-19 2021-01-15 中国科学技术大学 Knowledge learning and transferring method and system for epidemic prevention robot
CN112883200A (en) * 2021-03-15 2021-06-01 重庆大学 Link prediction method for knowledge graph completion
CN114491541A (en) * 2022-03-31 2022-05-13 南京众智维信息科技有限公司 Safe operation script automatic arrangement method based on knowledge graph path analysis

Also Published As

Publication number Publication date
CN115146299A (en) 2022-10-04

Similar Documents

Publication Publication Date Title
CN112633010B (en) Aspect-level emotion analysis method and system based on multi-head attention and graph convolution network
CN110597991A (en) Text classification method and device, computer equipment and storage medium
CN110489567B (en) Node information acquisition method and device based on cross-network feature mapping
CN116050401B (en) Method for automatically generating diversity problems based on transform problem keyword prediction
CN114926770B (en) Video motion recognition method, apparatus, device and computer readable storage medium
CN111368545A (en) Named entity identification method and device based on multi-task learning
DE102021004562A1 (en) Modification of scene graphs based on natural language commands
WO2023155546A1 (en) Structure data generation method and apparatus, device, medium, and program product
CN116821291A (en) Question-answering method and system based on knowledge graph embedding and language model alternate learning
CN114239675A (en) Knowledge graph complementing method for fusing multi-mode content
CN115146299B (en) Safety trusteeship service method based on knowledge graph and domain adaptation
CN114861907A (en) Data calculation method, device, storage medium and equipment
CN117010494B (en) Medical data generation method and system based on causal expression learning
CN117787343A (en) Long-sequence prediction method and device for microblog topic trend and computer storage medium
CN113377656A (en) Crowd-sourcing recommendation method based on graph neural network
CN112288154A (en) Block chain service reliability prediction method based on improved neural collaborative filtering
CN115422376B (en) Network security event source tracing script generation method based on knowledge graph composite embedding
Wu et al. Spiking neural P systems with communication on request and mute rules
CN116543339A (en) Short video event detection method and device based on multi-scale attention fusion
CN113849641B (en) Knowledge distillation method and system for cross-domain hierarchical relationship
CN115167863A (en) Code completion method and device based on code sequence and code graph fusion
CN114333069A (en) Object posture processing method, device, equipment and storage medium
CN117808944B (en) Method and device for processing text action data of digital person, storage medium and electronic device
CN117808083B (en) Distributed training communication method, device, system, equipment and storage medium
CN113609280B (en) Multi-domain dialogue generation method, device, equipment and medium based on meta learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230825

Address after: Room 3-3, No.1 Guanghua East Street, Qinhuai District, Nanjing City, Jiangsu Province, 210000

Patentee after: Big data Security Technology Co.,Ltd.

Address before: 211300 No. 3, Longjing Road, Gaochun District, Nanjing, Jiangsu

Patentee before: NANJING ZHONGZHIWEI INFORMATION TECHNOLOGY Co.,Ltd.