CN117332785B

CN117332785B - Method for extracting entity and relation from network security threat information combination

Info

Publication number: CN117332785B
Application number: CN202311302393.7A
Authority: CN
Inventors: 韩晓晖; 吕海青; 左文波; 崔慧; 刘广起; 刘洋
Original assignee: Qilu University of Technology; Shandong Computer Science Center National Super Computing Center in Jinan
Current assignee: Qilu University of Technology; Shandong Computer Science Center National Super Computing Center in Jinan
Priority date: 2023-10-10
Filing date: 2023-10-10
Publication date: 2024-03-01
Anticipated expiration: 2043-10-10
Also published as: CN117332785A

Abstract

A method for jointly extracting entities and relations from network security threat information relates to the technical field of network security, utilizes a model framework of multi-task joint learning to jointly extract the entity relations, can effectively reduce the error propagation problem of a non-joint extraction mode, utilizes unified vector representation encoded by the same vector encoder to extract characteristics specific to different tasks, reduces noise influence of useless characteristics and improves the decoding speed of the entity relations.

Description

Method for extracting entity and relation from network security threat information combination

Technical Field

The invention relates to the technical field of network security, in particular to a method for jointly extracting entities and relations from network security threat information.

Background

With the rapid development of the internet, network security problems are increasingly prominent, and various security threats are continuously emerging. Cyber security threat intelligence becomes an important basis for research and analysis of cyber security specialists, however, due to the numerous sources of information, the variety of information forms, and the general unstructured state of information, traditional manual analysis methods are inefficient and prone to errors. Therefore, the invention aims to provide an automatic and efficient network security threat information analysis method and system.

With the wide popularization of the internet and the rapid development of information technology, network security problems are becoming more serious. Network security threats, including cyber attacks, malware, data leakage, etc., continue to emerge, bringing significant losses and risks to individuals, businesses, and government agencies. Network security professionals and security teams need to collect, analyze, and understand network security threat intelligence effectively at times in order to protect the security of network systems and users.

At present, the method mainly used for extracting threat information entities and relations is as follows: a pipeline-based approach and a joint extraction approach. First, the pipeline-based approach refers to the fact that entity recognition and relationship extraction are treated as two independent subtasks and are accomplished sequentially through two independent models. The pipeline-based method is simple and easy to implement, but since the two steps of entity identification and relation extraction are independently performed, an erroneous entity identification error can lead to an error in the relation extraction result, namely error propagation, because the information interaction between the two tasks is limited. The method based on joint extraction refers to the task of regarding entity identification and relation extraction as a joint and simultaneously considering the association relation between the entity and the relation. The method aims to obtain more accurate results by simultaneously optimizing the identification of entities and relationships. The interaction between the entity and the relation is considered based on the joint extraction method, so that the context information in the text can be better captured, and the extraction accuracy is improved. However, the execution flow of most entity-relationship joint extraction models is generally: the method comprises the steps of firstly identifying the entities and then classifying the relationships based on the extracted entity pairs, wherein the time complexity of the method is in an exponential form along with the increase of the number of the entities. Due to the complexity of the model and the difficulty of training, joint extraction-based methods typically require more computational resources and data.

To overcome these problems, artificial intelligence techniques such as Natural Language Processing (NLP) and machine learning are widely used in the field of analysis of cyber security threat intelligence. Automated entity identification and relationship extraction techniques make the analysis process more efficient, thereby discovering potential threats in time. The application of the deep learning model enables the system to better understand and analyze unstructured text data, and improves the analysis precision of threat information. Moreover, the application of the visualization technology can intuitively display the association between the entities, and help security specialists to better understand the essence and trend of the network security threat.

However, there is still a lack of a method and system for efficient, rapid, comprehensive, and intelligent network security threat intelligence combined extraction of entities and relationships.

Disclosure of Invention

In order to overcome the defects of the technology, the invention provides a method for constructing an efficient and accurate network security threat information entity and relationship extraction by combining a plurality of artificial intelligence technologies.

The technical scheme adopted for overcoming the technical problems is as follows:

a method for jointly extracting entities and relationships from cyber security threat intelligence, comprising the steps of:

(a) Collecting n network security threat information articles to obtain a network security threat information set D, D= { D ₁ ,d ₂ ,…,d _i ,...,d _n }，d _i I e { 1..n } for the i-th cyber security threat intelligence article; (b) The ith network security threat information article d _i Performing data preprocessing operationsObtaining a sentence set S _i ，S _i ＝{s ₁ ,s ₂ ,...,s _j ,...,s _l }，s _j For sentence set S _i J e {1,., l }, l is the sentence set S _i The number of sentences in a sentence;

(c) Will j th sentence s _j Word segmentation is carried out to obtain a character sequence X _j ，X _j ＝{w ₁ ,w ₂ ,...,w _t ,...,w _N }，w _t For the t-th character, t e {1,., N }, N is the j-th sentence s _j The number of characters;

(d) Obtaining the embedded representation e of the t-th character _t Jth sentence s _j Is e, e= { e ₁ ,e ₂ ,...,e _t ,...,e _N }；

(e) Inputting the vector embedding sequence e into an attention module to obtain an entity recognition task vector matrix e ^ner Task vector matrix e for relationship classification ^re Shared vector matrix e for entity identification and relationship classification tasks ^share ， Vector embedding for entity recognition corresponding to the t-th character,/-> Embedding the relation classification vector corresponding to the t-th character, identifying the shared vector embedding of the relationship classification task for the entity corresponding to the t character;

(f) Task vector matrix e is identified according to entities ^ner Shared vector matrix e of body recognition and relation classification tasks ^share Obtaining a candidate head entity tag sequence E, E= { entity ₁ ,entity ₂ ,...,entity _i ,...,entity _m }，entity _i I e { 1..m } is the label of the i-th entity, m is the number of entities;

(g) Classifying task vector matrix e according to relationships ^re Shared vector matrix e of entity identification and relationship classification tasks ^share Obtaining all header entity vector set entries _CHE ， Embedding the pooled i-th head entity vector, i e { 1..m };

(h) According to all header entity vector set entries _CHE Calculating to obtain all head entity and relation correlation expression sequences O, O= { O ₁ ,o ₂ ,...,o _j ,...,o _m }，o _j Representing vectors for the relevance of the jth head entity to the relationship, j e {1, …, m };

(i) Obtaining the entity of the j-th entity according to all the head entities and the relation correlation expression sequence O _j Is a triplet sequence of (c).

Further, n cyber security threat intelligence articles are collected from public vulnerability reports and/or social media and/or security news in step (a).

Further, the method of the data preprocessing operation in the step (b) is as follows:

(b-1) removing the ith cyber-security threat intelligence article d _i Punctuation marks, mathematical operation symbols, brackets, quotation marks, @, #, $,%, ≡,&Completing noise removal operation;

(b-2) ith cyber-security threat intelligence article d after noise removal using the split () function of the python program _i Conversion to sentence set S _i 。

Further, step (c) uses the token function in the transducer kit to take the jth sentence s _j And performing word segmentation.

Further, step (d) comprises the steps of:

(d-1) writing the t-th character w _t Inputting into a Bert pre-training model, and outputting to obtain feature vectors R is real space, d _w Is a dimension;

(d-2) converting the t-th character w _t Input into a Glove pre-training model, and output to obtain a feature vector d _g Is a dimension;

(d-3) using the space tool part-of-speech tagging tool to tag the t-th character w _t Character embedding is carried out to obtain part-of-speech vectors d _p Is a dimension;

(d-4) extracting feature vectorsFeature vector->Part of speech vector->Performing splicing operation to obtain an embedded representation e of the t-th character _t ，/>Jth sentence s _j The vector embedding sequence is e, < >>

Further, step (e) comprises the steps of:

(e-1) the attention module is composed of an entity recognition attention module, a shared task attention module, and a relationship classification attention module, and is initialized to obtain a parameter matrix for the entity recognition attention module by using a torch.randn () function of a torch toolkit d _att Initializing to obtain a parameter matrix for sharing task attention modules by using a torch.randn () function of a torch toolkit for dimension > Initializing to get a parameter matrix for the relational classification attention module using the torch.randn () function of the torch toolkit>

(e-2) passing through the formulaCalculating to obtain Q vector of t-th entity recognition attention moduleWherein T is transposed by the formula +.>Calculating K vector of t-th entity recognition attention module>By the formula->Calculating the V vector of the t-th entity recognition attention module>The Q matrix of the entity recognition attention module is +.> The K matrix of the entity recognition attention module is +.>The V matrix of the entity recognition attention module is +.> By the formula->Calculating the attention score +.f of the t-th Q vector and the j-th K vector>All attention score matrices are alpha ^ner ，/>By the formulaCalculating to obtain entity identification vector embedding +.> Entity recognition task vector matrix is e ^ner ，

(e-3) passing through the formulaCalculating to obtain Q vector of t-th shared task attention module>Wherein T is transposed by the formula +.>Calculating K vector of t-th shared task attention module>By the formula->Calculating to obtain V vector of t-th shared task attention module>The Q matrix of the shared task attention module is +.> K matrix of shared task attention module isThe V matrix of the shared task attention module is +. >

By the formula->Calculating the attention score +.f of the t-th Q vector and the j-th K vector>All attention score matrices are alpha ^share ，/>By the formula->Calculating to obtain the shared vector embedding of the entity corresponding to the t character and the relation classification task>The shared vector matrix of the entity identification and relation classification task is e ^share ，/>

(e-4) passing through the formulaCalculating Q vector of t-th relation classification attention module>Wherein T is transposed by the formula +.>K vector of t-th relation classification attention module is calculated>By the formula->Calculating the V vector of the t-th relation classification attention module>The Q matrix of the relational classification attention module is +.> The K matrix of the relation classification attention module is +.> The V matrix of the relation classification attention module is +.>

By the formula->Calculating the attention score +.f of the t-th Q vector and the j-th K vector>All attention score matrices are alpha ^re ，By the formulaShared vector embedding for calculating and obtaining relation task corresponding to t-th characterThe relation classification task vector matrix is e ^re ，

Further, step (f) includes the steps of:

(f-1) embedding the entity recognition vector corresponding to the t-th characterAnd the entity corresponding to the t character and the shared vector embedding of the relation classification task +. >Performing splicing operation to obtain vector->Vector matrix is e ^CHE ，

(f-2) matrix of vectors e ^CHE Inputting into a two-way long and short memory neural network BiLSTM to obtain a vector matrix O ^CHE ， For the t-th character w _t Output of BiLSTM via two-way long and short memory neural network,>d _h is a dimension;

(f-3) outputting the vector into the sequence O ^CHE Is input into a single-layer linear network through a formula P ^CHE ＝O ^CHE W ^CHE +b ^CHE Calculating to obtain a vector matrix P ^CHE ，L _ner Number of physical tags, W ^CHE Parameter matrix for single-layer linear network, +.>b ^CHE Bias term for single-layer linear network, +.>L _ner The entity labels are B-compton, I-compton, B-identity, I-identity, B-tool, I-tool, B-actor, I-actor, B-vulnerability, I-vulnerability, O, B-compton representing the label of the first character of the cyber attack activity type entity in the text, I-compton representing the label of the other character than the first character of the cyber attack activity type entity in the text, B-identity representing the label of the first character of the organization or person type entity in the text, I-identity representing the label of the other character than the first character of the organization or person type entity in the text, B-tool representing the label of the first character of the tool type in the text, I-tool representing the label of the other character than the first character of the tool type in the text, B-malware represents a tag of a first character of a malware type entity in the text, I-malware represents a tag of a first character of a malware type entity in the text other than the first character, B-actor represents a tag of a first character of a threat actor type entity in the text, I-actor represents a tag of a first character of a threat actor type entity in the text other than the first character, B-vulnerabilities represents a tag of a first character of a vulnerability type entity in the text, I-vulnerabilities represents a tag of a second character of a vulnerability type entity in the text other than the first character, O is a tag of a non-entity type character in the text, B-compainen, B-identity, B-tool, B-actor, B-vulnerabilities constitute a B tag set, I-Compainen, I-identity, I-tool, I-malware, I-actor, I-vulnerabilities constitute an I tag set;

(f-4) matrix of vectors P ^CHE Inputting into conditional random field CRF to obtain probability of different entity labels corresponding to each character For the t-th character w _t Probability sequences corresponding to different entity type tags; (f-5) calculating the t-th character w using the torch.argmax () function in the pytorch toolkit _t Probability sequences corresponding to tags of different entity types +.>Entity tag l of the highest probability of (2) _t Obtaining the tag sequence of sentence->Tag sequence of traversing sentence->Tag sequence marked as head entity by using tag in B tag set as beginning and tag in I tag set as middle tag sequence to obtain head selection entity tag sequence E, E= { entity ₁ ,entity ₂ ,…,entity _i ,…,entity _m }，entity _i For the label of the i-th header entity, i e { 1..m }, m is the number of header entities.

Further, step (g) includes the steps of:

(g-1) embedding a shared vector of a relational task corresponding to t charactersShared vector embedding of entity and relationship classification task corresponding to the t-th character>Performing splicing operation to obtain vector->Vector matrix is e ^TRE ，

(g-2) labeling entity of the ith header entity _i The entity position of the middle entity is filled with 1, the non-entity position is filled with 0, and the label entity of the ith head entity is obtained _i Mask vector L of (2) _i ，L _i ∈R ^1×N Jth sentence s _j The set of mask vectors for the entities in (a) is L, l= { L ₁ ,L ₂ ,...,L _i ,...,L _m }，L∈R ^m×N ；

(g-3) passing through the formulaCalculating to obtain the i head entity vector embedding +.>Embedding the i-th header entity vector +.>Inputting into the maximum pooling layer, outputting to obtain the pooled i-th head entity vector embedded +.>Further, step (h) comprises the steps of:

(h-1) initializing to obtain a parameter matrix for correlation calculation using a torch.randn () function of a torch toolkitParameter matrix->

(h-2) vector of the vectorAnd parameter matrix->Performing multiplication operation to obtain key vector +.>The key vector matrix is K ^TRE ，/> Embedding the pooled j-th header entity vector +.>And parameter matrix->Multiplication is performed to obtain a query vector +.>Query vector matrix Q ^CHE ，

(h-3) passing through the formulaCalculating to obtain the relevance score S of the query vector corresponding to each head entity and the key vector corresponding to the t character of each sentence _jt Wherein V is a parameter matrix, < >>

(h-4) using a softmax function on the relevance score S _jt Normalizing to obtain normalized correlation fraction alpha _jt ，α _jt The value range is [0,1 ]]；

(h-5) passing through the formulaCalculating to obtain the query vector corresponding to the jth head entityThe contextual representation h of the vector corresponding to the t-th character of each sentence _jt The set of contextual representations of all head entities and vectors corresponding to the t-th character of each sentence is h,/-, and +.>h＝{h _1t ,h _2t ,…,h _jt ,...,h _mt }；

(h-6) passing through the formulaThe calculated context represents h _jt Threshold g of (2) _j ，g _j ∈[0,1]Wherein sigma (·) is a sigmoid function, W ₁ Is a parameter matrix->W ₂ Is a parameter matrix-> B for splice operation ₁ And b ₂ Are bias terms;

(h-7) passing through the formula u _j ＝g _j ·tan(W ₃ h _jt +b ₃ ) Calculating to obtain a filtered vector u _j Wherein W is ₃ As a matrix of parameters,b ₃ is a bias term;

(h-8) vector of the vectorAnd the filtered vector u _j Performing splicing operation to obtain a j-th head entity and relationship correlation expression vector matrix o _j ，/>All header entities and relations are related in the representation sequence O,further, step (i) comprises the steps of:

(i-1) inputting all head entities and relation correlation expression sequences O into a two-way long and short memory neural network BiLSTM to obtain vector sequences O ^re ，Representing vector matrix o for jth head entity and relationship correlation _j Output of BiLSTM via two-way long and short memory neural network,>

(i-2) representing the j-th header entity with a relationship correlation vector matrix o _j Input into a single-layer linear network through the formulaCalculating to obtain vector matrix->W in the formula _re Parameter matrix for single-layer linear network, +.>b _re Bias terms that are single-layer linear networks;

(i-3) matrix vectorsInput to the conditionIn the random field CRF, the jth header entity is obtained _j Probability of different entity tags in corresponding sentence +.> For the j-th header entity _j The t character w of the corresponding sentence _t Probability sequences corresponding to labels of different entity types, wherein the probability value set corresponding to all head entities is +.>

(i-4) calculating the j-th header entity using the torch.argmax () function in the pytorch toolkit _j Probability of corresponding to different entity tags in a sentenceEntity tag l of the highest probability of (2) _j ，l _j ∈R ^N Maximum probability entity tag l _j The entity tags of (a) are respectively B-Compainen, I-Compainen, B-identity, I-identity, B-tool, I-tool, B-malware, I-malware, B-actor, I-actor and B-vulnerability, I-vulnerability, O, and all tail entity tag sequences are->

(i-5) the entity tag l with the highest probability _j Defined as a tail entity tag sequence, using tags in the B tag set as a start, in the I tag setThe tag is used as a middle tag sequence to be marked as a tag of a tail entity, so that a tag sequence E ', E' = { entity of the tail entity is obtained ₁ ′,entity ₂ ′,...,entity _i ′,…,entity _n ′}，entity _i ' is the label of the ith head entity, i epsilon {1, …, n }, n is the number of tail entities;

(i-6) tag entry for the ith tail entity _i Filling the entity position in' 1, filling the non-entity position in 0, and obtaining the label entity of the ith tail entity _i ' mask vector L _i ′，L _i ′∈R ^1×N Jth header entity _j The mask vector set of the corresponding tail entity is L ^j ，L ^j ＝{L ₁ ′,L ₂ ′,…,L _i ′,…,L _n ′}，L _j ∈R ^n×N ；

(i-7) passing through the formulaCalculating to obtain the i-th tail entity vector embedding +.> All tail entity vector embedding sequences are entity _tail ，Embedding the i-th tail entity vector +.>Inputting into the maximum pooling layer, outputting to obtain the pooled i-th tail entity vector embedded +.>All vector sequences are entity _tail ′，/>

(i-8) embedding the pooled i-th tail entity vectorEmbedding +.>Splicing to obtain vector->Vector +.>Is input into a single-layer linear network by the formula +.>Calculating to obtain vector matrix->W in the formula _re ' is a parameter matrix of a single-layer linear network, +.>b _re ' is a bias term of a single-layer linear network, +.>L _re For the number of relationship tags, L _re The relationship label is use, targets, other, the relationship of label B-software and label B-Compainen is defined as use, the relationship of label I-software and label I-Compainen is defined as use, the relationship of label B-actor and label B-Compainen is defined as use, the relationship of label I-actor and label I-Compainen is defined as use, the relationship of label B-actor and label B-tool is defined as use, the relationship of label I-actor and label I-tool is defined as use, the relationship of label B-actor and label B-software is defined as use, and the relationship of label I-actor and label I-mail is defined as use,; label B-cto r and tag B-vulnerabilities are defined as targets, tag I-actor and tag I-vulnerabilities are defined as targets, tag B-malware and tag B-vulnerabilities are defined as targets, tag I-malware and tag I-vulnerabilities are defined as targets, tag B-actor and tag B-identity are defined as targets, tag I-actor and tag I-identity are defined as targets, tag B-server and tag B-identity are defined as targets, tag B-malware and tag B-identity are defined as targets, tag I-malware and tag I-identity are defined as targets; the method comprises the steps of defining a relationship between a tag B-Compainen and a tag B-vulnerabilities as an other, defining a relationship between a tag I-Compainen and a tag I-vulnerabilities as an other, defining a relationship between a tag B-tool and a tag B-vulnerabilities as an other, defining a relationship between a tag I-tool and a tag I-vulnerabilities as an other, defining a relationship between a tag B-tool and a tag B-Compainen as an other, defining a relationship between a tag I-tool and a tag I-Compainen as an other, defining a relationship between a tag B-identity and a tag B-Compainen as an other, defining a relationship between a tag I-identity and a tag I-Compainen as an other, defining a relationship between a tag B-tool and a tag B-identity as an other, defining a relationship between a tag I-tool and a tag I-identity as an other, and a tag I-identity as a tag I-identity, and a tag I-vpileability as an other, and a tag I-identity as a tag I-identity;

(i-9) vector matrix pairs using a softmax functionNormalizing to obtain normalized vector ++> Obtaining the most probable entity tag l using the torch.argmax () function in the pytorch toolkit _j Corresponding relation label rel _ij Obtaining triples<entity _j ,rel _ij ,entity _i ′>Is extracted to obtain the firstj header entity entries _j Corresponding all triplet sequences

{<entity _j ,rel _1j ,entity ₁ ′>,<entity _j ,rel _2j ,entity ₂ ′>,

...,<entity _j ,rel _ij ,entity _i ′>,...,<entity _j ,rel _nj ,entity _n ′>}。

The beneficial effects of the invention are as follows: the model architecture of multi-task joint learning is utilized to perform joint extraction of entity relations, so that the error propagation problem of a non-joint extraction mode can be effectively reduced, unified vector representations coded by the same vector coder are utilized to extract characteristics specific to different tasks, the noise influence of useless characteristics is reduced, and the decoding speed of the entity relations is improved.

Drawings

FIG. 1 is a diagram of a model structure of a network security threat intelligence joint extraction entity and relationship according to the present invention.

Detailed Description

The invention is further described with reference to fig. 1.

(a) Collecting n network security threat information articles to obtain a network security threat information set D, D= { D ₁ ,d ₂ ,…,d _i ,...,d _n }，d _i I e { 1..n } for the i-th cyber security threat intelligence article. (b) The ith network security threat information article d _i Performing data preprocessing operation to obtain sentence set S _i ，S _i ＝{s ₁ ,s ₂ ,...,s _j ,...,s _l }，s _j For sentence set S _i J e {1,., l }, l is the sentence set S _i The number of sentences in (a).

(c) Will j th sentence s _j Word segmentation is carried out to obtain a character sequence X _j ，X _j ＝{w ₁ ,w ₂ ,...,w _t ,...,w _N }，w _t Is t thA character, t e { 1., N, N is the jth sentence s _j Number of characters.

(d) Obtaining the embedded representation e of the t-th character _t Jth sentence s _j Is e, e= { e ₁ ,e ₂ ,...,e _t ,...,e _N }。

(e) Inputting the vector embedded sequence e into an attention module, extracting shared vectors for entity recognition task vectors, relationship classification task vectors and entity recognition and relationship classification tasks, and obtaining an entity recognition task vector matrix e ^ner Task vector matrix e for relationship classification ^re Shared vector matrix e for entity identification and relationship classification tasks ^share ， Vector embedding for entity recognition corresponding to the t-th character,/-> Embedding +.> And identifying the shared vector embedding of the relationship classification task for the entity corresponding to the t character. />

(f) Task vector matrix e is identified according to entities ^ner Shared vector matrix e of body recognition and relation classification tasks ^share Obtaining a candidate head entity tag sequence E, E= { entity ₁ ,entity ₂ ,...,entity _i ,...,entity _m }，entity _i I e { 1..m } is the label of the i-th entity, m is the number of entities.

(g) Classifying task vector matrix e according to relationships ^re Shared vector matrix e of entity identification and relationship classification tasks ^share Obtaining all header entity vector set entries _CHE ， Embedded for the pooled i-th header entity vector, i e { 1..m }.

(h) According to all header entity vector set entries _CHE Calculating to obtain all head entity and relation correlation expression sequences O, O= { O ₁ ,o ₂ ,...,o _j ,...,o _m }，o _j The vector is represented for the relevance of the j-th head entity to the relationship, j e { 1..m }.

The entity recognition and relation extraction can be performed in a multitasking parallel manner by using an end-to-end neural network model architecture based on multitasking learning. Firstly, embedding an input text into unified vector representation by using an NLP pre-training model, and extracting feature vectors specific to entity recognition and relationship by using an attention module; inputting the feature vector of the entity identification task into a head entity module for identifying the head entity; and inputting the characteristic vector of the relation extraction task and the identified representing vector of the head entity into a tail entity and relation decoding module to decode the tail entity and the relation. The method provided by the invention fully utilizes the same vector encoder to extract the characteristics specific to different tasks, reduces the noise of useless characteristics and improves the utilization efficiency of the characteristics; and secondly, the decoding speed of the entity relationship is improved by utilizing an end-to-end model architecture of multitask learning, the information of the entity can be efficiently and accurately extracted from diversified network security threat information data, and the association relationship between the entities is established.

Entity identification experimental result of collected network security threat information data set D in different models

Model	Accuracy of	Accuracy of	Recall rate of recall	F1-fraction
					SpERT	79.2％	70.3％	74.6％	76.8％
Multi-turn QA	85.0％	81.3％	82.6％	83.7％
					MTL	89.3％	86.5％	89.6％	89.4％
DYGIE	78.2％	76.5％	79.6％	78.9％
					PERA	88.1％	86.1％	89.2％	86.6％
ours	90.5％	91.2％	89.8％	90.1％

。

According to the experimental results in the table one, the method for jointly extracting the entity and the relation from the network security threat information provided by the invention has the advantages that the entity identification accuracy reaches 91.2%, the accuracy reaches 90.5%, the F1-fraction reaches 90.1%, and the recall rate reaches 89.8%. Compared with other traditional experimental methods, the method has the advantages of greatly improving the precision and having good entity identification effect.

Relationship classification experimental result of network security threat information data set D collected in table two in different models

Model	Accuracy of	Accuracy of	Recall rate of recall	F1-fraction
					SpERT	74.7％	73.6％	71.5％	72.8％
Multi-turn QA	69.20％	67.4％	68.2％	68.9％
					MTL	77.73％	67.1％	68.4％	72.63％
DYGIE	72.2％	69.5％	71.6％	71.2％
					PERA	76.1％	74.1％	75.0％	75.5％
ours	77.6％	75.4％	78.3％	77.9％

。

According to the experimental result of the second table, the method for extracting the entity and the relation from the network security threat information in a combined way has the relation classification and identification accuracy reaching 75.4%, the precision reaching 77.6%, the F1-fraction reaching 77.9% and the recall rate reaching 78.3%. Compared with other traditional experimental methods, the method has the advantages of greatly improving the precision and having good relation classification effect.

In one embodiment of the invention, n cyber security threat intelligence articles are collected from public vulnerability reports and/or social media and/or security news in step (a).

In one embodiment of the present invention, the method of the data preprocessing operation in step (b) is:

(b-1) removing the ith cyber-security threat intelligence article d _i Punctuation marks, mathematical operation symbols, brackets, quotation marks, @, #, $,%, ≡,&And (5) completing noise removal operation.

(b-2) using the split () function of the python program to divide periods into partitions, removing noise from the ith cyber security threat information article d _i Conversion to sentence set S _i 。

In one embodiment of the present invention, step (c) uses the token function in the transducer kit to take the jth sentence s _j And performing word segmentation.

In one embodiment of the invention, step (d) comprises the steps of:

(d-1) writing the t-th character w _t Inputting into a Bert pre-training model, and outputting to obtain feature vectors R is real space, d _w Is a dimension.

(d-2) converting the t-th character w _t Input into a Glove pre-training model, and output to obtain a feature vector d _g Is a dimension.

(d-3) using the space tool part-of-speech tagging tool to tag the t-th character w _t Character embedding is carried out to obtain part-of-speech vectorsd _p Is a dimension.

In one embodiment of the invention, step (e) comprises the steps of:

(e-1) the attention module is constituted by an entity recognition attention module, a shared task attention module, and a relationship classification attention moduleInstead, initializing with the torch.randn () function of the torch toolkit yields a parameter matrix for entity recognition attention module d _att Initializing to obtain a parameter matrix for sharing task attention modules by using a torch.randn () function of a torch toolkit for dimension> Initializing to get a parameter matrix for the relational classification attention module using the torch.randn () function of the torch toolkit>

(e-2) passing through the formulaCalculating to obtain Q vector of t-th entity recognition attention moduleWherein T is transposed by the formula +.>Calculating to obtain the K vector of the t-th entity recognition attention moduleBy the formula->Calculating the V vector of the t-th entity recognition attention module>The Q matrix of the entity recognition attention module is +.> The K matrix of the entity recognition attention module is +.>The V matrix of the entity recognition attention module is +.> By the formula->Calculating the attention score +.f of the t-th Q vector and the j-th K vector>All attention score matrices are alpha ^ner ，/>By the formulaCalculating to obtain entity identification vector embedding +.> Entity recognition task vector matrix is e ^ner ，

(e-3) passing through the formulaCalculating to obtain Q vector of t-th shared task attention module>Wherein T is transposed by the formula +.>Calculating K vector of t-th shared task attention module>By the formula->Calculating to obtain V vector of t-th shared task attention module>The Q matrix of the shared task attention module is +.> K matrix of shared task attention module isThe V matrix of the shared task attention module is +.>

(e-4) passing through the formulaCalculating Q vector of t-th relation classification attention module>Wherein T is transposed by the formula +.>K vector of t-th relation classification attention module is calculated>By the formula->Calculating the V vector of the t-th relation classification attention module>The Q matrix of the relational classification attention module is +.> The K matrix of the relation classification attention module is +. > The V matrix of the relation classification attention module is +.> />

By passing throughFormula->Calculating the attention score +.f of the t-th Q vector and the j-th K vector>All attention score matrices are alpha ^re ，By the formula->Shared vector embedding of relation task corresponding to t-th character>The relation classification task vector matrix is e ^re ，/>

Through the calculation, the entity recognition task vector matrix e is obtained ^ner Task vector matrix e for relationship classification ^re Shared vector matrix e for entity identification and relationship classification tasks ^share The identification task vector, the relationship classification task vector and the sharing task vector in the corresponding respectively.

In one embodiment of the invention, step (f) comprises the steps of:

(f-1) embedding the entity recognition vector corresponding to the t-th characterAnd the entity corresponding to the t character and the shared vector embedding of the relation classification task +.>Performing splicing operation to obtain vector->Vector matrix is e ^CHE ，

(f-2) matrix of vectors e ^CHE Inputting into a two-way long and short memory neural network BiLSTM to obtain a vector matrix O ^CHE ， For the t-th character w _t Output of BiLSTM via two-way long and short memory neural network,>d _h is a dimension.

(f-3) outputting the vector into the sequence O ^CHE Is input into a single-layer linear network through a formula P ^CHE ＝O ^CHE W ^CHE +b ^CHE Calculating to obtain a vector matrix P ^CHE ，L _ner Number of physical tags, W ^CHE Parameter matrix for single-layer linear network, +.>b ^CHE Bias term for single-layer linear network, +.>L _ner The physical labels are B-Compainen, I-Compainen, B-identity, I-identity, B-tool, I-tool, B-mail, I-mail, B-actor, I-actor, B-vulnerability, I-vulnerability, O, B-Compainen indicates the network attack activity in the textA tag of a first character of the action type entity, I-Compaine represents a tag of a network attack action type entity in the text other than the first character, B-identity represents a tag of a first character of an organization or person type entity in the text, I-identity represents a tag of a first character of an organization or person type entity in the text other than the first character, B-tool represents a tag of a first character of a tool type in the text, I-tool represents a tag of a tool type other than the first character in the text, B-malware represents a tag of a first character of a malware type entity in the text, I-malware represents a tag of a malware type entity other than the first character in the text, B-actor means a tag of a first character of a threat actor type entity in the text, I-actor means a tag of other characters than the first character of the threat actor type entity in the text, B-vulnerabilities means a tag of the first character of the vulnerability type entity in the text, I-vulnerabilities means a tag of other characters than the first character of the vulnerability type entity in the text, O is a tag of a non-entity type character in the text, B-composite, B-identity, B-tool, B-mail, B-actor, B-vulnerabilities constitute a B tag set, I-composite, I-identity, I-tool, I-actor, I-vulnerabilities constitute an I tag set.

(f-4) matrix of vectors P ^CHE Inputting the probability of the different entity labels corresponding to each character into a conditional random field CRF to perform head entity decoding For the t-th character w _t Probability sequences corresponding to different entity type tags;

(f-5) calculating the t-th character w using the torch.argmax () function in the pytorch toolkit _t Corresponding to tags of different entity typesProbability sequenceEntity tag l of the highest probability of (2) _t Obtaining tag sequences of sentencesTag sequence of traversing sentence->Tag sequence marked as head entity by using tag in B tag set as beginning and tag in I tag set as middle tag sequence to obtain head selection entity tag sequence E, E= { entity ₁ ,entity ₂ ,...,entity _i ,...,entity _m }，entity _i For the label of the i-th header entity, i e { 1..m }, m is the number of header entities.

In one embodiment of the invention, step (g) comprises the steps of:

(g-2) labeling entity of the ith header entity _i The entity position of the middle entity is filled with 1, the non-entity position is filled with 0, and the label entity of the ith head entity is obtained _i Mask vector L of (2) _i ，L _i ∈R ^1×N Jth sentence s _j The set of mask vectors for the entities in (a) is L, l= { L ₁ ,L ₂ ,...,L _i ,...,L _m }，L∈R ^m×N . (g-3) passing through the formulaCalculating to obtain the i head entity vector embedding +.>Embedding the i-th header entity vector +.>Inputting into a maximum pooling layer, unifying the first dimension of the entity by using the maximum pooling layer, and outputting to obtain the pooled i-th head entity vector embedded +.>

In one embodiment of the invention, step (h) comprises the steps of:

(h-4) using a softmax function on the relevance score S _jt Normalizing to obtain normalized correlation fraction alpha _jt ，α _jt The value range is [0,1 ]]。

(h-5) passing through the formulaCalculating to obtain the query vector corresponding to the jth head entity The contextual representation h of the vector corresponding to the t-th character of each sentence _jt The set of contextual representations of all head entities and vectors corresponding to the t-th character of each sentence is h,/-, and +.>h＝{h _1t ,h _2t ,...,h _jt ,...,h _mt }。

(h-6) passing through the formulaThe calculated context represents h _jt Threshold g of (2) _j ，g _j ∈[0,1]Wherein sigma (·) is a sigmoid function, W ₁ Is a parameter matrix->W ₂ Is a parameter matrix-> B for splice operation ₁ And b ₂ Are bias terms.

(h-7) passing through the formula u _j ＝g _j ·tan(W ₃ h _jt +b ₃ ) Calculating to obtain a filtered vector u _j Wherein W is ₃ As a matrix of parameters,b ₃ is a bias term.

(h-8) vector of the vectorAnd the filtered vector u _j Performing splicing operation to obtain a j-th head entity and relationship correlation expression vector matrix o _j ，/>All header entities and relations are related in the representation sequence O,using vector e ^TRE All header entity vector set entries _CHE And performing correlation calculation of the entity and the relation, wherein the step corresponds to the correlation calculation module of the entity and the relation.

In one embodiment of the invention, step (i) comprises the steps of:

(i-1) inputting all head entities and relation correlation expression sequences O into a two-way long and short memory neural network BiLSTM, performing context learning on the feature sequences, and processing the feature sequences by the two-way long and short memory neural network BiLSTM to obtain vector sequences O ^re ， Representing vector matrix o for jth head entity and relationship correlation _j Output of BiLSTM via two-way long and short memory neural network,>

(i-2) representing the j-th header entity with a relationship correlation vector matrix o _j Input into a single-layer linear network through the formulaCalculating to obtain vector matrix->W in the formula _re Parameter matrix for single-layer linear network, +.>b _re Is a bias term for a single layer linear network.

(i-3) matrix vectorsInputting into conditional random field CRF to obtain j-th head entity _j Probability of different entity tags in corresponding sentence +.> For the j-th header entity _j The t character w of the corresponding sentence _t Probability sequences corresponding to labels of different entity types, wherein the probability value set corresponding to all head entities is +.>

(i-4) calculating the j-th header entity using the torch.argmax () function in the pytorch toolkit _j Probability of corresponding to different entity tags in a sentenceEntity tag l of the highest probability of (2) _j ，l _j ∈R ^N Maximum probability entity tag l _j The entity tags of (a) are respectively B-Compainen, I-Compainen, B-identity, I-identity, B-tool, I-tool, B-malware, I-malware, B-actor, I-actor and B-vulnerability, I-vulnerability, O, and all tail entity tag sequences are- >

(i-5) the entity tag l with the highest probability _j Defining as a tail entity tag sequence, marking the tag sequence as a tail entity tag by using a tag in the B tag set as a start and a tag in the I tag set as a middle tag sequence, and obtaining a tail entity tag sequence E ', E' = { entity ₁ ′,entity ₂ ′,...,entity _i ′,...,entity _n ′}，entity _i ' is the label of the i-th head entity, i e { 1..the n }, n being the number of tail entities.

(i-6) tag entry for the ith tail entity _i Filling the entity position in' 1, filling the non-entity position in 0, and obtaining the label entity of the ith tail entity _i ' mask vector L _i ′，L _i ′∈R ^1×N Jth header entity _j The mask vector set of the corresponding tail entity is L ^j ，L ^j ＝{L ₁ ′,L ₂ ′,...,L _i ′,...,L _n ′}，L ^j ∈R ^n×N 。

(i-7) passing through the formulaCalculating to obtain the i-th tail entity vector embedding +.> All tail entity vector embedding sequences are entity _tail ，Embedding the i-th tail entity vector +.>Inputting into the maximum pooling layer, outputting to obtain the pooled i-th tail entity vector embedded +.>All vector sequences are entity _tail ′，

/>

(i-8) embedding the pooled i-th tail entity vectorEmbedding +.>Splicing to obtain vector->Vector +.>Is input into a single-layer linear network by the formula +.>Calculating to obtain vector matrix->W in the formula _re ' is a parameter matrix of a single-layer linear network, +. >b _re ' is a bias term of a single-layer linear network, +.>L _re For the number of relationship tags, L _re The relationship labels are use, targets, other, the labels use and targets represent the relationship between the entities of the predefined types, and the label other represents that no relationship exists between the representative entities. Specifically, the relationship of tag B-malware to tag B-Compainen is defined as useDefining a relation between the tag I-software and the tag I-Compaine as use, defining a relation between the tag B-actor and the tag B-Compaine as use, defining a relation between the tag I-actor and the tag I-Compaine as use, defining a relation between the tag B-actor and the tag B-tool as use, defining a relation between the tag I-actor and the tag I-tool as use, defining a relation between the tag B-actor and the tag B-software as use, and defining a relation between the tag I-actor and the tag I-software as use; the relation between the label B-actor and the label B-vulnerabilities is defined as targets, the relation between the label I-actor and the label I-vulnerabilities is defined as targets, the relation between the label B-software and the label B-vulnerabilities is defined as targets, the relation between the label I-software and the label I-vulnerabilities is defined as targets, the relation between the label B-actor and the label B-identity is defined as targets, the relation between the label I-actor and the label I-identity is defined as targets, the relation between the label B-software and the label B-identity is defined as targets, and the relation between the label I-software and the label I-identity is defined as targets; the relationship between the label B-Compainen and the label B-vulnerabilities is defined as an other, the relationship between the label I-Compainen and the label I-vulnerabilities is defined as an other, the relationship between the label B-tool and the label B-vulnerabilities is defined as an other, the relationship between the label I-tool and the label I-vulnerabilities is defined as an other, the relationship between the label B-tool and the label B-Compainen is defined as an other, the relationship between the label I-tool and the label I-Compainen is defined as an other, the relationship between the label B-identity and the label B-Compainen is defined as an other, the relationship between the label I-identity and the label I-Compainen is defined as an other, the relationship between the label B-tool and the label B-identity is defined as an other, the relationship between the label I-tool and the label I-identity is defined as an other, and the label I-vpileness is defined as an other.

(i-9) vector matrix pairs using a softmax functionNormalizing to obtain normalized vector ++> Obtaining the most probable entity tag l using the torch.argmax () function in the pytorch toolkit _j Corresponding relation label rel _ij Obtaining triples<entity _j ,rel _ij ,entity _i ′>Extracting to obtain the j-th header entity _j Corresponding all triplet sequences {<entity _j ,rel _1j ,entity ₁ ′>,<entity _j ,rel _2j ,entity ₂ ′>,

…,<entity _j ,rel _ij ,entity _i ′〉,…,<entity _j ,rel _nj ,entity _n ′>}。

Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for jointly extracting entities and relationships from cyber security threat intelligence, comprising the steps of:

(a) Collecting n network security threat information articles to obtain a network security threat information set D, D= { D ₁ ,d ₂ ,…,d _i ,...,d _n }，d _i I e {1, …, n }, which is the i-th article of cyber security threat intelligence;

(b) The ith network security threat information article d _i Performing data preprocessing operation to obtain sentence set S _i ，S _i ＝{s ₁ ,s ₂ ,…,s _j ,...,s _l }，s _j Is sentence setS in S _i J e {1,., l }, l is the sentence set S _i The number of sentences in a sentence;

(e) Inputting the vector embedding sequence e into an attention module to obtain an entity recognition task vector matrix e ^ner Task vector matrix e for relationship classification ^re Shared vector matrix e for entity identification and relationship classification tasks ^share ，The vector identified for the entity corresponding to the t-th character is embedded,embedding the relation classification vector corresponding to the t-th character, identifying the shared vector embedding of the relationship classification task for the entity corresponding to the t character;

(f) Task vector matrix e is identified according to entities ^ner Shared vector matrix e of body recognition and relation classification tasks ^share Obtaining a candidate head entity tag sequence E, E= { entity ₁ ,entity ₂ ,...,entity _i ,…,entity _m }，entity _i I e {1, …, m } is the label of the ith entity, m is the number of entities;

(g) Classifying task vector matrix e according to relationships ^re Shared vector matrix e of entity identification and relationship classification tasks ^share Obtaining all header entity vector set entries _CHE ， Embedding the pooled i-th head entity vector, i epsilon {1, …, m };

(h) According to all header entity vector set entries _CHE Calculating to obtain all head entity and relation correlation expression sequences O, O= { O ₁ ,o ₂ ,…,o _j ,…,o _m }，o _j Representing vectors for the relevance of the jth head entity to the relationship, j e {1, …, m };

2. The method for jointly extracting entities and relationships from cyber-security threat intelligence of claim 1, wherein: in step (a), n cyber security threat intelligence articles are collected from public vulnerability reports and/or social media and/or security news.

3. The method of claim 1, wherein the data preprocessing operation in step (b) comprises:

(b-1) removing the ith cyber-security threat intelligence article d _i Punctuation marks, mathematical operation symbols, brackets, quotation marks, @, #, $,%, ≡,&Performing noise removal operation;

4. The method for jointly extracting entities and relationships from cyber-security threat intelligence of claim 1, wherein: using a token function in a transducer kit in step (c) to take the jth sentence s _j And performing word segmentation.

5. The method of claim 1, wherein step (d) comprises the steps of:

(d-3) using the space tool part-of-speech tagging tool to tag the t-th character w _t Character embedding is carried out to obtain part-of-speech vectorsd _p Is a dimension;

(d-4) extracting feature vectorsFeature vector->Part of speechQuantity->Performing splicing operation to obtain an embedded representation e of the t-th character _t ，/>Jth sentence s _j The vector embedding sequence is e, < >>

6. The method for federated extraction of entities and relationships from cyber-security threat intelligence of claim 5, wherein step (e) comprises the steps of:

(e-1) the attention module is composed of an entity recognition attention module, a shared task attention module, and a relationship classification attention module, and is initialized to obtain a parameter matrix for the entity recognition attention module by using a torch.randn () function of a torch toolkit d _att Initializing to obtain a parameter matrix for sharing task attention modules by using a torch.randn () function of a torch toolkit for dimension> Initializing to get a parameter matrix for the relational classification attention module using the torch.randn () function of the torch toolkit>

(e-2) passing through the formulaCalculating the Q vector of the t-th entity recognition attention module>Wherein T is transposed by the formula +.>Calculating to obtain the K vector of the t-th entity recognition attention moduleBy the formula->Calculating the V vector of the t-th entity recognition attention module>The Q matrix of the entity recognition attention module is +.> The K matrix of the entity recognition attention module is +.>The V matrix of the entity recognition attention module is +.> By the formula->Calculating the attention score +.f of the t-th Q vector and the j-th K vector>All attention score matrices are alpha ^ner ，/>α ^ner ∈R ^N×N By the formulaCalculating to obtain entity identification vector embedding +.> Entity recognition task vector matrix is e ^ner ，/>

(e-3) passing through the formulaCalculating to obtain Q vector of t-th shared task attention module>Wherein T is transposed by the formula +.>Calculating K vector of t-th shared task attention module>By the formula->Calculating to obtain V vector of t-th shared task attention module >The Q matrix of the shared task attention module is +.> The K matrix of the shared task attention module is +.>The V matrix of the shared task attention module is +.> By the formula/>Calculating the attention score +.f of the t-th Q vector and the j-th K vector>All attention score matrices are alpha ^share ，/>α ^share ∈R ^N×N By the formulaCalculating to obtain the shared vector embedding of the entity corresponding to the t character and the relation classification task>The shared vector matrix of the entity identification and relation classification task is e ^share ，

(e-4) passing through the formulaCalculating Q vector of t-th relation classification attention module>Wherein T is transposed by the formula +.>K vector of t-th relation classification attention module is calculated>By the formula->Calculating the V vector of the t-th relation classification attention module>The Q matrix of the relational classification attention module is +.> The K matrix of the relation classification attention module is +.> The V matrix of the relation classification attention module is +.> By the formula->Calculating the attention score +.f of the t-th Q vector and the j-th K vector>All attention score momentsThe array is alpha ^re ，α ^re ∈R ^N×N By the formula->Shared vector embedding of relation task corresponding to t-th character>The relation classification task vector matrix is e ^re ，/>

7. The method for jointly extracting entities and relationships from cyber-security threat intelligence of claim 6, wherein step (f) comprises the steps of:

(f-2) matrix of vectors e ^CHE Inputting into a two-way long and short memory neural network BiLSTM to obtain a vector matrix O ^CHE ，For the t-th character w _t Output of BiLSTM via two-way long and short memory neural network,>d _h is a dimension;

(f-3) outputting the vector into the sequence O ^CHE Is input into a single-layer linear network through a formula P ^CHE ＝O ^CHE W ^CHE +b ^CHE Calculating to obtain a vector matrix P ^CHE ，L _ner Number of physical tags, W ^CHE Parameter matrix for single-layer linear network, +.>b ^CHE Bias term for single-layer linear network, +.>L _ner The entity labels are B-Compainen, I-Compainen, B-identity, I-identity, B-tool, I-tool, B-actor, I-actor, B-vulnerability, I-vulnerability, O, B-Compainen representing the label of the first character of the cyber-attack activity type entity in the text, I-Compainen representing the label of the other character than the first character of the cyber-attack activity type entity in the text, B-identity representing the label of the first character of the organization or person type entity in the text, I-identity representing the label of the other character than the first character of the organization or person type entity in the text, B-tool representing the label of the first character of the tool type in the text, I-tool representing the label of the other character than the first character of the tool type in the text, B-tool table I-malware represents a tag of a first character of a malware type entity in the text, B-actor represents a tag of a first character of a threat actor type entity in the text, I-actor represents a tag of a first character of a threat actor type entity in the text, B-vulnerabilities represents a tag of a first character of a vulnerability type entity in the text, I-vulnerabilities represents a tag of a other character of a vulnerability type entity in the text, O is a tag of a non-entity type character in the text, B-completions, B-identity, B-tools, B-actor, B-vulnerabilities constitute a B tag set, I-aid, I-identity, I-tool, I-actor, I-vulnerabilities constitute an I tag set;

(f-4) matrix of vectors P ^CHE Inputting into conditional random field CRF to obtain probability of different entity labels corresponding to each character For the t-th character w _t Probability sequences corresponding to different entity type tags;

(f-5) calculating the t-th character w using the torch.argmax () function in the pytorch toolkit _t Probability sequences corresponding to different entity type tags Entity tag l of the highest probability of (2) _t Obtaining tag sequences of sentencesTag sequence of traversing sentence->Using tags in the B-tag set as start, and I-tag setThe label is used as a label of the head entity, and the label sequence E, E= { entity is obtained ₁ ,entity ₂ ,…,entity _i ,...,entity _m }，entity _i For the label of the i-th header entity, i e { 1..m }, m is the number of header entities.

8. The method for federated extraction of entities and relationships from cyber-security threat intelligence of claim 7, wherein step (g) comprises the steps of:

(g-3) passing through the formulaCalculating to obtain the i head entity vector embedding +.>Embedding the i-th header entity vector +.>Inputting into the maximum pooling layer, outputting to obtain the pooled i-th head entity vector embedded +. >

9. The method for jointly extracting entities and relationships from cyber-security threat intelligence of claim 8, wherein step (h) comprises the steps of:

(h-5) passing through the formulaCalculating to obtain a query vector corresponding to the j-th head entity>The contextual representation h of the vector corresponding to the t-th character of each sentence _jt The set of contextual representations of all head entities and vectors corresponding to the t-th character of each sentence is h,/-, and +.>h＝{h _1t ,h _2t ,...,h _jt ,...,h _mt }；

(h-6) passing through the formulaThe calculated context represents h _jt Threshold g of (2) _j ，g _j ∈[0,1]Wherein sigma (·) is a sigmoid function, W ₁ Is a parameter matrix->W ₂ Is a parameter matrix->B for splice operation ₁ And b ₂ Are bias terms;

(h-8) vector of the vectorAnd the filtered vector u _j Performing splicing operation to obtain the j-th head entity and relationThe correlation represents a vector matrix o _j ，/>All head entities and relations have a correlation expression sequence O, o= { O ₁ ,o ₂ ,...,o _j ,...,o _m }，/>

10. The method for jointly extracting entities and relationships from cyber-security threat intelligence of claim 9, wherein step (i) comprises the steps of:

(i-3) matrix vectors Inputting into conditional random field CRF to obtain j-th head entity _j Probability of different entity tags in corresponding sentence +.> For the j-th header entity _j The t character w of the corresponding sentence _t Probability sequences corresponding to labels of different entity types, wherein the probability value set corresponding to all head entities is +.>

(i-5) the entity tag l with the highest probability _j Defined as a tail entity tag sequence, using tags in the B tag set as a start, I tag setThe label in the label is used as a label sequence in the middle and is marked as a label of a tail entity, so that a tail entity selecting label sequence E ', E' = { entity is obtained ₁ ′,entity ₂ ′,...,entity _i ′,…,entity _n ′}，entity _i ' is the label of the ith head entity, i epsilon {1, …, n }, n is the number of tail entities;

(i-6) tag entry for the ith tail entity _i Filling the entity position in' 1, filling the non-entity position in 0, and obtaining the label entity of the ith tail entity _i ' mask vector L _i ′，L _i ′∈R ^1×N Jth header entity _j The mask vector set of the corresponding tail entity is L ^j ，L ^j ＝{L ₁ ′,L ₂ ′,…,L _i ′,…,L _n ′}，L ^j ∈R ^n×N ；

(i-7) passing through the formulaCalculating to obtain the i-th tail entity vector embedding +.> All tail entity vector embedding sequences are entity _tail ，Embedding the ith tail entity vectorInputting into the maximum pooling layer, outputting to obtain the pooled i-th tail entity vector embedded +.>All vector sequences are entity _tail ′，/>

(i-8) embedding the pooled i-th tail entity vectorEmbedding with the pooled jth header entity vectorSplicing to obtain vector->Vector +.>Is input into a single-layer linear network by the formula +.>Calculating to obtain vector matrix->W in the formula _re ' is a parameter matrix of a single-layer linear network, +.>b _re ' is a bias term of a single-layer linear network, +.>L _re For the number of relationship tags, L _re =3, the relationship label is use, targets, other, the relationship of label B-malware and label B-compain is defined as use, the relationship of label I-malware and label I-compain is defined as use, the relationship of label B-actor and label B-compain is defined as use, the relationship of label I-actor and label I-compain is defined as use, the relationship of label B-actor and label B-tool is defined as use, and the label The relation between the label I-actor and the label I-tool is defined as use, the relation between the label B-actor and the label B-malware is defined as use, and the relation between the label I-actor and the label I-malware is defined as use; the relation between the label B-actor and the label B-vulnerabilities is defined as targets, the relation between the label I-actor and the label I-vulnerabilities is defined as targets, the relation between the label B-software and the label B-vulnerabilities is defined as targets, the relation between the label I-software and the label I-vulnerabilities is defined as targets, the relation between the label B-actor and the label B-identity is defined as targets, the relation between the label I-actor and the label I-identity is defined as targets, the relation between the label B-software and the label B-identity is defined as targets, and the relation between the label I-software and the label I-identity is defined as targets; the method comprises the steps of defining a relationship between a tag B-Compainen and a tag B-vulnerabilities as an other, defining a relationship between a tag I-Compainen and a tag I-vulnerabilities as an other, defining a relationship between a tag B-tool and a tag B-vulnerabilities as an other, defining a relationship between a tag I-tool and a tag I-vulnerabilities as an other, defining a relationship between a tag B-tool and a tag B-Compainen as an other, defining a relationship between a tag I-tool and a tag I-Compainen as an other, defining a relationship between a tag B-identity and a tag B-Compainen as an other, defining a relationship between a tag I-identity and a tag I-Compainen as an other, defining a relationship between a tag B-tool and a tag B-identity as an other, defining a relationship between a tag I-tool and a tag I-identity as an other, and a tag I-identity as a tag I-identity, and a tag I-vpileability as an other, and a tag I-identity as a tag I-identity;

(i-9) vector matrix pairs using a softmax functionNormalizing to obtain normalized vector ++> Using a pytorch toolThe torch.argmax () function in the package gets the entity tag l of the highest probability _j Corresponding relation label rel _ij Obtaining triples<entity _j ,rel _ij ,entity _i ′>Extracting to obtain the j-th header entity _j Corresponding all triplet sequences {<entity _j ,rel _1j ,entity ₁ ′>,<entity _j ,rel _2j ,entity ₂ ′>,…,<entity _j ,rel _ij ,entity _i ′>,…,<entity _j ,rel _nj ,entity _n ′>}。