CN108959084B - Markov vulnerability prediction quantity method based on smoothing method and similarity - Google Patents

Markov vulnerability prediction quantity method based on smoothing method and similarity Download PDF

Info

Publication number
CN108959084B
CN108959084B CN201810701155.6A CN201810701155A CN108959084B CN 108959084 B CN108959084 B CN 108959084B CN 201810701155 A CN201810701155 A CN 201810701155A CN 108959084 B CN108959084 B CN 108959084B
Authority
CN
China
Prior art keywords
equal
less
vulnerability
time node
prediction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810701155.6A
Other languages
Chinese (zh)
Other versions
CN108959084A (en
Inventor
高岭
张晓�
冯通
杨旭东
孙骞
王海
郑杰
赵子鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwest University
Original Assignee
Northwest University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwest University filed Critical Northwest University
Priority to CN201810701155.6A priority Critical patent/CN108959084B/en
Publication of CN108959084A publication Critical patent/CN108959084A/en
Application granted granted Critical
Publication of CN108959084B publication Critical patent/CN108959084B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3604Software analysis for verifying properties of programs
    • G06F11/3608Software analysis for verifying properties of programs using formal methods, e.g. model checking, abstract interpretation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A Markov vulnerability prediction quantity method based on a smoothing method and similarity is characterized in that a security vulnerability is used as a research object, historical data information of the security vulnerability is inspected to form a vulnerability complete set, and the vulnerability complete set is reasonably divided into a direct prediction set and an indirect prediction set. And then, predicting the direct prediction set by using a Markov method improved by an exponential smoothing method, establishing the relation between the direct prediction set and the indirect prediction set by using a cosine similarity principle, and further predicting the indirect prediction set. And finally, integrating the prediction results of the direct prediction set and the indirect prediction set, and providing a high-accuracy prediction value for workers in related fields.

Description

Markov vulnerability prediction quantity method based on smoothing method and similarity
Technical Field
The invention belongs to the technical field of computer information security, relates to an exponential smoothing method, cosine similarity and Markov algorithm, and particularly relates to a Markov vulnerability prediction quantity method based on the smoothing method and the similarity.
Background
The traditional software engineering subjects think that due to the principles of limited ability or insufficient experience of programmers, unreasonable software development process and the like, software inevitably has defects and hidden dangers. Among them, the defects and hidden dangers related to the security of the computer system are called security holes. With the rapid development of computer science, the attention of various industries to security vulnerabilities is increasing. Therefore, a method for predicting the number of the security vulnerabilities with high accuracy is provided for workers in the related field, and the method is significant work.
For quantitative prediction, the conventional method is based on statistical principles, and takes statistical indexes (such as arithmetic mean) as predicted values. Although this method is simple and easy, the correlation between different types of data is not considered, and therefore, it is difficult to obtain an accurate predicted value. The Markov algorithm is a modern prediction method, the algorithm fully considers the interconversion relationship among different types of data, and compared with the traditional method, the accuracy of prediction is greatly improved. However, how to determine the distribution of the state transition matrix more scientifically is a big problem in front of related researchers.
Disclosure of Invention
In order to overcome the above-mentioned deficiencies of the prior art, the present invention provides a method for predicting the number of holes in markov based on a smoothing method and a similarity, which improves the markov algorithm by using an exponential smoothing method and a cosine similarity. Firstly, a divide-and-conquer strategy is adopted, the average level of a whole set of the loopholes is taken as a reference, the loopholes which are closer to the loopholes are divided into a direct prediction set, and the loopholes which are farther from the loopholes are divided into an indirect prediction set. And secondly, investigating data distribution of various loopholes in the direct prediction set, and iteratively improving a state transition matrix and a probability matrix in a Markov method by using an exponential smoothing method to further obtain a predicted value of the direct prediction set. And finally, finding out the most similar vulnerability types for various vulnerabilities in the indirect prediction set in the direct prediction set by using cosine similarity, and carrying out proper scaling transformation by taking the predicted values as references to obtain the predicted values of the vulnerability number in the indirect prediction set.
In order to achieve the purpose, the invention adopts the technical scheme that:
a Markov prediction vulnerability quantity method based on a smoothing method and similarity is characterized by comprising the following steps:
(1) firstly, the historical quantity information of the security vulnerability is inspected to form a vulnerability complete set, and the vulnerability complete set is divided into a direct prediction set and an indirect prediction set
Examining historical quantity information of the security vulnerabilities to form a vulnerability complete set, recording the vulnerability complete set as U, wherein the U comprises all quantity information of n types of vulnerabilities in m time nodes, and recording the quantity information
Figure GDA0003289691800000021
uijRepresenting the number of the ith vulnerability in the jth time node, wherein i is more than or equal to 1 and less than or equal to n, j is more than or equal to 1 and less than or equal to m, and inspecting the average number level of the vulnerability complete set, namely calculating the arithmetic mean of all data in U
Figure GDA0003289691800000022
And rounding the calculated arithmetic mean down:
Figure GDA0003289691800000031
note that the i-th vulnerability is UiAnd (3) inspecting the average quantity level of various loopholes, namely calculating the arithmetic mean of data of each row in the U, and rounding down:
Figure GDA0003289691800000032
wherein i is more than or equal to 1 and less than or equal to n,
to be provided with
Figure GDA0003289691800000033
Taking 1 as an initial step length as a reference, p as a parameter for determining the range of the direct prediction set, generally, p is more than 0 and less than 1, and in order to ensure the accuracy of the prediction result, the value range of p is recommended to be more than or equal to 0.5 and less than 1,
the algorithm is as follows: algorithm for dividing vulnerability complete set into direct prediction set and indirect prediction set
Inputting: vulnerability complete set U and parameter p
And (3) outputting: direct prediction set S and indirect prediction set
Figure GDA0003289691800000034
Order to
Figure GDA0003289691800000035
sum=0,
Figure GDA0003289691800000038
Figure GDA0003289691800000037
Figure GDA0003289691800000041
(2) Predicting the number of various vulnerabilities in the direct prediction set:
the method for predicting the number of various vulnerabilities in the direct prediction set mainly comprises the following steps:
1) obtaining an actual state transition matrix
Setting w elements in a direct prediction set S, namely S contains all quantity information of w types of vulnerabilities on m time nodes; note the book
Figure GDA0003289691800000042
sijThe number of the ith vulnerability in the jth time node is represented, wherein i is more than or equal to 1 and less than or equal to w, j is more than or equal to 1 and less than or equal to m,
let QtRepresenting the actual state transition matrix from the (t-1) th time node to the t-th time node
Figure GDA0003289691800000043
qijtRepresenting the actual probability of transferring the ith vulnerability into the jth vulnerability from the (t-1) th time node to the tth time node, wherein t is more than 1 and less than or equal to m, i is more than or equal to 1 and less than or equal to w, and j is more than or equal to 1 and less than or equal to w;
wherein d isijThe number of times that the number of the ith loopholes is reduced and the number of the jth loopholes is increased is represented, i is more than or equal to 1 and less than or equal to w, j is more than or equal to 1 and less than or equal to w, and j is not equal to i; f. ofijIndicates the ratio of the number of i-th bugs decreasing and the number of j-th bugs increasing to the number of i-th bugs decreasing, i.e. the number of
Figure GDA0003289691800000051
1≤i≤w,1≤j≤w,j≠i;qitRepresents QtWherein i is not less than 1 and not more than w;
the algorithm is as follows: determining qitActual state transition matrix Q from t-1 time node to t time nodetRow i q ofitIs calculated by
Inputting: direct prediction set S, parameter fij
And (3) outputting: q. q.sit
Figure GDA0003289691800000052
Wherein, the value of i is 1, 2, … … and w in sequence, and a complete actual state transition matrix Q can be obtainedtTaking the t values of 2, 3, … … and m in sequence to obtain all actual state transition matrixes;
2) obtaining a predicted state transition matrix
Is Q'tRepresenting the predicted state transition matrix from the (t-1) th time node to the t-th time node
Figure GDA0003289691800000061
q’ijtRepresenting the prediction probability of transferring the ith vulnerability into the jth vulnerability from the t-1 th time node to the tth time node, wherein t is more than 1 and less than or equal to m, i is more than or equal to 1 and less than or equal to w, and j is more than or equal to 1 and less than or equal to w;
determining a predicted state transition matrix Q 'by't
When t is 2, Q't=Qt
B, when t is more than 2 and less than or equal to m, Q'tElement q 'of (1)'ijt(wherein i is more than or equal to 1 and less than or equal to w, and j is more than or equal to 1 and less than or equal to w) is obtained by an exponential smoothing method, namely:
q’ijt=αqijt+(1-α)q’ij(t-1)
wherein alpha is more than 0 and less than 1;
determining a predicted state transition matrix Q 'from t-1 time node to t time node'tThe algorithm of (1) inputs: actual state transition matrix QtParameter α, output: prediction state transition matrix Q't
Figure GDA0003289691800000062
Figure GDA0003289691800000071
Wherein, the value t is sequentially 2, 3, … … and m, and all prediction state transition matrixes can be obtained;
3) obtaining an actual probability matrix
Let PtRepresenting the actual probability matrix at the t-th time node; note the book
Pt=[p1t p2t … pwt],pitThe ratio of the ith vulnerability number in the direct prediction set to all vulnerability numbers in the direct prediction set at the tth time node is represented, and the count is recorded
Figure GDA0003289691800000072
Wherein i is more than or equal to 1 and less than or equal to w, t is more than or equal to 1 and less than or equal to m, the method is executed, and the value of t is 1, 2, … … and m in sequence to obtain all actual probability matrixes;
4) obtaining a prediction probability matrix
Is P'tRepresenting a prediction probability matrix at the t-th time node; note the book
Figure GDA0003289691800000073
p’itThe predicted value of the ratio of the ith vulnerability in the direct prediction set to all vulnerabilities in the direct prediction set at the tth time node is represented, wherein i is more than or equal to 1 and less than or equal to w, t is more than or equal to 1 and less than or equal to m,
determining a prediction probability matrix by:
a is P 'when t is 1't=Pt
B, when t is more than 1 and less than or equal to m, P'tOf (1) element p'itWherein i is more than or equal to 1 and less than or equal to w, and the value is obtained by an exponential smoothing method, namely: p'it=αpit+(1-α)p’i(t-1)Wherein alpha is more than 0 and less than 1,
determining a predicted probability matrix P 'of a t-th time node'tThe algorithm of (1) inputs: actual probability matrix PtParameter α, output: prediction state transition matrix P't
Figure GDA0003289691800000081
Wherein, the value t is 1, 2, … … and m in sequence, and all prediction probability matrixes can be obtained;
5) obtaining the predicted value of directly predicting the number of various centralized bugs
Setting the actual value of the total number of the loopholes of the direct prediction set at each time node as C, and recording C as [ C ═ C%1 c2 … cmcm+1],ciRepresenting the total number of holes of the ith time node, i.e.
Figure GDA0003289691800000082
Wherein i is more than or equal to 1 and less than or equal to m;
if the predicted value of the total number of vulnerabilities of the direct prediction set at each time node is C ', C ═ C'1 c’2 … c’mc’m+1]C 'is determined by'iWherein i is more than or equal to 1 and less than or equal to m:
(c 'when i is 1)'i=ci
C 'when i is more than 1 and less than or equal to m + 1'iObtained by exponential smoothing, namely:
Figure GDA0003289691800000083
wherein alpha is more than 0 and less than 1,
then c'm+1The prediction value of the total amount of the loopholes of the m +1 th time node of the direct prediction set is obtained;
obtaining a prediction state transition matrix Q 'from 2)'m(ii) a Obtaining a prediction probability matrix P 'from 4)'m
According to the Markov algorithm: a matrix of the number proportion of all the vulnerabilities in the direct prediction set at the (m + 1) th time node of each vulnerability in the direct prediction set:
Pm+1=P’m·Q’m
according to the nature of the matrix multiplication, Pm+1Is a row vector containing w elements, Pm+1The ith element in (1) is pi(m+1)Wherein i is more than or equal to 1 and less than or equal to w;
setting the quantity prediction matrix of various loopholes in the (m + 1) th time node in the direct prediction set as R, and recording R ═ R1 r2… rw]Let us order
Figure GDA0003289691800000091
Then r isiThe number predicted value of the ith vulnerability at the (m + 1) th time node is represented, wherein i is more than or equal to 1 and is less than or equal to w;
the matrix R is the number prediction result of various vulnerabilities in the (m + 1) th time node in the direct prediction set S;
(3) predicting the quantity of various vulnerabilities in the indirect prediction set;
1) obtaining cosine similarity matrix
Set indirect prediction set
Figure GDA0003289691800000092
In v elements, i.e.
Figure GDA0003289691800000093
All quantity information on m time nodes of the v-type vulnerability is contained; note the book
Figure GDA0003289691800000094
Figure GDA0003289691800000095
The number of the ith vulnerability in the jth time node is represented, wherein w + v equals to n, i is more than or equal to 1 and less than or equal to v, and j is more than or equal to 1 and less than or equal to m;
definition of
Figure GDA0003289691800000096
The variation vector of the ith vulnerability from the tth time node to the t +1 th time node is
Figure GDA0003289691800000097
Wherein i is more than or equal to 1 and less than or equal to v, and t is more than or equal to 1 and less than m;
defining the variation vector of the jth vulnerability from the tth time node to the t +1 th time node in the S as
Figure GDA0003289691800000098
Wherein j is more than or equal to 1 and less than or equal to w, and t is more than or equal to 1 and less than m;
wherein the content of the first and second substances,
Figure GDA0003289691800000099
and
Figure GDA00032896918000000910
respectively depict the change situation of the i-th vulnerability and the j-th vulnerability between two time nodes, and because the state transition has directionality,
Figure GDA00032896918000000911
and
Figure GDA00032896918000000912
is a variation vector;
therefore, from the t-th time node to the t + 1-th time node,
Figure GDA0003289691800000101
the cosine similarity between the ith bug and the jth bug in S is
Figure GDA0003289691800000102
Wherein i is more than or equal to 1 and less than or equal to v, j is more than or equal to 1 and less than or equal to w, and t is more than or equal to 1 and less than or equal to m;
is provided with
Figure GDA0003289691800000105
The cosine similarity between the ith vulnerability and the jth vulnerability in S is cos thetaijThe value is cos θijtWherein i is more than or equal to 1 and less than or equal to v, j is more than or equal to 1 and less than or equal to w, and t is more than or equal to 1 and less than m, namely:
Figure GDA0003289691800000103
set indirect prediction set
Figure GDA0003289691800000106
The cosine similarity matrix with the direct prediction set S is cos theta, then
Figure GDA0003289691800000104
Wherein i is more than or equal to 1 and less than or equal to v, and j is more than or equal to 1 and less than or equal to w;
2) obtaining most similar vulnerabilities
Finding out the subscript j of the maximum value of the ith row in cos theta, wherein the jth vulnerability in the direct prediction set S is the indirect prediction set
Figure GDA0003289691800000107
The most similar loopholes of the ith loopholes, wherein i is more than or equal to 1 and less than or equal to v, and j is more than or equal to 1 and less than or equal to w;
executing the above operations, and sequentially taking 1, 2, … … and v as the value of i to obtain an indirect prediction set
Figure GDA0003289691800000109
Directly predicting the most similar vulnerabilities of various vulnerabilities in the set S;
3) obtaining the predicted value of indirectly predicting the number of various vulnerabilities in the set
Survey indirect prediction set
Figure GDA0003289691800000108
Directly predicting the most similar vulnerability in the set S, namely the jth vulnerability in the set S, wherein i is more than or equal to 1 and less than or equal to v, and j is more than or equal to 1 and less than or equal to w; from the mth time node to the m +1 th time node, the relative increment of the jth vulnerability is
Figure GDA0003289691800000111
Then the predicted value of the number of the ith vulnerability at the (m + 1) th time node
Figure GDA0003289691800000112
Wherein i is more than or equal to 1 and less than or equal to v, and j is more than or equal to 1 and less than or equal to w;
executing the above operations, and sequentially taking 1, 2, … … and v as the value of i to obtain an indirect prediction set
Figure GDA00032896918000001110
Predicting the quantity of various loopholes at the (m + 1) th time node;
setting the quantity prediction matrix of various loopholes in the m +1 time node in the indirect prediction set as
Figure GDA0003289691800000113
Note the book
Figure GDA0003289691800000114
Order to
Figure GDA0003289691800000115
Then
Figure GDA0003289691800000116
The number predicted value of the ith vulnerability at the (m + 1) th time node is represented, wherein i is more than or equal to 1 and is less than or equal to v;
matrix array
Figure GDA00032896918000001111
I.e. indirect prediction set
Figure GDA00032896918000001112
Predicting the number of the various loopholes in the (m + 1) th time node;
(4) obtaining a prediction result of a vulnerability corpus
Order to
Figure GDA0003289691800000117
Then set RU=[R1 R2 … Rn]Namely the prediction result of the loophole complete set U at the m +1 time node, RiAnd the quantity predicted value of the ith type vulnerability in the vulnerability complete set U at the (m + 1) th time node is represented.
Further, the sum of each row in the state transition matrix in step 1) is 1, and the actual state transition matrix obtained by the above algorithm meets this requirement, which is proved as follows:
it is known that:
Figure GDA0003289691800000118
and (4) proving:
Figure GDA0003289691800000119
and (3) proving that:
(1) when s isit≥si(t-1)When there is qiit=1,qijt=0,j≠i;
Therefore, it is
Figure GDA0003289691800000121
(2) When s isit≥si(t-1)When there is qijt=fij·(1-qiit),j≠i;
Figure GDA0003289691800000122
Obtained from (1) and (2):
Figure GDA0003289691800000123
further, the sum of each row in the state transition matrix in step 2) is 1, and the predicted state transition matrix obtained by the above algorithm meets this requirement, which is proved as follows:
it is known that:
Figure GDA0003289691800000124
and (4) proving:
Figure GDA0003289691800000125
and (3) proving that:
(1) when t is 2, there is Q't=Qt
So q'ijt=qijt,1≤i≤w,1≤j≤w;
Therefore, it is
Figure GDA0003289691800000126
(2) When t is more than 2 and less than or equal to m, there is q'ijt=αqijt+(1-α)q’ij(t-1)
According to the mathematical induction method:
1) when t is equal to k, there are
Figure GDA0003289691800000127
Established
2) When t is equal to k +1,
Figure GDA0003289691800000131
obtained from 1) and 2):
Figure GDA0003289691800000132
when t is more than 2 and less than or equal to m, the method is established;
to sum up:
Figure GDA0003289691800000133
further, the probability matrix in step 3) is a row vector and the sum value is 1, and the actual probability matrix obtained by the above method meets the requirement, which proves as follows:
it is known that:
Figure GDA0003289691800000134
and (4) proving:
Figure GDA0003289691800000135
and (3) proving that:
Figure GDA0003289691800000136
further, the probability matrix in step 4) is a row vector and the sum value is 1, and the prediction probability matrix obtained by the above algorithm meets the requirement, which proves as follows:
it is known that:
Figure GDA0003289691800000137
and (4) proving:
Figure GDA0003289691800000138
and (3) proving that:
(1) when t is 1, there is P't=Pt
So p'it=pit,1≤i≤w,1≤j≤w;
Therefore, it is
Figure GDA0003289691800000141
(2) When t is more than 1 and less than or equal to m, there is p'it=αpit+(1-α)p’i(t-1)
According to the mathematical induction method:
1) when t is equal to k, there are
Figure GDA0003289691800000142
Established
2) When t is equal to k +1,
Figure GDA0003289691800000143
obtained from 1) and 2):
Figure GDA0003289691800000144
when t is more than 1 and less than or equal to m
To sum up:
Figure GDA0003289691800000145
further, the number of all kinds of security vulnerabilities in the authoritative information security vulnerability library is inspected and reported at a plurality of time nodes, and a vulnerability complete set is formed and expressed in a two-dimensional matrix form.
Further, the average quantity level of the loophole full set is inspected, the value is taken as the center, the step length is continuously increased, a proper neighborhood interval is determined, the interval is a direct prediction set, and the complement of the direct prediction set to the full set is an indirect prediction set.
The invention has the beneficial effects that:
and taking the security vulnerability as a research object, investigating historical data information of the security vulnerability to form a vulnerability complete set, and reasonably dividing the vulnerability complete set into a direct prediction set and an indirect prediction set. And then, predicting the direct prediction set by using a Markov method improved by an exponential smoothing method, establishing the relation between the direct prediction set and the indirect prediction set by using a cosine similarity principle, and further predicting the indirect prediction set. And finally, integrating the prediction results of the direct prediction set and the indirect prediction set, and providing a high-accuracy prediction value for workers in related fields.
Drawings
Fig. 1 is an algorithm diagram for dividing a vulnerability complete set into a direct prediction set and an indirect prediction set.
FIG. 2 is a graphical representation of a Markov prediction security vulnerability quantity method based on smoothing and similarity improvement.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
A Markov prediction vulnerability number method based on a smoothing method and similarity is shown in figures 1 and 2, and is characterized by comprising the following steps:
(1) firstly, the historical quantity information of the security vulnerability is inspected to form a vulnerability complete set, and the vulnerability complete set is divided into a direct prediction set and an indirect prediction set
Examining historical quantity information of the security vulnerabilities to form a vulnerability complete set, recording the vulnerability complete set as U, wherein the U comprises all quantity information of n types of vulnerabilities in m time nodes, and recording the quantity information
Figure GDA0003289691800000151
uijNumber of j time nodes representing i type vulnerabilityQuantity, wherein i is more than or equal to 1 and less than or equal to n, j is more than or equal to 1 and less than or equal to m, the average number level of the vulnerability complete set is considered, namely the arithmetic mean of all data in U is calculated
Figure GDA0003289691800000152
And rounding the calculated arithmetic mean down:
Figure GDA0003289691800000161
note that the i-th vulnerability is UiAnd (3) inspecting the average quantity level of various loopholes, namely calculating the arithmetic mean of data of each row in the U, and rounding down:
Figure GDA0003289691800000162
wherein i is more than or equal to 1 and less than or equal to n,
to be provided with
Figure GDA0003289691800000163
Taking 1 as an initial step length as a reference, p as a parameter for determining the range of the direct prediction set, generally, p is more than 0 and less than 1, and in order to ensure the accuracy of the prediction result, the value range of p is recommended to be more than or equal to 0.5 and less than 1,
the algorithm is as follows: algorithm for dividing vulnerability complete set into direct prediction set and indirect prediction set
Inputting: vulnerability complete set U and parameter p
And (3) outputting: direct prediction set S and indirect prediction set
Figure GDA0003289691800000164
Order to
Figure GDA0003289691800000165
sum=0,
Figure GDA0003289691800000168
Figure GDA0003289691800000167
Figure GDA0003289691800000171
(2) Predicting the number of various vulnerabilities in the direct prediction set:
the method for predicting the number of various vulnerabilities in the direct prediction set mainly comprises the following steps:
1) obtaining an actual state transition matrix
Setting w elements in a direct prediction set S, namely S contains all quantity information of w types of vulnerabilities on m time nodes; note the book
Figure GDA0003289691800000172
sijThe number of the ith vulnerability in the jth time node is represented, wherein i is more than or equal to 1 and less than or equal to w, j is more than or equal to 1 and less than or equal to m,
let QtRepresenting the actual state transition matrix from the (t-1) th time node to the t-th time node
Figure GDA0003289691800000173
qijtRepresenting the actual probability of transferring the ith vulnerability into the jth vulnerability from the (t-1) th time node to the tth time node, wherein t is more than 1 and less than or equal to m, i is more than or equal to 1 and less than or equal to w, and j is more than or equal to 1 and less than or equal to w;
wherein d isijThe number of times that the number of the ith loopholes is reduced and the number of the jth loopholes is increased is represented, i is more than or equal to 1 and less than or equal to w, j is more than or equal to 1 and less than or equal to w, and j is not equal to i; f. ofijIndicates the ratio of the number of i-th bugs decreasing and the number of j-th bugs increasing to the number of i-th bugs decreasing, i.e. the number of
Figure GDA0003289691800000181
1≤i≤w,1≤j≤w,j≠i;qitRepresents QtWherein i is not less than 1 and not more than w;
the algorithm is as follows: determining qitActual state transition matrix Q from t-1 time node to t time nodetRow i q ofitIs calculated by
Inputting: direct prediction set S, parameter fij
And (3) outputting: q. q.sit
Figure GDA0003289691800000182
Wherein, the value of i is 1, 2, … … and w in sequence, and a complete actual state transition matrix Q can be obtainedtTaking the t values of 2, 3, … … and m in sequence to obtain all actual state transition matrixes;
2) obtaining a predicted state transition matrix
Is Q'tRepresenting the predicted state transition matrix from the (t-1) th time node to the t-th time node
Figure GDA0003289691800000191
q’ijtRepresenting the prediction probability of transferring the ith vulnerability into the jth vulnerability from the t-1 th time node to the tth time node, wherein t is more than 1 and less than or equal to m, i is more than or equal to 1 and less than or equal to w, and j is more than or equal to 1 and less than or equal to w;
determining a predicted state transition matrix Q 'by't
When t is 2, Q't=Qt
B, when t is more than 2 and less than or equal to m, Q'tElement q 'of (1)'ijt(wherein i is more than or equal to 1 and less than or equal to w, and j is more than or equal to 1 and less than or equal to w) is obtained by an exponential smoothing method, namely:
q’ijt=αqijt+(1-α)q’ij(t-1)
wherein alpha is more than 0 and less than 1;
determining a predicted state transition matrix Q 'from t-1 time node to t time node'tThe algorithm of (1) inputs: actual state transition matrix QtParameter α, output: prediction state transition matrix Q't
Figure GDA0003289691800000192
Figure GDA0003289691800000201
Wherein, the value t is sequentially 2, 3, … … and m, and all prediction state transition matrixes can be obtained;
3) obtaining an actual probability matrix
Let PtRepresenting the actual probability matrix at the t-th time node; note Pt=[p1t p2t … pwt],pitThe ratio of the ith vulnerability number in the direct prediction set to all vulnerability numbers in the direct prediction set at the tth time node is represented, and the count is recorded
Figure GDA0003289691800000202
Wherein i is more than or equal to 1 and less than or equal to w, t is more than or equal to 1 and less than or equal to m, the method is executed, and the value of t is 1, 2, … … and m in sequence to obtain all actual probability matrixes;
4) obtaining a prediction probability matrix
Is P'tRepresenting a prediction probability matrix at the t-th time node; note the book
Figure GDA0003289691800000203
p’itThe predicted value of the ratio of the ith vulnerability in the direct prediction set to all vulnerabilities in the direct prediction set at the tth time node is represented, wherein i is more than or equal to 1 and less than or equal to w, t is more than or equal to 1 and less than or equal to m,
determining a prediction probability matrix by:
a is P 'when t is 1't=Pt
B, when t is more than 1 and less than or equal to m, P'tOf (1) element p'itWherein i is more than or equal to 1 and less than or equal to w, and the value is obtained by an exponential smoothing method, namely: p'it=αpit+(1-α)p’i(t-1)Wherein alpha is more than 0 and less than 1,
determining a predicted probability matrix P 'of a t-th time node'tThe algorithm of (1) inputs: actual probability matrix PtParameter α, output: prediction state transition matrix P't
Figure GDA0003289691800000211
Wherein, the value t is 1, 2, … … and m in sequence, and all prediction probability matrixes can be obtained;
5) obtaining the predicted value of directly predicting the number of various centralized bugs
Setting the actual value of the total number of the loopholes of the direct prediction set at each time node as C, and recording C as [ C ═ C%1 c2 … cmcm+1],ciRepresenting the total number of holes of the ith time node, i.e.
Figure GDA0003289691800000212
Wherein i is more than or equal to 1 and less than or equal to m;
if the predicted value of the total number of vulnerabilities of the direct prediction set at each time node is C ', C ═ C'1 c’2 … c’mc’m+1]C 'is determined by'iWherein i is more than or equal to 1 and less than or equal to m:
(c 'when i is 1)'i=ci
C 'when i is more than 1 and less than or equal to m + 1'iObtained by exponential smoothing, namely:
Figure GDA0003289691800000213
wherein alpha is more than 0 and less than 1,
then c'm+1The prediction value of the total amount of the loopholes of the m +1 th time node of the direct prediction set is obtained;
obtaining a prediction state transition matrix Q 'from 2)'m(ii) a Obtaining a prediction probability matrix P 'from 4)'m
According to the Markov algorithm: a matrix of the number proportion of all the vulnerabilities in the direct prediction set at the (m + 1) th time node of each vulnerability in the direct prediction set:
Pm+1=P’m·Q’m
according to the nature of the matrix multiplication, Pm+1Is aA row vector comprising w elements, denoted Pm+1The ith element in (1) is pi(m+1)Wherein i is more than or equal to 1 and less than or equal to w;
setting the quantity prediction matrix of various loopholes in the (m + 1) th time node in the direct prediction set as R, and recording R ═ R1 r2… rw]Let us order
Figure GDA0003289691800000221
Then r isiThe number predicted value of the ith vulnerability at the (m + 1) th time node is represented, wherein i is more than or equal to 1 and is less than or equal to w;
the matrix R is the number prediction result of various vulnerabilities in the (m + 1) th time node in the direct prediction set S;
(3) predicting the quantity of various vulnerabilities in the indirect prediction set;
1) obtaining cosine similarity matrix
Set indirect prediction set
Figure GDA0003289691800000222
In v elements, i.e.
Figure GDA0003289691800000223
All quantity information on m time nodes of the v-type vulnerability is contained; note the book
Figure GDA0003289691800000224
Figure GDA0003289691800000225
The number of the ith vulnerability in the jth time node is represented, wherein w + v equals to n, i is more than or equal to 1 and less than or equal to v, and j is more than or equal to 1 and less than or equal to m;
definition of
Figure GDA00032896918000002212
The variation vector of the ith vulnerability from the tth time node to the t +1 th time node is
Figure GDA0003289691800000226
Wherein i is more than or equal to 1 and less than or equal to v, and t is more than or equal to 1 and less than m;
defining class j leaks in SThe change vector of the hole from the t-th time node to the t + 1-th time node is
Figure GDA0003289691800000227
Wherein j is more than or equal to 1 and less than or equal to w, and t is more than or equal to 1 and less than m;
wherein the content of the first and second substances,
Figure GDA0003289691800000228
and
Figure GDA0003289691800000229
respectively depict the change situation of the i-th vulnerability and the j-th vulnerability between two time nodes, and because the state transition has directionality,
Figure GDA00032896918000002210
and
Figure GDA00032896918000002211
is a variation vector;
therefore, from the t-th time node to the t + 1-th time node,
Figure GDA0003289691800000231
the cosine similarity between the ith bug and the jth bug in S is
Figure GDA0003289691800000232
Wherein i is more than or equal to 1 and less than or equal to v, j is more than or equal to 1 and less than or equal to w, and t is more than or equal to 1 and less than or equal to m;
is provided with
Figure GDA0003289691800000235
The cosine similarity between the ith vulnerability and the jth vulnerability in S is cos thetaijThe value is cos θijtWherein i is more than or equal to 1 and less than or equal to v, j is more than or equal to 1 and less than or equal to w, and t is more than or equal to 1 and less than m, namely:
Figure GDA0003289691800000233
set indirect prediction set
Figure GDA0003289691800000236
The cosine similarity matrix with the direct prediction set S is cos theta, then
Figure GDA0003289691800000234
Wherein i is more than or equal to 1 and less than or equal to v, and j is more than or equal to 1 and less than or equal to w;
2) obtaining most similar vulnerabilities
Finding out the subscript j of the maximum value of the ith row in cos theta, wherein the jth vulnerability in the direct prediction set S is the indirect prediction set
Figure GDA0003289691800000238
The most similar loopholes of the ith loopholes, wherein i is more than or equal to 1 and less than or equal to v, and j is more than or equal to 1 and less than or equal to w;
executing the above operations, and sequentially taking 1, 2, … … and v as the value of i to obtain an indirect prediction set
Figure GDA0003289691800000239
Directly predicting the most similar vulnerabilities of various vulnerabilities in the set S;
3) obtaining the predicted value of indirectly predicting the number of various vulnerabilities in the set
Survey indirect prediction set
Figure GDA0003289691800000237
Directly predicting the most similar vulnerability in the set S, namely the jth vulnerability in the set S, wherein i is more than or equal to 1 and less than or equal to v, and j is more than or equal to 1 and less than or equal to w; from the mth time node to the m +1 th time node, the relative increment of the jth vulnerability is
Figure GDA0003289691800000241
Then the predicted value of the number of the ith vulnerability at the (m + 1) th time node
Figure GDA0003289691800000242
Wherein i is more than or equal to 1 and less than or equal to v, and j is more than or equal to 1 and less than or equal to w;
executing the above operations, and sequentially taking 1, 2, … … and v as the value of i to obtain an indirect prediction set
Figure GDA00032896918000002412
Predicting the quantity of various loopholes at the (m + 1) th time node;
setting the quantity prediction matrix of various loopholes in the m +1 time node in the indirect prediction set as
Figure GDA0003289691800000243
Note the book
Figure GDA0003289691800000244
Order to
Figure GDA0003289691800000245
Then
Figure GDA0003289691800000246
The number predicted value of the ith vulnerability at the (m + 1) th time node is represented, wherein i is more than or equal to 1 and is less than or equal to v;
matrix array
Figure GDA0003289691800000247
I.e. indirect prediction set
Figure GDA0003289691800000248
Predicting the number of the various loopholes in the (m + 1) th time node;
(4) obtaining a prediction result of a vulnerability corpus
Order to
Figure GDA0003289691800000249
Then set RU=[R1 R2 … Rn]Namely the prediction result of the loophole complete set U at the m +1 time node, RiAnd the quantity predicted value of the ith type vulnerability in the vulnerability complete set U at the (m + 1) th time node is represented.
Further, the sum of each row in the state transition matrix in step 1) is 1, and the actual state transition matrix obtained by the above algorithm meets this requirement, which is proved as follows:
it is known that:
Figure GDA00032896918000002410
and (4) proving:
Figure GDA00032896918000002411
and (3) proving that:
(3) when s isit≥si(t-1)When there is qiit=1,qijt=0,j≠i;
Therefore, it is
Figure GDA0003289691800000251
(4) When s isit≥si(t-1)When there is qijt=fij·(1-qiit),j≠i;
Figure GDA0003289691800000252
Obtained from (1) and (2):
Figure GDA0003289691800000253
further, the sum of each row in the state transition matrix in step 2) is 1, and the predicted state transition matrix obtained by the above algorithm meets this requirement, which is proved as follows:
it is known that:
Figure GDA0003289691800000254
and (4) proving:
Figure GDA0003289691800000255
and (3) proving that:
(1) when t is 2, there is Q't=Qt
So q'ijt=qijt,1≤i≤w,1≤j≤w;
Therefore, it is
Figure GDA0003289691800000256
(2) When t is more than 2 and less than or equal to m, there is q'ijt=αqijt+(1-α)q’ij(t-1)
According to the mathematical induction method:
3) when t is equal to k, there are
Figure GDA0003289691800000257
Established
4) When t is equal to k +1,
Figure GDA0003289691800000261
obtained from 1) and 2):
Figure GDA0003289691800000262
when t is more than 2 and less than or equal to m, the method is established;
to sum up:
Figure GDA0003289691800000263
further, the probability matrix in step 3) is a row vector and the sum value is 1, and the actual probability matrix obtained by the above method meets the requirement, which proves as follows:
it is known that:
Figure GDA0003289691800000264
and (4) proving:
Figure GDA0003289691800000265
and (3) proving that:
Figure GDA0003289691800000266
further, the probability matrix in step 4) is a row vector and the sum value is 1, and the prediction probability matrix obtained by the above algorithm meets the requirement, which proves as follows:
it is known that:
Figure GDA0003289691800000267
and (4) proving:
Figure GDA0003289691800000268
and (3) proving that:
(1) when t is 1, there is P't=Pt
So p isit=pit,1≤i≤w,1≤j≤w;
Therefore, it is
Figure GDA0003289691800000271
(2) When t is more than 1 and less than or equal to m, there is p'it=αpit+(1-α)p’i(t-1)
According to the mathematical induction method:
1) when t is equal to k, there are
Figure GDA0003289691800000272
Established
2) When t is equal to k +1,
Figure GDA0003289691800000273
obtained from 1) and 2):
Figure GDA0003289691800000274
when t is more than 1 and less than or equal to m
To sum up:
Figure GDA0003289691800000275
further, the number of all kinds of security vulnerabilities in the authoritative information security vulnerability library is inspected and reported at a plurality of time nodes, and a vulnerability complete set is formed and expressed in a two-dimensional matrix form.
Further, the average quantity level of the loophole full set is inspected, the value is taken as the center, the step length is continuously increased, a proper neighborhood interval is determined, the interval is a direct prediction set, and the complement of the direct prediction set to the full set is an indirect prediction set.

Claims (7)

1. A Markov prediction vulnerability quantity method based on a smoothing method and similarity is characterized by comprising the following steps:
(1) firstly, the historical quantity information of the security vulnerability is inspected to form a vulnerability complete set, and the vulnerability complete set is divided into a direct prediction set and an indirect prediction set
Examining historical quantity information of the security vulnerabilities to form a vulnerability complete set, recording the vulnerability complete set as U, wherein the U comprises all quantity information of n types of vulnerabilities in m time nodes, and recording the quantity information
Figure FDA0003454843350000011
uijRepresenting the number of the ith vulnerability in the jth time node, wherein i is more than or equal to 1 and less than or equal to n, j is more than or equal to 1 and less than or equal to m, and inspecting the average number level of the vulnerability complete set, namely calculating the arithmetic mean of all data in U
Figure FDA0003454843350000012
And rounding the calculated arithmetic mean down:
Figure FDA0003454843350000013
note that the i-th vulnerability is UiAnd (3) inspecting the average quantity level of various loopholes, namely calculating the arithmetic mean of data of each row in the U, and rounding down:
Figure FDA0003454843350000014
wherein i is more than or equal to 1 and less than or equal to n,
to be provided with
Figure FDA0003454843350000015
For reference, 1 is the initial step size, p is for determining the range of the direct prediction setThe parameters are generally more than 0 and less than 1, in order to ensure the accuracy of the prediction result,
the algorithm is as follows: algorithm for dividing vulnerability complete set into direct prediction set and indirect prediction set
Inputting: vulnerability complete set U and parameter p
And (3) outputting: direct prediction set S and indirect prediction set
Figure FDA0003454843350000021
Order to
Figure FDA0003454843350000022
sum=0,
Figure FDA0003454843350000023
Figure FDA0003454843350000024
(2) Predicting the number of various vulnerabilities in the direct prediction set:
the method for predicting the number of various vulnerabilities in the direct prediction set mainly comprises the following steps:
1) obtaining an actual state transition matrix
Setting w elements in a direct prediction set S, namely S contains all quantity information of w types of vulnerabilities on m time nodes; note the book
Figure FDA0003454843350000031
sijThe number of the ith vulnerability in the jth time node is represented, wherein i is more than or equal to 1 and less than or equal to w, j is more than or equal to 1 and less than or equal to m,
let QtRepresenting the actual state transition from the t-1 st time node to the t-th time node
Figure FDA0003454843350000032
qijtIndicating that the ith vulnerability is transferred to the jth vulnerability from the t-1 th time node to the tth time nodeThe actual probability of t is more than 1 and less than or equal to m, i is more than or equal to 1 and less than or equal to w, and j is more than or equal to 1 and less than or equal to w;
wherein d isijThe number of times that the number of the ith loopholes is reduced and the number of the jth loopholes is increased is represented, i is more than or equal to 1 and less than or equal to w, j is more than or equal to 1 and less than or equal to w, and j is not equal to i; f. ofijIndicates the ratio of the number of i-th bugs decreasing and the number of j-th bugs increasing to the number of i-th bugs decreasing, i.e. the number of
Figure FDA0003454843350000033
qitRepresents QtWherein i is not less than 1 and not more than w;
the algorithm is as follows: determining qitActual state transition matrix Q from t-1 time node to t time nodetRow i q ofitIs calculated by
Inputting: direct prediction set S, parameter fij
And (3) outputting: q. q.sit
Figure FDA0003454843350000034
Figure FDA0003454843350000041
Wherein, the value of i is 1, 2, … … and w in sequence, and a complete actual state transition matrix Q can be obtainedtTaking the t values of 2, 3, … … and m in sequence to obtain all actual state transition matrixes;
2) obtaining a predicted state transition matrix
Is Q'tRepresenting the predicted state transition matrix from the t-1 st time node to the t-th time node
Figure FDA0003454843350000042
q′ijtThe prediction probability of the ith vulnerability transferred from the t-1 th time node to the jth vulnerability is represented, wherein t is more than 1 and less than or equal to m, i is more than or equal to 1 and less than or equal to w, and 1 is more than or equal to 1j≤w;
Determining a predicted state transition matrix Q 'by't
When t is 2, Q't=Qt
B, when t is more than 2 and less than or equal to m, Q'tElement q 'of (1)'ijtWherein i is more than or equal to 1 and less than or equal to w, j is more than or equal to 1 and less than or equal to w is obtained by an exponential smoothing method, namely:
q′ijt=αqijt+(1-α)q′ij(t-1)
wherein alpha is more than 0 and less than 1;
determining a predicted state transition matrix Q 'from t-1 time node to t time node'tThe algorithm of (1) inputs: actual state transition matrix QtParameter α, output: prediction state transition matrix Q't
Figure FDA0003454843350000051
Wherein, the value t is sequentially 2, 3, … … and m, and all prediction state transition matrixes can be obtained;
3) obtaining an actual probability matrix
Let PtRepresenting the actual probability matrix at the t-th time node; note Pt=[p1t p2t…pwt],pitThe ratio of the ith vulnerability number in the direct prediction set to all vulnerability numbers in the direct prediction set at the tth time node is represented, and the count is recorded
Figure FDA0003454843350000052
Wherein i is more than or equal to 1 and less than or equal to w, t is more than or equal to 1 and less than or equal to m, the method is executed, and the value of t is 1, 2, … … and m in sequence to obtain all actual probability matrixes;
4) obtaining a prediction probability matrix
Is P'tRepresenting a prediction probability matrix at the t-th time node; note the book
Figure FDA0003454843350000063
p′itThe predicted value of the ratio of the ith vulnerability in the direct prediction set to all vulnerabilities in the direct prediction set at the tth time node is represented, wherein i is more than or equal to 1 and less than or equal to w, t is more than or equal to 1 and less than or equal to m,
determining a prediction probability matrix by:
a is P 'when t is 1't=Pt
B, when t is more than 1 and less than or equal to m, P'tOf (1) element p'itWherein i is more than or equal to 1 and less than or equal to w, and the value is obtained by an exponential smoothing method, namely: p'it=αpit+(1-α)p′i(t-1)Wherein alpha is more than 0 and less than 1,
determining a predicted probability matrix P 'of a t-th time node'tThe algorithm of (1) inputs: actual probability matrix PtParameter α, output: prediction state transition matrix P't
Figure FDA0003454843350000061
Wherein, the value t is 1, 2, … … and m in sequence, and all prediction probability matrixes can be obtained;
5) obtaining the predicted value of directly predicting the number of various centralized bugs
Setting the actual value of the total number of the loopholes of the direct prediction set at each time node as C, and recording C as [ C ═ C%1 c2…cm cm+1],ciRepresenting the total number of holes of the ith time node, i.e.
Figure FDA0003454843350000062
Wherein i is more than or equal to 1 and less than or equal to m;
if the predicted value of the total number of vulnerabilities of the direct prediction set at each time node is C ', C ═ C'1 c′2…c′m c′m+1]C 'is determined by'iWherein i is more than or equal to 1 and less than or equal to m:
(c 'when i is 1)'i=ci
C 'when i is more than 1 and less than or equal to m + 1'iObtained by exponential smoothing, i.e.:
Figure FDA0003454843350000071
Wherein alpha is more than 0 and less than 1,
then c'm+1The prediction value of the total amount of the loopholes of the m +1 th time node of the direct prediction set is obtained;
obtaining a prediction state transition matrix Q 'from 2)'m(ii) a Obtaining a prediction probability matrix P 'from 4)'m
According to the Markov algorithm: a matrix of the number proportion of all the vulnerabilities in the direct prediction set at the (m + 1) th time node of each vulnerability in the direct prediction set:
Pm+1=P′m·Q′m
according to the nature of the matrix multiplication, Pm+1Is a row vector containing w elements, Pm+1The ith element in (1) is pi(m+1)Wherein i is more than or equal to 1 and less than or equal to w;
setting the quantity prediction matrix of various loopholes in the (m + 1) th time node in the direct prediction set as R, and recording R ═ R1 r2…rw]Let us order
Figure FDA0003454843350000072
Then r isiThe number predicted value of the ith vulnerability at the (m + 1) th time node is represented, wherein i is more than or equal to 1 and is less than or equal to w;
the matrix R is the number prediction result of various vulnerabilities in the (m + 1) th time node in the direct prediction set S;
(3) predicting the quantity of various vulnerabilities in the indirect prediction set;
1) obtaining cosine similarity matrix
Set indirect prediction set
Figure FDA0003454843350000073
In v elements, i.e.
Figure FDA0003454843350000074
All quantity information on m time nodes of the v-type vulnerability is contained;note the book
Figure FDA0003454843350000081
Figure FDA0003454843350000082
The number of the ith vulnerability in the jth time node is represented, wherein w + v equals to n, i is more than or equal to 1 and less than or equal to v, and j is more than or equal to 1 and less than or equal to m;
definition of
Figure FDA0003454843350000083
The variation vector of the ith vulnerability from the tth time node to the t +1 th time node is
Figure FDA0003454843350000084
Wherein i is more than or equal to 1 and less than or equal to v, and t is more than or equal to 1 and less than m;
defining the variation vector of the jth vulnerability from the tth time node to the t +1 th time node in the S as
Figure FDA0003454843350000085
Wherein j is more than or equal to 1 and less than or equal to w, and t is more than or equal to 1 and less than m;
wherein the content of the first and second substances,
Figure FDA0003454843350000086
and
Figure FDA0003454843350000087
respectively depict the change situation of the i-th vulnerability and the j-th vulnerability between two time nodes, and because the state transition has directionality,
Figure FDA0003454843350000088
and
Figure FDA0003454843350000089
is a variation vector;
therefore, from the t-th time node to the t + 1-th time node,
Figure FDA00034548433500000810
the cosine similarity between the ith bug and the jth bug in S is
Figure FDA00034548433500000811
Wherein i is more than or equal to 1 and less than or equal to v, j is more than or equal to 1 and less than or equal to w, and t is more than or equal to 1 and less than or equal to m;
is provided with
Figure FDA00034548433500000812
The cosine similarity between the ith vulnerability and the jth vulnerability in S is cos thetaijThe value is cos θijtWherein i is more than or equal to 1 and less than or equal to v, j is more than or equal to 1 and less than or equal to w, and t is more than or equal to 1 and less than m, namely:
Figure FDA00034548433500000813
set indirect prediction set
Figure FDA00034548433500000814
The cosine similarity matrix with the direct prediction set S is cos theta, then
Figure FDA0003454843350000091
Wherein i is more than or equal to 1 and less than or equal to v, and j is more than or equal to 1 and less than or equal to w;
2) obtaining most similar vulnerabilities
Finding out the subscript j of the maximum value of the ith row in cos theta, wherein the jth vulnerability in the direct prediction set S is the indirect prediction set
Figure FDA00034548433500000910
The most similar loopholes of the ith loopholes, wherein i is more than or equal to 1 and less than or equal to v, and j is more than or equal to 1 and less than or equal to w;
executing the above operations, and sequentially taking 1, 2, … … and v as the value of i to obtain an indirect prediction set
Figure FDA00034548433500000911
In the direct prediction set SThe most similar vulnerability of (1);
3) obtaining the predicted value of indirectly predicting the number of various vulnerabilities in the set
Survey indirect prediction set
Figure FDA0003454843350000092
Directly predicting the most similar vulnerability in the set S, namely the jth vulnerability in the set S, wherein i is more than or equal to 1 and less than or equal to v, and j is more than or equal to 1 and less than or equal to w; from the mth time node to the m +1 th time node, the relative increment of the jth vulnerability is
Figure FDA0003454843350000093
Then the predicted value of the number of the ith vulnerability at the (m + 1) th time node
Figure FDA0003454843350000094
Wherein i is more than or equal to 1 and less than or equal to v, and j is more than or equal to 1 and less than or equal to w;
executing the above operations, and sequentially taking 1, 2, … … and v as the value of i to obtain an indirect prediction set
Figure FDA0003454843350000095
Predicting the quantity of various loopholes at the (m + 1) th time node;
setting the quantity prediction matrix of various loopholes in the m +1 time node in the indirect prediction set as
Figure FDA0003454843350000096
Note the book
Figure FDA0003454843350000097
Order to
Figure FDA0003454843350000098
Then
Figure FDA0003454843350000099
The number predicted value of the ith vulnerability at the (m + 1) th time node is represented, wherein i is more than or equal to 1 and is less than or equal to v;
matrix array
Figure FDA0003454843350000101
I.e. indirect prediction set
Figure FDA0003454843350000102
Predicting the number of the various loopholes in the (m + 1) th time node;
(4) obtaining a prediction result of a vulnerability corpus
Order to
Figure FDA0003454843350000103
Then set RU=[R1 R2…Rn]Namely the prediction result of the loophole complete set U at the m +1 time node, RiAnd the quantity predicted value of the ith type vulnerability in the vulnerability complete set U at the (m + 1) th time node is represented.
2. The method according to claim 1, wherein the sum of each row in the state transition matrix in step 1) is 1, and the actual state transition matrix obtained by the method for predicting the number of the holes based on the markov algorithm and the similarity satisfies the requirement as follows:
it is known that:
Figure FDA0003454843350000104
and (4) proving:
Figure FDA0003454843350000105
and (3) proving that:
(1) when s isit≥si(t-1)When q is greateriit=1,qijt=0,j≠i;
Therefore, it is
Figure FDA0003454843350000106
(2) When s isit≥si(t-1)When q is greaterijt=fij·(1-qiit),j≠i;
Therefore, it is
Figure FDA0003454843350000107
Figure FDA0003454843350000111
Obtained from (1) and (2):
Figure FDA0003454843350000112
3. the method for predicting the number of the holes based on the markov algorithm with the similarity as claimed in claim 1, wherein the sum of each row in the state transition matrix in the step 2) is 1, and the predicted state transition matrix obtained by the method for predicting the number of the holes based on the markov algorithm with the similarity meets the requirement, which is proved as follows:
it is known that:
Figure FDA0003454843350000113
and (4) proving:
Figure FDA0003454843350000114
and (3) proving that:
(1) when t is 2, there is Q't=Qt
So q'ijt=qijt,1≤i≤w,1≤j≤w;
Therefore, it is
Figure FDA0003454843350000115
(2) When t is more than 2 and less than or equal to m, there is q'ijt=αqijt+(1-α)q′ij(t-1)
According to the mathematical induction method:
1) when t is equal to k, there are
Figure FDA0003454843350000116
Established
2) When t is equal to k +1,
Figure FDA0003454843350000117
Figure FDA0003454843350000121
obtained from 1) and 2):
Figure FDA0003454843350000122
when t is more than 2 and less than or equal to m, the method is established;
to sum up:
Figure FDA0003454843350000123
4. the method according to claim 1, wherein the probability matrix in step 3) is a row vector and the sum is 1, and the actual probability matrix obtained by the method for predicting the number of the holes based on the markov algorithm and the similarity satisfies the requirement as follows:
it is known that:
Figure FDA0003454843350000124
and (4) proving:
Figure FDA0003454843350000125
and (3) proving that:
Figure FDA0003454843350000126
5. the method for predicting the number of the holes based on the Markov algorithm with the smoothness and the similarity as claimed in claim 1, wherein the probability matrix in the step 4) is a row vector and the sum is 1, and the prediction probability matrix obtained by the method for predicting the number of the holes based on the Markov algorithm with the similarity meets the requirement, which is proved as follows:
it is known that:
Figure FDA0003454843350000127
and (4) proving:
Figure FDA0003454843350000128
and (3) proving that:
(1) when t is 1, there is P't=Pt
So p'it=pit,1≤i≤w,1≤j≤w;
Therefore, it is
Figure FDA0003454843350000131
(2) When t is more than 1 and less than or equal to m, there is p'it=αpit+(1-α)p′i(t-1)
According to the mathematical induction method:
1) when t is equal to k, there are
Figure FDA0003454843350000132
Established
2) When t is equal to k +1,
Figure FDA0003454843350000133
obtained from 1) and 2):
Figure FDA0003454843350000134
when t is more than 1 and less than or equal to m
To sum up:
Figure FDA0003454843350000135
6. the method for predicting the number of the vulnerabilities based on the markov algorithm with the similarity as claimed in claim 1, wherein the number of the various types of vulnerabilities in the authority information security vulnerability database at a plurality of time nodes is examined to form a vulnerability complete set, which is expressed in a two-dimensional matrix form.
7. The method of claim 1, wherein an average number level of a vulnerability corpus is examined, the step length is continuously increased by taking the average number as a center, a proper neighborhood interval is determined, the interval is a direct prediction set, and a complement of the direct prediction set to the corpus is an indirect prediction set.
CN201810701155.6A 2018-06-29 2018-06-29 Markov vulnerability prediction quantity method based on smoothing method and similarity Active CN108959084B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810701155.6A CN108959084B (en) 2018-06-29 2018-06-29 Markov vulnerability prediction quantity method based on smoothing method and similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810701155.6A CN108959084B (en) 2018-06-29 2018-06-29 Markov vulnerability prediction quantity method based on smoothing method and similarity

Publications (2)

Publication Number Publication Date
CN108959084A CN108959084A (en) 2018-12-07
CN108959084B true CN108959084B (en) 2022-03-25

Family

ID=64484081

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810701155.6A Active CN108959084B (en) 2018-06-29 2018-06-29 Markov vulnerability prediction quantity method based on smoothing method and similarity

Country Status (1)

Country Link
CN (1) CN108959084B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112115476B (en) * 2020-08-06 2023-10-24 扬州大学 Automatic vulnerability classification method, system and computer equipment based on LSTM

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101923618A (en) * 2010-08-19 2010-12-22 中国航天科技集团公司第七一○研究所 Hidden Markov model based method for detecting assembler instruction level vulnerability
CN103532761A (en) * 2013-10-18 2014-01-22 嘉兴学院 Survivability evaluating method applicable to attacked wireless sensing network
CN104469798A (en) * 2014-12-12 2015-03-25 重庆邮电大学 Communication network load condition information forecasting method based on Markov chain
EP3139318A1 (en) * 2015-09-04 2017-03-08 Siemens Aktiengesellschaft Patch management for industrial control systems

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101923618A (en) * 2010-08-19 2010-12-22 中国航天科技集团公司第七一○研究所 Hidden Markov model based method for detecting assembler instruction level vulnerability
CN103532761A (en) * 2013-10-18 2014-01-22 嘉兴学院 Survivability evaluating method applicable to attacked wireless sensing network
CN104469798A (en) * 2014-12-12 2015-03-25 重庆邮电大学 Communication network load condition information forecasting method based on Markov chain
EP3139318A1 (en) * 2015-09-04 2017-03-08 Siemens Aktiengesellschaft Patch management for industrial control systems
US10331429B2 (en) * 2015-09-04 2019-06-25 Siemens Aktiengesellschaft Patch management for industrial control systems

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于漏洞严重程度分类的漏洞预测模型;高志伟等;《电子学报》;20130930;第41卷(第9期);第1784-1787页 *

Also Published As

Publication number Publication date
CN108959084A (en) 2018-12-07

Similar Documents

Publication Publication Date Title
Saylor et al. Quantifying comparison of large detrital geochronology data sets
CN108647272B (en) Method for predicting concentration of butane at bottom of debutanizer by expanding small samples based on data distribution
CN113038302B (en) Flow prediction method and device and computer storage medium
Yang et al. Expected efficiency based on directional distance function in data envelopment analysis
Manninen et al. Leukemia prediction using sparse logistic regression
US7587280B2 (en) Genomic data mining using clustering logic and filtering criteria
Villoria et al. Gaussian quadratures vs. Monte Carlo experiments for systematic sensitivity analysis of computable general equilibrium model results
CN115510981A (en) Decision tree model feature importance calculation method and device and storage medium
WO2023201772A1 (en) Cross-domain remote sensing image semantic segmentation method based on adaptation and self-training in iteration domain
CN114611582B (en) Method and system for analyzing substance concentration based on near infrared spectrum technology
CN108959084B (en) Markov vulnerability prediction quantity method based on smoothing method and similarity
CN110324178B (en) Network intrusion detection method based on multi-experience nuclear learning
Weine et al. Application of equal local levels to improve QQ plot testing bands with R package qqconf
Saidov Data visualization and its proof by compactness criterion of objects of classes
Salem et al. Sequential dimension reduction for learning features of expensive black-box functions
Zhang et al. Dbiecm-an evolving clustering method for streaming data clustering
CN114298659A (en) Data processing method and device for evaluation object index and computer equipment
Izsák Some practical aspects of fitting and testing the Zipf-Mandelbrot model: A short essay
Liu et al. A robust regression based on weighted LSSVM and penalized trimmed squares
Zhao et al. Distribution-free and model-free multivariate feature screening via multivariate rank distance correlation
Mazzocco et al. Estimates and impact of lymphocyte division parameters from CFSE data using mathematical modelling
Bongiorno et al. Statistically validated hierarchical clustering: Nested partitions in hierarchical trees
RU2586025C2 (en) Method for automatic clustering of objects
Zhang et al. Predicting Future Event via Small Data (eg, 4 Data) by ASF and Curve Fitting Methods
Ren et al. Development of feature extraction method based on interval-valued Pythagorean fuzzy decision theory

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant