CN117375845A - Network asset certificate identification method and device - Google Patents

Network asset certificate identification method and device Download PDF

Info

Publication number
CN117375845A
CN117375845A CN202311345134.2A CN202311345134A CN117375845A CN 117375845 A CN117375845 A CN 117375845A CN 202311345134 A CN202311345134 A CN 202311345134A CN 117375845 A CN117375845 A CN 117375845A
Authority
CN
China
Prior art keywords
certificate
network asset
information
asset
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311345134.2A
Other languages
Chinese (zh)
Inventor
任传伦
张先国
杨天长
刘策越
李宝静
尹誉衡
唐然
郭强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 15 Research Institute
Original Assignee
CETC 15 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 15 Research Institute filed Critical CETC 15 Research Institute
Priority to CN202311345134.2A priority Critical patent/CN117375845A/en
Publication of CN117375845A publication Critical patent/CN117375845A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3263Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials involving certificates, e.g. public key certificate [PKC] or attribute certificate [AC]; Public key infrastructure [PKI] arrangements

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a network asset certificate identification method and a device, wherein the method comprises the following steps: acquiring certificate information of a network asset, wherein the certificate information of the network asset forms an asset certificate library; processing the certificate information of the network asset to obtain the characteristic information of the network asset, wherein the characteristic information of the network asset forms a characteristic library; training a preset network asset certificate identification model by utilizing the characteristic information of the network asset to obtain a training network asset certificate identification model; and acquiring the certificate information of the network asset to be identified, and processing the certificate information of the network asset to be identified by utilizing the training network asset certificate identification model to obtain a network asset certificate identification result. The invention combines two similarity measurement methods, and performs similarity comparison from the two aspects of the content and the structure of the certificate; meanwhile, by combining an autonomous learning feedback mechanism, a certificate similarity measurement and recognition query method with high automation degree and higher accuracy are explored.

Description

Network asset certificate identification method and device
Technical Field
The invention relates to the technical field of network asset identity recognition, in particular to a network asset certificate recognition method and device.
Background
The digital certificate refers to a digital certificate for marking the identity information of each communication party in the internet communication, and can be used for identifying and verifying the identity of each communication party. Certificate information typically includes the following items of information: version number, serial number, signature algorithm, issuer, expiration date, principal public key algorithm, signature value, etc. The method comprises the steps of providing a certificate, wherein the version number is version information of the certificate, each certificate has a unique certificate serial number, a signature algorithm is used in an authentication process, an issuer is the name of an issuer of the certificate, the validity period marks the validity time of the certificate, a main body is the name of a certificate owner, a main body public key is a public key of the certificate owner, and a signature value is the signature of the certificate issuer on the certificate.
Digital certificates are identity certificates used for network asset principals in internet information activities, and are also electronic data that can be used to verify the confidentiality and integrity of transmitted information from network assets. The method and the device reasonably and accurately implement similarity measurement on the certificates of the network assets, and can effectively conduct classified identification and query retrieval on the network assets.
At present, no related study directly aiming at the similarity measurement and identification of the certificates is available, and the structure of the certificates is considered to comprise a plurality of information items (version numbers, serial numbers, signature algorithms, issuers and the like) and the value content corresponding to each information item, so that a method for measuring the similarity of texts in text classification can be referred. Most algorithms in text classification, such as a KNN method, a support vector machine method, a K-means method, and the like, need to achieve the purpose of classification by calculating similarity. Common conventional text similarity measurement methods include cosine similarity, jaccard similarity coefficient, euclidean distance, manhattan distance, chebyshev distance, mahalanobis distance, and the like.
The existing text similarity measurement method has single function and can not be directly applied to similarity measurement of certificates. Considering that the structure of the certificate comprises a plurality of information items (version number, serial number, signature algorithm, issuer and the like) and the value content corresponding to each information item, two similarity measurement methods can be considered to be combined, and similarity comparison can be carried out from the two aspects of the content and the structure of the certificate; meanwhile, by combining an autonomous learning feedback mechanism, a certificate similarity measurement and recognition query method with high automation degree and higher accuracy are explored.
Disclosure of Invention
The invention aims to solve the technical problem of providing a network asset certificate identification method and device, which can achieve the purpose of identifying the network asset by carrying out similarity measurement on certificates of different network assets, thereby solving the problems of low automation degree and low accuracy rate of identifying the network asset in the prior art. To solve the above technical problem, a first aspect of an embodiment of the present invention discloses a network asset certificate identification method, which includes:
s1, acquiring certificate information of a network asset, wherein the certificate information of the network asset forms an asset certificate library;
S2, processing the certificate information of the network asset to obtain the characteristic information of the network asset, wherein the characteristic information of the network asset forms a characteristic library;
s3, training a preset network asset certificate recognition model by utilizing the characteristic information of the network asset to obtain a training network asset certificate recognition model;
s4, acquiring the certificate information of the network asset to be identified, and processing the certificate information of the network asset to be identified by utilizing the training network asset certificate identification model to obtain a network asset certificate identification result.
In a first aspect of the embodiment of the present invention, the processing the certificate information of the network asset to obtain the feature information of the network asset includes:
s21, performing structural similarity calculation on the certificate information of the network asset to obtain structural similarity information;
s22, performing content similarity calculation on the certificate information of the network asset to obtain content similarity information;
s23, fusing the structural similarity information and the content similarity information to obtain the characteristic information of the network asset.
In a first aspect of the embodiment of the present invention, the calculating the structural similarity of the certificate information of the network asset to obtain structural similarity information includes:
S211, acquiring certificate information of a network asset A, wherein the number of information items of the network asset A to be tested is m;
s212, acquiring certificate information of a network asset B, wherein the number of information items of the network asset B is n;
s213, the certificate information of the network asset A and the certificate information of the network asset B are processed to obtain the number x of information items contained in the network asset A and the network asset B;
s214, processing the number of information items of the network asset A to be tested to be m, the number of information items of the network asset B to be n and the number of information items x by using a structural similarity calculation model to obtain structural similarity information;
the structural similarity calculation model is as follows:
where str (A, B) is structural similarity information.
In a first aspect of the embodiment of the present invention, the calculating the content similarity of the certificate information of the network asset to obtain content similarity information includes:
s221, processing the certificate information of the network asset A to obtain the content formalized representation of the certificate information item of the network asset A A =(a 1 ,a 2 ,…,a x );
S222, processing the certificate information of the network asset B to obtain the content formalized representation of the certificate information item of the network asset B B =(b 1 ,b 2 ,…,b x );
S223, formally representing content of the certificate information item of the network asset A by using a content similarity information calculation model A =(a 1 ,a 2 ,…,a x ) And the certificate information item formalized representation of the network asset B is processed to obtain content similarity information;
the content similarity calculation model is as follows:
wherein con (A, B) is content similarity information,
in an optional implementation manner, in a first aspect of the embodiment of the present invention, the fusing the structural similarity information and the content similarity information to obtain feature information of a network asset includes:
fusing the structural similarity information and the content similarity information by using a similarity information fusion model to obtain characteristic information of the network asset;
the similarity information fusion model is as follows:
wherein total (a, B) is characteristic information of the network asset, str (a, B) is structural similarity information, con (a, B) is content similarity information.
In a first aspect of the embodiment of the present invention, training a preset network asset certificate identification model by using the characteristic information of the network asset to obtain a trained network asset certificate identification model includes:
S31, dividing the characteristic information of the network asset to obtain a marked certificate sample and an unmarked certificate sample;
s32, training a preset network asset certificate identification model by taking the unlabeled certificate sample as a training sample to obtain a training network asset certificate identification model.
In a first aspect of the embodiment of the present invention, the obtaining the certificate information of the network asset to be identified, and using the training network asset certificate identification model to process the certificate information of the network asset to be identified to obtain a network asset certificate identification result, includes:
s41, acquiring certificate information of the network asset to be identified;
s42, processing the certificate information of the network asset to be identified to obtain the characteristic information of the network asset to be identified;
s43, obtaining a first recognition result according to the characteristic information of the network asset to be recognized;
s44, according to the first recognition result, the characteristic information of the network asset to be recognized is processed by utilizing the training network asset certificate recognition model, and a network asset certificate recognition result is obtained.
As an optional implementation manner, in the first aspect of the embodiment of the present invention, the method further includes:
Acquiring certificate information of a network asset to be identified;
identifying the certificate information of the network asset to be identified to obtain a first identification result;
labeling the network asset certificate different from the first recognition result according to the first recognition result to obtain a positive example certificate and a negative example certificate;
adding the counterexample certificate into the unlabeled certificate sample to form an optimal training sample set;
and training a preset network asset certificate identification model by using the optimal training sample set to obtain an optimized network asset certificate identification model.
The second aspect of the embodiment of the invention discloses a network asset certificate identification device, which comprises:
the data acquisition module is used for acquiring the certificate information of the network asset, and the certificate information of the network asset forms an asset certificate library;
the feature extraction module is used for processing the certificate information of the network asset to obtain feature information of the network asset, and the feature information of the network asset forms a feature library;
the model training module is used for training a preset network asset certificate identification model by utilizing the characteristic information of the network asset to obtain a training network asset certificate identification model;
And the certificate identification module is used for acquiring the certificate information of the network asset to be identified, and processing the certificate information of the network asset to be identified by utilizing the training network asset certificate identification model to obtain a network asset certificate identification result.
In a second aspect of the embodiment of the present invention, the processing the certificate information of the network asset to obtain the feature information of the network asset includes:
s21, performing structural similarity calculation on the certificate information of the network asset to obtain structural similarity information;
s22, performing content similarity calculation on the certificate information of the network asset to obtain content similarity information;
s23, fusing the structural similarity information and the content similarity information to obtain the characteristic information of the network asset.
In a second aspect of the embodiment of the present invention, the calculating the structural similarity of the certificate information of the network asset to obtain structural similarity information includes:
s211, acquiring certificate information of a network asset A, wherein the number of information items of the network asset A to be tested is m;
s212, acquiring certificate information of a network asset B, wherein the number of information items of the network asset B is n;
S213, the certificate information of the network asset A and the certificate information of the network asset B are processed to obtain the number x of information items contained in the network asset A and the network asset B;
s214, processing the number of information items of the network asset A to be tested to be m, the number of information items of the network asset B to be n and the number of information items x by using a structural similarity calculation model to obtain structural similarity information;
the structural similarity calculation model is as follows:
where str (A, B) is structural similarity information.
In a second aspect of the embodiment of the present invention, the calculating the content similarity of the certificate information of the network asset to obtain content similarity information includes:
s221, processing the certificate information of the network asset A to obtain the content formalized representation of the certificate information item of the network asset A A =(a 1 ,a 2 ,…,a x );
S222, processing the certificate information of the network asset B to obtain the content formalized representation of the certificate information item of the network asset B B =(b 1 ,b 2 ,…,b x );
S223, formally representing content of the certificate information item of the network asset A by using a content similarity information calculation model A =(a 1 ,a 2 ,…,a x ) And the certificate information item formalized representation of the network asset B is processed to obtain content similarity information;
The content similarity calculation model is as follows:
wherein con (A, B) is content similarity information,
in a second aspect of the embodiment of the present invention, the fusing the structural similarity information and the content similarity information to obtain feature information of the network asset includes:
fusing the structural similarity information and the content similarity information by using a similarity information fusion model to obtain characteristic information of the network asset;
the similarity information fusion model is as follows:
wherein total (a, B) is characteristic information of the network asset, str (a, B) is structural similarity information, con (a, B) is content similarity information.
In a second aspect of the embodiment of the present invention, training a preset network asset certificate identification model by using the characteristic information of the network asset to obtain a trained network asset certificate identification model includes:
s31, dividing the characteristic information of the network asset to obtain a marked certificate sample and an unmarked certificate sample;
s32, training a preset network asset certificate identification model by taking the unlabeled certificate sample as a training sample to obtain a training network asset certificate identification model.
In a second aspect of the embodiment of the present invention, the obtaining the certificate information of the network asset to be identified, and using the training network asset certificate identification model to process the certificate information of the network asset to be identified to obtain a network asset certificate identification result, includes:
s41, acquiring certificate information of the network asset to be identified;
s42, processing the certificate information of the network asset to be identified to obtain the characteristic information of the network asset to be identified;
s43, obtaining a first recognition result according to the characteristic information of the network asset to be recognized;
s44, according to the first recognition result, the characteristic information of the network asset to be recognized is processed by utilizing the training network asset certificate recognition model, and a network asset certificate recognition result is obtained.
As an optional implementation manner, in the second aspect of the embodiment of the present invention, the method further includes:
acquiring certificate information of a network asset to be identified;
identifying the certificate information of the network asset to be identified to obtain a first identification result;
labeling the network asset certificate different from the first recognition result according to the first recognition result to obtain a positive example certificate and a negative example certificate;
Adding the counterexample certificate into the unlabeled certificate sample to form an optimal training sample set;
and training a preset network asset certificate identification model by using the optimal training sample set to obtain an optimized network asset certificate identification model.
A third aspect of the present invention discloses another network asset certificate identification apparatus, the apparatus comprising:
a memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform some or all of the steps in the network asset certificate identification method disclosed in the first aspect of the embodiment of the present invention.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
the invention provides a network asset certificate identification method and device, which are used for carrying out similarity measurement on a certificate sample in two aspects of structural similarity and content similarity so as to obtain structural features, content features and comprehensive features of a certificate; adopting a support vector machine model, starting from how to select the most suitable certificate as a training sample of the support vector machine, introducing an autonomous learning thought, constructing the support vector machine based on active learning feedback, taking an unlabeled certificate sample as the training sample, and inputting the unlabeled certificate sample into a support vector machine classifier for learning; in the process of continuously identifying and inquiring unlabeled samples of a user, aiming at a certificate library with a larger scale, the problem that the classification effect of a support vector machine classifier is affected by the limited quantity of labeled samples of the user or the calculated quantity of the classifier is increased because a large quantity of labeling work takes processing time can be avoided; meanwhile, the certificate with the biggest ambiguity can be marked, and an optimal training sample set is formed by combining the samples marked by the user, so that the calculation time is shortened, the classified query efficiency is improved, and the quality of the recognition query result is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a network asset certificate identification method disclosed in an embodiment of the present invention;
FIG. 2 is a flow chart of another network asset certificate identification method disclosed in an embodiment of the present invention;
FIG. 3 is a schematic diagram of a network asset certificate identification query system based on autonomous learning feedback according to an embodiment of the present invention;
FIG. 4 is a schematic illustration of sample labeling as disclosed in an embodiment of the present invention;
FIG. 5 is a schematic diagram of an active learning model disclosed in an embodiment of the present invention;
FIG. 6 is a schematic diagram of a network asset certificate identification device according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of another network asset certificate identification device according to an embodiment of the present invention.
Detailed Description
In order to make the present invention better understood by those skilled in the art, the following description will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms first, second and the like in the description and in the claims and in the above-described figures are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, apparatus, article, or device that comprises a list of steps or elements is not limited to the list of steps or elements but may, in the alternative, include other steps or elements not expressly listed or inherent to such process, method, article, or device.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the invention. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The invention discloses a network asset certificate identification method and device, which can acquire the certificate information of a network asset, wherein the certificate information of the network asset forms an asset certificate library; processing the certificate information of the network asset to obtain the characteristic information of the network asset, wherein the characteristic information of the network asset forms a characteristic library; training a preset network asset certificate identification model by utilizing the characteristic information of the network asset to obtain a training network asset certificate identification model; and acquiring the certificate information of the network asset to be identified, and processing the certificate information of the network asset to be identified by utilizing the training network asset certificate identification model to obtain a network asset certificate identification result. The invention combines two similarity measurement methods, and performs similarity comparison from the two aspects of the content and the structure of the certificate; meanwhile, by combining an autonomous learning feedback mechanism, a certificate similarity measurement and recognition query method with high automation degree and higher accuracy are explored. The following will describe in detail.
Example 1
Referring to fig. 1, fig. 1 is a flowchart of a network asset certificate identification method according to an embodiment of the present invention. The network asset certificate identification method described in fig. 1 is applied to a network asset certificate identification system, and embodiments of the present invention are not limited thereto. As shown in fig. 1, the network asset certificate identification method may include the following operations:
s1, acquiring certificate information of a network asset, wherein the certificate information of the network asset forms an asset certificate library;
s2, processing the certificate information of the network asset to obtain the characteristic information of the network asset, wherein the characteristic information of the network asset forms a characteristic library;
s3, training a preset network asset certificate recognition model by utilizing the characteristic information of the network asset to obtain a training network asset certificate recognition model;
s4, acquiring the certificate information of the network asset to be identified, and processing the certificate information of the network asset to be identified by utilizing the training network asset certificate identification model to obtain a network asset certificate identification result.
The asset certificate library is composed of certificate information of network assets and comprises the following information items: version number, serial number, signature algorithm, issuer, expiration date, principal public key algorithm, signature value, etc.;
Optionally, the processing the certificate information of the network asset to obtain the feature information of the network asset includes:
s21, performing structural similarity calculation on the certificate information of the network asset to obtain structural similarity information;
s22, performing content similarity calculation on the certificate information of the network asset to obtain content similarity information;
s23, fusing the structural similarity information and the content similarity information to obtain the characteristic information of the network asset.
Optionally, the performing structural similarity calculation on the certificate information of the network asset to obtain structural similarity information includes:
s211, acquiring certificate information of a network asset A, wherein the number of information items of the network asset A to be tested is m;
s212, acquiring certificate information of a network asset B, wherein the number of information items of the network asset B is n;
s213, the certificate information of the network asset A and the certificate information of the network asset B are processed to obtain the number x of information items contained in the network asset A and the network asset B;
s214, processing the number of information items of the network asset A to be tested to be m, the number of information items of the network asset B to be n and the number of information items x by using a structural similarity calculation model to obtain structural similarity information;
The structural similarity calculation model is as follows:
where str (A, B) is structural similarity information.
Optionally, the performing content similarity calculation on the certificate information of the network asset to obtain content similarity information includes:
s221, processing the certificate information of the network asset A to obtainCertificate information item formalized representation content to network asset a A =(a 1 ,a 2 ,…,a x );
S222, processing the certificate information of the network asset B to obtain the content formalized representation of the certificate information item of the network asset B B =(b 1 ,b 2 ,…,b x );
S223, formally representing content of the certificate information item of the network asset A by using a content similarity information calculation model A =(a 1 ,a 2 ,…,a x ) And the certificate information item formalized representation of the network asset B is processed to obtain content similarity information;
the content similarity calculation model is as follows:
wherein con (A, B) is content similarity information,
optionally, the fusing the structural similarity information and the content similarity information to obtain feature information of the network asset includes:
fusing the structural similarity information and the content similarity information by using a similarity information fusion model to obtain characteristic information of the network asset;
the similarity information fusion model is as follows:
Wherein total (a, B) is characteristic information of the network asset, str (a, B) is structural similarity information, con (a, B) is content similarity information.
Optionally, training a preset network asset certificate recognition model by using the characteristic information of the network asset to obtain a trained network asset certificate recognition model, including:
s31, dividing the characteristic information of the network asset to obtain a marked certificate sample and an unmarked certificate sample;
the dividing method can be performed according to the ratio of 3:7, and the invention is not limited.
S32, training a preset network asset certificate identification model by taking the unlabeled certificate sample as a training sample to obtain a training network asset certificate identification model.
Optionally, the obtaining the certificate information of the network asset to be identified, and processing the certificate information of the network asset to be identified by using the training network asset certificate identification model to obtain a network asset certificate identification result, including:
s41, acquiring certificate information of the network asset to be identified;
s42, processing the certificate information of the network asset to be identified to obtain the characteristic information of the network asset to be identified;
S43, obtaining a first recognition result according to the characteristic information of the network asset to be recognized;
s44, according to the first recognition result, the characteristic information of the network asset to be recognized is processed by utilizing the training network asset certificate recognition model, and a network asset certificate recognition result is obtained.
Optionally, the method further comprises:
acquiring certificate information of a network asset to be identified;
identifying the certificate information of the network asset to be identified to obtain a first identification result;
labeling the network asset certificate different from the first recognition result according to the first recognition result to obtain a positive example certificate and a negative example certificate;
and setting a threshold value according to the first recognition result by the user, marking the positive example certificates and the negative example certificates, and carrying out quantity self-definition to obtain a result by carrying out secondary query recognition.
Adding the counterexample certificate into the unlabeled certificate sample to form an optimal training sample set;
and training a preset network asset certificate identification model by using the optimal training sample set to obtain an optimized network asset certificate identification model.
Optionally, after obtaining the structural similarity information and the content similarity information, the following method may be used to perform feature fusion:
Projecting the structural similarity information X and the content similarity information Y to one dimension for linear representation, wherein the projection vectors a and b correspond respectively, and the projected feature matrix becomes:
X'=a T X,Y'=b T Y
the correlation coefficient between X 'and Y' is maximized, thereby obtaining projection vectors a and b when the correlation coefficient is maximized. Namely:
the data were normalized before projection, with the aim of making the mean value of the data 0 and the variance 1. Under such conditions, it is possible to obtain:
cov(X',Y')=cov(a T X,b T Y)=E(<a T X,b T Y>)=E((a T X)(b T Y) T )=a T E(XY T )b
D(X')=D(a T X)=a T E(XX T )a
D(Y')=D(b T Y)=b T E(YY T )b
because the average value of X and Y is 0 after normalization, the method is that
D(X)=cov(X,X)=E(XX T )
D(Y)=cov(Y,Y)=E(YY T )
cov(X,Y)=E(XY T ),cov(Y,X)=E(YX T )
Set S XX = cov (X, X), then the solution target translates into:
the solving method flow of the function comprises the following steps:
step3: solving the singular value of M to obtain the maximum singular value and the left and right singular vectors u, v thereof.
Step4: projection vectors a and b of X and Y are respectively:
optionally, the preset network asset certificate identification model processes the characteristic information of the network asset by adopting discrete wavelet decomposition. After the characteristic information of the network asset is decomposed into different frequency bands, the energy duty ratio is calculated, after normalization, the characteristic is subjected to dimension reduction by utilizing principal component analysis, and the formed characteristic vector set is used as the input of a least square support vector machine. And optimizing initial parameters of the least square support vector machine by using a mixed particle swarm algorithm, so as to build a preset network asset certificate identification model. The mixed particle swarm algorithm combines FSA (fast simulated annealing method) and PSO (particle swarm algorithm), the PSO algorithm is used for exploring a global search area, when the PSO searches the whole optimal solution of the current iteration times, the FSA algorithm is used for adjusting the optimal position found by the PSO to obtain a new solution, the new solution is compared with the optimal solution obtained by the PSO algorithm, if the new solution is superior to the optimal solution, the new solution is used as the current optimal solution, otherwise, the new solution is accepted with a certain probability. By the combination method, premature sinking into a local optimal solution is avoided, the relation between global search and local search is further balanced, and the efficiency and the accuracy of an algorithm are optimized.
FSA algorithm flow:
1) Starting from an initial solution, an initial temperature T and a termination temperature T are defined min A cooling rate r;
2) At each temperature, the current solution is disturbed by using the Cauchy distribution to obtain a new solution;
3) For each new solution, its objective function value or cost function value is calculated. If the new solution is more optimal, accepting the solution as the current solution; otherwise the solution is accepted according to the Metropolis criterion probability.
4) Repeating steps 2) and 3) until the temperature drops to T min . In the process of temperature reduction, the probability of accepting inferior solutions gradually decreases, and finally only better solutions are accepted.
5) When the temperature is reduced to T min And when the algorithm is finished, returning the found optimal solution.
Example two
Referring to fig. 2, fig. 2 is a flowchart illustrating another network asset certificate identification method according to an embodiment of the present invention. The network asset certificate identification method described in fig. 2 is applied to a network asset certificate identification system, and embodiments of the present invention are not limited thereto. As shown in fig. 2, the network asset certificate identification method may include the following operations:
(1) Acquiring credential information for a network asset to be identified
(2) Calculating the structural similarity and the content similarity of the certificate by using the method for measuring the similarity of the certificate through two aspects of structure and content quantification, and further obtaining the structural characteristics, the content characteristics and the comprehensive characteristics of the certificate;
(3) Constructing a support vector machine based on active learning feedback, and optimizing a training sample set;
(4) And carrying out similarity measurement of the asset certificate to be identified through the classifier model, identifying the network asset, and further establishing a similar network asset library.
Network asset certificate identification query system based on autonomous learning feedback, the specific idea is as follows:
1) Overall structure
Retrieving a feedback interface: including certificate inquiry and result feedback. Based on the mode of autonomous learning training and authentication feedback, the feature comparison between the network asset certificate to be identified and the network asset certificate library is realized through the similarity calculation of the asset certificates, and certificates of different categories are identified; and further provides a function for the user to query the selected sample certificate again.
Asset certificate library: known raw asset certificates;
feature library: a comprehensive feature library formed by structural features and content features;
2) Principle of operation
The user selects a certificate to be queried and identified through a certificate query module, and a first identification query result is displayed through the system;
and marking the positive example certificates and the negative example certificates according to the result by the user, and carrying out secondary query recognition to obtain the result by the user with self-defined quantity. The certificates with great ambiguity are divided through an active learning feedback mechanism, and can be marked as counterexample certificates by a user and fed back with certificate marking results again;
And deciding the execution times of the system according to whether the related feedback result of the result feedback module accords with the certificate recognition inquiry expectation or not until the recognition result has certain accuracy, and ending the operation.
The principle schematic diagram of the network asset certificate identification query system based on autonomous learning feedback is shown in fig. 3.
3) Implementation design
Certificate feature extraction
(1) Extracting information items of each network asset certificate sample and the valued content of each information item;
the certificate information includes the following information items: version number, serial number, signature algorithm, issuer, expiration date, principal public key algorithm, signature value, etc. The method comprises the steps of providing a certificate, wherein the version number is version information of the certificate, each certificate has a unique certificate serial number, a signature algorithm is used in an authentication process, an issuer is the name of an issuer of the certificate, the validity period marks the validity time of the certificate, a main body is the name of a certificate owner, a main body public key is a public key of the certificate owner, and a signature value is the signature of the certificate issuer on the certificate.
The information items of the certificate information can be expressed in the form of vectors, and the certificate information of the subject a can be expressed in the form of vectors
cert A =(A 1 ,A 2 ,…,A n ),
Wherein for i=1 to n, a i Respectively representing the version number, serial number, signature algorithm, issuer, validity period, principal public key algorithm, signature value, etc. of the certificate. Each certificate information item has a corresponding value, and the content of the information item is called as information item, for example, the version number can be V3, and is formalized as A 1 =V3。
(2) Calculating the structural similarity between each network asset certificate sample and the known network asset certificates, and taking the structural similarity as the structural characteristics of the certificates;
the structural similarity refers to calculating the structural similarity between two pieces of certificate information, listing all certificate information items of the two pieces of certificates, and calculating the structural similarity between the two pieces of certificate information by using Jaccard similarity coefficients. Assuming that certificate A has m certificate information items and certificate B has n certificate information items, wherein each of A and B contains x information items, the structural similarity of certificate A and certificate B is expressed as
If the two certificates contain the same names and numbers of information items, i.e., m=n, and x=m=n, the structural similarity of the two certificates is 1.
Calculating the content similarity between each network asset certificate sample and the known network asset certificate as the content characteristic of the certificate;
(3) The content similarity refers to the similarity of the valued contents of the two identical information items of the certificates, the certificate information items in which the two certificates exist are listed, the similarity measurement is carried out on the valued contents of the information items, and the Euclidean distance is utilized to calculate the similarity of the valued contents of the identical information items of the two certificates. Assuming that the certificate information items contained in both the certificate A and the certificate B comprise x information items such as version numbers, serial numbers, signature algorithms, issuers, validity periods, subjects, subject public keys, subject public key algorithms, signature values and the like, the valued content of the same information items of the certificate A and the certificate B can be formally expressed as content A =(a 1 ,a 2 ,…,a x ),content B =(b 1 ,b 2 ,…,b x ) Calculating Euclidean distance between the value contents of the same information items of two certificates as
The smaller the euclidean distance, the greater the similarity of the two certificates; the larger the euclidean distance, the smaller the two certificates are similar. Then the content similarity of certificate a and certificate B is
(4) Calculating comprehensive characteristics of the certificate sample based on the structural characteristics and the content characteristics of the certificate;
according to the structural similarity and the content similarity of the certificate A and the certificate B, calculating the overall similarity between the certificate A and the certificate B as follows
And (3) constructing an active learning feedback certificate identification model:
(1) Certificate inquiry
Network asset certificate identification and result display are mainly performed based on a similarity measurement method and comprehensive certificate characteristics. Considering the condition of certificates in the certificate library, which have high similarity with the to-be-detected asset certificates, the system interface can plan the display area based on the country regions, the applications and the like.
(2) Sample labeling
The method mainly comprises the steps of forming a marked sample set according to positive examples and negative examples marked by a user, and training the marked sample set as the input of a support vector machine, so that the accuracy of the identification query result is ensured.
Sample labeling is a process operation of implementing a result feedback certificate identification query, and a user considers the number of labels by himself and labels the relevant positive and negative certificate samples as required. Meanwhile, in combination with active learning, a user can label the identification result of the stage in the positive and negative cases for the certificate with larger ambiguity of the identification query result. Fig. 3 is a schematic structural diagram of a network asset certificate identification query system based on autonomous learning feedback according to an embodiment of the present invention.
(3) Result feedback based on active learning
By adopting an active learning feedback mode, a large number of unlabeled certificates of a user are used as training samples and are input into a support vector machine classifier for learning, so that a certificate result with great ambiguity can be obtained; and the user automatically judges whether to mark the mark as a positive instruction or a negative instruction certificate, and updates the marked sample set, so that an optimal training sample set is gradually formed, and the accuracy of recognition and query is improved. FIG. 4 is a schematic illustration of sample labeling as disclosed in an embodiment of the present invention. FIG. 5 is a schematic diagram of an active learning model according to an embodiment of the present invention.
The support vector machine model based on active learning has a cyclic execution characteristic, and is described as S= (A, H, U, M and D), wherein A represents a support vector machine classifier, H is a query function, U is a user terminal, M is a marked sample set, and D is an unmarked sample set.
The model execution includes:
(1) Acquiring structural features, content features and comprehensive features of the certificate by combining a similarity measurement method, and outputting a first certificate identification query result;
(2) Based on the first output result, a user selects and marks the positive example sample and the negative example sample to obtain a training sample;
(3) Constructing a training sample set, learning the training samples, and carrying out recognition query again based on a classifier;
(4) In the learning feedback result, automatically calculating and outputting a certificate set with larger ambiguity with a sample identification query result selected by a user, automatically selecting and labeling a positive demonstration certificate or a negative example certificate in the ambiguous certificate by the user, and adding the labeled sample set;
(5) The process is circulated, and the optimal classification recognition result can be obtained through calculation of fewer sample sets;
(6) And calculating the similarity distance between the certificate to be identified and each certificate in the certificate library, and sorting according to the similarity to obtain a certificate identification result.
Example III
Referring to fig. 6, fig. 6 is a schematic structural diagram of a network asset certificate identification apparatus according to an embodiment of the present invention. The network asset certificate identification device described in fig. 6 is applied to a network asset certificate identification system, and embodiments of the present invention are not limited thereto. As shown in fig. 6, the network asset certificate identification apparatus may include the following operations:
s301, a data acquisition module is used for acquiring certificate information of a network asset, wherein the certificate information of the network asset forms an asset certificate library;
s302, a feature extraction module is used for processing the certificate information of the network asset to obtain feature information of the network asset, wherein the feature information of the network asset forms a feature library;
S303, a model training module, which is used for training a preset network asset certificate identification model by utilizing the characteristic information of the network asset to obtain a training network asset certificate identification model;
s304, a certificate recognition module is used for acquiring the certificate information of the network asset to be recognized, and processing the certificate information of the network asset to be recognized by utilizing the training network asset certificate recognition model to obtain a network asset certificate recognition result.
Example IV
Referring to fig. 7, fig. 7 is a schematic structural diagram of another network asset certificate identification apparatus according to an embodiment of the present invention. The network asset certificate identification device described in fig. 7 is applied to a network asset certificate identification system, and embodiments of the present invention are not limited thereto. As shown in fig. 6, the network asset certificate identification apparatus may include the following operations:
a memory 401 storing executable program codes;
a processor 402 coupled with the memory 401;
the processor 402 invokes executable program code stored in the memory 401 for performing the steps in the network asset certificate identification method described in embodiment one or embodiment two.
The apparatus embodiments described above are merely illustrative, in which the modules illustrated as separate components may or may not be physically separate, and the components shown as modules may or may not be physical, i.e., may be located in one place, or may be distributed over multiple network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above detailed description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course by means of hardware. Based on such understanding, the foregoing technical solutions may be embodied essentially or in part in the form of a software product that may be stored in a computer-readable storage medium including Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), one-time programmable Read-Only Memory (OTPROM), electrically erasable programmable Read-Only Memory (EEPROM), compact disc Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM) or other optical disc Memory, magnetic disc Memory, tape Memory, or any other medium that can be used for computer-readable carrying or storing data.
Finally, it should be noted that: the embodiment of the invention discloses a network asset certificate identification method and device, which are disclosed as preferred embodiments of the invention, and are only used for illustrating the technical scheme of the invention, but not limiting the technical scheme; although the invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that; the technical scheme recorded in the various embodiments can be modified or part of technical features in the technical scheme can be replaced equivalently; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims (10)

1. A method of network asset certificate identification, the method comprising:
s1, acquiring certificate information of a network asset, wherein the certificate information of the network asset forms an asset certificate library;
s2, processing certificate information of the network asset in the asset certificate library to obtain characteristic information of the network asset, wherein the characteristic information of the network asset forms a characteristic library;
s3, training a preset network asset certificate recognition model by utilizing the characteristic information of the network asset in the characteristic library to obtain a training network asset certificate recognition model;
S4, acquiring the certificate information of the network asset to be identified, and processing the certificate information of the network asset to be identified by utilizing the training network asset certificate identification model to obtain a network asset certificate identification result.
2. The method for identifying network asset certificate according to claim 1, wherein the processing the certificate information of the network asset to obtain the characteristic information of the network asset comprises:
s21, performing structural similarity calculation on the certificate information of the network asset to obtain structural similarity information;
s22, performing content similarity calculation on the certificate information of the network asset to obtain content similarity information;
s23, fusing the structural similarity information and the content similarity information to obtain the characteristic information of the network asset.
3. The method for identifying network asset certificates according to claim 2, wherein the step of performing structural similarity calculation on the certificate information of the network asset to obtain structural similarity information includes:
s211, acquiring certificate information of a network asset A, wherein the number of information items of the network asset A is m;
s212, acquiring certificate information of a network asset B, wherein the number of information items of the network asset B is n;
S213, the certificate information of the network asset A and the certificate information of the network asset B are processed to obtain the number x of information items contained in the network asset A and the network asset B;
s214, processing the information item number m of the network asset A, the information item number n of the network asset B and the information item number x by using a structural similarity calculation model to obtain structural similarity information;
the structural similarity calculation model is as follows:
where str (A, B) is structural similarity information.
4. The network asset certificate identification method according to claim 2, wherein the performing content similarity calculation on the certificate information of the network asset to obtain content similarity information includes:
s221, processing the certificate information of the network asset A to obtain the content formalized representation of the certificate information item of the network asset A A =(a 1 ,a 2 ,…,a x );
S222, processing the certificate information of the network asset B to obtain the content formalized representation of the certificate information item of the network asset B B =(b 1 ,b 2 ,…,b x );
S223, calculating a model by using the content similarity information, and aiming at the networkCertificate information item formalized representation content of collateral asset A A =(a 1 ,a 2 ,…,a x ) And the certificate information item formalized representation of the network asset B is processed to obtain content similarity information;
The content similarity calculation model is as follows:
wherein con (A, B) is content similarity information,
5. the method for identifying network asset certificate according to claim 2, wherein the fusing the structural similarity information and the content similarity information to obtain the characteristic information of the network asset comprises:
fusing the structural similarity information and the content similarity information by using a similarity information fusion model to obtain characteristic information of the network asset;
the similarity information fusion model is as follows:
wherein total (a, B) is characteristic information of the network asset, str (a, B) is structural similarity information, con (a, B) is content similarity information.
6. The network asset certificate identification method according to claim 1, wherein training a preset network asset certificate identification model by using the characteristic information of the network asset to obtain a trained network asset certificate identification model comprises:
s31, dividing the characteristic information of the network asset to obtain a marked certificate sample and an unmarked certificate sample;
s32, training a preset network asset certificate identification model by taking the unlabeled certificate sample as a training sample to obtain a training network asset certificate identification model.
7. The method for identifying network asset certificate according to claim 1, wherein the obtaining the certificate information of the network asset to be identified, and using the training network asset certificate identification model, processing the certificate information of the network asset to be identified to obtain the network asset certificate identification result, comprises:
s41, acquiring certificate information of the network asset to be identified;
s42, processing the certificate information of the network asset to be identified to obtain the characteristic information of the network asset to be identified;
s43, obtaining a first recognition result according to the characteristic information of the network asset to be recognized;
s44, according to the first recognition result, the characteristic information of the network asset to be recognized is processed by utilizing the training network asset certificate recognition model, and a network asset certificate recognition result is obtained.
8. The network asset certificate identification method of claim 1, further comprising:
acquiring certificate information of a network asset to be identified;
identifying the certificate information of the network asset to be identified to obtain a first identification result;
labeling the network asset certificate different from the first recognition result according to the first recognition result to obtain a positive example certificate and a negative example certificate;
Adding the counterexample certificate into the unlabeled certificate sample to form an optimal training sample set;
and training a preset network asset certificate identification model by using the optimal training sample set to obtain an optimized network asset certificate identification model.
9. A network asset certificate identification apparatus, the apparatus comprising:
the data acquisition module is used for acquiring the certificate information of the network asset, and the certificate information of the network asset forms an asset certificate library;
the feature extraction module is used for processing the certificate information of the network asset to obtain feature information of the network asset, and the feature information of the network asset forms a feature library;
the model training module is used for training a preset network asset certificate identification model by utilizing the characteristic information of the network asset to obtain a training network asset certificate identification model;
and the certificate identification module is used for acquiring the certificate information of the network asset to be identified, and processing the certificate information of the network asset to be identified by utilizing the training network asset certificate identification model to obtain a network asset certificate identification result.
10. A network asset certificate identification apparatus, the apparatus comprising:
A memory storing executable program code;
a processor coupled to the memory;
the processor invokes the executable program code stored in the memory to perform the network asset credential identification method as claimed in any one of claims 1 to 8.
CN202311345134.2A 2023-10-17 2023-10-17 Network asset certificate identification method and device Pending CN117375845A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311345134.2A CN117375845A (en) 2023-10-17 2023-10-17 Network asset certificate identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311345134.2A CN117375845A (en) 2023-10-17 2023-10-17 Network asset certificate identification method and device

Publications (1)

Publication Number Publication Date
CN117375845A true CN117375845A (en) 2024-01-09

Family

ID=89388668

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311345134.2A Pending CN117375845A (en) 2023-10-17 2023-10-17 Network asset certificate identification method and device

Country Status (1)

Country Link
CN (1) CN117375845A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106031086A (en) * 2014-02-20 2016-10-12 菲尼克斯电气公司 Method and system for creating and checking the validity of device certificates
WO2018107994A1 (en) * 2016-12-13 2018-06-21 阿里巴巴集团控股有限公司 Method and device for allocating augmented reality-based virtual objects
CN111444908A (en) * 2020-03-25 2020-07-24 腾讯科技(深圳)有限公司 Image recognition method, device, terminal and storage medium
CN114579832A (en) * 2020-11-30 2022-06-03 厦门美亚商鼎信息科技有限公司 Website digital certificate identification method and system based on decision tree
US20220272115A1 (en) * 2021-02-22 2022-08-25 Tenable, Inc. Predicting cyber risk for assets with limited scan information using machine learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106031086A (en) * 2014-02-20 2016-10-12 菲尼克斯电气公司 Method and system for creating and checking the validity of device certificates
WO2018107994A1 (en) * 2016-12-13 2018-06-21 阿里巴巴集团控股有限公司 Method and device for allocating augmented reality-based virtual objects
CN111444908A (en) * 2020-03-25 2020-07-24 腾讯科技(深圳)有限公司 Image recognition method, device, terminal and storage medium
CN114579832A (en) * 2020-11-30 2022-06-03 厦门美亚商鼎信息科技有限公司 Website digital certificate identification method and system based on decision tree
US20220272115A1 (en) * 2021-02-22 2022-08-25 Tenable, Inc. Predicting cyber risk for assets with limited scan information using machine learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
邱云飞: "基于网络结构和文本内容的群体画像构建方法研究", 图书情报工作, vol. 63, no. 22, 30 November 2019 (2019-11-30), pages 3 *

Similar Documents

Publication Publication Date Title
Ma et al. Variational Bayesian learning for Dirichlet process mixture of inverted Dirichlet distributions in non-Gaussian image feature modeling
Lin et al. Spec hashing: Similarity preserving algorithm for entropy-based coding
CN112307472B (en) Abnormal user identification method and device based on intelligent decision and computer equipment
US9141853B1 (en) System and method for extracting information from documents
CN111797629B (en) Method and device for processing medical text data, computer equipment and storage medium
CN110929125A (en) Search recall method, apparatus, device and storage medium thereof
CN112584062B (en) Background audio construction method and device
CN110188422B (en) Method and device for extracting feature vector of node based on network data
CN112632278A (en) Labeling method, device, equipment and storage medium based on multi-label classification
CN113127633A (en) Intelligent conference management method and device, computer equipment and storage medium
CN111797217B (en) Information query method based on FAQ matching model and related equipment thereof
CN113869398B (en) Unbalanced text classification method, device, equipment and storage medium
CN113032601A (en) Zero sample sketch retrieval method based on discriminant improvement
CN114386013A (en) Automatic student status authentication method and device, computer equipment and storage medium
CN117375845A (en) Network asset certificate identification method and device
CN116186223A (en) Financial text processing method, device, equipment and storage medium
Karamti et al. A deep locality-sensitive hashing approach for achieving optimal image retrieval satisfaction
CN113988223B (en) Certificate image recognition method, device, computer equipment and storage medium
CN104778479B (en) A kind of image classification method and system based on sparse coding extraction
CN110263196B (en) Image retrieval method, image retrieval device, electronic equipment and storage medium
CN113672804A (en) Recommendation information generation method, system, computer device and storage medium
CN114610941A (en) Cultural relic image retrieval system based on comparison learning
CN113901821A (en) Entity naming identification method, device, equipment and storage medium
CN112769540A (en) Method, system, equipment and storage medium for diagnosing side channel information leakage
Khan et al. Improvised contrastive loss for improved face recognition in open-set nature

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination