CN116401719A - Method for positioning hardware Trojan horse in gate-level netlist based on machine learning - Google Patents

Method for positioning hardware Trojan horse in gate-level netlist based on machine learning Download PDF

Info

Publication number
CN116401719A
CN116401719A CN202310395996.XA CN202310395996A CN116401719A CN 116401719 A CN116401719 A CN 116401719A CN 202310395996 A CN202310395996 A CN 202310395996A CN 116401719 A CN116401719 A CN 116401719A
Authority
CN
China
Prior art keywords
output sub
gate
maximum single
module
trojan horse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310395996.XA
Other languages
Chinese (zh)
Inventor
王泉
黄钊
周丽榕
谢昌健
李泽宇
王骏君
刘锦辉
樊璐
刘潇
万波
李少峰
吴自力
田玉敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202310395996.XA priority Critical patent/CN116401719A/en
Publication of CN116401719A publication Critical patent/CN116401719A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/71Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/30Circuit design
    • G06F30/32Circuit design at the digital level
    • G06F30/327Logic synthesis; Behaviour synthesis, e.g. mapping logic, HDL to netlist, high-level language to RTL or netlist

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Mathematical Physics (AREA)
  • Geometry (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention discloses a method for detecting and positioning a hardware Trojan in a gate-level netlist based on machine learning, which mainly solves the problems that in the prior art, the hardware Trojan positioning accuracy and efficiency are low, and an ideal model is required to be used as a reference. The implementation scheme is as follows: dividing an integrated circuit in a sample into a plurality of maximum output sub-modules, extracting characteristic vectors of the integrated circuit and constructing a data set; training the existing machine learning model by using a cross-validation method to obtain a classifier; the method comprises the steps of utilizing a classifier to detect Trojan horse on an integrated circuit to be detected; trojan horse locating is carried out on the detected maximum output sub-module containing the hardware Trojan horse by the Trojan horse searching method based on layer-by-layer difference analysis. The invention takes the maximum output sub-module as a unit to carry out machine learning, thereby obviously improving the performance of the classifier and the detection accuracy of Trojan horse; the positioning accuracy and efficiency of the Trojan horse circuit in the gate-level netlist are improved through comparing and analyzing the maximum output submodule, and the method can be used for hardware Trojan horse protection in the design of the gate-level netlist of the integrated circuit.

Description

Method for positioning hardware Trojan horse in gate-level netlist based on machine learning
Technical Field
The invention belongs to the technical field of integrated circuits, and particularly relates to a method for detecting and positioning a hardware Trojan in a gate-level netlist, which can be used for protecting the hardware Trojan in the design stage of the gate-level netlist of an integrated circuit.
Background
A hardware Trojan is a malicious circuit that can be implanted at any stage in the design and manufacturing process of an integrated circuit, and its practical application has affected some key fields such as mobile communication, medical treatment, aerospace, civil infrastructure, and so on, so as to be national safe. Currently, protection measures for hardware Trojan mainly focus on nondestructive detection, and the essential idea is to use the change of certain characteristics after the hardware Trojan is implanted to determine whether a certain integrated circuit is implanted into the hardware Trojan. Nondestructive hardware Trojan detection can be classified into dynamic detection and static detection according to the stage of the selected integrated circuit characteristic.
Dynamic detection determines whether to implant a hardware Trojan by observing characteristics of the operation stage of the integrated circuit, such as by-pass parameters of power consumption and path delay. Under the influence of a hardware Trojan, certain characteristic changes of the operation stage of the integrated circuit are obvious and easy to observe, so that the dynamic detection can generally obtain higher detection accuracy. However, the dynamic detection method generally needs to carefully select the value set of the test vector, which is difficult to implement and takes a long time when the input pins of the integrated circuit are more.
Static detection determines whether a hardware trojan is implanted by extracting features of the design phase of the integrated circuit, such as fan-in, number of ring structures. The static detection does not need to design a perfect integrated circuit, so that hardware Trojan detection in stages and modules is facilitated. Furthermore, static detection does not require actual operation of the integrated circuit, nor does it naturally require test vectors. However, most of the features of an integrated circuit at the design stage generally need to take time to extract in direct correlation with the circuit scale, and it is difficult to ensure correlation of these features with hardware Trojan, resulting in lower detection accuracy.
The patent document with the publication number of CN 110287735A discloses a Trojan horse infection circuit identification method based on chip netlist features, which comprises the steps of extracting a node SCOAP metric value, detecting a suspicious node set by using a k-means++ clustering network, correcting the suspicious node set by combining the topological structure of a chip netlist, and recovering Trojan horse trigger nodes through node reachable analysis. The method is lack of consideration of mixing rare nodes and common nodes as Trojan trigger nodes, so that Trojan with a large number of common nodes is easy to miss, the time consumption is too long, and the accuracy is low.
The patent document with publication number of CN 114065308A discloses a door-level hardware Trojan horse positioning method and system based on deep learning, which comprises the steps of extracting door-level netlist information, constructing a characteristic path set and detecting and positioning by using textCNN. The method has the advantages that the construction time of the path characteristics is longer, and the distinction between the path characteristics and the common path characteristics is not obvious, so the Trojan horse detection accuracy is lower.
The patent document with publication number CN 109740348A discloses a hardware Trojan horse positioning method based on machine learning, which comprises the steps of extracting gate-level netlist characteristics, dividing hardware Trojan horse types, and detecting and positioning by using Ove-class SVM and BPNN respectively. The method is difficult to divide the types of the hardware Trojan in the preprocessing stage, and the Trojan network cables are easy to be positioned by mistake.
In summary, the existing hardware Trojan detection and positioning method still has the defects of low hardware Trojan positioning precision and efficiency and the need of an ideal model as a reference.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a hardware Trojan horse positioning method in a gate level netlist based on machine learning, so that the positioning accuracy and efficiency of the hardware Trojan horse are improved under the condition that an ideal model is not required to be referenced.
In order to achieve the above purpose, the technical scheme of the invention comprises the following steps:
(1) Dividing an integrated circuit in a sample into a plurality of extremely large single-output sub-modules;
(2) Feature extraction is performed by taking each maximum single output sub-module in the integrated circuit as a unit to form a data set, and the data set is formed according to 7:3, dividing the ratio into a training set and a testing set;
(3) Training a machine learning model by using a cross-validation method to obtain a trained classifier;
(4) Selecting a gate-level netlist to be detected for Trojan horse detection, and outputting a detection result;
(5) Judging whether the output result of the step (4) contains a hardware Trojan horse or not:
if the hardware Trojan is not contained, the positioning is completed;
otherwise, executing the step (6);
(6) Positioning the detected hardware Trojan horse:
(6a) The current gate-level network table is marked as C, the golden design version of the non-implanted Trojan corresponding to the C is marked as C ', the C and the C' are divided into a plurality of maximum single output sub-modules, and the characteristic vector of each maximum single output sub-module is extracted;
(6b) A maximum single output sub-module a of Trojan is implanted into one of the C detected in the step (4), and the maximum single output sub-module a which is the closest to the C 'is found out according to Euclidean distance between feature vectors and is marked as a';
(6c) Performing Trojan horse searching based on layer-by-layer difference analysis on the a and the a' to obtain a plurality of Trojan horse areas;
(6d) Steps (6 b) to (6 c) are performed on all the maximum single output sub-modules detected in step (4) to obtain a Trojan horse area.
Compared with the prior art, the invention has the following advantages:
first, the invention divides the integrated circuit to be tested into a plurality of extremely large single output sub-modules, which can realize the secondary division of different logic cone overlapping areas in the traditional method for dividing the integrated circuit by logic cones, thereby simplifying the operation in the process of detecting and positioning the hardware Trojan and improving the efficiency of detecting and positioning the hardware Trojan.
Secondly, in hardware Trojan detection, because the gate-level netlist of the integrated circuit is divided into a plurality of maximum single-output sub-modules, each maximum single-output sub-module is mutually independent in the hardware Trojan detection process, the parallel detection of the plurality of maximum single-output sub-modules is facilitated, and the time required for detecting the large-scale integrated circuit can be effectively shortened; meanwhile, as the data set is constructed by taking the extremely large single-output sub-module instead of the whole gate-level netlist as a unit, the size of the constructed data set is increased by tens of times, the performance of the classifier obtained by training is obviously improved, and the Trojan detection accuracy is further improved.
Third, in the hardware Trojan positioning, as the gate-level netlist of the integrated circuit is divided into a plurality of extremely large single-output sub-modules, most logic gates and signal lines belong to the hardware Trojan in each Trojan area finally positioned, and the accuracy of Trojan positioning is improved.
Drawings
FIG. 1 is a flow chart of an implementation of the present invention;
FIG. 2 is a schematic diagram of a very large single output sub-module division in accordance with the present invention;
FIG. 3 is a schematic illustration of Trojan search based on layer-by-layer variance analysis in the present invention.
Detailed Description
Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
Referring to fig. 1, the implementation steps of this example are as follows:
and step 1, acquiring an integrated circuit sample, and dividing the integrated circuit in the sample into a plurality of extremely large single output sub-modules.
(1.1) selecting as a sample a gate level netlist of a plurality of integrated circuits comprising a plurality of "golden designs" without Trojan horse implantation, and each of the various versions of the "golden designs" when different Trojan horse implantation;
(1.2) for each gate-level netlist in the sample, dividing it into a plurality of very large single-output sub-modules:
(1.2.1) abstracting the gate-level netlist into a directed graph by taking a logic gate in the gate-level netlist as a vertex, taking a signal line branch as a directed edge, namely taking a starting point as an input pin of the logic gate and taking an end point as an output pin of the logic gate, wherein a gate to which each main output pin in the gate-level netlist belongs corresponds to a 'converging node' of the directed graph to form a converging node set T;
(1.2.2) performing breadth-first traversal on the maximum single-output sub-module taking one converging node t in the converging node set as a starting point, and judging whether all output nodes connected with the node i in the traversal process belong to the maximum single-output sub-module or not:
if yes, adding the node i into the maximum single output sub-module, and executing the step (1.2.3);
otherwise, consider node i as a junction node, add it to junction node set T, execute (1.2.3);
(1.2.3) repeatedly executing the step (1.2.2) on all nodes in the maximum single-output sub-module taking the merging node t as a starting point until all nodes are traversed, namely forming the maximum single-output sub-module taking a logic gate corresponding to the merging node t as a vertex;
(1.2.4) repeating steps (1.2.2) through (1.2.3) for all the junction nodes in the junction node set T, the gate level netlist can be partitioned into a plurality of very large single output sub-modules.
(1.3) performing step (1.2) on all gate level netlists in the sample, resulting in a plurality of maximum output sub-modules of the sample.
And 2, performing feature extraction on a plurality of maximum output sub-modules in the integrated circuit to form a data set, and dividing the data set into a training set and a testing set.
(2.1) extracting static structural features of each maximum single-output sub-module, constructing feature vectors, and combining the feature vectors of all the maximum single-output sub-modules to form a matrix at the tail parts of the feature vectors according to whether the maximum single-output sub-modules contain hardware Trojan additional tags or not, namely 1 represents the existence and 0 represents the nonexistence;
(2.2) executing the step (2.1) on the maximum output submodules of all gate-level netlists in the sample to obtain a plurality of matrixes, merging the matrixes according to rows, and removing repeated rows to obtain matrixes, namely a data set;
(2.3) according to 7: the scale of 3 divides the dataset into a training portion and a testing portion.
And step 3, training a machine learning model to obtain a trained classifier.
K nearest neighbors, decision trees, naive Bayes classifiers are trained using a cross-validation approach. And dynamically adjusting parameters for each classifier according to the cross-validation score to achieve the best effect, thereby obtaining the trained classifier.
And 4, dividing a gate-level netlist of the integrated circuit to be tested into two maximum single-output sub-modules.
Referring to fig. 2, the implementation of this step is as follows:
(4.1) taking a logic gate as a vertex, branching a signal line into a directed edge, abstracting a gate-level netlist into a directed graph, and respectively corresponding to two junction nodes T1 and T2 in the directed graph to a logic gate G4 to which a main output pin PO1 belongs and a logic gate G8 to which a main output pin PO2 belongs to form a junction node set T;
(4.2) performing breadth-first traversal on the maximum single-output sub-module taking the junction node T1 as a starting point, when traversing to the logic gate G2, adding the logic gate G2 as a junction node T3 into the junction node set T because one output node G6 of the logic gate G2 does not belong to the current maximum single-output sub-module, and continuing breadth-first traversal, wherein after traversing, main input pins PI 1-PI 3, main output pins PO1, signal lines W1-W2 and logic gates G1, G3 and G4 obtained by traversing the junction node T1 are divided into a maximum single-output sub-module a;
(4.3) performing breadth-first traversal on the maximum single-output sub-module taking the junction node T2 as a starting point, when traversing to the main input pin PI4, adding PI4 as a junction node T4 into the junction node set T as one output node G2 of the PI4 does not belong to the current maximum single-output sub-module, continuing the traversal, and dividing main input pins PI 5-PI 6 obtained by traversing the junction node T2 after the traversal is finished, main output pins PO2, signal lines W4-W6 and logic gates G5-G8 into a maximum single-output sub-module b;
(4.4) performing breadth-first traversal on the maximum single-output sub-module taking the merging node t3 as a starting point, wherein after the traversal is finished, the logic gate G2 is divided into a maximum single-output sub-module c through a main input pin PI3 obtained by traversing the merging node;
(4.5) performing breadth-first traversal on the maximum single-output sub-module taking the merging node t4 as a starting point, and dividing only the main input pin PI4 into the maximum single-output sub-module d because PI4 traversal is empty.
And 5, extracting feature vectors of the maximum single-output sub-modules a, b, c and d obtained by traversing the four converging nodes, and forming a feature matrix by using the feature vectors.
(5.1) traversing the four maximum single output sub-modules a, b, c and d respectively, and calculating 10 hardware Trojan horse related characteristics of the respective main input number, the main input branch number, the main output branch number, the logic gate number, the trigger number, the signal line number, the total fan-in, the total fan-out and the loop number, wherein the specific definition of each hardware Trojan horse related characteristic is shown in table 1:
TABLE 1 hardware Trojan horse related feature definition
Feature name Description of the invention
Number of main inputs The number of main input pins contained in the maximum single output sub-module
Number of branches Number of logic gate input pins connected to the main input pin in the maximum single output sub-module
Number of main outputs The number of main output pins contained in the maximum single output sub-module
Number of branches Number of logic gate input pins connected to the main output pin in the maximum single output sub-module
Number of logic gates The number of basic logic gates contained in the maximum single output sub-module
Number of triggers The number of trigger class logic gates contained in the very large single output sub-module
Number of signal lines The number of interconnects contained in a very large single output sub-module
General fan-in The sum of the input logic gate numbers of all logic gates in the maximum single output sub-module
Total fan-out The sum of the output logic gate numbers of all logic gates in the maximum single output sub-module
Number of loops The number of loops (i.e., simple loops in the directed graph) contained in the very large single output sub-module
(5.2) arranging the 10 hardware Trojan horse related features of the four maximum single output sub-modules a, b, c and d in sequence to form feature vectors of the maximum single output sub-modules a, b, c and d, wherein the feature vectors are shown in table 2:
TABLE 2 eigenvector values for different polarity large single output sub-modules
Figure BDA0004177673000000061
(5.3) combining the feature vectors of the maximum single output sub-module a, the maximum single output sub-module b, the maximum single output sub-module c and the maximum single output sub-module d to form a feature Matrix, wherein the feature Matrix is expressed as follows:
Figure BDA0004177673000000062
and 6, predicting by a classifier according to the characteristic Matrix to obtain tag vectors corresponding to the four maximum output sub-modules a, b, c and d which are respectively traversed by the four converging nodes.
(6.1) inputting the feature Matrix into three classifiers of a K nearest neighbor, a decision tree and a naive Bayes classifier, and predicting a corresponding four-dimensional tag column vector through each classifier to obtain three different four-dimensional tag column vectors;
(6.2) constructing a tag vector v1 corresponding to the maximum single output sub-module a with a first dimension of the three different four-dimensional tag column vectors, constructing a tag vector v2 corresponding to the maximum single output sub-module b with a second dimension of the three different four-dimensional tag column vectors, constructing a tag vector v3 corresponding to the maximum single output sub-module c with a third dimension of the three different four-dimensional tag column vectors, and constructing a tag vector v4 corresponding to the maximum single output sub-module d with a fourth dimension of the three different four-dimensional tag column vectors, expressed as follows:
v1=(1 1 0),
v2=(0 0 0),
v3=(0 0 0),
v4=(0 0 0)。
and 7, detecting the hardware Trojan horse in the gate-level netlist to be detected according to the label vectors corresponding to the four maximum single output sub-modules, and outputting a hardware Trojan horse set M.
According to the method, whether at least one tag exists in a tag vector corresponding to a maximum single output sub-module or not is judged to be 1, whether a hardware Trojan is implanted into the maximum single output sub-module or not is judged, and a hardware Trojan set M is set to be empty:
(7.1) judging tag vectors of four maximum single output sub-modules obtained by four converging nodes in the gate-level netlist to be tested:
when two labels are 1 in the label vector v1 corresponding to the maximum single output sub-module a, the hardware Trojan is considered to be implanted into the a, the a is added into the hardware Trojan set M, and the label vector of the b is judged;
when the labels in the label vector v2 corresponding to the maximum single output sub-module b are all 0, the hardware Trojan horse is considered not to be implanted in the label vector b, and the label vector of the label vector c is judged;
when the labels in the label vector v3 corresponding to the maximum single output sub-module c are all 0, the hardware Trojan horse is considered not to be implanted in the c, and the label vector of d is judged;
and when the labels in the label vector v4 corresponding to the maximum single output sub-module d are all 0, the hardware Trojan horse is not implanted in d, and the step (7.2) is executed.
And (7.2) outputting the hardware Trojan horse set M.
And 8, carrying out Trojan positioning on the maximum output sub-module a in which the hardware Trojan is implanted in the hardware Trojan set M.
Referring to fig. 3, the implementation of this step is as follows:
(8.1) marking the current gate level network table as C, marking the golden design version of the non-implanted Trojan corresponding to C as C ', dividing the C' into a plurality of maximum single output sub-modules, and extracting the characteristic vector of each maximum single output sub-module;
(8.2) finding out the nearest maximum single output sub-module of a in C 'according to Euclidean distance between feature vectors, and marking as a';
(8.3) performing Trojan search based on layer-by-layer difference analysis on the a and a', and obtaining a plurality of Trojan areas.
(8.3.1) comparing the layers 1, wherein a and a' only comprise a logic gate G5 in the layer 1, and the two logic gates are the same in type and are two input or gates, so that the logic gate in the next position is continuously traversed and compared to obtain the layer 2;
(8.3.2) comparing the layer 2, wherein the logic gate of a is G4 and T3 in the layer 2, the logic gate of a 'is G4 and G3 in the layer 2, the logic gates G4 of a and a' are two-input OR gates, but the logic gate T3 of a is two-input AND gate and the logic gate G3 of a 'is an inverter, so the logic gate T3 of a is recorded as a Trojan horse output gate, the logic gates T3 of a and the logic gate G3 of a' are eliminated, and the logic gate of the next position is compared continuously to obtain the layer 3;
(8.3.3) comparing the 3 rd layer, wherein a and a' both contain logic gates G1 and G2 in the 3 rd layer, and the two logic gates G1 are two-input AND gates, and the two logic gates G2 are two-input NAND gates, so that the next logic gate is continuously traversed and compared to obtain the 4 th layer;
(8.3.4) comparing layers 4, and ending the traversal if layers 4 of a and a' are both empty;
(8.3.5) the unique Trojan output gate T3 is traversed by a breadth first of depth 8 starting from this, resulting in a unique Trojan region D in a which contains logic gates T3, T2, T1, G3 and primary input pins PI5, PI6.
The above description is only one specific example of the invention and does not constitute any limitation of the invention, and it will be apparent to those skilled in the art that various modifications and changes in form and details may be made without departing from the principles, construction of the invention, but these modifications and changes based on the idea of the invention are still within the scope of the claims of the invention.

Claims (6)

1. The method for positioning the hardware Trojan in the gate-level netlist based on machine learning is characterized by comprising the following steps of:
(1) Dividing an integrated circuit in a sample into a plurality of extremely large single-output sub-modules;
(2) Feature extraction is performed by taking each maximum single output sub-module in the integrated circuit as a unit to form a data set, and the data set is formed according to 7:3, dividing the ratio into a training set and a testing set;
(3) Training a machine learning model by using a cross-validation method to obtain a trained classifier;
(4) Selecting a gate-level netlist to be detected for Trojan horse detection, and outputting a detection result;
(5) Judging whether the output result of the step (4) contains a hardware Trojan horse or not:
if the hardware Trojan is not contained, the detection is completed;
otherwise, executing the step (6);
(6) Positioning the detected hardware Trojan horse:
(6a) The current gate-level network table is marked as C, the golden design version of the non-implanted Trojan corresponding to the C is marked as C ', the C and the C' are divided into a plurality of maximum single output sub-modules, and the characteristic vector of each maximum single output sub-module is extracted;
(6b) A maximum single output sub-module a of Trojan is implanted into one of the C detected in the step (4), and the maximum single output sub-module a which is the closest to the C 'is found out according to Euclidean distance between feature vectors and is marked as a';
(6c) Performing Trojan horse searching based on layer-by-layer difference analysis on the a and the a' to obtain a plurality of Trojan horse areas;
(6d) Steps (6 b) to (6 c) are performed on all the maximum single output sub-modules detected in step (4) to obtain a Trojan horse area.
2. The method of claim 1, wherein the step (1) of dividing the integrated circuit under test into a plurality of very large single output sub-modules comprises the steps of:
1a) The logic gate is taken as a vertex, a signal line branch is taken as a directed edge, namely, the starting point is taken as an input pin of the logic gate, the end point is taken as an output pin of the logic gate, and the gate-level netlist is abstracted into a directed graph, wherein a gate to which each main output pin in the gate-level netlist belongs corresponds to a 'merging node' of the directed graph, and a merging node set T is formed;
1b) Performing breadth-first traversal on a maximum single-output sub-module taking one converging node t in a converging node set as a starting point, and judging whether all output nodes connected with the node i in the traversal process belong to the maximum single-output sub-module or not:
if yes, adding the node i into the maximum single output sub-module, and executing the step 1 c);
otherwise, consider node i as a junction node, add it to junction node set T, execute 1 d);
1c) Repeatedly executing the step (1 b) on all nodes in the maximum single-output sub-module taking the merging node t as a starting point until all nodes are traversed, namely forming the maximum single-output sub-module taking a logic gate corresponding to the merging node t as a vertex;
1d) Repeating steps 1 b) through 1 c) for all the junction nodes in the junction node set T, the gate level netlist can be partitioned into a plurality of maximum single output sub-modules.
3. The method of claim 1, wherein the feature extraction in step (2) is performed in units of each maximum single output sub-module in the integrated circuit to form a dataset, implemented as follows:
2a) Selecting gate level netlists of a plurality of integrated circuits as samples, wherein the gate level netlists comprise a plurality of gold designs without Trojan horse implantation, and various versions of each gold design when different Trojan horse is implanted;
2b) Dividing each gate-level netlist in the sample into a plurality of maximum single-output sub-modules, extracting static structural features of each maximum single-output sub-module, constructing a feature vector, and merging the feature vectors of all the maximum single-output sub-modules into a matrix at the tail of the feature vector according to whether the maximum single-output sub-modules contain hardware Trojan additional tags or not, namely 1 represents the presence and 0 represents the absence;
2c) And (3) executing the step (2 b) on all gate-level netlists in the sample to obtain a plurality of matrixes, merging the matrixes according to rows, and removing repeated rows to obtain the matrixes, namely the data set.
4. The method of claim 1, wherein the hardware Trojan horse detection in the step (4) is performed and a detection result is output, and the following is implemented:
4a) Selecting a gate-level netlist, setting a hardware Trojan horse set of the gate-level netlist as M, and setting M as null at the moment;
4b) Dividing the gate-level netlist into a plurality of maximum single-output sub-modules, extracting the feature vector of each maximum single-output sub-module, and forming a feature matrix by using the feature vectors of all the maximum single-output sub-modules;
4c) Respectively inputting the feature matrix into a K nearest neighbor, a decision tree and a naive Bayes classifier, and predicting to obtain a corresponding label vector;
4d) For each maximum single output sub-module, judging whether at least one label of label vectors predicted by a classifier is 1:
if yes, the hardware Trojan is considered to be implanted, and the hardware Trojan is added into the hardware Trojan set M, and the next maximum single output sub-module is continuously judged;
otherwise, continuing to judge the next maximum single output sub-module;
4e) Repeatedly executing the step 4 d) on all the maximum single output sub-modules in the gate-level netlist until the judging operation is finished, and judging whether the hardware Trojan horse set M of the gate-level netlist is empty or not:
if yes, reporting that the device does not contain a hardware Trojan horse;
otherwise, outputting the hardware Trojan horse set M.
5. The method according to claim 1, wherein the step (6 c) of performing a Trojan horse search based on layer-by-layer difference analysis on a and a' is implemented as follows:
6c1) The corresponding vertexes are used as starting points, the corresponding directed graphs of a and a ' are traversed by breadth first respectively, logic gate sequences accessed by a and a ' under the same traversing depth are compared in sequence in the BFS process, and logic gates at the same positions in the two logic gate sequences are recorded as g and g ' respectively;
6c2) Comparing whether the types of g and g' are the same:
if the types of g and g 'are different, firstly recording g as a Trojan horse output gate, respectively eliminating g and g' in the traversal of the next depth of a and a ', and then continuously comparing the types of the logic gates g and g' of the next position;
if the types of g and g 'are the same, directly continuing to compare the types of the logic gates g and g' at the next position;
6c3) Repeating the step (6 c 2) until all logic gates of a or a' are traversed, and obtaining a plurality of trojan output gates;
6c4) Performing breadth-first traversal on each Trojan output gate, and dividing all non-repeated logic gates, signal lines, main input pins and main output pins accessed in the traversal process into a Trojan region;
6c5) And (6 c 4) executing the step of outputting the door to all the obtained trojans to obtain a plurality of trojan areas.
6. A method according to claim 3, wherein the extracting of the static structural features of each of the maximum single output sub-modules in step (2 b) comprises: the number of main inputs, the number of main inputs branches, the number of main outputs branches, the number of logic gates, the number of flip-flops, the number of signal lines, the total fan-in, the total fan-out, and the number of loops.
CN202310395996.XA 2023-04-13 2023-04-13 Method for positioning hardware Trojan horse in gate-level netlist based on machine learning Pending CN116401719A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310395996.XA CN116401719A (en) 2023-04-13 2023-04-13 Method for positioning hardware Trojan horse in gate-level netlist based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310395996.XA CN116401719A (en) 2023-04-13 2023-04-13 Method for positioning hardware Trojan horse in gate-level netlist based on machine learning

Publications (1)

Publication Number Publication Date
CN116401719A true CN116401719A (en) 2023-07-07

Family

ID=87015725

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310395996.XA Pending CN116401719A (en) 2023-04-13 2023-04-13 Method for positioning hardware Trojan horse in gate-level netlist based on machine learning

Country Status (1)

Country Link
CN (1) CN116401719A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116911227A (en) * 2023-09-05 2023-10-20 苏州异格技术有限公司 Logic mapping method, device, equipment and storage medium based on hardware

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116911227A (en) * 2023-09-05 2023-10-20 苏州异格技术有限公司 Logic mapping method, device, equipment and storage medium based on hardware
CN116911227B (en) * 2023-09-05 2023-12-05 苏州异格技术有限公司 Logic mapping method, device, equipment and storage medium based on hardware

Similar Documents

Publication Publication Date Title
Chen et al. Progressive darts: Bridging the optimization gap for nas in the wild
Liu et al. S 3 DET: Detecting system symmetry constraints for analog circuits with graph similarity
US20190340507A1 (en) Classifying data
Bouyer et al. LSMD: A fast and robust local community detection starting from low degree nodes in social networks
CN110414277B (en) Gate-level hardware Trojan horse detection method based on multi-feature parameters
CN116401719A (en) Method for positioning hardware Trojan horse in gate-level netlist based on machine learning
Pang et al. Towards balanced learning for instance recognition
Azriel et al. SoK: An overview of algorithmic methods in IC reverse engineering
CN110188763A (en) A kind of image significance detection method based on improvement graph model
CN110825642B (en) Software code line-level defect detection method based on deep learning
CN116522334A (en) RTL-level hardware Trojan detection method based on graph neural network and storage medium
CN115146062A (en) Intelligent event analysis method and system fusing expert recommendation and text clustering
CN108197660A (en) Multi-model Feature fusion/system, computer readable storage medium and equipment
Li et al. A XGBoost based hybrid detection scheme for gate-level hardware Trojan
CN110399432A (en) A kind of classification method of table, device, computer equipment and storage medium
CN114239083A (en) Efficient state register identification method based on graph neural network
CN110955892B (en) Hardware Trojan horse detection method based on machine learning and circuit behavior level characteristics
Demetrovics et al. An optimization of closed frequent subgraph mining algorithm
CN117009518A (en) Similar event judging method integrating basic attribute and text content and application thereof
Zhang et al. Hybrid multi‐level hardware Trojan detection platform for gate‐level netlists based on XGBoost
US20100049713A1 (en) Pattern matching device and method
Roy et al. Nuclei-Net: A multi-stage fusion model for nuclei segmentation in microscopy images
Sagar et al. Error evaluation on k-means and hierarchical clustering with effect of distance functions for iris dataset
CN114202494A (en) Method, device and equipment for classifying cells based on cell classification model
JP3766119B2 (en) Circuit simulation method and apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination