CN111008646A - Man-machine relationship verification method and device based on equipment use condition - Google Patents

Man-machine relationship verification method and device based on equipment use condition Download PDF

Info

Publication number
CN111008646A
CN111008646A CN201911075033.1A CN201911075033A CN111008646A CN 111008646 A CN111008646 A CN 111008646A CN 201911075033 A CN201911075033 A CN 201911075033A CN 111008646 A CN111008646 A CN 111008646A
Authority
CN
China
Prior art keywords
feature
equipment
training set
information gain
entropy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911075033.1A
Other languages
Chinese (zh)
Inventor
王申华
蒋红亮
方小方
何湘威
吕齐
陈澄
柯公武
严冬
寿博仁
刘吉权
吴辉
曹保良
王挺
张晨阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinhua Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Wuyi Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Original Assignee
Jinhua Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Wuyi Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinhua Power Supply Co of State Grid Zhejiang Electric Power Co Ltd, Wuyi Power Supply Co of State Grid Zhejiang Electric Power Co Ltd filed Critical Jinhua Power Supply Co of State Grid Zhejiang Electric Power Co Ltd
Priority to CN201911075033.1A priority Critical patent/CN111008646A/en
Publication of CN111008646A publication Critical patent/CN111008646A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/20Administration of product repair or maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • General Business, Economics & Management (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Evolutionary Computation (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Water Supply & Treatment (AREA)
  • Public Health (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a man-machine relationship verification method and a man-machine relationship verification device based on equipment use conditions, wherein the method comprises the following steps: acquiring a device use condition database, and extracting a training set D from the device use condition database; extracting a feature set A of a training set D, wherein the feature set A contains features used for judging the use condition of equipment; calculating the experience condition entropy and information gain of each feature in the feature set A to the training set D based on an ID3 algorithm to select proper root nodes and intermediate nodes; constructing a decision tree according to the selected root node and the middle node; analyzing whether the equipment is frequently used and whether the behavior of replacing the used equipment without permission occurs on the basis of the decision tree so as to adjust and maintain the requirement of the equipment; a corresponding apparatus is also disclosed. The invention can find whether the equipment is frequently used or not and whether the behavior of replacing the equipment without permission occurs or not in time through the equipment characteristic identification calculation, thereby improving the information safety and accuracy, and solving the problems of equipment idling, equipment replacing users without permission and the like.

Description

Man-machine relationship verification method and device based on equipment use condition
Technical Field
The invention relates to the technical field of power grid operation and maintenance, in particular to a man-machine relationship verification method and device based on equipment use conditions.
Background
The communication professional department is a communication transportation inspection class and belongs to a maintenance construction work area under an operation and maintenance department. In actual work, various communication devices (particularly terminal devices) are widely distributed, and in addition, personnel replacement and responsibility change are carried out. Due to long-term shortage of personnel, the team heavy operation and maintenance light management is caused, the updating of the equipment ledger is delayed, omission often occurs, and a corresponding control means is lacked. Particularly, when the post of the employee of the company is frequently transferred, the situation that the personnel is transferred in place and the equipment standing book information is not updated often occurs, so that the situations that the employee does not transfer equipment, the equipment is mixed and the equipment is idle exist all the time.
Disclosure of Invention
The invention provides a man-machine relationship verification method and device based on equipment use conditions to solve the technical problems.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
according to a first aspect of the embodiments of the present invention, there is provided a human-computer relationship verification method based on device usage, including the following steps:
step 101, acquiring a device use condition database, and extracting a training set D from the device use condition database;
step 102, extracting a feature set A of a training set D, wherein the feature set A contains features used for judging the use condition of equipment;
103, calculating the experience condition entropy and the information gain of each feature in the feature set A to the training set D based on the ID3 algorithm to select proper root nodes and intermediate nodes;
104, constructing a decision tree according to the selected root node and the middle node;
and step 105, analyzing whether the equipment is frequently used and whether the behavior of replacing the used equipment without permission occurs on the basis of the decision tree, thereby carrying out requirement adjustment and maintenance on the equipment.
Preferably, the step 103 includes:
step 1031, classifying the training set D according to the fact that whether the equipment using crowd changes or not, and calculating experience entropy of the training set D;
step 1032, sequentially calculating the experience condition entropy and the information gain of each feature in the feature set A to the training set D based on the ID3 algorithm;
step 1033, selecting the feature with the largest information gain as the root node feature of the training set D, and dividing the feature into a plurality of subsets;
step 1034, respectively calculating the empirical condition entropy and the information gain of the residual features in the feature set a to each subset, and selecting leaf nodes.
Preferably, the training set D is classified according to whether the device usage population changes, and the process of calculating the experience entropy of the training set D is as follows:
the "whether the population of devices is changed" is labeled as feature C, which has K possible values C ═ C1,C2,...,CkThe training set D is divided into K classes according to the characteristics C, CkFrequency of occurrence is pkAnd K is more than or equal to 1 and less than or equal to K, and K and K are integers, the empirical entropy of the training set D is as follows:
Figure BDA0002262151640000021
preferably, the process of sequentially calculating the empirical condition entropy and the information gain of each feature in the feature set a to the training set D based on the ID3 algorithm is as follows:
in step 10321, each feature in the feature set A is labeled as A1、A2、…、AmM is an integer greater than or equal to 1;
step 10322, feature set A1Dividing the training set D into n subsets Di=[D1,D2,D3,…,Dn]N is an integer greater than or equal to 1, each subset is divided into k classes according to the characteristics C, and then the characteristic set A1Empirical conditional entropy for training set D:
Figure BDA0002262151640000022
wherein i is more than or equal to 1 and less than or equal to n, K is more than or equal to 1 and less than or equal to K, i, n, K and K are integers, | DiL is the subset of samples DiThe number of samples contained in, | D | is the number of samples contained in the training set D, | DikL is the subset of samples DiThe number of samples contained in the kth class;
step 10323, calculate feature A1Information gain of:
g(D,A1)=H(D)-H(D|A1);
Step 10324, repeat step 10322 and step 10323, and obtain the empirical conditional entropy and information gain of other features in the feature set a to the training set D.
Preferably, each feature in the feature set a corresponds to and represents the frequency of use of each service platform on the device, and the value of each feature includes four kinds: frequent, occasional, with access, without access.
Preferably, in step 101, only the devices with the average weekly visit number exceeding 1 are selected when the device usage database is acquired.
According to a second aspect of the embodiments of the present invention, there is provided a human-computer relationship verification apparatus based on device usage, including:
the data extraction module is used for acquiring an equipment use condition database and extracting a training set D from the equipment use condition database;
the characteristic extraction module is used for extracting a characteristic set A of a training set D, and the characteristic set A contains characteristics used for judging the service condition of equipment;
the information gain calculation module is used for calculating the experience condition entropy and the information gain of each feature in the feature set A to the training set D based on the ID3 algorithm so as to select proper root nodes and middle nodes;
the decision tree construction module is used for constructing a decision tree according to the selected root node and the middle node;
and the analysis module is used for analyzing whether the equipment is frequently used and whether the behavior of replacing the used equipment without permission occurs on the basis of the decision tree so as to adjust and maintain the equipment.
Preferably, the information gain calculation module includes:
the experience entropy calculation submodule is used for classifying the training set D according to the fact that whether the equipment using crowd changes or not, and calculating the experience entropy of the training set D;
the information gain calculation submodule is used for sequentially calculating the experience condition entropy and the information gain of each feature in the feature set A to the training set D based on the ID3 algorithm;
the root node selection submodule selects the characteristic with the maximum information gain as the root node characteristic of the training set D and divides the root node characteristic into a plurality of subsets;
and the leaf node selection submodule is used for respectively calculating the empirical condition entropy and the information gain of the residual features in the feature set A to each subset and selecting the leaf nodes.
Preferably, each feature in the feature set a corresponds to and represents the frequency of use of each service platform on the device, and the value of each feature includes four kinds: frequent, occasional, with access, without access.
Preferably, the data extraction module only selects the device with the average weekly visit number exceeding 1 when acquiring the device use condition database.
Compared with the prior art, the invention can find whether the equipment is frequently used or not and whether the behavior of replacing the equipment without permission occurs or not in time through the equipment characteristic identification calculation, thereby improving the information safety and accuracy and solving the problems of equipment idling, equipment replacing users without permission and the like.
Drawings
FIG. 1 is a flow chart of a human-computer relationship verification method based on device usage of the present invention;
FIG. 2 is a flowchart of step 103 of the method for verifying human-computer relationship based on device usage according to the present invention;
FIG. 3 is a block diagram of a human-computer relationship verification apparatus according to the present invention;
fig. 4 is a block diagram of an information gain calculation module in the human-computer relationship verification apparatus based on the device usage of the present invention.
In the figure, 201-a data extraction module, 202-a feature extraction module, 203-an information gain calculation module, 204-a decision tree construction module, 205-an analysis module, 231-an empirical entropy calculation sub-module, 232-an information gain calculation sub-module, 233-a root node selection sub-module and 234-a leaf node selection sub-module.
Detailed Description
The present invention will be described in detail below with reference to specific embodiments shown in the drawings. These embodiments are not intended to limit the present invention, and structural, methodological, or functional changes made by those skilled in the art according to these embodiments are included in the scope of the present invention.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
As shown in fig. 1, a man-machine relationship verification method based on device usage includes the following steps:
step 101, acquiring a device use condition database, and extracting a training set D from the device use condition database;
step 102, extracting a feature set A of a training set D, wherein the feature set A contains features used for judging the use condition of equipment;
103, calculating the experience condition entropy and the information gain of each feature in the feature set A to the training set D based on the ID3 algorithm to select proper root nodes and intermediate nodes;
104, constructing a decision tree according to the selected root node and the middle node;
and step 105, analyzing whether the equipment is frequently used and whether the behavior of replacing the used equipment without permission occurs on the basis of the decision tree, thereby carrying out requirement adjustment and maintenance on the equipment.
In step 101, only the devices with the average weekly visit number exceeding 1 are selected when the device usage database is obtained, so as to remove other data for data representativeness and accuracy. The device usage database contains data such as: computer switch (firewall) log data, fields, source address, destination address; the destination address and service system statistics, network segment address and corresponding service platform name; IP binding MAC address, P address, device user, device human department, etc. Typically, 20 sets of left and right data are extracted to form a training set D.
The feature set a in step 102 may be classified into a training set D according to whether the device usage population changes, and is used to represent two major directions of the decision tree: the crowd of the equipment is changed, and the crowd of the equipment is not changed. Each feature in the feature set a is used to identify various service platforms applied to the device, and represents the frequency of use of the device for the various service platforms, and the value of each feature includes four kinds: frequent, occasional, with access, without access. The values of the characteristics can be classified according to preset threshold values, so that the use times of various service platforms on the equipment are determined as "frequent", "occasional", "access" or "no access" according to the use times from high to low.
As described in further detail below with respect to step 103, step 103 includes the following steps, as shown in fig. 2.
And step 1031, classifying the training set D according to the fact that whether the equipment using crowd changes or not, and calculating the experience entropy of the training set D.
The process of calculating the experience entropy of the training set D according to the classification of the training set D according to the fact that whether the equipment use crowd changes is as follows:
the "whether the population of devices is changed" is labeled as feature C, which has K possible values C ═ C1,C2,...,CkThe training set D is divided into K classes according to the characteristics C, CkFrequency of occurrence is pkAnd K is more than or equal to 1 and less than or equal to K, and K and K are integers, the empirical entropy of the training set D is as follows:
Figure BDA0002262151640000061
wherein, contract 0log20=0。
And step 1032, sequentially calculating the empirical condition entropy and information gain of each feature in the feature set A to the training set D based on the ID3 algorithm. The ID3 classification decision tree has the characteristics of strong readability and high classification speed. And (3) expanding rapid group screening of a large amount of data by adopting an ID3 classification decision tree, carrying out clustering processing on the screened group, and judging whether the crowd attributes of equipment users change or not by using the frequencies of different service platforms used by groups with different attributes.
The process of sequentially calculating the empirical condition entropy and the information gain of each feature in the feature set A to the training set D based on the ID3 algorithm is as follows:
in step 10321, each feature in the feature set A is labeled as A1、A2、…、AmAnd m is an integer greater than or equal to 1, and is used for identifying various service platforms applied to the equipment.
Step 10322, feature set A1Dividing the training set D into n subsets Di=[D1,D2,D3,…,Dn]N is an integer greater than or equal to 1, each subset is divided into k classes according to the characteristics C, and then the characteristic set A1Empirical conditional entropy for training set D:
Figure BDA0002262151640000071
wherein i is more than or equal to 1 and less than or equal to n, K is more than or equal to 1 and less than or equal to K, i, n, K and K are integers, | DiL is the subset of samples DiThe number of samples contained in, | D | is the number of samples contained in the training set D, | DikL is the subset of samples DiClass k contains the number of samples.
Step 10323, calculate feature A1The information gain of (1):
g(D,A1)=H(D)-H(D|A1)。
step 10324, repeat step 10322 and step 10323, and obtain the empirical conditional entropy and information gain of other features in the feature set a to the training set D.
Step 1033, because the features with large information gain have stronger classification ability, the features with the largest information gain are selected as the root node features of the training set D, and are divided into a plurality of subsets.
Step 1034, respectively calculating the empirical condition entropy and the information gain of the residual features in the feature set a to each subset, and selecting leaf nodes.
The following example illustrates the training set D shown in table 1.
Figure BDA0002262151640000072
TABLE 1
The training set D has 15 samples, and whether the crowd changes "classifies training set D according to" equipment use "the value is" yes "has 9 samples, and the value is" no "has 6 samples, and training set D's experience entropy does:
Figure BDA0002262151640000081
the confidence gain for each feature on the data set D is then calculated.
Feature set A is set with A1、A2、A3、A4Platform 1, platform 2, platform 3 and platform 4 are shown separately.
"platform 1" A1The values of (a) are occasionally, frequently, with access, without access, if the training set D is divided by using the feature, 4 sample subsets can be obtained, which are respectively recorded as: d1(platform 1 ═ occasionally), D2(platform 1 ═ frequently), D3(platform 1 with access), D4(platform 1 ═ no access).
As shown in Table 1, D1Contains 5 samples, wherein the proportion of 'whether the equipment use population changes' taking the value of 'yes' is
Figure BDA0002262151640000082
The proportion of 'whether the equipment use population changes' with the value of 'no' is
Figure BDA0002262151640000083
D2Contains 5 samples, wherein the proportion of 'whether the equipment use population changes' taking the value of 'yes' is
Figure BDA0002262151640000084
"whether the population of the device is changedThe ratio of "taking value as" no "is
Figure BDA0002262151640000085
D3Contains sample 0; d4Contains 5 samples, wherein the proportion of 'whether the equipment use population changes' valued as 'yes' is
Figure BDA0002262151640000086
The proportion of 'whether the equipment use population changes' with the value of 'no' is
Figure BDA0002262151640000087
The empirical entropy of its three branch points is then:
Figure BDA0002262151640000088
Figure BDA0002262151640000089
H(D3)=0
Figure BDA00022621516400000810
characteristic A1Empirical conditional entropy for training set D:
Figure BDA00022621516400000811
characteristic A1The information gain of (1):
g(D,A1)=H(D)-H(D|A1)=0.971-0.888=0.083
similarly, the calculation can be:
characteristic A2Information gain g (D, A)2)=0.324
Characteristic A3Information gain g (D, A)3)=0.420
Characteristic A4Information gain g (D, A)4)=0.363
Comparing the information gain values of the features to obtain feature A3Has the largest value of information gain, so that the feature A can be selected3As the optimal feature and the root node feature, and dividing the optimal feature and the root node feature into two subsets D1And D2For D1There is only one type of sample point, so it is a leaf node, pair D2Then it needs to be from a1、A2、A4To select a new feature. The information gain for each feature is calculated as follows:
g(D2,A1)=0.251
g(D2,A2)=0.918
g(D2,A4)=0.474
the feature A is known at this time2Has the largest information gain, so that the feature A is selected2As the characteristics of the intermediate node of the next layer, two sub-nodes are led out, one corresponds to the sub-node of 'yes', the other corresponds to the sub-node of 'no', and respective samples in the two nodes belong to the same class and therefore both belong to leaf nodes.
By analogy, a complete decision tree can be constructed, the use condition of the equipment is clear at a glance, and whether the equipment is frequently used or not and whether the behavior of replacing and using the equipment without permission can be easily obtained. On the basis, relevant maintenance and adjustment can be carried out, for example, function adjustment can be carried out on relevant equipment, and idle of part of equipment is avoided; and carrying out function pairing on related equipment and operators, and modifying the authority of the equipment and the personnel.
The invention constructs a decision tree, and can find whether the equipment is frequently used, abnormal equipment use, the behavior of replacing and using the equipment without permission and the like in time through equipment characteristic identification calculation, thereby improving the information safety and accuracy, and solving the problems of equipment idling, equipment replacing users without permission and the like.
Based on the above method, as shown in fig. 2, the present invention further provides a human-computer relationship verification apparatus based on the device usage, which includes:
the data extraction module 201 is configured to obtain an equipment use condition database, and extract a training set D from the equipment use condition database;
the feature extraction module 202 is configured to extract a feature set a of a training set D, where the feature set a includes features used for determining a usage situation of the device;
the information gain calculation module 203 is used for calculating the empirical condition entropy and the information gain of each feature in the feature set A to the training set D based on the ID3 algorithm so as to select a proper root node and a proper intermediate node;
a decision tree construction module 204, configured to construct a decision tree according to the selected root node and the intermediate node;
and the analysis module is used for analyzing whether the equipment is frequently used and whether the behavior of replacing the used equipment without permission occurs on the basis of the decision tree so as to adjust and maintain the equipment.
As shown in fig. 4, the information gain calculating module includes:
the experience entropy calculation submodule 231 is used for classifying the training set D according to the fact that whether the equipment using crowd changes or not, and calculating the experience entropy of the training set D;
the information gain calculation submodule 232 sequentially calculates the experience condition entropy and the information gain of each feature in the feature set A to the training set D based on the ID3 algorithm;
the root node selection submodule 233 selects the feature with the largest information gain as the root node feature of the training set D, and divides the feature into a plurality of subsets;
the leaf node selecting sub-module 234 calculates the empirical condition entropy and the information gain of the remaining features in the feature set a for each subset, and selects a leaf node.
Each feature in the feature set a corresponds to and represents the frequency of use of each service platform on the device, and the value of each feature includes four kinds: frequent, occasional, with access, without access.
And when the data extraction module acquires the device use condition database, only the device with the average weekly visit number exceeding 1 is selected.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (10)

1. The man-machine relationship verification method based on the equipment use condition is characterized by comprising the following steps of:
step 101, acquiring a device use condition database, and extracting a training set D from the device use condition database;
step 102, extracting a feature set A of a training set D, wherein the feature set A contains features used for judging the use condition of equipment;
103, calculating the experience condition entropy and the information gain of each feature in the feature set A to the training set D based on the ID3 algorithm to select proper root nodes and intermediate nodes;
104, constructing a decision tree according to the selected root node and the middle node;
and step 105, analyzing whether the equipment is frequently used and whether the behavior of replacing the used equipment without permission occurs on the basis of the decision tree, thereby carrying out requirement adjustment and maintenance on the equipment.
2. The human-computer relationship verification method based on the device use condition as claimed in claim 1, wherein the step 103 comprises:
step 1031, classifying the training set D according to the fact that whether the equipment using crowd changes or not, and calculating experience entropy of the training set D;
step 1032, sequentially calculating the experience condition entropy and the information gain of each feature in the feature set A to the training set D based on the ID3 algorithm;
step 1033, selecting the feature with the largest information gain as the root node feature of the training set D, and dividing the feature into a plurality of subsets;
step 1034, respectively calculating the empirical condition entropy and the information gain of the residual features in the feature set a to each subset, and selecting leaf nodes.
3. The human-computer relationship verification method based on the device use condition according to claim 2, wherein the training set D is classified according to whether the device use population changes, and the process of calculating the experience entropy of the training set D is as follows:
the "whether the population of devices is changed" is labeled as feature C, which has K possible values C ═ C1,C2,...,CkThe training set D is divided into K classes according to the characteristics C, CkFrequency of occurrence is pkAnd K is more than or equal to 1 and less than or equal to K, and K and K are integers, the empirical entropy of the training set D is as follows:
Figure FDA0002262151630000021
4. the human-computer relationship verification method based on the device use condition as claimed in claim 3, wherein the process of sequentially calculating the empirical condition entropy and the information gain of each feature in the feature set A to the training set D based on the ID3 algorithm is as follows:
in step 10321, each feature in the feature set A is labeled as A1、A2、…、AmM is an integer greater than or equal to 1;
step 10322, feature set A1Dividing the training set D into n subsets Di=[D1,D2,D3,…,Dn]N is an integer greater than or equal to 1, each sub-groupThe set is divided into k types according to the characteristic C, and then the characteristic set A is obtained1Empirical conditional entropy for training set D:
Figure FDA0002262151630000022
wherein i is more than or equal to 1 and less than or equal to n, K is more than or equal to 1 and less than or equal to K, i, n, K and K are integers, | DiL is the subset of samples DiThe number of samples contained in, | D | is the number of samples contained in the training set D, | DikL is the subset of samples DiThe number of samples contained in the kth class;
step 10323, calculate feature A1The information gain of (1):
g(D,A1)=H(D)-H(D|A1);
step 10324, repeat step 10322 and step 10323, and obtain the empirical conditional entropy and information gain of other features in the feature set a to the training set D.
5. The human-computer relationship verification method based on the device usage according to claim 4, wherein each feature in the feature set A corresponds to a frequency degree of usage of each service platform on the device, and a value of each feature includes four types: frequent, occasional, with access, without access.
6. The human-computer relationship verification method based on the device usage according to any one of claims 1 to 5, wherein in the step 101, only the device with the average weekly access frequency exceeding 1 is selected when the device usage database is obtained.
7. Man-machine relationship verifying device based on equipment use condition is characterized by comprising:
the data extraction module is used for acquiring an equipment use condition database and extracting a training set D from the equipment use condition database;
the characteristic extraction module is used for extracting a characteristic set A of a training set D, and the characteristic set A contains characteristics used for judging the service condition of equipment;
the information gain calculation module is used for calculating the experience condition entropy and the information gain of each feature in the feature set A to the training set D based on the ID3 algorithm so as to select proper root nodes and middle nodes;
the decision tree construction module is used for constructing a decision tree according to the selected root node and the middle node;
and the analysis module is used for analyzing whether the equipment is frequently used and whether the behavior of replacing the used equipment without permission occurs on the basis of the decision tree so as to adjust and maintain the equipment.
8. The device usage-based human-computer relationship verification apparatus according to claim 7, wherein the information gain calculation module includes:
the experience entropy calculation submodule is used for classifying the training set D according to the fact that whether the equipment using crowd changes or not, and calculating the experience entropy of the training set D;
the information gain calculation submodule is used for sequentially calculating the experience condition entropy and the information gain of each feature in the feature set A to the training set D based on the ID3 algorithm;
the root node selection submodule selects the characteristic with the maximum information gain as the root node characteristic of the training set D and divides the root node characteristic into a plurality of subsets;
and the leaf node selection submodule is used for respectively calculating the empirical condition entropy and the information gain of the residual features in the feature set A to each subset and selecting the leaf nodes.
9. The device for human-computer relationship verification based on device usage according to claim 7, wherein each feature in the feature set a corresponds to a frequency degree of usage of each service platform on the device, and a value of each feature includes four types: frequent, occasional, with access, without access.
10. The device usage-based human-computer relationship verification apparatus according to any one of claims 7 to 9, wherein the data extraction module only selects devices with an average weekly access number exceeding 1 when acquiring the device usage database.
CN201911075033.1A 2019-11-06 2019-11-06 Man-machine relationship verification method and device based on equipment use condition Pending CN111008646A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911075033.1A CN111008646A (en) 2019-11-06 2019-11-06 Man-machine relationship verification method and device based on equipment use condition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911075033.1A CN111008646A (en) 2019-11-06 2019-11-06 Man-machine relationship verification method and device based on equipment use condition

Publications (1)

Publication Number Publication Date
CN111008646A true CN111008646A (en) 2020-04-14

Family

ID=70111747

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911075033.1A Pending CN111008646A (en) 2019-11-06 2019-11-06 Man-machine relationship verification method and device based on equipment use condition

Country Status (1)

Country Link
CN (1) CN111008646A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104732336A (en) * 2015-02-27 2015-06-24 吉林大学第一医院 Medical equipment performance management and control information integrated platform
CN108062560A (en) * 2017-12-04 2018-05-22 贵州电网有限责任公司电力科学研究院 A kind of power consumer feature recognition sorting technique based on random forest
CN108876197A (en) * 2018-07-19 2018-11-23 杨启蓓 A kind of power equipment cluster and cohort analysis system and method
CN110009061A (en) * 2019-04-18 2019-07-12 南京邮电大学 A kind of AP adaptive optimization selection method based on machine learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104732336A (en) * 2015-02-27 2015-06-24 吉林大学第一医院 Medical equipment performance management and control information integrated platform
CN108062560A (en) * 2017-12-04 2018-05-22 贵州电网有限责任公司电力科学研究院 A kind of power consumer feature recognition sorting technique based on random forest
CN108876197A (en) * 2018-07-19 2018-11-23 杨启蓓 A kind of power equipment cluster and cohort analysis system and method
CN110009061A (en) * 2019-04-18 2019-07-12 南京邮电大学 A kind of AP adaptive optimization selection method based on machine learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
蒲天添: "基于决策树的工程项目管理优化研究" *

Similar Documents

Publication Publication Date Title
CN108874927B (en) Intrusion detection method based on hypergraph and random forest
CN109189901A (en) Automatically a kind of method of the new classification of discovery and corresponding corpus in intelligent customer service system
US10521748B2 (en) Retention risk determiner
CN106843941B (en) Information processing method, device and computer equipment
CN105471670B (en) Data on flows classification method and device
CN107545038B (en) Text classification method and equipment
CN109472075B (en) Base station performance analysis method and system
CN110727852A (en) Method, device and terminal for pushing recruitment recommendation service
CN115759640B (en) Public service information processing system and method for smart city
CN112733146B (en) Penetration testing method, device and equipment based on machine learning and storage medium
CN105488406B (en) A kind of similar malice sample matches method and system based on feature vector
CN104965784A (en) Automatic test method and apparatus
CN115577152B (en) Online book borrowing management system based on data analysis
CN104820724A (en) Method for obtaining prediction model of knowledge points of text-type education resources and model application method
CN111368862A (en) Method for distinguishing indoor and outdoor marks, training method and device of classifier and medium
CN112750030A (en) Risk pattern recognition method, risk pattern recognition device, risk pattern recognition equipment and computer readable storage medium
CN107239964A (en) User is worth methods of marking and system
CN112822121A (en) Traffic identification method, traffic determination method and knowledge graph establishment method
CN112785156B (en) Industrial collar and sleeve identification method based on clustering and comprehensive evaluation
CN113850282A (en) Traffic management method, system and device based on dynamic classification
CN106228453A (en) A kind of method and apparatus obtaining user's occupational information
CN107291860B (en) Seed user determination method
Suyal et al. Performance evaluation of rough set based classification models to intrusion detection system
CN109802847A (en) A kind of analysis method of network transmission service quality, device
CN113724059A (en) Federal learning model training method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination