CN114692098A - Intelligent software behavior control method based on block chain and federal learning - Google Patents

Intelligent software behavior control method based on block chain and federal learning Download PDF

Info

Publication number
CN114692098A
CN114692098A CN202210610745.4A CN202210610745A CN114692098A CN 114692098 A CN114692098 A CN 114692098A CN 202210610745 A CN202210610745 A CN 202210610745A CN 114692098 A CN114692098 A CN 114692098A
Authority
CN
China
Prior art keywords
model
software
node
software behavior
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210610745.4A
Other languages
Chinese (zh)
Other versions
CN114692098B (en
Inventor
王晓东
谭雨欣
魏志强
李晓璇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Zhidou Digital Technology Co ltd
Original Assignee
Ocean University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ocean University of China filed Critical Ocean University of China
Priority to CN202210610745.4A priority Critical patent/CN114692098B/en
Publication of CN114692098A publication Critical patent/CN114692098A/en
Application granted granted Critical
Publication of CN114692098B publication Critical patent/CN114692098B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/12Protecting executable software
    • G06F21/121Restricting unauthorised execution of programs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Multimedia (AREA)
  • Technology Law (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of internet, and discloses a software behavior intelligent control method based on a block chain and federal learning, which comprises the following steps: (1) constructing a software behavior classification model of a single node (2), constructing an alliance chain of a multi-node joint training task facing the software behavior classification model, taking the software behavior classification model constructed in the last step as a local training model of each node, and combining a block chain and a federal learning technology to realize multi-party cooperative training of the software behavior classification model of each node in the alliance chain under the condition of no central node drive; (3) designing a software behavior intelligent control client tool, sensing software behavior data generated when a user uses a computer in real time, inputting the data into a software behavior classification model, and intelligently controlling the software behavior according to a classification result. According to the invention, the classification accuracy of the software popup behaviors is improved on the premise of protecting the data privacy of the user, and the intelligent control of the popup behaviors can be realized according to the software preference of the user.

Description

Intelligent software behavior control method based on block chain and federal learning
Technical Field
The invention belongs to the technical field of internet, and particularly relates to a software behavior intelligent control method based on a block chain and federal learning.
Background
At present, a part of software providers bind a large amount of advertisements and information pop-up windows to the application software developed by the software providers, and the system operation experience of software users is damaged by the behavior. The appearance of the software popup can cause the problems of interference on the sight of a user, influence on the operation of the user, reduction in the use efficiency of a computer, reduction in the network speed and the like, and the user experience is greatly influenced. Although some security software capable of performing popup interception is available in the market at present, for example, the popup interception method disclosed in patent No. CN202010503928.7 monitors a popup event through a target hook function, then obtains at least one attribute feature of a popup corresponding to the popup event according to a process corresponding to the popup event, and determines whether the popup is an advertisement popup according to the attribute feature, and if the popup is an advertisement popup, refuses to respond to the popup event. The method can realize interception before the advertisement popup is displayed, improves user experience, but the method judges the advertisement popup behavior of the software by depending on certain rules, and various characteristics of the software popup are continuously changed, so that the method has the risk of missing detection or misjudgment.
Obviously, in the existing scheme, the control of the popup mainly depends on the rule configuration of the user, the user needs to manually configure the control options of the popup, and a software popup behavior information sharing mechanism among multiple users is not established, so that the intelligent control of the popup behavior cannot be realized by means of the software preference of a user group.
In the aspect of detection and classification of software behaviors, the traditional machine learning algorithm and the deep learning algorithm can learn corresponding characteristics through a large amount of data, and finally, the software behaviors are accurately classified. However, due to the lack of large amount of high-quality software behavior training data, software behavior detection and classification research faces the current situation of data islanding, so that software behavior data of each participant cannot be effectively utilized when machine learning model training is performed, and the improvement of algorithm model effect is hindered, so that a software behavior perception method needs to be explored and designed to collect data such as user interface content and process of various software popup windows, and data support is provided for classification and control of software behaviors.
In addition, collected software behavior data such as software window screenshots can contain personal preference information of users, window processes can expose behaviors of the users for using software, the data relate to user privacy, the data are dispersed in all nodes, the effect of training a model by using limited data by a single node is limited, and the centralized training of the data of all the nodes can cause privacy disclosure, so that a data island is broken on the premise of protecting the data privacy, and the accuracy of the model is improved.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides an intelligent software behavior control method based on a block chain and federal learning, which combines machine learning, federal learning and block chain technology, synthesizes multi-party data to cooperatively train a software behavior classification model on the premise of protecting the privacy of software behavior data of a user, and timely controls the software popup behavior according to the classification result. Compared with the original method for training the model by centralizing the data, the method provided by the invention can improve the classification accuracy of the software popup behaviors on the premise of protecting the data privacy of the user, and simultaneously provides support for the management and control strategy of the popup software shared by the users.
In order to solve the technical problems, the invention adopts the technical scheme that:
the intelligent software behavior control method based on the block chain and the federal learning comprises the following steps:
(1) constructing a software behavior classification model of a single node, comprising the steps of sensing and storing software behaviors, constructing a software behavior data set, designing software popup user interface content identification and a software behavior classification model network structure; (2) constructing an alliance chain of a multi-node joint training task oriented to a software behavior classification model, taking the software behavior classification model constructed in the last step as a model for local training of each node, and combining a block chain and a federal learning technology to realize that the software behavior classification model is trained in a multi-party cooperation mode under the condition that each node on the alliance chain is not driven by a central node, wherein the alliance chain architecture comprises a design software behavior classification model joint training task, a definition on-chain block structure and a joint training flow, a design model aggregation algorithm and a part for compiling an intelligent contract in the software behavior classification model joint training process;
(3) designing a software behavior intelligent control client tool, sensing software behavior data generated when a user uses a computer in real time, inputting a constructed software behavior classification model, classifying software behaviors, obtaining a classification result, and intelligently controlling the software behaviors according to the classification result.
Further, the specific steps in the step (1) are as follows:
(1.1) software behavior sensing and storing: analyzing the automatic pop-up window principle of different pop-up window software, sensing the behavior data of various multidimensional software in real time, and storing the behavior data;
(1.2) constructing a software behavior data set: performing data cleaning on the stored multi-dimensional software behavior data, labeling the behavior data of the pop-up window software, judging whether the user likes the content and labeling, and taking the labeled software behavior data as a final training and testing data set;
(1.3) identifying the content of the software popup user interface: recognizing text contents in the software popup user interface, and taking the text contents as a part of input of a subsequent software behavior classification model;
(1.4) constructing a software behavior classification model: respectively mapping the recognition result, the process behavior data and the network behavior data of the software popup user interface content into a feature vector form, fusing the feature vectors by different weights, then performing feature learning through a deep neural network, finally inputting the learned feature vectors into a probability output layer, outputting the probability of each category, and realizing the classification of the software popup behaviors.
Further, step (2) designs a alliance chain architecture facing the software behavior classification model joint training task based on Hyperhedger Fabric, and establishes a decentralized software behavior classification model joint training framework. In order to ensure the privacy of software behavior data and the security of a software behavior classification model, the block chain type used by the framework provided by the method is a federation chain. Compared with a public chain architecture, the alliance chain sets a stricter identity authentication condition for each node, and safety of the system can be improved. Meanwhile, in order to avoid storage space burden caused by data storage on the link to each node and guarantee privacy of local data of a user, the alliance link only records model training task information, abstract information of a node local model and shared model parameter information, and each node does not need to upload local software behavior data. The roles of the nodes in the alliance chain are divided into a task coordination node and a data providing node, the task coordination node is responsible for issuing a software behavior classification model joint training task and initializing a global model, and finally the task is completed by extracting parameters of the global model; the data providing node is responsible for jointly training the task aiming at the software behavior classification model issued by the task coordinating node, training the model by using the local software behavior data set, and uploading abstract information and model parameters of the local model to complete the task.
On a decentralized network with initialized and constructed alliance chain, the detailed steps of the software behavior classification model joint training process are as follows:
and (2.1) the task coordination node issues a learning training task description to the alliance chain system, initializes global model parameters, and then adds a software behavior classification joint training task, wherein the initial global model selects the software behavior classification model constructed in the step (1).
And (2.2) forming a set by the data providing nodes participating in the task, then acquiring joint training task information and global model parameters from the alliance chain, decrypting, executing federal learning based on a local software behavior data set, and cooperatively training the same machine learning model.
And (2.3) after each round of iterative training is finished, each data providing node takes the abstract data (such as model accuracy, loss value and the like) of the local model and the encrypted model parameters as transaction content to initiate transaction, then the intelligent contract evaluates the contribution of each node to the model, and only the model transaction information meeting the requirements is reserved on the alliance chain.
And (2.4) after each round of iterative training is finished, calculating the weight of the contribution degree of each node to the global model by an intelligent contract, then realizing the aggregation of model parameters based on the weight, updating the global model, and judging whether the global model is fitted or reaches the maximum iteration times. If the condition is met, judging that the joint training task is finished, and then informing a task coordination node; if the condition is not met, packing the aggregated global model into blocks and issuing a next round of iterative training task, starting the next round of iterative training by each node, and continuing to execute the step (2.2).
And (2.5) the task coordination node extracts the global model parameters and verifies the validity of the model to finish the joint training task.
Further, the model aggregation algorithm in step (2) adjusts the aggregation weight of each local model according to the summary information of the local model of each data providing node, adjusts the proportion of the local model parameters in the updated global model parameters, and increases the proportion of the high-quality model parameters in the aggregation model; the contribution degree of each data providing node to the model is determined by the model loss of the node in the k-th local training iteration, and the cross entropy is used
Figure 741051DEST_PATH_IMAGE001
Evaluating the local model loss of each node, and defining the model contribution degree weight of the model aggregation algorithm as follows:
Figure 101626DEST_PATH_IMAGE002
Figure 692007DEST_PATH_IMAGE003
and
Figure 530650DEST_PATH_IMAGE004
are respectively as
Figure 155666DEST_PATH_IMAGE005
Node and
Figure 143868DEST_PATH_IMAGE006
the cross-entropy loss function of the node,
Figure 221545DEST_PATH_IMAGE007
in order to input the parameters, the user can select the parameters,
Figure 863879DEST_PATH_IMAGE008
for the desired output, the process of updating the global model parameters in the model aggregation algorithm based on the model contribution weight can be represented as:
Figure 343402DEST_PATH_IMAGE009
wherein,
Figure 983462DEST_PATH_IMAGE010
is shown ask-global model parameters after 1 round of aggregation,nfor the number of nodes participating in the model aggregation in the current round,
Figure 814015DEST_PATH_IMAGE011
is a node
Figure 696258DEST_PATH_IMAGE012
In the first placekModel contribution weight in round model aggregation,
Figure 561446DEST_PATH_IMAGE013
is a node
Figure 372407DEST_PATH_IMAGE014
In the first placekThe updated model parameters are updated locally in turn,
Figure 159097DEST_PATH_IMAGE015
is a node
Figure 612075DEST_PATH_IMAGE016
In that
Figure 833234DEST_PATH_IMAGE017
The average gradient of the local data of (a),
Figure 611835DEST_PATH_IMAGE018
is as followskAnd (4) the global model parameters after the round aggregation.
Furthermore, in the process of joint training, an intelligent contract is written, and each iteration comprises three processes: 1) the method comprises the steps that a task coordination node or a data providing node calls an intelligent contract issuing software behavior classification model to jointly train a task, after task information is recorded on a block chain, each data providing node requests and receives relevant data of the model training task, and in the process, the data providing node calls the intelligent contract to obtain the relevant data through obtaining a classified account; 2) after acquiring global model parameters, each data providing node carries out local training based on a private software behavior data set, then sends the obtained model abstract information and the encrypted model parameters back to a block chain network, at the moment, a function of an intelligent contract is called, the function takes the model abstract information and the model parameters transmitted by each node in the round as input, verifies the model abstract information in the transaction, and only retains the model data which can improve the model accuracy or ensure that the sample number reaches the specified minimum sample number; 3) and the last process is the aggregation of software behavior classification models, the intelligent contract takes the model abstract information and the model parameters which are kept on the chain through the model contribution evaluation process in the step (2.3) as input, calculates the model contribution weight of each data providing node, updates the global model based on the model aggregation algorithm of the model contribution weight, then judges whether the global model reaches the maximum fitting degree or the iteration times, and further triggers the completion of the software behavior classification combined training task or the continuous release of the next round of iterative training task.
Furthermore, each node in the alliance chain continuously iterates the optimization model through a software behavior classification model joint training process, finally outputs an optimized software behavior classification model with the best effect, and extracts the model for actual software behavior classification.
Compared with the prior art, the invention has the advantages that:
(1) the deep learning method is applied to software behavior classification and management and control, multidimensional software behavior data are automatically sensed, software processes, network flow and user interface content data which can more accurately express software behavior characteristics are fused, then the characteristics of the software behavior data are learned to realize classification of software behaviors, classification results are reliable, intelligent management and control can be carried out according to the classification results, and system operation experience of users is improved.
(2) The software behavior classification model combined training thought designed by the invention can make the data of each node not go out of the local, and fully utilizes the data of each participant to cooperatively train a more accurate software behavior classification model, thereby not only ensuring the privacy of user data, but also fully utilizing the software behavior data of each participant and improving the performance of the model.
(3) The traditional federal learning process relies on a central server to drive training, so that the learning disclosure can not be guaranteed, and the tracking and tracing of the whole federal learning process can not be realized. In order to solve the problems, the invention designs a alliance chain framework of a software behavior classification model joint training task by combining a block chain and a federal learning technology, the framework can store abstract data such as model accuracy, loss value and the like of a software behavior classification model training process on the block chain under the condition of no central node drive, and automation and transparency of the federal learning process are realized through a written intelligent contract.
(4) The invention provides a model parameter aggregation algorithm based on model contribution degree weight, which adjusts the aggregation weight of each model according to the summary information of each data providing node local model, increases the weight proportion of high-quality model parameters in the aggregation model, and can enhance the mutual credibility between federal learning nodes and the accuracy of the aggregation model.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a flow chart of the present invention for constructing a single-node software behavior classification model;
FIG. 3 is an alliance chain framework of a multi-node joint training task oriented to a software behavior classification model according to the present invention;
FIG. 4 is a flowchart of the model issuing and acquiring training tasks of the present invention;
FIG. 5 is a flow chart of model contribution evaluation in accordance with the present invention;
FIG. 6 is a flow diagram of model aggregation in accordance with the present invention;
fig. 7 is a functional structure diagram of the intelligent management and control client tool for software behavior according to the present invention.
Detailed Description
The invention is further described with reference to the following figures and specific embodiments.
Referring to fig. 1, the intelligent software behavior management and control method based on block chain and federal learning includes the following steps:
(1) and constructing a software behavior classification model of the single node, wherein the software behavior classification model comprises the steps of sensing and storing software behaviors, constructing a software behavior data set, designing software popup user interface content identification and a software behavior classification model network structure. (2) The software behavior classification model multi-node joint training task oriented alliance chain is constructed, the software behavior classification model constructed in the last step is used as a model for local training of each node, and the block chain and federal learning technology are combined, so that the software behavior classification model is trained in a multi-party cooperation mode under the condition that each node on the alliance chain is not driven by a central node.
(3) Real-time software behavior data generated when a user uses a computer is input into the constructed software behavior classification model to classify the software behaviors, classification results are obtained, and the software behaviors are intelligently controlled according to the classification results.
The following steps are respectively described with reference to the accompanying drawings, as shown in fig. 2, a process of constructing a software behavior classification model of a single node is provided, and the specific steps in step (1) are as follows:
(1.1) software behavior sensing and storing: aiming at the ubiquitous popup internet application software in a Windows operating system, aiming at the technical difference between software developed based on different development frames, the automatic popup principle in the running process of the popup internet application software is researched, a software behavior information acquisition client tool is designed and developed based on a Windows API and a WinPcap library, multi-dimensional software behavior data such as window creation, window images, occupation of process running resources (mainly comprising a CPU, a memory and a network) and the like of various types of software are sensed in real time, an SQLite database and a PCAP type file are used for storing the data, and data support is provided for software behavior modeling analysis.
(1.2) constructing a software behavior data set: and (3) carrying out data cleaning on the stored multi-dimensional software behavior data, then labeling the behavior data of the pop-up window software, judging whether the user likes the content and labeling, and taking the labeled software behavior data as a final training and testing data set.
(1.3) identifying the content of the software popup user interface: the text part in the software popup user interface picture occupies the main body, and a user generally knows the content in the popup by reading the text, so the method can identify the text content in the software popup user interface and take the text content as a part of the subsequent software behavior classification model input.
In the advertisement popup user interface picture collected in the text, because the character area is not full of the whole picture, if the text recognition model is directly used for recognition, the interference of other background information can be received, and the effect of the model can be greatly reduced. Therefore, before the advertisement pop-up window user interface picture uses the text recognition model, a text region detection algorithm is generally used for positioning the text in the picture, and the text region detected in the picture is used as the input of the text recognition model, so that the recognition effect is improved.
The text area detection network performs normalization processing on the text area in the software popup user interface content, then integrates user interface character feature extraction and sequence prediction through a deep convolution neural network and a sequence marking module, realizes end-to-end user interface content identification, and finally outputs the identification result of the software popup user interface content.
(1.4) constructing a software behavior classification model: respectively mapping the recognition result, the process behavior data and the network behavior data of the software popup user interface content into a feature vector form, fusing the feature vectors with different weights, then performing feature learning through a deep neural network, finally inputting the learned feature vectors into a probability output layer, outputting the probability of each category, and realizing the classification of the software popup behaviors.
Combining the alliance chain framework of the software behavior classification model-oriented multi-node joint training task shown in fig. 3, step (2) designs the alliance chain framework of the software behavior classification model-oriented joint training task based on Hyperhedger Fabric, and establishes the decentralized software behavior classification model joint training framework. In order to ensure the privacy of software behavior data and the security of a software behavior classification model, the block chain type used by the framework provided by the method is a federation chain. Compared with a public chain architecture, the alliance chain sets stricter identity authentication conditions for each node, and safety of the system can be improved. Meanwhile, in order to avoid storage space burden caused by data storage on the link to each node and guarantee privacy of local data of a user, the alliance link only records model training task information, abstract information of a node local model and shared model parameter information, and each node does not need to upload local software behavior data. The roles of the nodes in the alliance chain are divided into a task coordination node and a data providing node, the task coordination node is responsible for issuing a software behavior classification model joint training task and initializing a global model, and finally the task is completed by extracting parameters of the global model; the data providing node is responsible for jointly training the task aiming at the software behavior classification model issued by the task coordinating node, training the model by using the local software behavior data set, and uploading abstract information and model parameters of the local model to complete the task.
On a decentralized network with initialized and constructed alliance chain, the detailed steps of the software behavior classification model joint training process are as follows:
and (2.1) the task coordination node issues a learning training task description to the alliance chain system, initializes global model parameters, and then adds a software behavior classification joint training task, wherein the initial global model selects the software behavior classification model constructed in the step (1).
And (2.2) forming a set by the data providing nodes participating in the task, then acquiring joint training task information and global model parameters from the alliance chain, decrypting, executing federal learning based on a local software behavior data set, and cooperatively training the same machine learning model.
And (2.3) after each round of iterative training is finished, the data providing nodes take abstract data (such as model accuracy, loss value and the like) of a local model and encrypted model parameters as transaction contents to initiate transactions, then endorsement nodes simulate and execute the transactions and carry out endorsements, the selected sequencing service nodes pack the transactions into blocks, the blocks are broadcasted to the whole network to verify block information, and the blocks are issued to a alliance chain after the verification is passed. When the data providing nodes upload abstract data of a local model and encrypted model parameters, the intelligent contract evaluates the contribution of each node to the model, and only retains model transaction information meeting requirements to a alliance chain.
And (2.4) after each round of iterative training is finished, calculating the weight of the contribution degree of each node to the global model by an intelligent contract, then realizing the aggregation of model parameters based on the weight, updating the global model, and judging whether the global model is fitted or reaches the maximum iteration times. If the condition is met, the Pandeon joint training task is completed, and then a task coordination node is informed; if the condition is not met, packing the aggregated global model into blocks and issuing a next round of iterative training task, starting the next round of iterative training by each node, and continuing to execute the step (2.2).
And (2.5) the task coordination node extracts the global model parameters and verifies the validity of the model to finish the joint training task.
The existing model aggregation algorithm is mainly Federal average FedAvg, but FedAvg does not consider the problem of global model quality reduction caused by the participation of low-quality models in aggregation. The invention provides a model aggregation algorithm based on model contribution degree weight, wherein in the step (2), the model aggregation algorithm adjusts the aggregation weight of each local model according to the summary information of each data providing node local model, adjusts the proportion of local model parameters in updated global model parameters, and increases the proportion of high-quality model parameters in the aggregation model; the contribution degree of each data providing node to the model is determined by the model loss of the node in the k-th local training iteration, and the cross entropy is used
Figure 89083DEST_PATH_IMAGE019
Evaluating the local model loss of each node, and defining the model contribution degree weight of the model aggregation algorithm as follows:
Figure 876911DEST_PATH_IMAGE020
Figure 185533DEST_PATH_IMAGE021
and
Figure 603876DEST_PATH_IMAGE022
are respectively as
Figure 129273DEST_PATH_IMAGE005
Node and
Figure 189632DEST_PATH_IMAGE006
the cross-entropy loss function of the node,
Figure 821602DEST_PATH_IMAGE007
in order to input the parameters, the user can select the parameters,
Figure 410846DEST_PATH_IMAGE008
for the desired output, the process of updating the global model parameters in the model aggregation algorithm based on the model contribution weight can be represented as:
Figure 659425DEST_PATH_IMAGE023
wherein,
Figure 789055DEST_PATH_IMAGE010
is shown ask-global model parameters after 1 round of aggregation,nfor the number of nodes participating in the model aggregation in the current round,
Figure 771137DEST_PATH_IMAGE024
is a node
Figure 328020DEST_PATH_IMAGE025
In the first placekModel contribution weight in round model aggregation,
Figure 63895DEST_PATH_IMAGE013
is a node
Figure 934899DEST_PATH_IMAGE026
In the first placekThe updated model parameters are updated locally in turn,
Figure 338198DEST_PATH_IMAGE015
is a node
Figure 502201DEST_PATH_IMAGE027
In that
Figure 990951DEST_PATH_IMAGE017
The average gradient of the local data of (a),
Figure 462384DEST_PATH_IMAGE018
is as followskAnd (4) the global model parameters after the round aggregation.
Global modelG k The condition for achieving convergence is asG k Is less than a predetermined valueHOr the number of iterations reaches the maximum number of iterationsMaxIterationNum,Namely:
Figure 657873DEST_PATH_IMAGE028
the model aggregation algorithm based on the model contribution weight is shown as algorithm 1.
Algorithm 1 model aggregation based on model contribution weight
Input participating node
Figure 556559DEST_PATH_IMAGE029
Number of iterations
Figure 267026DEST_PATH_IMAGE030
Encrypted global model parameters
Figure 246877DEST_PATH_IMAGE031
Output encrypted Global model parameters
Figure 296872DEST_PATH_IMAGE032
1 while
Figure 366459DEST_PATH_IMAGE033
do
2 for
Figure 564223DEST_PATH_IMAGE034
do
3 obtaining the latest model parameters and decrypting
Figure 580720DEST_PATH_IMAGE035
To obtain
Figure 547539DEST_PATH_IMAGE036
Is provided with
Figure 286563DEST_PATH_IMAGE037
4for each local iteration j do from 1 to the number of iterations S
5 obtaining local model parameters from the last iteration, i.e. setting
Figure 846988DEST_PATH_IMAGE038
6 for from 1 to the batch number B do of the batch data set B
7 calculating the gradient of the batch data set B
Figure 463914DEST_PATH_IMAGE039
8 local update model parameters:
Figure 488502DEST_PATH_IMAGE040
9 calculate local update model loss:
Figure 395497DEST_PATH_IMAGE041
10 end for
11 end for
obtaining local model parameter updates
Figure 567853DEST_PATH_IMAGE042
In a
Figure 926153DEST_PATH_IMAGE043
Up-performing addition homomorphic encryption
Figure 867564DEST_PATH_IMAGE044
Loss of local model
Figure 312769DEST_PATH_IMAGE046
And uploaded to a block chain
13 the smart contract performs model parameter aggregation, and the received model parameters are weighted and averaged, i.e.
Figure 503634DEST_PATH_IMAGE047
14 end while
15 return
Figure 256826DEST_PATH_IMAGE032
According to the method, the local model loss value of each node is evaluated through the model parameter aggregation algorithm based on the model contribution weight, the proportion of the local model parameters in the updated global model parameters is adjusted, and the fairness of the federal learning process and the accuracy of the aggregation model are enhanced.
As shown in fig. 4-6, the present invention realizes the automation and transparency of the software behavior classification model joint training process by writing an intelligent contract, and each iteration includes three main processes in the joint training process: 1) the method comprises the following steps that a task coordination node or an intelligent contract issuing software behavior classification model is called to jointly train a task, after task information is recorded on a block chain, each data providing node requests and receives relevant data of the model training task, and in the process, the data providing node calls an intelligent contract to obtain the relevant data through obtaining a classified account; 2) after acquiring global model parameters, each data providing node carries out local training based on a private software behavior data set, then sends the obtained model abstract information and the encrypted model parameters back to a block chain network, at the moment, a function of an intelligent contract is called, the function takes the model abstract information and the model parameters transmitted by each node in the round as input, verifies the model abstract information in the transaction, and only retains the model data which can improve the model accuracy or ensure that the sample number reaches the specified minimum sample number; 3) and the final process is the aggregation of software behavior classification models, the intelligent contract takes the model abstract information and the model parameters which are kept on the chain through the model contribution evaluation process in the previous step as input, calculates the model contribution weight of each data providing node, updates the global model based on the model aggregation algorithm of the model contribution weight, then judges whether the global model reaches the maximum fitting degree or the iteration times, and further triggers the completion of the software behavior classification combined training task or the continuous release of the next round of iterative training task.
According to the invention, on the premise of protecting user data privacy, the leaguer link framework based on HyperLedger Fabric and Federal learning can utilize multi-user software behavior data to cooperate with the training model, so that the accuracy of the software behavior classification model is improved, and the popup behaviors are classified and controlled by means of the software preference of user groups; the combined training process driven by the nodes without the center can be realized, and the credibility of mutual trust and an aggregation model among the nodes is ensured; the automation and the transparentization of the whole process of the joint training are realized by compiling and deploying intelligent contracts; the abstract data in the training process is stored in the chain, so that the follow-up tracking and auditing of the combined training task are facilitated.
When the software behavior classification model is used, each node in the alliance chain continuously iterates the optimization model through a software behavior classification model joint training process, and finally a software behavior classification model with better effect is output, and the model is extracted to be used for actual software behavior classification.
Referring to fig. 7, the invention designs a software behavior intelligent control client tool, which can sense the behavior of various types of software in a system in real time, collect relevant software behavior data, and call a software behavior classification model to classify the software behavior in real time when a user uses a computer. The user can define a software behavior control strategy in the client, and intelligent control is carried out according to the classification result: for the software behavior with the classification result disliked by the user, stopping the software behavior in a way of ending the process and the like; and taking a reservation measure for the software behavior with the classification result being the favorite of the user. Meanwhile, the client tool can record the software behaviors with the classification results disliked by the user, wherein the software behaviors comprise data such as software names, software service providers and the occurrence frequency of the same software behaviors, and the user can inquire the statistical information of the software behaviors regularly. For the software behavior which is frequently not liked by the user, the client prompts the user whether to uninstall the software or not, and the client automatically uninstalls the corresponding software after the user confirms the uninstallation.
In summary, the invention learns the behavior characteristics of the software from multiple dimensions, such as the process data, the network flow and the user interface content of the software, so as to realize the classification of the software behavior and implement the management and control of the software behavior in time according to the classification result. In addition, the invention can build an information sharing mechanism by means of multi-user software behavior data, and realize intelligent control of software behaviors according to the software preferences of user groups by combining block chains and a federal learning technology.
It is understood that the above description is not intended to limit the present invention, and the present invention is not limited to the above examples, and those skilled in the art should understand that they can make various changes, modifications, additions and substitutions within the spirit and scope of the present invention.

Claims (6)

1. The intelligent software behavior control method based on the block chain and the federal learning is characterized by comprising the following steps of:
(1) constructing a software behavior classification model of a single node, comprising the steps of sensing and storing software behaviors, constructing a software behavior data set, designing software popup user interface content identification and a software behavior classification model network structure;
(2) constructing an alliance chain of a multi-node joint training task oriented to a software behavior classification model, taking the software behavior classification model constructed in the last step as a model for local training of each node, and combining a block chain and a federal learning technology to realize that the software behavior classification model is trained in a multi-party cooperation mode under the condition that each node on the alliance chain is not driven by a central node, wherein the alliance chain architecture comprises a design software behavior classification model joint training task, a definition on-chain block structure and a joint training flow, a design model aggregation algorithm and a part for compiling an intelligent contract in the software behavior classification model joint training process;
(3) designing a software behavior intelligent control client tool, sensing software behavior data generated when a user uses a computer in real time, inputting the constructed software behavior classification model, classifying the software behavior, obtaining a classification result, and intelligently controlling the software behavior according to the classification result.
2. The intelligent software behavior control method based on the block chain and the federal learning according to claim 1, wherein the specific steps in the step (1) are as follows:
(1.1) software behavior sensing and storing: analyzing the automatic pop-up window principle of different pop-up window software, sensing the behavior data of various multidimensional software in real time, and storing the behavior data;
(1.2) constructing a software behavior data set: performing data cleaning on the stored multi-dimensional software behavior data, labeling the behavior data of the pop-up window software, judging whether the user likes the content and labeling, and taking the labeled software behavior data as a final training and testing data set;
(1.3) identifying the content of the software popup user interface: recognizing text contents in the software popup user interface, and taking the text contents as a part of input of a subsequent software behavior classification model;
(1.4) constructing a software behavior classification model: respectively mapping the recognition result, the process behavior data and the network behavior data of the software popup user interface content into a feature vector form, fusing the feature vectors by different weights, then performing feature learning through a deep neural network, finally inputting the learned feature vectors into a probability output layer, outputting the probability of each category, and realizing the classification of the software popup behaviors.
3. The intelligent software behavior control method based on the blockchain and the federal learning of claim 1, wherein step (2) designs a alliance chain architecture facing a software behavior classification model joint training task based on Hyperhedger Fabric, and establishes an alliance chain architecture for decentralized software behavior classification model joint training framework, the type of the blockchain used by the framework is an alliance chain, the alliance chain only records model training task information, abstract information of a node local model and shared model parameter information, and each node does not need to upload local software behavior data; the roles of the nodes in the alliance chain are divided into task coordination nodes and data providing nodes, the task coordination nodes are responsible for issuing software behavior classification model joint training tasks, initializing global models and finally completing the tasks by extracting global model parameters; the data providing node is responsible for jointly training a task aiming at a software behavior classification model issued by the task coordination node, training the model by using a local software behavior data set, and uploading abstract information and model parameters of the local model to complete the task;
on a decentralized network with initialized and constructed alliance chain, the detailed steps of the software behavior classification model joint training process are as follows:
(2.1) the task coordination node issues a learning training task description to the alliance chain system, initializes global model parameters, and then adds a software behavior classification joint training task, wherein the initial global model selects the software behavior classification model constructed in the step (1);
(2.2) forming a set by the data providing nodes participating in the task, then acquiring joint training task information and global model parameters from a alliance chain and decrypting the joint training task information and the global model parameters, executing federal learning based on a local software behavior data set, and cooperatively training the same machine learning model;
(2.3) after each round of iterative training is finished, each data providing node takes the abstract data of the local model and the encrypted model parameters as transaction contents to initiate transaction, then the intelligent contract evaluates the contribution of each node to the model, and only the model transaction information meeting the requirements is reserved on the alliance chain;
(2.4) after each round of iterative training is finished, calculating the weight of the contribution degree of each node to the global model by an intelligent contract, then realizing the aggregation of model parameters based on the weight, updating the global model, and judging whether the global model is fitted or reaches the maximum iteration times; if the condition is met, judging that the joint training task is finished, and then informing a task coordination node; if the condition is not met, packaging the aggregated global model into blocks and issuing a next round of iterative training task, starting the next round of iterative training by each node, and continuing to execute the step (2.2);
and (2.5) the task coordination node extracts the global model parameters and verifies the validity of the model to finish the joint training task.
4. The intelligent software behavior control method based on the blockchain and federal learning of claim 3, wherein in the step (2), the model aggregation algorithm adjusts the aggregation weight of each local model according to the summary information of each data providing node local model, adjusts the proportion of local model parameters in the updated global model parameters, and increases the proportion of high-quality model parameters in the aggregation model; the contribution degree of each data providing node to the model is determined by the model loss of the node in the k-th local training iteration, and the cross entropy is used
Figure 270961DEST_PATH_IMAGE002
Evaluating the local model loss of each node, and defining the model contribution degree weight of the model aggregation algorithm as follows:
Figure 827845DEST_PATH_IMAGE004
Figure 829299DEST_PATH_IMAGE006
and
Figure 887253DEST_PATH_IMAGE008
are respectively as
Figure 24974DEST_PATH_IMAGE009
Node and
Figure 18337DEST_PATH_IMAGE010
the cross-entropy loss function of the node,
Figure 867607DEST_PATH_IMAGE011
in order to input the parameters, the user can select the parameters,
Figure 339040DEST_PATH_IMAGE012
for the desired output, the process of updating the global model parameters in the model aggregation algorithm based on the model contribution weight can be represented as:
Figure 331266DEST_PATH_IMAGE014
wherein,
Figure 620165DEST_PATH_IMAGE015
is shown ask-global model parameters after 1 round of aggregation,nfor the number of nodes participating in the model aggregation in the current round,
Figure 65053DEST_PATH_IMAGE017
is a node
Figure 199231DEST_PATH_IMAGE019
In the first placekModel contribution weight in round model aggregation,
Figure 577123DEST_PATH_IMAGE020
is a node
Figure DEST_PATH_IMAGE022
In the first placekThe updated model parameters are updated locally in turn,
Figure DEST_PATH_IMAGE023
is a node
Figure DEST_PATH_IMAGE024
In that
Figure DEST_PATH_IMAGE025
The average gradient of the local data of (a),
Figure DEST_PATH_IMAGE026
is as followskAnd (4) the global model parameters after the round aggregation.
5. The intelligent software behavior management and control method based on the blockchain and the federal learning of claim 3, wherein in the process of joint training, an intelligent contract is written, and each iteration comprises three processes: 1) the method comprises the steps that a task coordination node or a data providing node calls an intelligent contract issuing software behavior classification model to jointly train a task, after task information is recorded on a block chain, each data providing node requests and receives relevant data of the model training task, and in the process, the data providing node calls the intelligent contract to obtain the relevant data through obtaining a classified account; 2) after acquiring global model parameters, each data providing node carries out local training based on a private software behavior data set, then sends the obtained model abstract information and the encrypted model parameters back to a block chain network, at the moment, a function of an intelligent contract is called, the function takes the model abstract information and the model parameters transmitted by each node in the round as input, verifies the model abstract information in the transaction, and only retains the model data which can improve the model accuracy or ensure that the sample number reaches the specified minimum sample number; 3) and the last process is the aggregation of software behavior classification models, the intelligent contract takes the model abstract information and the model parameters which are kept on a chain through the model contribution evaluation process in the step (2.3) as input, calculates the model contribution weight of each data providing node, updates the global model based on the model aggregation algorithm of the model contribution weight, then judges whether the global model reaches the maximum fitting degree or the iteration times, and further triggers the completion of the software behavior classification combined training task or the continuous release of the next round of iterative training task.
6. The intelligent software behavior control method based on the blockchain and the federal learning of claim 5, wherein each node in the alliance chain continuously iterates the optimization model through a software behavior classification model joint training process, and finally outputs an optimized software behavior classification model with the best effect, and the model is extracted for actual software behavior classification.
CN202210610745.4A 2022-06-01 2022-06-01 Intelligent software behavior control method based on block chain and federal learning Active CN114692098B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210610745.4A CN114692098B (en) 2022-06-01 2022-06-01 Intelligent software behavior control method based on block chain and federal learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210610745.4A CN114692098B (en) 2022-06-01 2022-06-01 Intelligent software behavior control method based on block chain and federal learning

Publications (2)

Publication Number Publication Date
CN114692098A true CN114692098A (en) 2022-07-01
CN114692098B CN114692098B (en) 2022-08-26

Family

ID=82131116

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210610745.4A Active CN114692098B (en) 2022-06-01 2022-06-01 Intelligent software behavior control method based on block chain and federal learning

Country Status (1)

Country Link
CN (1) CN114692098B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190222586A1 (en) * 2018-01-17 2019-07-18 Trust Ltd. Method and system of decentralized malware identification
CN111291394A (en) * 2020-01-31 2020-06-16 腾讯科技(深圳)有限公司 False information management method, false information management device and storage medium
US20200193292A1 (en) * 2018-12-04 2020-06-18 Jinan University Auditable privacy protection deep learning platform construction method based on block chain incentive mechanism
CN112950222A (en) * 2021-04-08 2021-06-11 腾讯科技(深圳)有限公司 Resource processing abnormity detection method and device, electronic equipment and storage medium
CN113657608A (en) * 2021-08-05 2021-11-16 浙江大学 Excitation-driven block chain federal learning method
CN113992360A (en) * 2021-10-01 2022-01-28 浙商银行股份有限公司 Block chain cross-chain-based federated learning method and equipment
CN114491616A (en) * 2021-12-08 2022-05-13 杭州趣链科技有限公司 Block chain and homomorphic encryption-based federated learning method and application

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190222586A1 (en) * 2018-01-17 2019-07-18 Trust Ltd. Method and system of decentralized malware identification
US20200193292A1 (en) * 2018-12-04 2020-06-18 Jinan University Auditable privacy protection deep learning platform construction method based on block chain incentive mechanism
CN111291394A (en) * 2020-01-31 2020-06-16 腾讯科技(深圳)有限公司 False information management method, false information management device and storage medium
CN112950222A (en) * 2021-04-08 2021-06-11 腾讯科技(深圳)有限公司 Resource processing abnormity detection method and device, electronic equipment and storage medium
CN113657608A (en) * 2021-08-05 2021-11-16 浙江大学 Excitation-driven block chain federal learning method
CN113992360A (en) * 2021-10-01 2022-01-28 浙商银行股份有限公司 Block chain cross-chain-based federated learning method and equipment
CN114491616A (en) * 2021-12-08 2022-05-13 杭州趣链科技有限公司 Block chain and homomorphic encryption-based federated learning method and application

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SHILI HU等: "《2021 IEEE International Conference on Blockchain (Blockchain)》", 24 January 2022 *
张君如等: "面向用户隐私保护的联邦安全树算法", 《计算机应用》 *

Also Published As

Publication number Publication date
CN114692098B (en) 2022-08-26

Similar Documents

Publication Publication Date Title
US20230316076A1 (en) Unsupervised Machine Learning System to Automate Functions On a Graph Structure
US11521221B2 (en) Predictive modeling with entity representations computed from neural network models simultaneously trained on multiple tasks
US11017271B2 (en) Edge-based adaptive machine learning for object recognition
US20190378051A1 (en) Machine learning system coupled to a graph structure detecting outlier patterns using graph scanning
US20190378049A1 (en) Ensemble of machine learning engines coupled to a graph structure that spreads heat
US20190378050A1 (en) Machine learning system to identify and optimize features based on historical data, known patterns, or emerging patterns
US20190377819A1 (en) Machine learning system to detect, label, and spread heat in a graph structure
Chorianopoulos Effective CRM using predictive analytics
US20200320381A1 (en) Method to explain factors influencing ai predictions with deep neural networks
US20150302433A1 (en) Automatic Generation of Custom Intervals
CN113468227B (en) Information recommendation method, system, equipment and storage medium based on graph neural network
CN107230108A (en) The processing method and processing device of business datum
CN110991789B (en) Method and device for determining confidence interval, storage medium and electronic device
US20230252362A1 (en) Techniques for deriving and/or leveraging application-centric model metric
CN113011895A (en) Associated account sample screening method, device and equipment and computer storage medium
Rehman et al. Federated self-supervised learning for video understanding
CN112817563A (en) Target attribute configuration information determination method, computer device, and storage medium
CN115631008B (en) Commodity recommendation method, device, equipment and medium
CN111159241A (en) Click conversion estimation method and device
CN114692098B (en) Intelligent software behavior control method based on block chain and federal learning
CN117010492A (en) Method and device for model training based on knowledge migration
US11991183B2 (en) Optimizing resource utilization
CN114003821B (en) Personalized behavior recommendation method based on federal learning
CN113360772A (en) Interpretable recommendation model training method and device
CN112308706A (en) Machine learning model training method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20231102

Address after: 27a, Haiya building, Binhai garden, 1 Shandong Road, Shinan District, Qingdao City, Shandong Province 266071

Patentee after: Shandong Zhidou Digital Technology Co.,Ltd.

Address before: 266100 Shandong Province, Qingdao city Laoshan District Songling Road No. 238

Patentee before: OCEAN University OF CHINA

TR01 Transfer of patent right