CN115329833A - Logistics system abnormal data identification method based on block chain - Google Patents

Logistics system abnormal data identification method based on block chain Download PDF

Info

Publication number
CN115329833A
CN115329833A CN202210760438.4A CN202210760438A CN115329833A CN 115329833 A CN115329833 A CN 115329833A CN 202210760438 A CN202210760438 A CN 202210760438A CN 115329833 A CN115329833 A CN 115329833A
Authority
CN
China
Prior art keywords
data
training set
sample
weight
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210760438.4A
Other languages
Chinese (zh)
Inventor
周媛媛
李晓辉
沈八中
苏家楠
吕思婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Institute of Technology of Xidian University
Original Assignee
Guangzhou Institute of Technology of Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Institute of Technology of Xidian University filed Critical Guangzhou Institute of Technology of Xidian University
Priority to CN202210760438.4A priority Critical patent/CN115329833A/en
Publication of CN115329833A publication Critical patent/CN115329833A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2379Updates performed during online database operations; commit processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/083Shipping
    • G06Q10/0838Historical data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Economics (AREA)
  • Computing Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Development Economics (AREA)
  • Mathematical Physics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Medical Informatics (AREA)
  • Evolutionary Computation (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a logistics system abnormal data identification method based on a block chain, and relates to the technical field of block chains. Initializing sample data to obtain a plurality of training sets and an initialized sample weight and a misclassification cost corresponding to each training set; performing a preset number of iterations by taking the weight of the initialized sample and the misclassification cost as parameters to obtain a preset number of weak classifiers, and combining the preset number of weak classifiers into a strong classifier of the training set; carrying out weighted aggregation on the strong classifiers of all the training sets, and obtaining a judgment result whether the transaction data is abnormal or not through voting type judgment; and packing the normal data to obtain a second block and accessing the second block to the block chain. The method has the advantages that the unbalanced data classification algorithm based on integrated learning and cost sensitivity is combined, the accuracy of unbalanced data classification can be effectively improved, when the proportion of abnormal data in the whole data is small, accurate judgment can be carried out through the trained strong classifier, and the phenomenon that the abnormal data is packed and linked up to cause loss is avoided.

Description

Logistics system abnormal data identification method based on block chain
Technical Field
The invention relates to the technical field of block chains, in particular to a logistics system abnormal data identification method based on a block chain.
Background
In recent years, electronic commerce has been rapidly developed, and the combination of logistics and various internet technologies has become more and more compact. While the logistics related technology is rapidly improved, many problems still need to be solved, such as logistics information tracing problem. The problem is not only that the transportation information such as the origin, the route and the destination of the goods is simply inquired, but also that the authenticity of the inquired information is ensured. The logistics commodity can be effectively tracked by tracing the logistics information, and the logistics commodity can be known and processed in time when passing through risk areas and related personnel, so that the risk is reduced to a certain extent.
The traditional logistics system adopts a centralized deployment mode, so that the risk of tampering logistics information by manufacturers and distributors exists, and the accuracy of the information cannot be ensured. Meanwhile, blockchain networks have gained widespread attention due to their highly transparent, decentralized, non-tamper-evident, and anonymous nature. Compared with the traditional logistics system, the block chain realizes the functions of data non-tampering, logistics commodity tracing and the like, effectively breaks an information island, and avoids the problem caused by malicious information modification.
In the prior art, although the data of the block chain technology is used to ensure that the uplink data cannot be tampered, the authenticity and accuracy of the data submitted by the participants cannot be ensured. Under the condition that a malicious party submits malicious data or abnormal data, data chaining is directly carried out without judgment, and loss is easily caused.
Disclosure of Invention
The present invention is directed to solve the problems of the background art, and provides a method for identifying abnormal data of a logistics system based on a block chain.
The purpose of the invention can be realized by the following technical scheme:
the logistics system abnormal data identification method based on the block chain provided by the embodiment of the invention comprises the following steps:
acquiring transaction data submitted to a to-be-checked block by a user block link node;
initializing sample data to obtain a plurality of training sets and an initialized sample weight and a misclassification cost corresponding to each training set; the sample data comprises the transaction data and historical data of the user blockchain node;
aiming at each training set, carrying out iteration for a preset number of times by taking the weight of the initialized sample and the misclassification cost as parameters to obtain a preset number of weak classifiers, and combining the preset number of weak classifiers into a strong classifier of the training set;
carrying out weighted aggregation on the strong classifiers of all the training sets, and obtaining a judgment result whether the transaction data is abnormal or not through voting type judgment;
and broadcasting the judgment result of the transaction data on a block chain, packaging normal data to obtain a second block, and accessing the second block to the block chain.
Optionally, before obtaining the transaction data submitted to the block to be reviewed by the user block link node, the method further includes:
acquiring an account and a password input by a target User through a User block link point, comparing the account and the password with information filled in a database User table during registration, and verifying the identity of the target User;
and if the target user identity is successfully verified and the transaction data is uploaded by the user block link node, storing the transaction data to the block to be audited.
Optionally, initializing the sample data to obtain a plurality of training sets and an initialized sample weight and a misclassification cost corresponding to each training set, including:
dividing the sample data into N training sets;
for each training set, the training set is represented as:
S={(x 1 ,y 1 ),(x 2 ,y 2 ),...,(x K ,y K )|y∈(1,-1)}
where K is the total number of samples in the training set S, x k Data representing the kth sample, y k Indicates whether the k sample is normal, y k =1 represents a normal class sample; y is k =1 represents exception class sample;
calculating the initial sample weight D of the first iteration of the training set 1
Figure BDA0003720878220000031
Wherein, w 1k The weight of each sample in the training set S in the first iteration is obtained;
calculating the misclassification cost C of the training set k
Figure BDA0003720878220000032
Wherein n is the number of most samples in the training set, m is the number of few samples in the training set, and K is the total number of samples in the training set S.
Optionally, for each training set, performing a preset number of iterations with the initialized sample weight and the misclassification cost as parameters to obtain a preset number of weak classifiers, and combining the preset number of weak classifiers into a strong classifier of the training set, including the following steps:
extracting partial data as a learning training set aiming at each training set;
step two, learning the learning training set by using the learning sample weight corresponding to the learning training set at present to obtain a weak classifier; the learning sample weight is the initialization sample weight at a first iteration;
step three, updating the weight of the learning sample of the next iteration according to the weight of the learning sample and the misclassification cost;
step four, the updated learning sample weight is used, the steps one to three are repeatedly executed until a preset number of iterations are completed, and a preset number of weak classifiers are obtained;
step five, obtaining a group of weak classifiers f = (f) after T iterations of the training set S 1 ,f 2 ,...,f T ) Combining the weak classifier set F into a strong classifier F i
Figure BDA0003720878220000033
Wherein i represents the ith of the N training sets S, T represents the iteration number of the training set S, and alpha t Denotes the weak classifier f at the t-th iteration t The sign function output value is 1 or-1.
Optionally, each training set includes normal class samples and abnormal class samples, where the normal class samples are more than the abnormal class samples, the normal class samples are called majority class samples, and the abnormal class samples are called minority class samples; assuming that each training set comprises m majority class samples and n minority class samples;
for each training set, extracting partial data to serve as a learning training set, and specifically comprising the following steps:
and aiming at each training set, sequencing m majority samples contained in the training set from large to small according to the weight of the initialized samples, and extracting the first n majority samples and n minority samples to form a new set as a learning training set.
Optionally, updating the learning sample weight of the next iteration according to the learning sample weight and the misclassification cost includes:
supposing that the iteration is the t-th iteration, calculating the weak classifier f in the iteration t Error rate of (e) t
Figure BDA0003720878220000041
Wherein D is t (x k ) Data x representing the k sample k The learning sample weight at the tth iteration;
weak classifier f for computing t-th iterative training t Weight of alpha t
Figure BDA0003720878220000042
Calculating a sample weight adjustment factor beta k
β k =-0.5(y k f t (x k ))C k +0.5
Wherein, y k Is a variable having a value of 1 or-1, f t (x k ) A weak classifier f which is used for carrying out t-time iterative training on the data of the kth sample t Output value of (C) k Is a wrong division cost;
updating the weight of the learning sample of t +1 iterations to obtain D t+1
Figure BDA0003720878220000043
Wherein D is t (x k ) Initialization weight, α, for the kth sample t Weak classifier f for the t-th iteration t Weight of (1), beta k For the weight adjustment factor of the kth sample, z t Is a normalization factor.
Optionally, it is assumed that N training sets are included, and the training set S is any one of the N training sets; strong classifier F for training set S i A predicted value given to the input data is 1 or-1;
performing weighted aggregation on the strong classifiers of all the training sets, and obtaining a judgment result whether the transaction data is abnormal through voting type judgment, wherein the judgment result comprises the following steps:
from strong classifiers of N training setsWeighting and aggregating to obtain a group of strong classifiers F = { F = } 1 ,F 2 ,...,F N };
And (3) taking the prediction value of the strong classifier F as the input of a voting judgment function, and calculating an output prediction value P:
Figure BDA0003720878220000051
wherein N is the number of training sets, F i (x) Strong classifier F for representing ith data set training i For the predicted value of the input data, the output value of the sign function is 1 or-1;
obtaining a judgment result whether the transaction data is abnormal or not according to the predicted value P:
if the predicted value P is not less than 0, the transaction data is abnormal data;
and if the predicted value P is less than 0, the transaction data are normal data.
The logistics system abnormal data identification method based on the block chain obtains transaction data submitted to a block to be checked by a user block chain node; initializing sample data to obtain a plurality of training sets and an initialized sample weight and a misclassification cost corresponding to each training set; the sample data comprises transaction data and historical data of user block chain nodes; aiming at each training set, carrying out iteration for a preset number of times by taking the weight of the initialized sample and the misclassification cost as parameters to obtain a preset number of weak classifiers, and combining the preset number of weak classifiers into a strong classifier of the training set; carrying out weighted aggregation on the strong classifiers of all the training sets, and obtaining a judgment result whether the transaction data is abnormal or not through voting type judgment; and broadcasting the judgment result of the transaction data on the block chain, packaging the normal data to obtain a second block, and accessing the second block to the block chain. The method combines an unbalanced data classification algorithm based on ensemble learning and cost sensitivity, the ensemble learning forms a strong classifier by training a weak classifier, the cost sensitivity algorithm attaches importance to abnormal sample data by giving different weights to the sample, the accuracy of unbalanced data classification can be effectively improved by combining the ensemble learning and the cost sensitivity, when the proportion of the abnormal data in the whole data is small, accurate judgment can be carried out through the trained strong classifier, and the loss caused by packaging and chaining of the abnormal data is avoided.
Drawings
The invention will be further described with reference to the accompanying drawings.
Fig. 1 is a flowchart of a method for identifying abnormal data of a logistics system based on a block chain according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a logistics system provided by an embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a logistics system abnormal data identification method based on a block chain. Referring to fig. 1, fig. 1 is a flowchart of a logistics system abnormal data identification method based on a block chain according to an embodiment of the present invention. The method may comprise the steps of:
s101, acquiring transaction data submitted to a to-be-audited block by a user block chain node.
S102, initializing the sample data to obtain a plurality of training sets and the initialized sample weight and the misclassification cost corresponding to each training set.
S103, aiming at each training set, carrying out iteration for a preset number of times by taking the weight of the initialized sample and the misclassification cost as parameters to obtain a preset number of weak classifiers, and combining the preset number of weak classifiers into a strong classifier of the training set.
And S104, performing weighted aggregation on the strong classifiers of all the training sets, and obtaining a judgment result whether the transaction data is abnormal or not through voting type judgment.
And S105, broadcasting the judgment result of the transaction data on the block chain, packaging the normal data to obtain a second block, and accessing the second block to the block chain.
The sample data includes transaction data and historical data of user blockchain nodes.
According to the logistics system abnormal data identification method based on the block chain, the unbalanced data classification algorithm based on the ensemble learning and the cost sensitivity is combined, the ensemble learning forms the strong classifier by training the weak classifier, the cost sensitivity algorithm attaches importance to abnormal sample data by giving different weights to the samples, the accuracy of unbalanced data classification can be effectively improved by combining the imbalance data classification algorithm with the cost sensitivity algorithm, when the proportion of abnormal data in the whole data is small, accurate judgment can be carried out through the trained strong classifier, and the phenomenon that the abnormal data are packed and linked up to avoid loss is caused.
In one implementation, the user blockchain nodes are electronic terminals such as computers and cell phones connected to a blockchain network. The user block chain node can pack the transaction data and submit the transaction data to the block to be checked.
In one implementation, normal data is recorded in a blockchain in a transaction form through the intelligent contract, when new data needs to be written, the system identifies the last transaction hash stored last time and writes the data into the block, and the transaction data in the intelligent contract can comprise a transaction address, transaction content, a transaction address, a transaction date, a transaction person signature and the like.
In one embodiment, before S101, the method may further include:
and acquiring an account and a password input by the target User through the User block link point, and comparing the account and the password with the information recorded in the User table of the database during registration to verify the identity of the target User.
And if the target user identity is successfully verified and the transaction data are uploaded by the user block link points, storing the transaction data to the block to be checked.
In one implementation, the target User verifies identity information, the Account type may include a producer, a dealer, a transportation person, and the like, the background automatically identifies the Account and Password information input by the User, and compares the Account and Password information with the information filled in during registration recorded in the User table of the database User, and the User is a User information type and includes attributes such as the User type (whether the User is an administrator), the Account, and the Password. After the target user verifies the identity, related data of the logistics commodity can be filled in, wherein the related data comprises a production environment, transfer time, a transportation environment, distribution personnel and the like.
In one embodiment, step S102 includes:
step 1, dividing sample data into N training sets;
for each training set, the training set is represented as:
S={(x 1 ,y 1 ),(x 2 ,y 2 ),...,(x K ,y K )|y∈(1,-1)}
where K is the total number of samples in the training set S, x k Data representing the k-th sample, y k Indicates whether the k sample is normal, y k =1 represents a normal class sample; y is k =1 represents an exception class sample.
Step 2, calculating the weight D of the initialized sample of the first iteration of the training set 1
Figure BDA0003720878220000081
Wherein, w 1k The weight of each sample in the training set S for the first iteration.
Step 3, calculating the misclassification cost C of the training set k
Figure BDA0003720878220000082
Wherein n is the number of most samples in the training set, m is the number of few samples in the training set, and K is the total number of samples in the training set S.
In one embodiment, step S103 may include the steps of:
step one, extracting partial data as a learning training set aiming at each training set.
And step two, learning the learning training set by using the learning sample weight corresponding to the learning training set at present to obtain the weak classifier.
And step three, updating the weight of the learning sample of the next iteration according to the weight of the learning sample and the misclassification cost.
And step four, using the updated learning sample weight, and repeatedly executing the steps from the first step to the third step until the preset number of iterations is completed to obtain the preset number of weak classifiers.
Step five, obtaining a group of weak classifiers f = (f) after T iterations of the training set S 1 ,f 2 ,...,f T ) Combining the weak classifier set F into a strong classifier F i
Figure BDA0003720878220000091
Wherein i represents the ith of the N training sets S, T represents the iteration number of the training set S, and alpha t Denotes the weak classifier f at the t-th iteration t The sign function output value is 1 or-1.
Learning the sample weight as an initialization sample weight in the first iteration;
in one embodiment, each training set comprises normal class samples and abnormal class samples, wherein the normal class samples are more than the abnormal class samples, the normal class samples are called majority class samples, and the abnormal class samples are called minority class samples; assuming that each training set comprises m majority class samples and n minority class samples;
for each training set, extracting partial data to serve as a learning training set, and specifically comprising the following steps:
and aiming at each training set, sequencing m majority samples contained in the training set from large to small according to the weight of the initialized sample, and extracting the first n majority samples and n minority samples to form a new set as a learning training set.
In one embodiment, updating the learning sample weight of the next iteration according to the learning sample weight and the misclassification cost comprises:
step 1, supposing the iteration is the t-th iteration, calculating the weak classifier f in the iteration t Error rate of (e) t
Figure BDA0003720878220000092
Wherein D is t (x k ) Data x representing the k sample k Learning sample weights at the t-th iteration;
step 2, calculating the weak classifier f of the t-th iterative training t Weight of alpha t
Figure BDA0003720878220000093
Step 3, calculating a sample weight adjustment factor beta k
β k =-0.5(y k f t (x k ))C k +0.5
Wherein, y k Is a variable having a value of 1 or-1, f t (x k ) Weak classifier f for t-time iterative training of data of kth sample t Output value of C k Is a wrong division cost;
step 4, updating the weight of the learning sample of the t +1 iterations to obtain D t+1
Figure BDA0003720878220000101
Wherein D is t (x k ) Initialization weight, α, for the kth sample t Weak classifier f for the t-th iteration t Weight of (b), beta k Weight adjustment factor for the kth sample, z t Is a normalization factor.
In one implementationAssuming that N training sets are included, and the training set S is any one of the N training sets; strong classifier F for training set S i A predicted value given to the input data is 1 or-1;
step S104 may include the steps of:
step 1, performing weighted aggregation on strong classifiers of N training sets to obtain a group of strong classifiers F = { F = { (F) } 1 ,F 2 ,...,F N }。
Step 2, the prediction value of the strong classifier F is used as the input of the voting decision function, and the output prediction value P is calculated:
Figure BDA0003720878220000102
where N is the number of training sets, F i (x) Strong classifier F for representing ith data set training i For the predicted value of the input data, the output value of the sign function is 1 or-1.
Step 3, obtaining a judgment result whether the transaction data is abnormal according to the predicted value P:
if the predicted value P is larger than or equal to 0, the transaction data are abnormal data;
and if the predicted value P is less than 0, the transaction data are normal data.
In one implementation, if the predicted value P is greater than or equal to 0, it is proved that half or more of the data in the strong classifier are judged to be in minority, that is, abnormal data, and the system screens the abnormal data; otherwise, if the predicted value P is less than 0, it is proved that the data is judged to be a few if less than half of the strong classifiers, and the data is the normal data, and the system reserves the data and performs uplink operation. And only when more than half of the strong classifiers consider the submitted data as normal data, the data chaining operation is performed, so that the accuracy of the judgment result is further improved, and the data stored on the basis of the block chain is more real and reliable.
In one implementation, referring to fig. 2, fig. 2 is a schematic diagram of a logistics system provided in an embodiment of the present invention, where the logistics system is built based on a Hyperhedger Fabric architecture by using the abnormal data identification method provided in the embodiment of the present invention, and performs voting decision on a strong classifier trained based on ensemble learning and a cost-sensitive unbalanced data classification algorithm for uplink data, so as to ensure that data on the logistics system is real and reliable.
The overall architecture of the logistics system comprises a two-layer system support and a three-layer core architecture. The two-layer system comprises an operation and maintenance monitoring service system and a safety management service system. The operation and maintenance monitoring service system comprises contract management, notification management, log management, anomaly monitoring, data analysis and other modules, and is responsible for collecting and visually presenting running state data in the system, wherein the state data in the system comprises the access amount, time consumption, node health state and bottom layer machine resource use condition of the system, the state of the whole block chain system is mastered in real time through visual monitoring, and related personnel are timely notified to process when conditions such as fraudulent nodes, account book tampering, machine faults, data anomalies and the like occur. The safety management service system provides safety protection technologies such as cross-link safety, intelligent contract safety, privacy protection and the like, and has safety protection measures such as identity authentication management, API (application programming interface) safety, business safety and the like of users and services.
The three-layer core architecture in the blockchain platform comprises:
1) And the infrastructure layer is used for providing bottom layer resources for the block chain network and providing a relevant calculation and storage basis for processing information such as production, circulation, distribution and the like of the logistics commodities.
Specifically, the infrastructure layer includes computing node resources, storage resources, network communication bandwidth resources, and the like, for data computation and storage in the network.
2) And the block chain core layer comprises an encryption algorithm module, a consensus algorithm module and a user and authority management module. The system comprises an encryption algorithm module, a consensus algorithm module and a user and authority management module, wherein the encryption algorithm module and the consensus algorithm module are used for encrypting and performing consensus processing on information of production, circulation, distribution and the like of logistics commodities, and the user and authority management module is used for managing access authority of the mobile terminal.
3) And the scene application layer is used for constructing a credible evidence storage related block chain application scene according to the tracing request.
In a specific implementation process, the mobile terminal is used for sending an information traceability request to the blockchain platform by taking the logistics commodity identification as an index value, and receiving query result information matched with the information traceability request after the blockchain platform verifies that the authority of the mobile terminal is qualified.
While one embodiment of the present invention has been described in detail, the description is only a preferred embodiment of the present invention and should not be taken as limiting the scope of the invention. All equivalent changes and modifications made within the scope of the present invention shall fall within the scope of the present invention.

Claims (7)

1. A logistics system abnormal data identification method based on a block chain is characterized by comprising the following steps:
acquiring transaction data submitted to a to-be-audited block by a user block chain node;
initializing sample data to obtain a plurality of training sets and an initialized sample weight and a misclassification cost corresponding to each training set; the sample data comprises the transaction data and historical data of the user block chain node;
aiming at each training set, carrying out iteration for a preset number of times by taking the weight of the initialized sample and the misclassification cost as parameters to obtain a preset number of weak classifiers, and combining the preset number of weak classifiers into a strong classifier of the training set;
carrying out weighted aggregation on the strong classifiers of all the training sets, and obtaining a judgment result whether the transaction data is abnormal or not through voting type judgment;
and broadcasting the judgment result of the transaction data on a block chain, packaging normal data to obtain a second block, and accessing the second block to the block chain.
2. The method for identifying abnormal data of a logistics system based on a block chain as claimed in claim 1, wherein before acquiring transaction data submitted to a block to be checked by a user block chain node, the method further comprises:
acquiring an account and a password input by a target User through a User block chain node, comparing the account and the password with information filled during registration recorded in a User table of a database User, and performing identity verification on the target User;
and if the target user identity verification is successful and the user block chain node uploads the transaction data, storing the transaction data to the block to be audited.
3. The method for identifying abnormal data of a logistics system based on a block chain according to claim 1, wherein the initialization of the sample data is performed to obtain a plurality of training sets and an initialization sample weight and a misclassification cost corresponding to each training set, and the method comprises the following steps:
dividing the sample data into N training sets;
for each training set, the training set is represented as:
S={(x 1 ,y 1 ),(x 2 ,y 2 ),...,(x K ,y K )|y∈(1,-1)}
wherein K is the total number of samples of the training set S, x k Data representing the kth sample, y k Indicates whether the k sample is normal, y k =1 represents a normal class sample; y is k =1 represents an exception class sample;
calculating the initial sample weight D of the first iteration of the training set 1
Figure FDA0003720878210000021
Wherein, w 1k The weight of each sample in the training set S in the first iteration is obtained;
calculating the misclassification cost C of the training set k
Figure FDA0003720878210000022
Wherein n is the number of most samples in the training set, m is the number of few samples in the training set, and K is the total number of samples in the training set S.
4. The method for identifying abnormal data of a logistics system based on a block chain as claimed in claim 1, wherein for each training set, a preset number of iterations are performed with the initialized sample weight and the misclassification cost as parameters to obtain a preset number of weak classifiers, and the preset number of weak classifiers are combined into the strong classifier of the training set, comprising the following steps:
extracting partial data as a learning training set aiming at each training set;
learning the learning training set by using the learning sample weight corresponding to the learning training set at present to obtain a weak classifier; the learning sample weight is the initialization sample weight at a first iteration;
updating the weight of the learning sample of the next iteration according to the weight of the learning sample and the misclassification cost;
step four, the updated learning sample weight is used, the steps one to three are repeatedly executed until a preset number of iterations are completed, and a preset number of weak classifiers are obtained;
step five, obtaining a group of weak classifiers f = (f) after T iterations of the training set S 1 ,f 2 ,...,f T ) Combining the weak classifier set F into a strong classifier F i
Figure FDA0003720878210000031
Wherein i represents the ith of the N training sets S, T represents the iteration number of the training set S, and alpha t Represents the weak classifier f at the t-th iteration t The sign function output value is 1 or-1.
5. The method for identifying the abnormal data of the logistics system based on the block chain as claimed in claim 4, wherein each training set comprises normal samples and abnormal samples, the number of the normal samples is more than that of the abnormal samples, the normal samples are called as majority samples, and the abnormal samples are called as minority samples; assuming that each training set comprises m majority class samples and n minority class samples;
extracting partial data for each training set to serve as a learning training set, and specifically comprising the following steps of:
and aiming at each training set, sequencing m majority samples contained in the training set from large to small according to the weight of the initialized samples, and extracting the first n majority samples and n minority samples to form a new set as a learning training set.
6. The method for identifying abnormal data of a logistics system based on a block chain as claimed in claim 4, wherein updating the weight of the learning sample in the next iteration according to the weight of the learning sample and the misclassification cost comprises:
supposing that the iteration is the t-th iteration, calculating the weak classifier f in the iteration t Error rate of (e) t
Figure FDA0003720878210000032
Wherein D is t (x k ) Data x representing the k sample k The learning sample weight at the tth iteration;
weak classifier f for calculating t-th iterative training t Weight of alpha t
Figure FDA0003720878210000033
Calculating a sample weight adjustment factor beta k
β k =-0.5(y k f t (x k ))C k +0.5
Wherein, y k Is a variable having a value of 1 or-1, f t (x k ) A weak classifier f which is used for carrying out t-time iterative training on the data of the kth sample t Output value of (C) k Is a wrong division cost;
updating the weight of the learning sample of t +1 iterations to obtain D t+1
Figure FDA0003720878210000041
Wherein D is t (x k ) For the initialization weight of the kth sample, α t Weak classifier f for the t-th iteration t Weight of (b), beta k For the weight adjustment factor of the kth sample, z t Is a normalization factor.
7. The method for identifying the abnormal data of the logistics system based on the block chain as claimed in claim 1, wherein N training sets are assumed to be included, and the training set S is any one of the N training sets; strong classifier F for training set S i A predicted value given to the input data is 1 or-1;
performing weighted aggregation on the strong classifiers of all the training sets, and obtaining a judgment result whether the transaction data is abnormal through voting type judgment, wherein the judgment result comprises the following steps:
carrying out weighted aggregation on the strong classifiers of the N training sets to obtain a group of strong classifiers F = { F = { (F) } 1 ,F 2 ,...,F N };
And (3) taking the prediction value of the strong classifier F as the input of a voting judgment function, and calculating an output prediction value P:
Figure FDA0003720878210000042
wherein N is the number of training sets, F i (x) Strong classifier F for representing ith data set training i For the predicted value of the input data, the output value of the sign function is 1 or-1;
obtaining a judgment result whether the transaction data is abnormal according to the predicted value P:
if the predicted value P is larger than or equal to 0, the transaction data are abnormal data;
and if the predicted value P is less than 0, the transaction data are normal data.
CN202210760438.4A 2022-06-29 2022-06-29 Logistics system abnormal data identification method based on block chain Pending CN115329833A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210760438.4A CN115329833A (en) 2022-06-29 2022-06-29 Logistics system abnormal data identification method based on block chain

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210760438.4A CN115329833A (en) 2022-06-29 2022-06-29 Logistics system abnormal data identification method based on block chain

Publications (1)

Publication Number Publication Date
CN115329833A true CN115329833A (en) 2022-11-11

Family

ID=83917169

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210760438.4A Pending CN115329833A (en) 2022-06-29 2022-06-29 Logistics system abnormal data identification method based on block chain

Country Status (1)

Country Link
CN (1) CN115329833A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116843341A (en) * 2023-06-27 2023-10-03 湖南工程学院 Credit card abnormal data detection method, device, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116843341A (en) * 2023-06-27 2023-10-03 湖南工程学院 Credit card abnormal data detection method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
Liu et al. Blockchain and machine learning for communications and networking systems
CN112949760B (en) Model precision control method, device and storage medium based on federal learning
CN110473083B (en) Tree risk account identification method, device, server and storage medium
CN111598143B (en) Credit evaluation-based defense method for federal learning poisoning attack
CN108234127A (en) A kind of Internet of Things method and device based on block chain
CN107563410A (en) The sorting technique and equipment with multi-task learning are unanimously clustered based on topic categories
CN112184012A (en) Enterprise risk early warning method, device, equipment and readable storage medium
CN112686301A (en) Data annotation method based on cross validation and related equipment
CN115329833A (en) Logistics system abnormal data identification method based on block chain
Soleymani et al. Fuzzy Rule‐Based Trust Management Model for the Security of Cloud Computing
CN113850669A (en) User grouping method and device, computer equipment and computer readable storage medium
CN114169938A (en) Information pushing method, device, equipment and storage medium
CN113535848A (en) Block chain-based credit investigation grade determination method, device, equipment and storage medium
Liang et al. A methodology of trusted data sharing across telecom and finance sector under china’s data security policy
CN112598132A (en) Model training method and device, storage medium and electronic device
CN115189893A (en) Block chain consensus method based on neural network and related equipment thereof
US11995503B2 (en) System and method for carrier identification
CN115907968A (en) Wind control rejection inference method and device based on pedestrian credit
Xiao et al. Explainable fraud detection for few labeled time series data
CN115907954A (en) Account identification method and device, computer equipment and storage medium
CN108711074A (en) Business sorting technique, device, server and readable storage medium storing program for executing
CN114493850A (en) Artificial intelligence-based online notarization method, system and storage medium
CN111882415A (en) Training method and related device of quality detection model
CN114971878B (en) Risk assessment method, risk assessment device, apparatus and storage medium
Wang Reliability detection method of online education resource sharing based on blockchain

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination