CN117609921A

CN117609921A - Method and device for constructing anomaly detection model, electronic equipment and storage medium

Info

Publication number: CN117609921A
Application number: CN202311660453.2A
Authority: CN
Inventors: 窦宇宸; 曾麟淳; 周丹丹
Original assignee: Agricultural Bank of China
Current assignee: Agricultural Bank of China
Priority date: 2023-12-05
Filing date: 2023-12-05
Publication date: 2024-02-27

Abstract

The invention discloses a method and a device for constructing an abnormality detection model, electronic equipment and a storage medium. The method comprises the following steps: acquiring a supply chain historical transaction dataset; wherein the supply chain historical transaction data set comprises at least two supply chain historical transaction data sets, the supply chain historical transaction data sets involving at least two participants; extracting characteristic information of each piece of supply chain historical transaction data in the supply chain historical transaction data set, and performing alignment operation on the characteristic information to generate a target characteristic information set; training the longitudinal federal learning model based on the target feature information set to generate a supply chain transaction data anomaly detection model. The longitudinal federal learning model is trained based on the target characteristic information set subjected to the noise processing to generate the detection model, so that the data privacy is protected, and the data utilization is completed. Meanwhile, whether the supply chain transaction data is abnormal or not is detected through the detection model, so that the detection accuracy can be improved.

Description

Method and device for constructing anomaly detection model, electronic equipment and storage medium

Technical Field

The present invention relates to the field of finance, and in particular, to a method and apparatus for constructing an anomaly detection model, an electronic device, and a storage medium.

Background

In recent years, great importance has been placed on the development of supply chains, and how to optimize and stabilize the supply chains and increase the development level of the supply chains is an important link of economic development. In the current market environment, the fundamental problem of supply chain finance is that risks are difficult to manage, and one of the keys for solving the problem is to manage and control trust risks. The credit giving models commonly used at present comprise a credit metering model, a KMV model, a credit giving model and the like, wherein the credit giving model based on historical transaction data of clients is commonly used, and the credit giving credit of the clients is calculated according to factors such as the total amount of historical transaction, the annual calendar history transaction growth rate, the turnover days and the like of the clients in the last year or the last two years. In the process of calculation, how to distinguish abnormal historical transaction data is a pain point in the credit calculation process.

Because the supervision environment gradually strengthens data protection and successively goes out of the relevant policies, data among a plurality of enterprises, departments and universities cannot be communicated, and meanwhile, users and enterprises do not want to provide respective data for simple aggregation with other enterprises due to consideration of potential value of the data, so that the existing data island and privacy protection problems are caused, and the existing abnormal transaction data detection algorithm is caused.

Disclosure of Invention

The invention provides a construction method, a construction device, electronic equipment and a storage medium of an abnormality detection model, which not only protect data privacy, but also finish the utilization of data, and meanwhile, the detection model is used for detecting whether abnormality exists in supply chain transaction data, so that the detection accuracy can be improved. .

According to an aspect of the present invention, there is provided a method for constructing an anomaly detection model, the method comprising:

acquiring a supply chain historical transaction dataset; wherein the supply chain historical transaction data set comprises at least two pieces of supply chain historical transaction data, the supply chain historical transaction data set involving at least two participants;

extracting characteristic information of each supply chain historical transaction data in the supply chain historical transaction data set, and performing alignment operation on the characteristic information to generate a target characteristic information set;

training the longitudinal federal learning model based on the target feature information set to generate a supply chain transaction data anomaly detection model.

According to another aspect of the present invention, there is provided a construction apparatus of an abnormality detection model, the apparatus including:

the historical data set acquisition module is used for acquiring a supply chain historical transaction data set; wherein the supply chain historical transaction data set comprises at least two pieces of supply chain historical transaction data, the supply chain historical transaction data set involving at least two participants;

the characteristic information set generation module is used for extracting characteristic information of each piece of supply chain historical transaction data in the supply chain historical transaction data set, and performing alignment operation on the characteristic information to generate a target characteristic information set;

the anomaly detection model generation module is used for training the longitudinal federal learning model based on the target characteristic information set to generate a supply chain transaction data anomaly detection model.

According to another aspect of the present invention, there is provided an electronic apparatus including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the method of constructing an anomaly detection model according to any one of the embodiments of the present invention.

According to another aspect of the present invention, there is provided a computer readable storage medium storing computer instructions for causing a processor to implement the method for constructing an anomaly detection model according to any one of the embodiments of the present invention when executed.

According to the technical scheme, a supply chain historical transaction data set is obtained; wherein the supply chain historical transaction data set comprises at least two supply chain historical transaction data sets, the supply chain historical transaction data sets involving at least two participants; extracting characteristic information of each piece of supply chain historical transaction data in the supply chain historical transaction data set, and performing alignment operation on the characteristic information to generate a target characteristic information set; training the longitudinal federal learning model based on the target feature information set to generate a supply chain transaction data anomaly detection model. According to the technical scheme, the longitudinal federal learning model is trained based on the target characteristic information set subjected to the noise adding treatment, so that the detection model is generated, the data privacy is protected, and the data utilization is completed. Meanwhile, whether the supply chain transaction data is abnormal or not is detected through the detection model, so that the detection accuracy can be improved.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the invention or to delineate the scope of the invention. Other features of the present invention will become apparent from the description that follows.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a method for constructing an anomaly detection model according to a first embodiment of the present invention;

FIG. 2 is a flowchart for constructing a Bayesian network algorithm according to an embodiment of the present invention;

FIG. 3 is a flow chart of generating a noise-added feature information set according to a first embodiment of the present invention;

FIG. 4 is a flowchart of a method for constructing an anomaly detection model according to a second embodiment of the present invention;

FIG. 5 is a schematic diagram of a longitudinal federal learning model according to a second embodiment of the present invention;

FIG. 6 is a block diagram of a method for detecting anomalies in supply chain transaction data according to a second embodiment of the present invention;

fig. 7 is a schematic structural diagram of a device for constructing an abnormality detection model according to a third embodiment of the present invention;

fig. 8 is a schematic structural diagram of an electronic device implementing a method for constructing an anomaly detection model according to an embodiment of the present invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

Fig. 1 is a flowchart of a method for constructing an anomaly detection model according to an embodiment of the present invention, where the method may be performed by an anomaly detection model constructing apparatus, and the anomaly detection model constructing apparatus may be implemented in hardware and/or software, and the anomaly detection model constructing apparatus may be configured in an electronic device. As shown in fig. 1, the method includes:

s110, acquiring a supply chain historical transaction data set; wherein the supply chain historical transaction data set comprises at least two supply chain historical transaction data sets, the supply chain historical transaction data sets involving at least two participants.

The supply chain historical transaction data refers to quantitative information in past transaction records between at least two participants in a supply chain, and comprises information such as time, amount, participants, products or services, transaction modes and the like of the transaction, mainly digital and statistical data. The supply chain historical transaction data may be used to analyze the acquired transaction behavior to provide base data and trend analysis for future business decisions.

In the embodiment of the invention, the supply chain historical transaction data set can be formed by acquiring the supply chain historical transaction data. It should be noted that the supply chain historical transaction data set should include at least two pieces of supply chain historical transaction data.

S120, extracting characteristic information of each supply chain historical transaction data in the supply chain historical transaction data set, and performing alignment operation on the characteristic information to generate a target characteristic information set.

In the implementation of the invention, because the user groups of at least two participants related to the supply chain historical transaction data are different, and the training samples participating in model training are required to be kept consistent, after the characteristic information of each supply chain historical transaction data in the supply chain historical transaction data set is extracted, the characteristic information is required to be aligned, and a target characteristic information set for model training is generated.

Optionally, extracting feature information of each supply chain historical transaction data in the supply chain historical transaction data set, and performing an alignment operation on the feature information to generate a target feature information set, including: extracting characteristic information of each piece of supply chain historical transaction data in the supply chain historical transaction data set to generate an initial characteristic information set; the differential privacy protection algorithm based on the Bayesian network carries out noise adding processing on the initial characteristic information set to generate a noise adding characteristic information set; and carrying out alignment operation on the characteristic information in the noise-added characteristic information set based on a preset characteristic alignment algorithm to generate a target characteristic information set.

The Bayesian network is a probability graph model for describing attribute node relations based on an acyclic directed graph model, and can obtain the relation of conditional probability distribution.

The differential privacy protection algorithm is a data privacy protection technology, and is used for protecting the privacy of individuals in data by adding a certain degree of noise in the data release or data query process, and the main purpose of the differential privacy protection algorithm is to balance the relationship between data utilization and privacy protection.

In the embodiment of the invention, the dimensions of the historical transaction data in the supply chain historical transaction data set may have correlation, however, the traditional differential privacy protection algorithm is used for modeling based on uncorrelation of the dimensions in the data set, and the traditional differential privacy protection algorithm is applied to the high-dimensional supply chain historical transaction data set, so that the trained model has poor effect. In order to solve the problem, the embodiment of the invention provides a differential privacy protection algorithm based on a Bayesian network, which performs noise adding processing on initial characteristic information extracted from a supply chain historical transaction data set to generate a noise adding characteristic information set with differential privacy protection. And then, carrying out alignment operation on the characteristic information in the noise-added characteristic information set based on a preset characteristic alignment algorithm, and generating a target characteristic information set for model training.

Optionally, the differential privacy protection algorithm based on the bayesian network performs noise adding processing on the initial feature information set, and generates a noise added feature information set, including: performing dimension reduction processing on the initial feature information set based on the Bayesian network to generate a low-dimension feature information set; and carrying out noise adding processing on the low-dimensional characteristic information set based on the differential privacy protection algorithm to generate a noise adding characteristic information set.

In the embodiment of the present invention, a bayesian network algorithm may be first constructed, and an exemplary flow chart of constructing a bayesian network algorithm is shown in fig. 2, where the flow of constructing a bayesian network algorithm is approximately as follows:

define: the data set D is required to construct the degree k of the Bayesian network, the attribute dimension node set A, the AP pair set omega, the father node condition distribution pi, the exponential mechanism valuation function Q and the privacy budget epsilon

Input：D，k

Output: k-degree Bayesian network N

Initial N=φ, V=φ// initial Bayesian network N, parent node incorporates V

2.add(X ₁ Phi) to N// randomly selecting attribute node X from among attribute nodes ₁

3.add X ₁ to V// will (X) ₁ Phi) adding N, X ₁ Adding V

For i=2; i < d; i++ do// traverse attribute dimension node set

Initializing AP pair set Ω = { Φ }// initializing AP pair set Ω

6.for(X in A)&&Do// traverse parent node set V for attribute dimensions in existing N

Add (X, /) to Ω// join AP pairs to set Ω

8.end for；

9.for(X _i ,∏ _i ) in Ω do// traversing AP pair set Ω

10.max _i ＝Q _max ((X _i ,Π _i ) Epsilon)/selecting the top scoring AP pair in the set omega using an exponential mechanism

11.end for；

12.addmax _i to N// pair AP max _i Adding N

13.add X _i to V// attribute node X _i Adding V

14.end for；

Return N. Bay network N

Then, the initial feature information set can be subjected to dimension reduction processing based on the Bayesian network to generate a low-dimensional feature information set, and the low-dimensional feature information set is subjected to noise adding processing based on the differential privacy protection algorithm to generate a noise adding feature information set. Illustratively, the algorithm flow for denoising a data set based on a differential privacy preserving algorithm is approximately as follows:

define: the data set D is required to construct the degree k of entrance of a Bayesian network, the privacy budget epsilon, the Laplacian mechanism is written as Lap, the noise condition distribution P and the attribute node joint distribution Pr, and the condition distribution function Pr of differential privacy is realized ^* Data set D containing noise _W

Input：D，N，k

Output：D _W

Initialization noise condition distribution p=phi// initialization noise condition distribution P

2.for i＝k+1；i＜＜d；i++do

3.build Pr[X _i ，Π _i ]Constructing attribute node X _i Is of joint distribution Pr [ X ] _i ，Π _i ]

4.add Lapto Pr ^* [X _i ，Π _i ]At Pr [ X ] _i ，Π _i ]Adding Laplace noise to achieve differential privacy Pr ^* [X _i ，∏ _i ]

5.addPr ^* [X _i |∏ _i ]toP ^* // Pr from ^* [X _i ，Π _i ]Pr is extracted from the Chinese herbal medicine ^* [X _i |Π _i ]And add P ^*

6.end for；

7.for i＝1；i<＝k；i++do

8.addPr ^* [X _i |Π _i ]toP ^* // Pr from ^* [X _k+1 ，П _k+1 ]Pr is extracted from the Chinese herbal medicine ^* [X _i |Π _i ]And add P ^*

9.end for；

10.samplingP ^* and compositeD _W New data set D containing noise is synthesized by// sampling _W

11.returnD _W . Data set D is returned/returned _W

For example, fig. 3 shows a flowchart for generating a noisy feature information set, as shown in fig. 3, for an initial feature information set, differential privacy protection is performed on the selection of AP pairs in the process of constructing a bayesian network through an exponential mechanism, then a condition distribution of adding noise is performed on each AP pair based on a laplace mechanism, finally an approximate joint distribution of the condition distribution of adding noise is generated through the bayesian network and the condition distribution of noise, and finally a noisy feature information set with differential privacy protection is generated through a sampling mode.

S130, training the longitudinal federal learning model based on the target characteristic information set to generate a supply chain transaction data anomaly detection model.

The longitudinal federal learning model is a distributed machine learning technology, and aims to train the data of each mobile device locally to obtain a global training model so as to solve the problem of collaborative machine learning modeling on the premise that the data of a user is not transmitted by a network.

In the embodiment of the invention, after the target characteristic information set, the longitudinal federal learning model can be trained based on the target characteristic information set to generate the supply chain transaction data anomaly detection model. The method and the system protect the privacy of the data and finish the utilization of the data, and meanwhile, the supply chain transaction data is detected through the supply chain transaction data abnormality detection model, so that the detection accuracy of the abnormal data in the supply chain transaction data can be improved.

Example two

Fig. 4 is a flowchart of a method for constructing an anomaly detection model according to a second embodiment of the present invention, where the method is optimized based on the foregoing embodiments, and a solution not described in detail in the embodiment of the present invention is shown in the foregoing embodiments. As shown in fig. 4, the method includes:

s210, acquiring a supply chain historical transaction data set; wherein the supply chain historical transaction data set comprises at least two supply chain historical transaction data sets, the supply chain historical transaction data sets involving at least two participants.

S220, extracting characteristic information of each supply chain historical transaction data in the supply chain historical transaction data set, and performing alignment operation on the characteristic information to generate a target characteristic information set.

And S230, training the longitudinal federal learning model based on the target characteristic information set to generate a supply chain transaction data anomaly detection model.

Optionally, the longitudinal federal learning model includes at least two identical local models and one central service model; wherein the number of local models is the same as the number of participants involved in the supply chain historical transaction data set. Illustratively, FIG. 5 shows a schematic diagram of a longitudinal federal learning model, including the same number of local servers and a central server as the number of participants involved in the supply chain historical transaction data set, as shown in FIG. 5.

Optionally, training the longitudinal federal learning model based on the target feature information set to generate a supply chain transaction data anomaly detection model, including: performing iterative training on at least two local models and a central service model in the longitudinal federal learning model based on the target feature information set, wherein in the iterative training process, the at least two local models exchange own output intermediate feature vectors, and the at least two local models upload own gradient vectors to the central service model respectively; the central service model aggregates gradient vectors corresponding to the at least two local models to generate aggregate gradient vectors, updates the central service model based on the aggregate gradient vectors, and respectively transmits the aggregate gradient vectors to the at least two local models so that the at least two local models update own models based on the aggregate gradient vectors respectively until the central service model and the at least two local models converge; and taking the converged central service model as a supply chain transaction data anomaly detection model.

In the embodiment of the invention, the longitudinal federal learning model can be trained based on the target characteristic information set to generate the supply chain transaction data anomaly detection model. Illustratively, the general flow of training the vertical federal learning model based on the set of target feature information is as follows:

step one, at least two local models locally calculate intermediate feature vectors of target feature information, and exchange the intermediate feature vectors with each other to help calculate gradient vectors of the local models;

step two, at least two local models calculate own gradient vectors and upload the gradient vectors to a central service model;

step three, the central service model aggregates gradient vectors corresponding to at least two local models to generate aggregate gradient vectors, and updates the gradient vectors based on the aggregate gradient vectors;

step four, the central service model returns the aggregate gradient vector to at least two local models, and updates the at least two local models;

and fifthly, repeating the steps until the central service model and at least two local models are converged.

And step six, taking the converged central service model as a supply chain transaction data anomaly detection model.

Optionally, the central service model aggregates gradient vectors corresponding to at least two local models to generate an aggregated gradient vector, including: the central service model sums the gradient vectors corresponding to the at least two local models and takes the gradient vector sum as an aggregate gradient vector.

Illustratively, assuming two participants involved in a supply chain historical transaction dataset, their corresponding local models are A and B, respectively, with a gradient vector of local model A of g ₁ The gradient vector of the local model B is g ₂ The value of the aggregate gradient vector is g ₁ +g ₂ 。

Optionally, at least two local models update their own models based on the aggregated gradient vector, respectively, including: and each local model in the at least two local models respectively calculates the ratio of the aggregate gradient vector to the number of the local models, generates a target gradient vector, and updates the model based on the target gradient vector.

Illustratively, assuming that the supply chain historical transaction dataset involves two participants, the aggregate gradient vector has a value of g ₁ +g ₂ The value of the target gradient vector is

Optionally, in generating the supply chain transaction data anomaly detection model, further comprising: and an online detection interface is arranged for detecting whether the supply chain transaction data is abnormal or not by the participants related to the supply chain historical transaction data set.

Exemplary, fig. 6 shows an architecture design diagram of a supply chain transaction data anomaly detection method, as shown in fig. 6, where the method architecture includes a preprocessing layer, a data access layer, a sample alignment layer, a federal learning layer, and online detection, where the preprocessing layer includes discretization, single-heat encoding, normalization, and missing value complement, etc., and is used for extracting feature information to generate an initial feature information set; the data access layer is used for carrying out noise adding processing on the initial characteristic information set of the participant to generate a noise adding characteristic information set and accessing the noise adding characteristic information set; the sample alignment layer is used for aligning the noise adding characteristic information of each participant in the noise adding characteristic information set to generate a target characteristic information set; the sample learning layer is used for training the longitudinal federal model according to the target characteristic information set; the online detection is used for providing a detection interface for a participant involved in the supply chain historical transaction data set to detect whether the supply chain transaction data is abnormal.

S240, acquiring target supply chain transaction data to be detected, and extracting target characteristic information of the target supply chain transaction data.

In the embodiment of the invention, after the target supply chain transaction data to be detected is determined, the target characteristic information of the target supply chain transaction data needs to be extracted so as to enable the supply chain transaction data abnormality detection model to detect whether the target supply chain transaction data has abnormality or not.

S250, inputting the target characteristic information into a supply chain transaction data anomaly detection model.

In the embodiment of the invention, after extracting the target characteristic information of the target supply chain transaction data, the target characteristic information is required to be input into the supply chain transaction data anomaly detection model so that the supply chain transaction data anomaly detection model outputs a detection result according to the target characteristic information.

S260, judging whether the target supply chain transaction data is abnormal or not according to the output result of the supply chain transaction data abnormality detection model.

In the embodiment of the invention, after the output result of the supply chain transaction data abnormality detection model is obtained, whether the target supply chain transaction data is abnormal or not can be judged according to the output result of the supply chain transaction data abnormality detection model. The supply chain transaction data is detected through the supply chain transaction data abnormality detection model, so that the detection accuracy of the abnormal data in the supply chain transaction data can be improved.

According to the technical scheme, a supply chain historical transaction data set is obtained; wherein the supply chain historical transaction data set comprises at least two supply chain historical transaction data sets, the supply chain historical transaction data sets involving at least two participants; extracting characteristic information of each piece of supply chain historical transaction data in the supply chain historical transaction data set, and performing alignment operation on the characteristic information to generate a target characteristic information set; training a longitudinal federal learning model based on the target feature information set to generate a supply chain transaction data anomaly detection model; acquiring target supply chain transaction data to be detected, and extracting target characteristic information of the target supply chain transaction data; inputting the target characteristic information into a supply chain transaction data anomaly detection model; and judging whether the target supply chain transaction data is abnormal or not according to the output result of the supply chain transaction data abnormality detection model. According to the technical scheme, the longitudinal federal learning model is trained based on the target characteristic information set subjected to the noise adding treatment, so that the detection model is generated, the data privacy is protected, and the data utilization is completed. Meanwhile, whether the supply chain transaction data is abnormal or not is detected through the detection model, so that the detection accuracy can be improved.

Example III

Fig. 7 is a schematic structural diagram of a device for constructing an anomaly detection model according to a third embodiment of the present invention. As shown in fig. 7, the apparatus includes:

a historical data set acquisition module 310 for acquiring a supply chain historical transaction data set; wherein the supply chain historical transaction data set comprises at least two pieces of supply chain historical transaction data, the supply chain historical transaction data set involving at least two participants;

the feature information set generating module 320 is configured to extract feature information of each piece of supply chain historical transaction data in the supply chain historical transaction data set, and perform an alignment operation on the feature information to generate a target feature information set;

the anomaly detection model generation module 330 is configured to train the longitudinal federal learning model based on the target feature information set, and generate a supply chain transaction data anomaly detection model.

Optionally, the feature information set generating module 320 includes:

an initial feature information set generating unit, configured to extract feature information of each piece of supply chain historical transaction data in the supply chain historical transaction data set, and generate an initial feature information set;

the noise-adding feature information set generating unit is used for carrying out noise adding processing on the initial feature information set based on a differential privacy protection algorithm of the Bayesian network to generate a noise-adding feature information set;

and the target feature information set generating unit is used for carrying out alignment operation on the feature information in the noise-added feature information set based on a preset feature alignment algorithm to generate a target feature information set.

Optionally, the noise-added feature information set generating unit includes:

the low-dimensional characteristic information set generation subunit is used for performing dimension reduction processing on the initial characteristic information set based on a Bayesian network to generate a low-dimensional characteristic information set;

and the noise-adding characteristic information set generation subunit is used for carrying out noise adding processing on the low-dimensional characteristic information set based on a differential privacy protection algorithm to generate a noise-adding characteristic information set.

Optionally, the longitudinal federal learning model includes at least two identical local models and one central service model; wherein the number of local models is the same as the number of participants involved in the supply chain historical transaction dataset;

the anomaly detection model generation module 330 includes:

the gradient vector uploading unit is used for carrying out iterative training on at least two local models and a central service model in the longitudinal federal learning model based on the target characteristic information set, in the iterative training process, the at least two local models exchange own output intermediate characteristic vectors with each other, and the at least two local models upload own gradient vectors to the central service model respectively;

the model updating unit is used for aggregating gradient vectors corresponding to the at least two local models by the central service model, generating an aggregate gradient vector, updating the central service model based on the aggregate gradient vector, and respectively issuing the aggregate gradient vector to the at least two local models so that the at least two local models update own models based on the aggregate gradient vector until the central service model and the at least two local models converge;

and the anomaly detection model generation unit is used for taking the converged central service model as a supply chain transaction data anomaly detection model.

Optionally, the model updating unit includes:

and the aggregation gradient vector generation subunit is used for summing the gradient vectors corresponding to the at least two local models by the central service model, and taking the sum of the gradient vectors as an aggregation gradient vector.

Optionally, the model updating unit includes:

and the local model updating subunit is used for respectively calculating the ratio of the aggregate gradient vector to the number of the local models by each of the at least two local models, generating a target gradient vector and updating the own model based on the target gradient vector.

Optionally, the apparatus further comprises:

the target feature information extraction module is used for acquiring target supply chain transaction data to be detected and extracting target feature information of the target supply chain transaction data;

the target characteristic information input module is used for inputting the target characteristic information into the supply chain transaction data anomaly detection model;

and the supply chain transaction data detection module is used for judging whether the target supply chain transaction data is abnormal or not according to the output result of the supply chain transaction data abnormality detection model.

The device for constructing the abnormality detection model provided by the embodiment of the invention can execute the method for constructing the abnormality detection model provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.

Example IV

Fig. 8 shows a schematic diagram of the structure of an electronic device 10 that may be used to implement an embodiment of the invention. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. Electronic equipment may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices (e.g., helmets, glasses, watches, etc.), and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.

As shown in fig. 8, the electronic device 10 includes at least one processor 11, and a memory, such as a Read Only Memory (ROM) 12, a Random Access Memory (RAM) 13, etc., communicatively connected to the at least one processor 11, in which the memory stores a computer program executable by the at least one processor, and the processor 11 may perform various appropriate actions and processes according to the computer program stored in the Read Only Memory (ROM) 12 or the computer program loaded from the storage unit 18 into the Random Access Memory (RAM) 13. In the RAM 13, various programs and data required for the operation of the electronic device 10 may also be stored. The processor 11, the ROM 12 and the RAM 13 are connected to each other via a bus 14. An input/output (I/O) interface 15 is also connected to bus 14.

Various components in the electronic device 10 are connected to the I/O interface 15, including: an input unit 16 such as a keyboard, a mouse, etc.; an output unit 17 such as various types of displays, speakers, and the like; a storage unit 18 such as a magnetic disk, an optical disk, or the like; and a communication unit 19 such as a network card, modem, wireless communication transceiver, etc. The communication unit 19 allows the electronic device 10 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

The processor 11 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of processor 11 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various processors running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processor, controller, microcontroller, etc. The processor 11 performs the respective methods and processes described above, for example, the construction method of the abnormality detection model.

In some embodiments, the method of constructing the anomaly detection model may be implemented as a computer program tangibly embodied on a computer-readable storage medium, such as storage unit 18. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device 10 via the ROM 12 and/or the communication unit 19. When the computer program is loaded into the RAM 13 and executed by the processor 11, one or more steps of the above-described construction method of the abnormality detection model may be performed. Alternatively, in other embodiments, processor 11 may be configured to perform the method of constructing the anomaly detection model in any other suitable manner (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

A computer program for carrying out methods of the present invention may be written in any combination of one or more programming languages. These computer programs may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the computer programs, when executed by the processor, cause the functions/acts specified in the flowchart and/or block diagram block or blocks to be implemented. The computer program may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of the present invention, a computer-readable storage medium may be a tangible medium that can contain, or store a computer program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. Alternatively, the computer readable storage medium may be a machine readable signal medium. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on an electronic device having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) through which a user can provide input to the electronic device. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.

The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present invention may be performed in parallel, sequentially, or in a different order, so long as the desired results of the technical solution of the present invention are achieved, and the present invention is not limited herein.

The above embodiments do not limit the scope of the present invention. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should be included in the scope of the present invention.

Claims

1. The method for constructing the abnormality detection model is characterized by comprising the following steps of:

2. The method of claim 1, wherein extracting the characteristic information of each piece of supply chain historical transaction data in the supply chain historical transaction data set and performing an alignment operation on the characteristic information to generate a target characteristic information set, comprises:

extracting characteristic information of each supply chain historical transaction data in the supply chain historical transaction data set to generate an initial characteristic information set;

the initial characteristic information set is subjected to noise adding processing based on a differential privacy protection algorithm of a Bayesian network, and a noise adding characteristic information set is generated;

and carrying out alignment operation on the characteristic information in the noise-added characteristic information set based on a preset characteristic alignment algorithm to generate a target characteristic information set.

3. The method of claim 2, wherein the generating the set of noisy feature information by denoising the initial set of feature information based on a differential privacy protection algorithm of a bayesian network comprises:

performing dimension reduction processing on the initial feature information set based on a Bayesian network to generate a low-dimensional feature information set;

and carrying out noise adding processing on the low-dimensional characteristic information set based on a differential privacy protection algorithm to generate a noise adding characteristic information set.

4. The method of claim 1, wherein the longitudinal federal learning model comprises at least two identical local models and one central service model; wherein the number of local models is the same as the number of participants involved in the supply chain historical transaction dataset;

training a longitudinal federal learning model based on the target feature information set to generate a supply chain transaction data anomaly detection model, comprising:

performing iterative training on at least two local models and a central service model in a longitudinal federal learning model based on the target feature information set, wherein in the iterative training process, the at least two local models exchange own output intermediate feature vectors, and the at least two local models respectively upload own gradient vectors to the central service model;

the central service model aggregates gradient vectors corresponding to the at least two local models to generate aggregate gradient vectors, updates the central service model based on the aggregate gradient vectors, and respectively transmits the aggregate gradient vectors to the at least two local models so that the at least two local models update own models based on the aggregate gradient vectors until the central service model and the at least two local models converge;

and taking the converged central service model as a supply chain transaction data anomaly detection model.

5. The method of claim 4, wherein the central service model aggregates gradient vectors corresponding to the at least two local models to generate an aggregate gradient vector, comprising:

the central service model sums the gradient vectors corresponding to the at least two local models and takes the sum of the gradient vectors as an aggregate gradient vector.

6. The method of claim 4, wherein the at least two local models update their own models based on the aggregate gradient vector, respectively, comprising:

and each local model in the at least two local models respectively calculates the ratio of the aggregate gradient vector to the number of the local models, generates a target gradient vector, and updates the model based on the target gradient vector.

7. The method of claim 1, further comprising, after generating the supply chain transaction data anomaly detection model:

acquiring target supply chain transaction data to be detected, and extracting target characteristic information of the target supply chain transaction data;

inputting the target characteristic information into the supply chain transaction data anomaly detection model;

and judging whether the target supply chain transaction data is abnormal or not according to the output result of the supply chain transaction data abnormality detection model.

8. An abnormality detection model constructing apparatus comprising:

9. An electronic device, the electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the method of constructing an anomaly detection model of any one of claims 1-7.

10. A computer-readable storage medium storing computer instructions for causing a processor to implement the method of constructing the anomaly detection model of any one of claims 1-7 when executed.