CN112966279A - Distributed data processing method and system - Google Patents

Distributed data processing method and system Download PDF

Info

Publication number
CN112966279A
CN112966279A CN202110183631.1A CN202110183631A CN112966279A CN 112966279 A CN112966279 A CN 112966279A CN 202110183631 A CN202110183631 A CN 202110183631A CN 112966279 A CN112966279 A CN 112966279A
Authority
CN
China
Prior art keywords
node
information
algorithm
data
control information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110183631.1A
Other languages
Chinese (zh)
Other versions
CN112966279B (en
Inventor
王森
聂二保
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co Ltd filed Critical Beijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN202110183631.1A priority Critical patent/CN112966279B/en
Publication of CN112966279A publication Critical patent/CN112966279A/en
Application granted granted Critical
Publication of CN112966279B publication Critical patent/CN112966279B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/71Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Mathematical Physics (AREA)
  • Storage Device Security (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application relates to a distributed data processing method and a system, wherein the method comprises the following steps: acquiring to-be-processed information for calculation, wherein the to-be-processed information comprises at least two variable information; acquiring operation data corresponding to each operation step in an operation flow from an algorithm processing server; generating operation control information corresponding to each first operation node according to the operation data and the variable information; sending the operation control information to a first operation node according to the first node identification; the first operation node calculates according to the operation control information to obtain an operation result, and sends the operation result to the second operation node according to the second node identifier; and obtaining a final result by calculating the final operation node. The operation steps can be performed by a plurality of operation nodes, each operation node can only acquire partial algorithm and single variable information but cannot acquire the complete picture of the algorithm and the information to be processed, the algorithm can be effectively prevented from being cracked, and the information to be processed is prevented from leaking.

Description

Distributed data processing method and system
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a distributed data processing method and system.
Background
In order to achieve risk pricing well in the field of digital assets, emerging financial institutions and traditional financial institutions begin to pay attention to development and implementation of credit risk pricing models based on big data, and even on the basis of the development and implementation of credit risk pricing models, a plurality of financial science and technology companies specially designed and provide related services for the models appear, and the credit risk pricing models play an increasingly important role in the financial field. The problems existing in the prior art mainly comprise the following steps:
1. problem of unauthorized use of model
If the model is running in the data provider's environment, the data provider can use the model at any time without authorization from the model provider.
2. The model is difficult to logically break and is easy to break by' reverse push
The data provider has a complete model operation process, although the process can be a black box, the parameter entering and parameter exiting are clear to the data provider, and even if the model provider performs certain cracking prevention in a noise increasing mode, cracking cannot be completely achieved.
In view of the technical problems in the related art, no effective solution is provided at present.
Disclosure of Invention
In order to solve the technical problem or at least partially solve the technical problem, the present application provides a distributed data processing method and system.
In a first aspect, an embodiment of the present application provides a distributed data processing method, including:
acquiring to-be-processed information for calculation, wherein the to-be-processed information comprises at least two variable information;
acquiring operation data corresponding to each operation step in an operation flow from an algorithm processing server, wherein the operation data comprises: the operation method comprises an operation rule corresponding to the operation step, a first node identifier of a first operation node for executing the operation step and a second node identifier of a second operation node for receiving an operation result of the operation step;
generating operation control information corresponding to each first operation node according to the operation data and the variable information;
sending the operation control information to the first operation node according to the first node identification; the first operation node calculates the operation result according to the operation control information and sends the operation result to the second operation node according to the second node identifier;
obtaining a final result calculated by a final operation node; and the final operation node is the last operation node in the operation flow.
Optionally, as the foregoing method: the generating operation control information corresponding to each first operation node according to the operation data and the variable information includes:
inquiring to obtain an algorithm sequence corresponding to the variable information, and generating an operation unit according to the variable information and the corresponding algorithm sequence;
determining the operation data corresponding to each operation unit;
and generating operation control information corresponding to each first operation node according to the operation data and the operation unit.
Optionally, as in the foregoing method, the sending the operation control information to the first operation node according to the first node identifier includes:
encrypting the operation control information according to a preset encryption strategy to obtain encrypted information;
and sending the encrypted information to the first operation node, so that the first operation node decrypts the encrypted information according to a decryption strategy corresponding to the encryption strategy to obtain the operation control information.
In a second aspect, an embodiment of the present application provides a data processing method for model segmentation, including:
dividing the target model, and determining an algorithm corresponding to the sequential division and a logic unit obtained by the division;
obtaining an algorithm sequence corresponding to the minimum logic unit according to all the algorithms corresponding to the minimum logic unit obtained by segmentation;
determining logic relationship information between each minimum logic unit according to the target model;
obtaining a sub-model after segmentation according to the algorithm sequence, the logic relation information and the minimum logic unit;
determining an operation flow and operation data corresponding to each operation step in the operation flow according to the sub-model after division, and sending the operation data to a distribution platform, wherein the operation data comprises: the operation method comprises an operation rule corresponding to the operation step, a first node identification of a first operation node for executing the operation step and a second node identification of a second operation node for receiving an operation result of the operation step.
Optionally, as in the foregoing method, the obtaining an algorithm sequence corresponding to a minimum logic unit according to all the algorithms corresponding to the minimum logic unit obtained by partitioning includes:
determining an algorithm according to which the minimum logic units are obtained by the target model through segmentation;
and arranging the algorithms corresponding to each minimum logic unit according to a segmentation sequence to obtain the algorithm sequence corresponding to each minimum logic unit.
Optionally, as in the foregoing method, the obtaining a sub-model after segmentation according to the algorithm sequence, the logical relationship information, and the minimum logical unit includes:
determining the level information of each minimum logic unit according to the algorithm sequence of each minimum logic unit;
determining the calculation sequence number of each minimum logic unit layer by layer according to the logic relationship information and the hierarchy information;
obtaining a base code corresponding to the algorithm sequence according to the base corresponding to each algorithm; wherein the base comprises at least one character;
and obtaining the sub-model after segmentation according to the calculated serial number, the base password and the minimum logic unit of each minimum logic unit.
Optionally, as in the foregoing method, the obtaining a base code corresponding to the algorithm sequence according to the base corresponding to each algorithm includes:
obtaining a first base code corresponding to each algorithm sequence according to the base corresponding to each algorithm;
determining the longest base code with the largest number of bases in all the first base codes;
determining the base offset number of a second base code according to the maximum base number of the longest base code; wherein the second base code is the first base code in which the number of bases is less than the longest base code;
supplementing empty logic bases at the rear end of the final algorithm of the second base code according to the base compensation quantity so as to compensate the base quantity of the second base code to the maximum base quantity and obtain a compensated second base code; the final algorithm is the last algorithm in the algorithm sequence; the empty logical base is a base that does not comprise an algorithm;
and obtaining the base code corresponding to each algorithm sequence according to the longest base code and the compensated second base code.
Optionally, as in the foregoing method, the determining an operation flow according to the divided sub-model and operation data corresponding to each operation step in the operation flow includes:
analyzing the logic relation information to determine each logic unit which is associated with each other;
obtaining the operation flow according to the logic units and the logic relationship information which are mutually associated;
determining corresponding algorithms when the logic units are mutually associated according to the algorithm sequence;
obtaining an algorithm corresponding to each operation step in the operation flow according to the corresponding algorithm when each logic unit is associated with each other;
randomly selecting operation nodes corresponding to the operation steps, and determining node identifiers of the operation nodes;
and obtaining the operation data corresponding to the operation steps according to the node identification of the operation node corresponding to each operation step and an operation rule.
In a third aspect, an embodiment of the present application provides a distributed data processing system, including: the system comprises a distribution platform, a data providing end and an algorithm processing server;
the data providing end sends information to be processed, which needs to be calculated, to a distribution platform, wherein the information to be processed comprises at least two variable information;
the algorithm processing server determines operation data corresponding to each operation step in an operation flow and sends the operation data to the distribution platform; the operational data includes: the operation method comprises an operation rule corresponding to the operation step, a first node identifier of a first operation node for executing the operation step and a second node identifier of a second operation node for receiving an operation result of the operation step;
the distribution platform generates operation control information corresponding to each first operation node according to the operation data and the variable information;
the distribution platform sends the operation control information to the first operation node according to the first node identification;
the first operation node calculates to obtain the operation result according to the operation control information and sends the operation result to the second operation node according to the second node identification;
recursion is carried out according to the above steps until the final operation node is calculated to obtain a final result; and the final operation node is the last operation node in the operation flow.
Optionally, as in the foregoing system, the sending, by the distribution platform, the operation control information to the first operation node according to the first node identifier includes:
the distribution platform encrypts the operation control information according to a preset encryption strategy to obtain first encryption information;
the distribution platform sends the encryption information to the first operation node;
and the first operation node decrypts the first encrypted information according to a decryption strategy corresponding to the encryption strategy to obtain the operation control information.
Optionally, in the foregoing system, the calculating, by the first operation node, the operation result according to the operation control information, and sending the operation result to the second operation node according to the second node identifier includes:
the first operation node calculates to obtain the operation result according to the operation control information;
the first operation node encrypts the operation result according to the encryption strategy to obtain second encryption information, and the second encryption information is sent to the second operation node according to the second node identifier;
the distribution platform generates operation control information corresponding to the second operation node according to the operation data;
the distribution platform encrypts the operation control information according to the encryption strategy to obtain third encryption information, and the third encryption information is sent to the second operation node according to the second node identifier;
and the second operation node decrypts the second encryption information and the third encryption information according to the decryption strategy to obtain corresponding operation data.
In a fourth aspect, an embodiment of the present application provides a distributed data processing apparatus, including:
the device comprises a first acquisition module, a second acquisition module and a processing module, wherein the first acquisition module is used for acquiring information to be processed for calculation, and the information to be processed comprises at least two variable information;
a second obtaining module, configured to obtain operation data corresponding to each operation step in an operation flow from an algorithm processing server, where the operation data includes: the operation method comprises an operation rule corresponding to the operation step, a first node identifier of a first operation node for executing the operation step and a second node identifier of a second operation node for receiving an operation result of the operation step;
the generating module is used for generating operation control information corresponding to each first operation node according to the operation data and the variable information;
a sending module, configured to send the operation control information to the first operation node according to the first node identifier; the first operation node calculates the operation result according to the operation control information and sends the operation result to the second operation node according to the second node identifier;
the third acquisition module is used for acquiring a final result calculated by the final operation node; and the final operation node is the last operation node in the operation flow.
In a fifth aspect, an embodiment of the present application provides a data processing apparatus for model segmentation, including:
the segmentation module is used for segmenting the target model, and determining an algorithm corresponding to sequential segmentation and a logic unit obtained by segmentation;
the algorithm sequence module is used for obtaining an algorithm sequence corresponding to the minimum logic unit according to all the algorithms corresponding to the minimum logic unit obtained by segmentation;
the determining module is used for determining the logic relationship information between the minimum logic units according to the target model;
the submodel obtaining module is used for obtaining a sub-model after segmentation according to the algorithm sequence, the logic relationship information and the minimum logic unit;
and the sending module is used for determining an operation flow and operation data corresponding to each operation step in the operation flow according to the sub-model after division, and sending the operation data to the distribution platform.
In a sixth aspect, an embodiment of the present application provides an electronic device, including: the system comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
the memory is used for storing a computer program;
the processor is configured to implement the processing method according to any one of the preceding claims when executing the computer program.
In a seventh aspect, an embodiment of the present application provides a storage medium, where the storage medium includes a stored program, where the program is executed to perform the method steps of any one of the preceding items.
Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages:
according to the method provided by the embodiment of the application, the operation step is divided into the plurality of sub-steps, and the plurality of operation nodes are respectively used for operation, so that each operation node can only obtain part of algorithm and single variable information, but cannot obtain the complete picture of the algorithm and the information to be processed, the algorithm can be effectively prevented from being cracked, and the information to be processed is prevented from leaking.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a schematic flowchart of a data processing method for model segmentation according to an embodiment of the present disclosure;
fig. 2 is a schematic flowchart of a data processing method for model segmentation according to another embodiment of the present application;
fig. 3 is a schematic flowchart of a data processing method for model segmentation according to another embodiment of the present application;
FIG. 4 is a schematic diagram of a model segmentation provided in an embodiment of the present application;
FIG. 5 is a schematic diagram of a model segmentation according to another embodiment of the present application;
fig. 6 is a schematic flowchart of a distributed data processing method according to an embodiment of the present application;
fig. 7 is a schematic flowchart of a distributed data processing method according to another embodiment of the present application;
FIG. 8 is an architecture diagram of a distributed data processing system according to an embodiment of the present application;
FIG. 9 is a block diagram of a data processing apparatus for model segmentation according to an embodiment of the present disclosure;
fig. 10 is a block diagram of a distributed data processing apparatus according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The risk evaluation model refers to a model for evaluating the credit risk condition of the digital assets in the circulation process after the assets are transformed digitally, and generally relates to the collection, processing, operation and the like of all dimensional data related to the assets. In the traditional risk pricing field, the acquisition and processing of the data are generally finished by means of a centralized mechanism and an operation flow, for example, the acquisition and processing are finished through the processes of on-site adjustment of a credit analyst, establishment of a credit model and the like, in the process, an enterprise needing to change digital assets is willing to provide enterprise data to facilitate the processes of model operation and the like in order to obtain credit rations as soon as possible in the financing process and meet the requirement of information disclosure of the financing enterprise, related main bodies except the financing enterprise are usually finished by a third party mechanism bearing credit risk pricing, and excessive data acquisition and model transmission processes are not involved. As a professional third-party organization, the model of the credit rating company is often confidential in core technology, details of the credit risk pricing model are not published to the evaluated financing enterprise, and information on the model is not disclosed in the public place.
With the decrease of traditional financial asset sources, the rise of popular finance, especially for C-end consumption finance and for SME (small and medium enterprise) financial assets, is a risk preference for various financial institutions, especially emerging financial institutions. Digital assets corresponding thereto are gradually becoming the mainstream of the asset form as one of important financial asset models. However, the important difference between digital assets and traditional morphological assets is that the source of the assets is mainly the assets with poor mobility in C-terminal and small and medium-sized micro-individuals, which requires that the risk pricing model is greatly different from the traditional assets on the basis of technical routes and data.
In order to achieve risk pricing well in the field of digital assets, emerging financial institutions and traditional financial institutions begin to pay attention to development and implementation of credit risk pricing models based on big data, and even on the basis of the development and implementation of credit risk pricing models, a plurality of financial science and technology companies specially designed and provide related services for the models appear, and the credit risk pricing models play an increasingly important role in the financial field. The prior art method and flow are roughly as follows:
1. production and deployment of models
As a provider of the model, a professional financial science and technology company collects and cleans credit information data, combines original data with the technical basis of the company, completes model building by methods such as a specialist method, a statistical method and machine learning, deploys the model to a production environment, and continuously optimizes and iterates the model.
2. Generally, risk pricing model information is not passed out
As the core technology and confidentiality of the financial technology company, the risk pricing model is the most competitive product, and the financial technology company generally does not deploy the model in the production environment of the data provider and is not willing to publish the specific details of the model to the society in view of protecting the technological achievements and intellectual property.
3. Once the model is out of the financial science and technology company system, the safety problem is not solved well
Due to the problems of information security, privacy protection and the like, a data provider can provide data to a model for operation under the conditions of compliance and security, and based on the problem, a relatively innovative business mode is that the model provider encrypts the model and deploys the model to an independent environment of the data provider, and the result is calculated and then output.
The prior art scheme has the following defects:
as mentioned above, due to the particularities of the credit risk pricing model for digital assets, it is often necessary to deploy away from the model provider environment, such as in a data provider environment, or in a platform system that carries out a particular financial transaction.
In the prior art, although the problem of security of model deployment in other environments can be solved to a certain extent, since a data provider has complete model information, even if the data provider operates in a black box, the data can be directly input into the model without disassembling the process to obtain a result, so that services are provided for other customers except the model provider.
In order to solve at least one of the above technical problems, as shown in fig. 1, an embodiment of the present application provides a data processing method for model segmentation, including the following steps P1 to P5:
step P1, segmenting the target model, and determining an algorithm corresponding to sequential segmentation and a logic unit obtained by segmentation; wherein the object model comprises a plurality of logical units.
Specifically, a logical structure of the target model may be determined, where the target model may be a formula for implementing a specific algorithm, and the logical structure corresponding to the formula may be: and determining the logic units included in the formula according to the operation sequence or the operation rule. For example, when there is formula (1) as shown below:
M=0.8×[x1/3+(x2+x3)/(x4+x5)]+0.2×(x6 2+x7x8-x9+x10 3) (1)
in the logic structure corresponding to equation (1), the logic structure is the logic corresponding to the addition in the largest aspect, so the splitting is performed to obtain: 0.8 × [ x ]1/3+(x2+x3)/(x4+x5)]Corresponding logic unit I and 0.2 × (x)6 2+x7x8-x9+x10 3) Corresponding to the logic unit II, and the operation rule between the two logic units is addition.
Specifically, an algorithm according to which each division is performed and the logic units obtained by the division are determined, and an algorithm according to which the logic units obtained by the division are associated with each other can be obtained.
Furthermore, each logic unit obtained by performing division is necessarily a sub-logic unit in one logic unit obtained by the previous division. For example, when equation (1) is first divided, 0.8 × [ x ] is obtained1/3+(x2+x3)/(x4+x5)]Corresponding logical Unit I or 0.2 × (x)6 2+x7x8-x9+x10 3) The corresponding logical units II are all the original logical units (i.e.: formula (1)), and the algorithm between logic unit I and logic unit II is "addition"; when the logic unit I is further divided into x1And/3 logic unit III, the logic unit III is a sub-logic unit of the logic unit I.
And P2, obtaining an algorithm sequence corresponding to the minimum logic unit according to all the algorithms corresponding to the minimum logic unit obtained by the segmentation.
That is, the target model is sequentially divided according to the logical structure until the minimum logical unit is obtained. And recording all algorithms related to the obtained minimum logic unit, and obtaining a corresponding algorithm sequence. Typically, each algorithm sequence includes at least one algorithm.
And P3, determining the logic relationship information between each minimum logic unit according to the target model.
Specifically, taking the target model as a formula as an example, the logical relationship information between the minimum logical units may include: the information of operational relationship exists between different minimum logic units, and the information between intermediate variables is obtained according to the different minimum logic units; and each minimum logic unit can inherit to obtain the logic relation between the intermediate variable formed by the minimum logic unit and other intermediate variables.
And P4, obtaining the sub-model after segmentation according to the algorithm sequence, the logic relation information and the minimum logic unit.
Specifically, each divided sub-model includes an algorithm sequence corresponding to the minimum logical unit and logical relationship information.
And then, determining how to assemble each minimum logic unit according to the algorithm sequence and the logic relation information so as to restore and obtain the target model.
Step P5., determining an operation flow and operation data corresponding to each operation step in the operation flow according to the divided sub-model, and sending the operation data to a distribution platform, wherein the operation data comprises: the operation method comprises an operation rule corresponding to the operation step, a first node identification of a first operation node for executing the operation step and a second node identification of a second operation node for receiving an operation result of the operation step.
Specifically, the sub-model after being divided is obtained according to the algorithm sequence, the logic relationship information and the minimum logic unit, and the algorithm sequence, the logic relationship information and the minimum logic unit are obtained by splitting the target model according to the corresponding logic structure, so that the sub-model after being divided can be obtained according to the logic relationship information and the minimum logic unit in the sub-model after being dividedTaking the target model obtained by reduction and the operation flow when calculating through the target model (for example, taking formula (1) as an example, the operation flow needs the first step of calculation (x)2+x3) And (x)4+x5) Second step of calculating (x)2+x3)/(x4+x5) And x1/3, the third step calculates x1/3+(x2+x3)/(x4+x5) The fourth step calculates 0.8 × [ x ]1/3+(x2+x3)/(x4+x5) Finally, calculating to obtain a final result); in addition, after the operation flow is obtained, the operation data corresponding to each step in the operation flow can be obtained through analysis by combining the algorithm sequence corresponding to each minimum logic unit.
Optionally, the operation data may be operation data corresponding to each operation step generated after determining one operation step, or may be operation data corresponding to all operation steps obtained after planning operation nodes for executing each operation step in the operation flow in advance.
The first operation node may be an operation node corresponding to each operation step; the second operation node may be an operation node for receiving operation results obtained in the operation steps; therefore, when the second operation node receives the operation result, it can receive the operation data again as the first operation node to determine the operation rule to be executed on the received operation result and the destination of the new operation result (i.e. another second operation node).
By the method in the embodiment, one target model can be divided into a plurality of minimum logic units, the corresponding divided sub-models are obtained, the corresponding operation flows can be restored according to the divided sub-models, and the operation flows are sent to the distribution platform, so that each operation step in the operation flows can be calculated through different operation nodes, the purpose that each operation party cannot completely obtain the whole algorithm of the target model is finally achieved, and the algorithm can be prevented from leaking.
In some embodiments, as shown in fig. 2, the step P2 obtains the algorithm sequence corresponding to the minimum logical unit according to all the algorithms corresponding to the minimum logical unit obtained by dividing, as the aforementioned method, including the following steps P21 and P22:
and P21, determining an algorithm according to which the minimum logic units obtained by target model segmentation are sequentially based.
And P22, arranging the algorithms corresponding to each minimum logic unit according to the segmentation sequence to obtain the algorithm sequence corresponding to each minimum logic unit.
That is, when the target model is divided, the correspondence between the order of division and the algorithm at the time of obtaining the minimum logical unit is recorded.
The algorithm sequence is obtained by arranging the algorithms in order of division.
For example, the algorithm for sequentially performing the division into the minimum logic units z is as follows: when "+", "x", the corresponding algorithm sequence can then be: + → × → ×.
As shown in fig. 3, in some embodiments, as the aforementioned method, the step P4 obtains the sub-model after segmentation according to the algorithm sequence, the logical relationship information and the minimum logical unit, and includes the following steps P41 to P44:
and P41, determining the hierarchical information of each minimum logic unit according to the number of the algorithms in the algorithm sequence of each minimum logic unit.
That is, the level information of each minimum logical unit is consistent with the number of algorithms in the corresponding algorithm sequence.
For example, when the number of algorithms corresponding to the algorithm sequence is at most 5 in all the minimum logic units, the hierarchical information of the minimum logic unit of the algorithm sequence including 5 algorithms may be a first level, the hierarchical information of the minimum logic unit of the algorithm sequence including 4 algorithms may be a second level, and the like.
Specifically, since the number of algorithms matches the number of divisions, the hierarchy division may be performed by dividing the minimum logical unit into the number of divisions.
And P42, determining the calculation sequence number of each minimum logic unit layer by layer according to the logic relationship information and the hierarchy information.
Specifically, the calculation sequence number of each minimum logic unit is determined layer by layer according to the hierarchy information, and further, the calculation sequence in the subsequent calculation can be determined according to the number of the algorithm of each minimum logic unit; meanwhile, the logical relationship information determines the calculation serial number of each minimum logical unit, and the minimum logical units with the logical relationship can be adjacently ordered, so that the minimum logical units can be distributed according to the calculation serial numbers when model reduction is carried out at a later stage.
Step P43, obtaining a base code corresponding to the algorithm sequence according to the base corresponding to each algorithm; wherein the base comprises at least one character.
That is, the base code corresponding to the algorithm sequence is obtained by the base banding algorithm.
Alternatively, when four bases are provided with "A, T, C, G", and the four algorithms of "addition, subtraction, multiplication, and division" can be respectively used, there are 4 × 4-16 methods for pairing the bases with the algorithm. Furthermore, the following corresponding relation of 'A-addition, T-subtraction, C-multiplication and G-division' can be determined, in the actual processing, the decryption difficulty can be increased by a multi-layer nesting method, for example, the decryption difficulty can be increased in a geometric progression manner by nesting once every time 'AT-addition, TA-subtraction, CG-multiplication and GC-division', so that the algorithm is further prevented from being decrypted.
And P44, obtaining the sub-model after the division according to the calculation serial number, the base password and the minimum logic unit of each minimum logic unit.
That is, each of the split daughter models includes the calculation number and the base number corresponding to each minimum logical unit.
In some embodiments, step P43 obtains the base code corresponding to the algorithm sequence according to the base corresponding to each algorithm, as described in the previous method, including steps P431 to P435:
and P431, obtaining a first base code corresponding to each algorithm sequence according to the base corresponding to each algorithm.
Specifically, the first base code is a base code corresponding to each algorithm sequence.
Step P432. determine the longest base code with the largest number of bases in all the first base codes.
Specifically, the longest base codon is the base codon with the largest number of bases among all the first base codons, i.e.: corresponding to the algorithm sequence with the most algorithms.
Step P433, determining the base compensation quantity of the second base code according to the maximum base quantity of the longest base code; wherein the second base code is the first base code in which the number of bases is less than the longest base code.
Specifically, the maximum number of bases is the number of bases of the longest base code; thus, the base offset number can be obtained by determining the difference between the number of bases in each second base code and the maximum number of bases.
Step P434, supplementing empty logic bases at the rear end of the final algorithm of the second base code according to the base compensation quantity so as to compensate the base quantity of the second base code to the maximum base quantity and obtain a compensated second base code; the final algorithm is the last algorithm in the algorithm sequence; an empty logical base is a base that does not contain an algorithm.
Specifically, the final algorithm is an algorithm based on the last segmentation when the minimum logic unit corresponding to the second base password is obtained by the segmentation; alternatively, based on step P53 in the previous embodiment, when the maximum number of bases is 4, and there is a second base code C-A-C, and the leftmost base corresponds to the final algorithm, it is necessary to add an empty logical base (e.g., U) at the leftmost side and obtain U-C-A-C as the compensated second base code.
And P435, obtaining the base passwords corresponding to the algorithm sequences according to the longest base password and the compensated second base password.
Specifically, since the number of bases of the longest base code is the largest, no compensation is required; after the compensated second base code is obtained, the base code corresponding to each algorithm sequence can be obtained according to the longest base code and the compensated second base code.
By the method in the embodiment, the base codes can be unified, so that when model reduction is performed at a later stage, whether the variable information corresponding to the minimum logic unit needs to be operated or not can be judged directly by identifying each base in the base codes, and whether the variable information needs to be distributed or not can be judged, and the system can be more conveniently identified and judged.
In some embodiments, as the method mentioned above, the step P5 determines the operation flow according to the sub-model after division, and the operation data corresponding to each operation step in the operation flow includes the following steps P51 to P56:
and P51, analyzing the logic relation information and determining each logic unit which is mutually associated.
Step P52, obtaining an operation flow according to the logic units and the logic relation information which are mutually associated;
step P53, determining corresponding algorithms when the logic units are mutually associated according to the algorithm sequence;
and P54, obtaining an algorithm corresponding to each operation step in the operation flow according to the corresponding algorithm when the logic units are associated with each other.
Specifically, the steps P51 to P54 are inverse processing of the steps P1 to P4, and the splitting of the target model in the steps P1 to P4 is inverse to the operation flow, so the operation flow and the algorithm corresponding to each operation step in the operation flow can be obtained by restoring the methods in the steps P51 to P54.
Step P55, randomly selecting operation nodes corresponding to the execution of each operation step, and determining the node identification of each operation node;
and P56, obtaining operation data corresponding to the operation steps according to the node identifiers of the operation nodes corresponding to the operation steps and the operation rules.
That is, stored in the operational data is a node identification of the operational node.
Obtaining each operation node through random selection; the situation that the target model is obtained through mutual cooperation and reduction among all the operation nodes can be avoided.
Application example:
(I) partitioning according to logical structure
(1) First tier logical partitioning
Since the risk pricing model of digital assets is generally a statistical model, the end result is multiplied by its weight and summed over different data dimensions. Therefore, the first layer of logic segmentation obtains a weight sequence and a corresponding data dimension sequence. For ease of description, this aspect is modeled by an example of equation (1):
M=0.8×[x1/3+(x2+x3)/(x4+x5)]+0.2×(x6 2+x7x8-x9+x10 3) (1)
optionally, when two logic units corresponding to different weights are respectively regarded as separate entities, the first-layer logic division is respectively 0.8 × [ x ]1/3+(x2+x3)/(x4+x5)]And 0.2 × (x)6 2+x7x8-x9+x10 3) And (4) dividing, wherein the weight sequences corresponding to the two logic units are 0.8 and 0.2 respectively. And the logical unit obtained after the current division is' x1/3+(x2+x3)/(x4+x5)]"and" (x)6 2+x7x8-x9+x10 3)”。
The logical units obtained by the layer division can be expressed as formula (2) and formula (3).
m1=x1/3+(x2+x3)/(x4+x5) (2)
m2=x6 2+x7x8-x9+x10 3 (3)
In the formula (2) and the formula (3), m1m 21 and 2 in the figure represent the first logic unit and the second logic unit formed by logic division of the layer, and the following is similar.
m subscripted numbers, the digits of the number representing the hierarchical information of the logical division, a general example: such as m1212The subscript has four digits to indicate that the logic unit m is obtained after four times of division1212Thus, the level information is 4, this logical unit m1212The dividing method comprises the following steps: carrying out first segmentation on the target model m to obtain a first logic unit m1Then for the first logic cell m1Performing a second division to obtain a second logic unit m under the second division12Then to m12Performing a third division to obtain a first logic unit m under the third division121(ii) a Finally to m121Performing a fourth division, and taking the second logic unit m under the fourth division1212. The specific segmentation is schematically shown in fig. 4:
(2) second layer segmentation
Following the same procedure as the first layer division, the second layer division results are obtained as formula (4) to formula (9).
m11=x1/3 (4)
m12=(x2+x3)/(x4+x5) (5)
m21=x6 2=x6×x6 (6)
m22=x7×x8 (7)
m23=-x9 (8)
m24=x10 3=x10×x10×x10 (9)
It should be noted that formula (4) only contains one argument, and the argument does not relate to the calculation with itself (exponentiation) and other arguments, so that the third segmentation is no longer necessaryIncluding x1Content of (1), x9The same applies.
(3) Third layer segmentation
The third layer of logic partitioning, the result is the formula:
m121=x2+x3 (10)
m122=x4+x5 (11)
m211=x6 (12)
m212=x6 (13)
m221=x7 (14)
m222=x8 (15)
m241=x10 (16)
m242=x10 (17)
m243=x10 (18)
(4) fourth layer segmentation
The fourth layer of logic is divided, and the result is the formula:
m1211=x2 (19)
m1212=x3 (20)
m1221=x4 (21)
m1222=x5 (22)
the logic partitioning is finished according to the principle that each minimum logic unit finally comprises only one independent variable, and if a plurality of identical independent variables are present in the result, the result indicates that the operation is performed for a plurality of times by the same variable.
The resulting operational logic framework is shown in the model logic partitioning framework diagram in fig. 5:
as can be seen from fig. 5, the model in the example of the present invention is divided into four layers, and finally 13 basic logic units are obtained, which are respectively from bottom to top: m is1211、m1212、m1221、m1222、m211、m212、m221、m222、m241、m242、m243、m11、m23. And judging whether the cell is a final logic cell or not, wherein the standard is that no division is performed below the cell, and only one independent variable is involved in the operation of the cell.
(II) base cipher processing of model logic
(1) Random determination of base pairing rules
Since there are only four bases "A, T, C, G", there are four algorithms "add, subtract, multiply, divide", and thus there are 4 × 4 to 16 methods of base pairing with the algorithm. In the invention, for the sake of simplicity, the following sequence is used for determining 'A-addition, T-subtraction, C-multiplication and G-division', and in the actual processing, the decryption difficulty can be increased by a multi-layer nested method, for example, the decryption difficulty can be increased in a geometric progression manner when 'AT-addition, TA-subtraction, CG-multiplication and GC-division' are nested once. Some logic units do not relate to all segmentation levels, for example, if m212 is only divided into the third layer, the corresponding algorithm base of the fourth layer is replaced by 'U', and the identification logic is default.
(2) Logical unit base pairing
According to the level of logical partitioning, the calculated sequences are matched according to the base pairing principle (the matching process includes all four layers of operation logic, not only one layer), as shown in table 1:
table 1: base pairing table corresponding to logical division
Figure BDA0002942159360000131
In table 1, the first column is a logic unit, the second column is a calculation sequence number, and all unit logics of the whole formula can be "assembled" at a later stage according to the calculation sequence number. All elements corresponding to the first-level segmentation need to be multiplied by the weight value, so that the basic groups are all 'C'.
(III) obtaining a logical Unit base code
The calculated numbers, base codes and weights of the logical units are assembled together to form a logical unit base code list, as shown in table 2:
table 2: base sequence listing of logical units
Figure BDA0002942159360000132
Figure BDA0002942159360000141
As shown in fig. 6, according to another embodiment of the present application, there is further provided a distributed data processing method, including the following steps S1 to S5:
the method comprises the step S1 of receiving information to be processed for calculation, wherein the information to be processed comprises at least two variable information.
Specifically, the method of the present embodiment may be applied to a distribution platform that distributes each variable information; the information to be processed is data for calculation according to a preset algorithm, that is, the information to be processed needs to be calculated through the algorithm to obtain a corresponding calculation result; wherein, each variable information may include: variable types and variable values; where the variable type may be, for example, x1、x2An equal variable; but also for example: bendix, interest rate, etc. are used to characterize the information of the specific meaning of the variables.
S2, acquiring operation data corresponding to each operation step in the operation flow from the algorithm processing server, wherein the operation data comprises: the operation method comprises an operation rule corresponding to the operation step, a first node identification of a first operation node for executing the operation step and a second node identification of a second operation node for receiving an operation result of the operation step.
The operation nodes are used for providing operation capacity corresponding to the operation steps, and each operation step corresponds to different operation nodes.
Specifically, each operation step has an operation node corresponding to the operation step; optionally, the operation flow may be: when variable information needs to be operated, selecting an operation node in real time, and determining to obtain an operation flow comprising an operation step; the operation flow can include the variable information or include the variable informationInter-variable information (for example: when the variable corresponding to the variable information is x)1The intermediate variable information may be a variable including a variable according to x1The resulting intermediate variable (x)1+x2) Or x3(x1+x2) Etc.) of a plurality of operation flows.
Generally, in order to break up the centralized processing and increase the difficulty of decryption, each operation node generally processes operations for a limited number of times (e.g., at most once) to avoid the problem that one node processes too many operation steps and then the probability of deriving an algorithm is reached.
The algorithm may be, for example: rules of addition, subtraction, multiplication, division, cubic, etc., and operational relationships between variables when performing operations (e.g., when variable x is present)1And x2When a division between the two is desired, the algorithm determines x1And x2Needs to carry out division operation and also determines x1And x2In relation to the position of the division number, i.e. determining x1And x2The divisor and dividend) of (1).
The first operation node is an operation node which corresponds to the operation step and processes the variable information; the second operation node is used for receiving the operation result and calculating according to the operation result; each operation node corresponds to a node identification, the node identification can be address information or unique identification information of the operation node, and then the target operation node to which the operation information needs to be sent can be located through the node identification.
In some cases, when performing operations, multiple variable information may be involved, and therefore, the same operation node may receive multiple variable information in that case; in addition, when the algorithm corresponding to the variable information is negation or reciprocal, only one variable information can be received in the same operation node.
And S3, generating operation control information corresponding to each first operation node according to the operation data and the variable information.
Specifically, when the first operation node receives variable information subjected to any calculation and is associated with a second operation node to which an operation result is to be sent, the operation control information includes: the operation method comprises an operation rule corresponding to the operation step, a first node identification of a first operation node for executing the operation step, a second node identification of a second operation node for receiving an operation result of the operation step, and variable information.
When the variable information received from the first operation node and used for calculation is an intermediate variable, only the operation rule, the first node identifier and the second node identifier need to be received.
When the operation node corresponding to the last operation step in the first operation node operation process, only the operation rule and the first node identification need to be received.
S4, sending operation control information to a first operation node according to the first node identification; and the first operation node calculates according to the operation control information to obtain an operation result and sends the operation result to the second operation node according to the second node identifier.
S5, obtaining a final operation node and calculating to obtain a final result; and the final operation node is the last operation node in the operation flow.
Specifically, after the variable information is operated by the first operation node according to the operation rule in the operation information, the operation result is obtained.
In addition, because the operation information further includes the node identifier of the second operation node, after the first operation node obtains the operation result, the operation result can be sent to the corresponding second operation node according to the node identifier, so that the target operation node can continue to perform the next operation step according to the operation result.
And according to the recursion, the operation result obtained by the second operation node is sent to the next operation node; until the final operation node obtains the operation result calculated according to the received preorder operation node and the received operation rule to obtain the final result; after the second operation node obtains the operation result, the node identifier of another operation node to which the second operation node needs to forward the operation result can be obtained from the information issued by the control end; furthermore, the distribution platform implementing the method of the embodiment can only execute the distribution action, but cannot acquire the to-be-processed information and the full view of the algorithm model.
Further, in this embodiment, the algorithm, the second node identifier and the variable information may be simultaneously sent to the first operation node, or the algorithm and the variable information may be first sent to the first operation node, and after the current operation node feeds back the operation result, the second node identifier is sent to the first operation node; both methods can achieve the goal that the first operation node sends the operation result to the second operation node executing the next operation step.
According to the method in the embodiment, the operation step is divided into the sub-steps, and the operation is performed by the operation nodes respectively, so that each operation node can only acquire part of the algorithm and single variable information, but cannot acquire the complete picture of the algorithm and the information to be processed, the algorithm can be effectively prevented from being cracked, and the information to be processed is prevented from leaking.
As shown in fig. 7, in some embodiments, the step S3 of generating the operation control information corresponding to each first operation node according to the operation data and the variable information as the aforementioned method includes the following steps S31 to S34:
and S31, inquiring to obtain an algorithm sequence corresponding to the variable information, and generating an operation unit according to the variable information and the corresponding algorithm sequence.
Specifically, the variable information may include type information, and the algorithm sequence may also correspond to one type information; furthermore, the query may be performed by matching characters of the type information, and after the algorithm sequence is obtained by the query, a field may be added to the algorithm sequence for writing a variable value of the variable information, thereby generating the operation unit.
The content in the foregoing embodiments can be referred to for a method for acquiring an algorithm sequence, which is not described herein again.
And S32, determining the operation data corresponding to each operation unit.
Specifically, since the operation data is received from the information sent by the algorithm processing server, the operation data may be determined according to the split sub-model, and the split sub-model includes the algorithm sequence, the corresponding operation unit may be obtained according to the algorithm sequence, and the operation data corresponding to the operation unit may be determined.
And S33, generating operation control information corresponding to each first operation node according to the operation data and the operation unit.
The operation unit includes variable information, so that operation control information corresponding to each first operation node can be obtained according to the operation data and the operation unit.
Further, since the second operation node may receive a plurality of operation results, and when performing operations of the same rule (for example, division), the divisor and the dividend are exchanged, and completely different results are obtained, the operation data may further carry information specifying the order of the operation results corresponding to each first operation unit, for example, one of the optional expression manners of the information may be: x is the number of1、x2A, division; further, the second arithmetic unit can know that x should be1÷x2(ii) a When the second arithmetic unit receives the following result: x is the number of2、x1A, division; the operation that the second arithmetic unit should perform is x2÷x1
In some embodiments, as the method, the step S4 of sending the operation control information to the first operation node according to the first node identification includes the following steps S41 and S42:
s41, encrypting the operation control information according to a preset encryption strategy to obtain encrypted information;
and S42, sending the encrypted information to the first operation node so that the first operation node decrypts the encrypted information according to the decryption strategy corresponding to the encryption strategy to obtain operation control information.
Specifically, the encryption policy and the decryption policy in this embodiment are set corresponding to each other, and optionally, corresponding public keys and private keys may be used.
Therefore, the operation control information including the algorithm, the node identification and the variable information can be encrypted through the private key; and the public key is distributed in advance to each candidate operation node that can be used for calculation. After the candidate operation node is selected, the encrypted information can be decrypted according to the public key to obtain the operation rule, the node identification and the variable information.
By the method in the embodiment, the distributed information can be prevented from being intercepted by illegal personnel, so that data and algorithm are prevented from leaking and the safety is prevented from being influenced.
Application example:
the operation assembly unit and the distributed operation are realized by the method in the embodiment:
the base code and the data unit of the logic unit are combined into an operation subunit, and the operation subunit and the logic unit can form an operation unit.
Based on the previous application example, it can be known from the process of model segmentation that each logic unit corresponds to a different independent variable x, and the segmented data units and the logic units are combined to form an operation unit as shown in table 3:
table 3: logical unit, base code and independent variable corresponding table
Figure BDA0002942159360000171
The left side of each arithmetic unit is provided with a calculation serial number which can be used as a serial number for sorting the arithmetic units.
And according to the calculation sequence number, when the calculation node is instructed to perform calculation, the method can be used for limiting the sequence of performing mutual calculation on two or more variable information; by way of example: in the calculation to obtain m121And m122When the quotient between m can pass1211、m1212、m1221And m1222The respective corresponding numbers 1, 2, 3 and 4 determine that in performing the division, it should bem121/m122Instead of m122/m121
Distributed computing process for random selection and inverse logic cutting sequence of distributed computing environment
And selecting an operation node corresponding to the operation unit in the block chain by a random method, and providing calculation force by the operation node to operate the operation unit. The specific process is as follows:
(1) arithmetic unit first distribution and arithmetic
The distribution of the arithmetic units follows the sequence of logic division and the process of installing model logic cutting, if the arithmetic units are sent according to the division times, the arithmetic units distributed for the first time can be No. 1, No. 2, No. 3 and No. 4, and if the arithmetic units are only stored, other arithmetic units can be also used; in particular, the arithmetic units No. 1, 2, 3, and 4 may be used only for storage without an algorithm at the time of first distribution. The random distribution is to disturb the storage state after the centralization cutting, the decryption difficulty is improved through the decentralized process, and each operation node is randomly selected, so the possibility of 'collusion' is reduced.
The result of this distribution and operation is x2、x3、x4、x5、x1、x9Wherein x is2、x3、x4、x5Exist alone, for example, in the pair x1、x9When distributing, it can have relative algorithm, and make the operation node calculate to obtain the corresponding operation result: x is the number of1A combination of/3 and-x9
(2) Operation unit multiple distribution and distributed operation
According to the generation process of the arithmetic unit, the following results are obtained in sequence:
a. second distribution and operation
Obtaining: x is the number of2+x3And x4+x5
b. Third time distribution and operation
Obtaining: (x)2+x3)/(x4+x5)、x6×x6(equivalent to x)6 2)、x7×x8、x10×x10×x10(equivalent to x)10 3)。
c. Fourth dispatch and operation
To obtain [ x ]1/3+(x2+x3)/(x4+x5)]And [ x ]6 2+x7×x8+x10×x10×x10]Two results.
d. Operation and assembly of final results
Combining the distributed operation results into a randomly selected distributed computing node to obtain:
[x1/3+(x2+x3)/(x4+x5)]、[x6 2+x7×x8+x10×x10×x10]substituting the combination of (a) into the algorithm A weight coefficient (already identified in the operation unit), and restoring to obtain formula (1).
0.8×[x1/3+(x2+x3)/(x4+x5)]+0.2×[x6 2+x7×x8+x10 3] (1)
The complete reduction then yields equation (1), the target model.
As shown in fig. 8, according to another aspect of the present application, there is also provided a distributed data processing system including: a distribution platform 3, a data providing terminal 2 and an algorithm processing server 1;
the data providing end 2 sends information to be processed, which needs to be calculated, to a distribution platform, wherein the information to be processed comprises at least two variable information;
the algorithm processing server 1 determines operation data corresponding to each operation step in the operation flow and sends the operation data to the distribution platform; the operation data includes: the method comprises the following steps of (1) calculating an algorithm corresponding to a calculation step, a first node identifier of a first calculation node 401 executing the calculation step and a second node identifier of a second calculation node 402 for receiving a calculation result of the calculation step;
the distribution platform 3 generates operation control information corresponding to each first operation node 401 according to the operation data and the variable information;
the distribution platform 3 sends the operation control information to the first operation node 401 according to the first node identifier;
the first operation node 401 calculates to obtain an operation result according to the operation control information, and sends the operation result to the second operation node 402 according to the second node identifier;
recursion is carried out according to the above until the final operation node 403 calculates to obtain a final result; and the final operation node is the last operation node in the operation flow.
Specifically, the specific process of implementing the functions of each module in the apparatus according to the embodiment of the present invention may refer to the related description in the method embodiment, and is not described herein again.
In some embodiments, as in the foregoing system, the distribution platform 3 sends the operation control information to the first operation node 401 according to the first node identifier, including:
the distribution platform 3 encrypts the operation control information according to a preset encryption strategy to obtain first encryption information;
the distribution platform 3 sends the encryption information to the first operation node 401;
the first operation node 401 decrypts the first encrypted information according to the decryption policy corresponding to the encryption policy, and obtains operation control information.
Specifically, the specific process of implementing the functions of each module in the apparatus according to the embodiment of the present invention may refer to the related description in the method embodiment, and is not described herein again.
In some embodiments, as in the foregoing system, the calculating, by the first operation node 401, an operation result according to the operation control information, and sending the operation result to the second operation node 402 according to the second node identifier includes:
the first operation node 401 calculates to obtain an operation result according to the operation control information;
the first operation node 401 encrypts the operation result according to an encryption strategy to obtain second encryption information, and sends the second encryption information to the second operation node 402 according to the second node identifier;
the distribution platform 3 generates operation control information corresponding to the second operation node according to the operation data;
the distribution platform 3 encrypts the operation control information according to the encryption strategy to obtain third encryption information, and sends the third encryption information to the second operation node 402 according to the second node identifier;
the second operation node 402 decrypts the second encrypted information and the third encrypted information according to the decryption policy, and obtains corresponding operation data.
Specifically, the specific process of implementing the functions of each module in the apparatus according to the embodiment of the present invention may refer to the related description in the method embodiment, and is not described herein again.
As shown in fig. 9, according to an embodiment of another aspect of the present application, there is also provided a data processing apparatus for model segmentation, including:
the segmentation module 11 is used for segmenting the target model, and determining an algorithm corresponding to sequential segmentation and a logic unit obtained by segmentation;
the algorithm sequence module 12 is used for obtaining an algorithm sequence corresponding to the minimum logic unit according to all the algorithms corresponding to the minimum logic unit obtained by segmentation;
a determining module 13, configured to determine, according to the target model, logical relationship information between each of the minimum logical units;
a submodel obtaining module 14, configured to obtain a partitioned submodel according to the algorithm sequence, the logical relationship information, and the minimum logical unit;
and the sending module 15 is configured to determine an operation flow and operation data corresponding to each operation step in the operation flow according to the divided sub-model, and send the operation data to the distribution platform.
Specifically, the specific process of implementing the functions of each module in the apparatus according to the embodiment of the present invention may refer to the related description in the method embodiment, and is not described herein again.
As shown in fig. 10, according to an embodiment of another aspect of the present application, there is also provided a distributed data processing apparatus including:
a first obtaining module 21, configured to obtain to-be-processed information used for performing calculation, where the to-be-processed information includes at least two pieces of variable information;
a second obtaining module 22, configured to obtain operation data corresponding to each operation step in the operation flow from the algorithm processing server, where the operation data includes: the operation method comprises an operation rule corresponding to the operation step, a first node identifier of a first operation node for executing the operation step and a second node identifier of a second operation node for receiving an operation result of the operation step;
a generating module 23, configured to generate operation control information corresponding to each first operation node according to the operation data and the variable information;
a sending module 24, configured to send the operation control information to the first operation node according to the first node identifier; the first operation node calculates the operation result according to the operation control information and sends the operation result to the second operation node according to the second node identifier;
a third obtaining module 25, configured to obtain a final result calculated by the final operation node; and the final operation node is the last operation node in the operation flow.
Specifically, the specific process of implementing the functions of each module in the apparatus according to the embodiment of the present invention may refer to the related description in the method embodiment, and is not described herein again.
According to another embodiment of the present application, there is also provided an electronic apparatus including: as shown in fig. 11, the electronic device may include: the system comprises a processor 1501, a communication interface 1502, a memory 1503 and a communication bus 1504, wherein the processor 1501, the communication interface 1502 and the memory 1503 complete communication with each other through the communication bus 1504.
A memory 1503 for storing a computer program;
the processor 1501 is configured to implement the steps of the above-described method embodiments when executing the program stored in the memory 1503.
The bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component.
The embodiment of the present application further provides a storage medium, where the storage medium includes a stored program, and the program executes the method steps of the foregoing method embodiment when running.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The foregoing are merely exemplary embodiments of the present invention, which enable those skilled in the art to understand or practice the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (15)

1. A distributed data processing method, comprising:
acquiring to-be-processed information for calculation, wherein the to-be-processed information comprises at least two variable information;
acquiring operation data corresponding to each operation step in an operation flow from an algorithm processing server, wherein the operation data comprises: the operation method comprises an operation rule corresponding to the operation step, a first node identifier of a first operation node for executing the operation step and a second node identifier of a second operation node for receiving an operation result of the operation step;
generating operation control information corresponding to each first operation node according to the operation data and the variable information;
sending the operation control information to the first operation node according to the first node identification; the first operation node calculates the operation result according to the operation control information and sends the operation result to the second operation node according to the second node identifier;
obtaining a final result calculated by a final operation node; and the final operation node is the last operation node in the operation flow.
2. The method of claim 1, wherein: the generating operation control information corresponding to each first operation node according to the operation data and the variable information includes:
inquiring to obtain an algorithm sequence corresponding to the variable information, and generating an operation unit according to the variable information and the corresponding algorithm sequence;
determining the operation data corresponding to each operation unit;
and generating operation control information corresponding to each first operation node according to the operation data and the operation unit.
3. The method of claim 1, wherein sending the operational control information to the first operational node based on the first node identification comprises:
encrypting the operation control information according to a preset encryption strategy to obtain encrypted information;
and sending the encrypted information to the first operation node, so that the first operation node decrypts the encrypted information according to a decryption strategy corresponding to the encryption strategy to obtain the operation control information.
4. A data processing method for model segmentation, comprising:
dividing the target model, and determining an algorithm corresponding to the sequential division and a logic unit obtained by the division;
obtaining an algorithm sequence corresponding to the minimum logic unit according to all the algorithms corresponding to the minimum logic unit obtained by segmentation;
determining logic relationship information between each minimum logic unit according to the target model;
obtaining a sub-model after segmentation according to the algorithm sequence, the logic relation information and the minimum logic unit;
determining an operation flow and operation data corresponding to each operation step in the operation flow according to the sub-model after division, and sending the operation data to a distribution platform, wherein the operation data comprises: the operation method comprises an operation rule corresponding to the operation step, a first node identification of a first operation node for executing the operation step and a second node identification of a second operation node for receiving an operation result of the operation step.
5. The method according to claim 4, wherein said obtaining a sequence of algorithms corresponding to the minimum logical unit according to all the algorithms corresponding to the minimum logical unit obtained by the segmentation comprises:
determining an algorithm according to which the minimum logic units are obtained by the target model through segmentation;
and arranging the algorithms corresponding to each minimum logic unit according to a segmentation sequence to obtain the algorithm sequence corresponding to each minimum logic unit.
6. The method of claim 4, wherein obtaining the split sub-models according to the algorithm sequence, the logical relationship information, and the minimum logical unit comprises:
determining the level information of each minimum logic unit according to the algorithm sequence of each minimum logic unit;
determining the calculation sequence number of each minimum logic unit layer by layer according to the logic relationship information and the hierarchy information;
obtaining a base code corresponding to the algorithm sequence according to the base corresponding to each algorithm; wherein the base comprises at least one character;
and obtaining the sub-model after segmentation according to the calculated serial number, the base password and the minimum logic unit of each minimum logic unit.
7. The method of claim 6, wherein obtaining the base code corresponding to the algorithm sequence based on the base corresponding to each algorithm comprises:
obtaining a first base code corresponding to each algorithm sequence according to the base corresponding to each algorithm;
determining the longest base code with the largest number of bases in all the first base codes;
determining the base offset number of a second base code according to the maximum base number of the longest base code; wherein the second base code is the first base code in which the number of bases is less than the longest base code;
supplementing empty logic bases at the rear end of the final algorithm of the second base code according to the base compensation quantity so as to compensate the base quantity of the second base code to the maximum base quantity and obtain a compensated second base code; the final algorithm is the last algorithm in the algorithm sequence; the empty logical base is a base that does not comprise an algorithm;
and obtaining the base code corresponding to each algorithm sequence according to the longest base code and the compensated second base code.
8. The method of claim 4, wherein the determining an operation flow according to the sub-model after division and operation data corresponding to each operation step in the operation flow comprises:
analyzing the logic relation information to determine each logic unit which is associated with each other;
obtaining the operation flow according to the logic units and the logic relationship information which are mutually associated;
determining corresponding algorithms when the logic units are mutually associated according to the algorithm sequence;
obtaining an algorithm corresponding to each operation step in the operation flow according to the corresponding algorithm when each logic unit is associated with each other;
randomly selecting operation nodes corresponding to the operation steps, and determining node identifiers of the operation nodes;
and obtaining the operation data corresponding to the operation steps according to the node identification of the operation node corresponding to each operation step and an operation rule.
9. A distributed data processing system, comprising: the system comprises a distribution platform, a data providing end and an algorithm processing server;
the data providing end sends information to be processed, which needs to be calculated, to a distribution platform, wherein the information to be processed comprises at least two variable information;
the algorithm processing server determines operation data corresponding to each operation step in an operation flow and sends the operation data to the distribution platform; the operational data includes: the operation method comprises an operation rule corresponding to the operation step, a first node identifier of a first operation node for executing the operation step and a second node identifier of a second operation node for receiving an operation result of the operation step;
the distribution platform generates operation control information corresponding to each first operation node according to the operation data and the variable information;
the distribution platform sends the operation control information to the first operation node according to the first node identification;
the first operation node calculates to obtain the operation result according to the operation control information and sends the operation result to the second operation node according to the second node identification;
recursion is carried out according to the above steps until the final operation node is calculated to obtain a final result; and the final operation node is the last operation node in the operation flow.
10. The system of claim 9, wherein the distribution platform sends the computation control information to the first computation node according to the first node identification, comprising:
the distribution platform encrypts the operation control information according to a preset encryption strategy to obtain first encryption information;
the distribution platform sends the encryption information to the first operation node;
and the first operation node decrypts the first encrypted information according to a decryption strategy corresponding to the encryption strategy to obtain the operation control information.
11. The system according to claim 10, wherein the first operation node calculates the operation result according to the operation control information, and sends the operation result to the second operation node according to the second node identifier, including:
the first operation node calculates to obtain the operation result according to the operation control information;
the first operation node encrypts the operation result according to the encryption strategy to obtain second encryption information, and the second encryption information is sent to the second operation node according to the second node identifier;
the distribution platform generates operation control information corresponding to the second operation node according to the operation data;
the distribution platform encrypts the operation control information according to the encryption strategy to obtain third encryption information, and the third encryption information is sent to the second operation node according to the second node identifier;
and the second operation node decrypts the second encryption information and the third encryption information according to the decryption strategy to obtain corresponding operation data.
12. A distributed data processing apparatus, comprising:
the device comprises a first acquisition module, a second acquisition module and a processing module, wherein the first acquisition module is used for acquiring information to be processed for calculation, and the information to be processed comprises at least two variable information;
a second obtaining module, configured to obtain operation data corresponding to each operation step in an operation flow from an algorithm processing server, where the operation data includes: the operation method comprises an operation rule corresponding to the operation step, a first node identifier of a first operation node for executing the operation step and a second node identifier of a second operation node for receiving an operation result of the operation step;
the generating module is used for generating operation control information corresponding to each first operation node according to the operation data and the variable information;
a sending module, configured to send the operation control information to the first operation node according to the first node identifier; the first operation node calculates the operation result according to the operation control information and sends the operation result to the second operation node according to the second node identifier;
the third acquisition module is used for acquiring a final result calculated by the final operation node; and the final operation node is the last operation node in the operation flow.
13. A data processing apparatus for model segmentation, comprising:
the segmentation module is used for segmenting the target model, and determining an algorithm corresponding to sequential segmentation and a logic unit obtained by segmentation;
the algorithm sequence module is used for obtaining an algorithm sequence corresponding to the minimum logic unit according to all the algorithms corresponding to the minimum logic unit obtained by segmentation;
the determining module is used for determining the logic relationship information between the minimum logic units according to the target model;
the submodel obtaining module is used for obtaining a sub-model after segmentation according to the algorithm sequence, the logic relationship information and the minimum logic unit;
and the sending module is used for determining an operation flow and operation data corresponding to each operation step in the operation flow according to the sub-model after division, and sending the operation data to the distribution platform.
14. An electronic device, comprising: the system comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
the memory is used for storing a computer program;
the processor, when executing the computer program, implementing the method steps of any of claims 1-8.
15. A storage medium, characterized in that the storage medium comprises a stored program, wherein the program is operative to perform the method steps of any of the preceding claims 1-8.
CN202110183631.1A 2021-02-08 2021-02-08 Distributed data processing method and system Active CN112966279B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110183631.1A CN112966279B (en) 2021-02-08 2021-02-08 Distributed data processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110183631.1A CN112966279B (en) 2021-02-08 2021-02-08 Distributed data processing method and system

Publications (2)

Publication Number Publication Date
CN112966279A true CN112966279A (en) 2021-06-15
CN112966279B CN112966279B (en) 2023-11-03

Family

ID=76284809

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110183631.1A Active CN112966279B (en) 2021-02-08 2021-02-08 Distributed data processing method and system

Country Status (1)

Country Link
CN (1) CN112966279B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102012903A (en) * 2009-09-04 2011-04-13 斯必克有限公司 Method and equipment for organizing hierarchical data in relational database
CN109426574A (en) * 2017-08-31 2019-03-05 华为技术有限公司 Distributed computing system, data transmission method and device in distributed computing system
CN109450617A (en) * 2018-12-06 2019-03-08 成都卫士通信息产业股份有限公司 Encryption and decryption method and device, electronic equipment, computer readable storage medium
CN111666087A (en) * 2020-05-28 2020-09-15 平安医疗健康管理股份有限公司 Operation rule updating method and device, computer system and readable storage medium
WO2020233350A1 (en) * 2019-05-20 2020-11-26 创新先进技术有限公司 Receipt storage method, node and system based on plaintext logs

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102012903A (en) * 2009-09-04 2011-04-13 斯必克有限公司 Method and equipment for organizing hierarchical data in relational database
CN109426574A (en) * 2017-08-31 2019-03-05 华为技术有限公司 Distributed computing system, data transmission method and device in distributed computing system
WO2019042312A1 (en) * 2017-08-31 2019-03-07 华为技术有限公司 Distributed computing system, data transmission method and device in distributed computing system
CN109450617A (en) * 2018-12-06 2019-03-08 成都卫士通信息产业股份有限公司 Encryption and decryption method and device, electronic equipment, computer readable storage medium
WO2020233350A1 (en) * 2019-05-20 2020-11-26 创新先进技术有限公司 Receipt storage method, node and system based on plaintext logs
CN111666087A (en) * 2020-05-28 2020-09-15 平安医疗健康管理股份有限公司 Operation rule updating method and device, computer system and readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
关岳;王戬;: "一种基于微服务的试验台数采分析展示架构", 自动化博览, no. 09 *
吕终亮;白新萍;薛峰;: "基于WebGIS的气象服务产品制作***及关键技术", 应用气象学报, no. 01 *

Also Published As

Publication number Publication date
CN112966279B (en) 2023-11-03

Similar Documents

Publication Publication Date Title
Qu et al. Proof of federated learning: A novel energy-recycling consensus algorithm
US11100427B2 (en) Multi-party computation system for learning a classifier
US10721217B2 (en) Cryptographic datashare control for blockchain
US20160012238A1 (en) A method and system for privacy-preserving recommendation to rating contributing users based on matrix factorization
US11562451B1 (en) Apparatus for proportional calculation regarding non-fungible tokens
CN110443067A (en) Federal model building device, method and readable storage medium storing program for executing based on secret protection
Gupta et al. An artificial intelligence based approach for managing risk of IT systems in adopting cloud
Kaur et al. ClaMPP: A cloud-based multi-party privacy preserving classification scheme for distributed applications
Luo et al. Parallel secure outsourcing of large-scale nonlinearly constrained nonlinear programming problems
Karumanchi et al. Integrated Internet of Things with cloud developed for data integrity problems on supply chain management
Upadhyay et al. Auditing metaverse requires multimodal deep learning
Serrano et al. A peer-to-peer ownership-preserving data marketplace
CN117172772A (en) Meta universe access method based on block chain network and block chain network device
US11334925B1 (en) Normalization and secure storage of asset valuation information
CN112966279B (en) Distributed data processing method and system
Irshad et al. Preserving privacy in collaborative business process composition
Duan et al. Practical distributed privacy-preserving data analysis at large scale
JP7269194B2 (en) Information sharing management method and information sharing management device
Saxena et al. Integration of back-propagation neural network to classify of cybercriminal entities in blockchain
Karmakar et al. Formal Verification of a Medical Insurance System Prototype: The Event-B Modeling Approach.
CN111461178B (en) Data processing method, system and device
Ho-Dac et al. Blockchain in enterprise applications: an introduction
US20230325527A1 (en) System and method to secure data pipelines using asymmetric encryption
Prasad et al. An enhanced agriculture supply chain management using blockchain technology
Albehairi A Privacy-Preserving Framework for Collaborative Association Rule Mining in Cloud

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant