CN112966279B - Distributed data processing method and system - Google Patents

Distributed data processing method and system Download PDF

Info

Publication number
CN112966279B
CN112966279B CN202110183631.1A CN202110183631A CN112966279B CN 112966279 B CN112966279 B CN 112966279B CN 202110183631 A CN202110183631 A CN 202110183631A CN 112966279 B CN112966279 B CN 112966279B
Authority
CN
China
Prior art keywords
node
information
algorithm
base
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110183631.1A
Other languages
Chinese (zh)
Other versions
CN112966279A (en
Inventor
王森
聂二保
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co Ltd filed Critical Beijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN202110183631.1A priority Critical patent/CN112966279B/en
Publication of CN112966279A publication Critical patent/CN112966279A/en
Application granted granted Critical
Publication of CN112966279B publication Critical patent/CN112966279B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/70Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer
    • G06F21/71Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)

Abstract

The application relates to a distributed data processing method and a system, wherein the method comprises the following steps: acquiring information to be processed for calculation, wherein the information to be processed comprises at least two variable information; acquiring operation data corresponding to each operation step in an operation flow from an algorithm processing server; generating operation control information corresponding to each first operation node according to the operation data and the variable information; transmitting the operation control information to a first operation node according to the first node identification; the first operation node calculates to obtain an operation result according to the operation control information, and the operation result is sent to the second operation node according to the second node identification; and obtaining a final operation node and calculating to obtain a final result. The operation steps can be carried out by a plurality of operation nodes, so that each operation node can only acquire partial algorithm and single variable information, but cannot acquire the whole appearance of the algorithm and the information to be processed, the algorithm can be effectively prevented from being cracked, and the information to be processed is prevented from being leaked.

Description

Distributed data processing method and system
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a distributed data processing method and system.
Background
In order to well complete risk pricing in the digital asset field, emerging financial institutions and traditional financial institutions are focusing on the development and implementation of credit risk pricing models based on big data, and even on the basis of the development and implementation, many financial and scientific companies specially designed for such models and providing related services are emerging and are in an increasingly important position in the financial field. Problems with the prior art solutions are mainly as follows:
1. unauthorized use of models
If the model runs in the environment of the data provider, the data provider can use the model at any time without authorization of the model provider.
2. The model logic is difficult and can be easily broken by' push-down
The data provider has a complete model operation process, and although the process can be a black box, the input parameters and the output parameters are clear for the data provider, and even if the model provider performs a certain degree of cracking prevention in a mode of adding noise, the model provider cannot always completely realize cracking prevention.
Aiming at a plurality of technical problems existing in the related art, no effective solution is provided at present.
Disclosure of Invention
In order to solve the above technical problems or at least partially solve the above technical problems, the present application provides a distributed data processing method and system.
In a first aspect, an embodiment of the present application provides a distributed data processing method, including:
acquiring information to be processed for calculation, wherein the information to be processed comprises at least two variable information;
acquiring operation data corresponding to each operation step in an operation flow from an algorithm processing server, wherein the operation data comprises: the method comprises an algorithm corresponding to the operation step, a first node identifier of a first operation node executing the operation step, and a second node identifier of a second operation node used for receiving the operation result of the operation step;
generating operation control information corresponding to each first operation node according to the operation data and the variable information;
transmitting the operation control information to the first operation node according to the first node identifier; the first operation node calculates the operation result according to the operation control information, and sends the operation result to the second operation node according to the second node identifier;
obtaining a final operation node and calculating to obtain a final result; the final operation node is the last operation node in the operation flow.
Alternatively, the method as described above: the generating the operation control information corresponding to each first operation node according to the operation data and the variable information includes:
inquiring to obtain an algorithm sequence corresponding to the variable information, and generating an operation unit according to the variable information and the corresponding algorithm sequence;
determining the operation data corresponding to each operation unit;
and generating operation control information corresponding to each first operation node according to the operation data and the operation unit.
Optionally, in the foregoing method, the sending the operation control information to the first operation node according to the first node identifier includes:
encrypting the operation control information according to a preset encryption strategy to obtain encryption information;
and sending the encrypted information to the first operation node so that the first operation node decrypts the encrypted information according to a decryption strategy corresponding to the encryption strategy to obtain the operation control information.
In a second aspect, an embodiment of the present application provides a data processing method for model segmentation, including:
dividing the target model, and determining an algorithm corresponding to the sequential division and a logic unit obtained by the division;
Obtaining an algorithm sequence corresponding to the minimum logic unit according to all the algorithms corresponding to the minimum logic unit obtained by segmentation;
determining logic relation information among each minimum logic unit according to the target model;
obtaining a sub-model after segmentation according to the algorithm sequence, the logic relation information and the minimum logic unit;
determining an operation flow according to the segmented sub-model and operation data corresponding to each operation step in the operation flow, and sending the operation data to a distribution platform, wherein the operation data comprises: the method comprises an algorithm corresponding to the operation step, a first node identifier of a first operation node executing the operation step and a second node identifier of a second operation node used for receiving the operation result of the operation step.
Optionally, in the foregoing method, the obtaining an algorithm sequence corresponding to the minimum logic unit according to all the algorithms corresponding to the minimum logic unit obtained by segmentation includes:
determining an algorithm on which the minimum logic unit is sequentially obtained by dividing the target model;
and arranging the algorithm corresponding to each minimum logic unit according to the segmentation order to obtain the algorithm sequence corresponding to each minimum logic unit.
Optionally, in the foregoing method, the obtaining the segmented sub-model according to the algorithm sequence, the logic relationship information and the minimum logic unit includes:
determining the hierarchical information of each minimum logic unit according to the algorithm sequence of each minimum logic unit;
determining the calculation sequence number of each minimum logic unit layer by layer according to the logic relation information and the hierarchy information;
obtaining a base code corresponding to the algorithm sequence according to the base corresponding to each algorithm; wherein the base comprises at least one character;
and obtaining the sub-model after segmentation according to the calculation sequence number of each minimum logic unit, the base code and the minimum logic unit.
Optionally, according to the method, the obtaining a base code corresponding to the algorithm sequence according to the base corresponding to each algorithm includes:
obtaining a first base code corresponding to each algorithm sequence according to the base corresponding to each algorithm;
determining the longest base code with the largest number of bases in all the first base codes;
determining the base compensation number of the second base code according to the maximum base number of the longest base code; wherein the second base codon is the first base codon having a fewer number of bases than the longest base codon;
Supplementing empty logical bases at the rear end of the final algorithm of the second base code according to the base compensation quantity so as to compensate the base quantity of the second base code to the maximum base quantity and obtain a compensated second base code; the final algorithm is the last algorithm in the algorithm sequence; the empty logical base is a base that does not contain an algorithm;
and obtaining base codes corresponding to the algorithm sequences according to the longest base code and the compensated second base code.
Optionally, in the foregoing method, the determining an operation flow according to the segmented sub-model and operation data corresponding to each operation step in the operation flow includes:
analyzing the logic relation information and determining each logic unit which is associated with each other;
obtaining the operation flow according to the logic units and the logic relation information which are mutually related;
determining corresponding algorithms when the logic units are associated with each other according to the algorithm sequence;
obtaining an algorithm corresponding to each operation step in the operation flow according to the algorithm corresponding to each logic unit when the logic units are related to each other;
Randomly selecting operation nodes corresponding to each operation step, and determining node identification of each operation node;
and obtaining the operation data corresponding to the operation steps according to the node identifiers of the operation nodes corresponding to the operation steps and the operation rules.
In a third aspect, an embodiment of the present application provides a distributed data processing system, including: the system comprises a distribution platform, a data providing end and an algorithm processing server;
the data providing end sends information to be processed, which needs to be calculated, to a distribution platform, wherein the information to be processed comprises at least two variable information;
the algorithm processing server determines operation data corresponding to each operation step in the operation flow and sends the operation data to the distribution platform; the operation data includes: the method comprises an algorithm corresponding to the operation step, a first node identifier of a first operation node executing the operation step, and a second node identifier of a second operation node used for receiving the operation result of the operation step;
the distribution platform generates operation control information corresponding to each first operation node according to the operation data and the variable information;
The distribution platform sends the operation control information to the first operation node according to the first node identification;
the first operation node calculates the operation result according to the operation control information and sends the operation result to the second operation node according to the second node identifier;
recursion is performed until a final operation node calculates to obtain a final result; the final operation node is the last operation node in the operation flow.
Optionally, in the foregoing system, the sending, by the distribution platform, the operation control information to the first operation node according to the first node identifier includes:
the distribution platform encrypts the operation control information according to a preset encryption strategy to obtain first encrypted information;
the distribution platform sends the encrypted information to the first operation node;
and the first operation node decrypts the first encryption information according to a decryption strategy corresponding to the encryption strategy to obtain the operation control information.
Optionally, in the foregoing system, the first operation node calculates the operation result according to the operation control information, and sends the operation result to the second operation node according to the second node identifier, including:
The first operation node calculates the operation result according to the operation control information;
the first operation node encrypts the operation result according to the encryption strategy to obtain second encryption information, and the second encryption information is sent to the second operation node according to the second node identification;
the distribution platform generates operation control information corresponding to the second operation node according to the operation data;
the distribution platform encrypts the operation control information according to the encryption strategy to obtain third encryption information, and the third encryption information is sent to the second operation node according to the second node identifier;
and the second operation node decrypts the second encryption information and the third encryption information according to the decryption strategy to obtain corresponding operation data.
In a fourth aspect, an embodiment of the present application provides a distributed data processing apparatus, including:
the first acquisition module is used for acquiring information to be processed for calculation, wherein the information to be processed comprises at least two variable information;
the second obtaining module is configured to obtain operation data corresponding to each operation step in an operation flow from the algorithm processing server, where the operation data includes: the method comprises an algorithm corresponding to the operation step, a first node identifier of a first operation node executing the operation step, and a second node identifier of a second operation node used for receiving the operation result of the operation step;
The generation module is used for generating operation control information corresponding to each first operation node according to the operation data and the variable information;
the sending module is used for sending the operation control information to the first operation node according to the first node identification; the first operation node calculates the operation result according to the operation control information, and sends the operation result to the second operation node according to the second node identifier;
the third acquisition module is used for acquiring a final result obtained by calculation of the final operation node; the final operation node is the last operation node in the operation flow.
In a fifth aspect, an embodiment of the present application provides a data processing apparatus for model segmentation, including:
the segmentation module is used for segmenting the target model and determining an algorithm corresponding to the sequential segmentation and a logic unit obtained by the segmentation;
the algorithm sequence module is used for obtaining an algorithm sequence corresponding to the minimum logic unit according to all the algorithms corresponding to the minimum logic unit obtained through segmentation;
the determining module is used for determining logic relation information among the minimum logic units according to the target model;
The sub-model acquisition module is used for acquiring a sub-model after segmentation according to the algorithm sequence, the logic relation information and the minimum logic unit;
and the sending module is used for determining an operation flow according to the segmented sub-model and operation data corresponding to each operation step in the operation flow, and sending the operation data to a distribution platform.
In a sixth aspect, an embodiment of the present application provides an electronic device, including: the device comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
the memory is used for storing a computer program;
the processor is configured to implement a processing method according to any one of the preceding claims when executing the computer program.
In a seventh aspect, an embodiment of the present application provides a storage medium comprising a stored program, wherein the program when run performs the method steps as set forth in any one of the preceding claims.
Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages:
according to the method provided by the embodiment of the application, the operation steps are split into the multiple sub-steps, and the multiple operation nodes are used for operation respectively, so that each operation node can only acquire part of algorithms and single variable information, but cannot acquire the full view of the algorithms and the information to be processed, the algorithms can be effectively prevented from being cracked, and the information to be processed is prevented from being leaked.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the application or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to a person skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a flow chart of a data processing method for model segmentation according to an embodiment of the present application;
FIG. 2 is a flow chart of a data processing method for model segmentation according to another embodiment of the present application;
FIG. 3 is a flow chart of a data processing method for model segmentation according to another embodiment of the present application;
FIG. 4 is a schematic diagram of a model segmentation according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a model segmentation according to another embodiment of the present application;
FIG. 6 is a flowchart of a distributed data processing method according to an embodiment of the present application;
FIG. 7 is a flowchart of a distributed data processing method according to another embodiment of the present application;
FIG. 8 is a block diagram of a distributed data processing system according to an embodiment of the present application;
FIG. 9 is a block diagram of a data processing apparatus for model segmentation according to an embodiment of the present application;
FIG. 10 is a block diagram of a distributed data processing apparatus according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The risk evaluation model refers to a model for evaluating credit risk conditions of digital assets in a circulation process after the assets are digitally transformed, and generally relates to collection, processing, operation and the like of data of various dimensions related to the assets. In the traditional risk pricing field, the data acquisition and processing are generally completed by means of a centralized mechanism and an operation flow, such as the processes of on-site adjustment of a credit analyst, establishment of a credit model and the like, in the process, enterprises which need to render digital assets are willing to provide enterprise data to facilitate the processes of model operation and the like in order to acquire credit allocation as soon as possible in the financing process and at the same time in the requirements of financing enterprise information disclosure, the related enterprises information is often completed by a third party mechanism for bearing credit risk pricing except the financing enterprises, and excessive data acquisition and model transmission processes are not involved. As a professional third party organization, the credit rating company model is often confidential in its core technology, and does not publish the details of the credit risk pricing model to the financing enterprises being evaluated, nor does it disclose information on the model in public places.
With the reduction of traditional financial asset sources, the advent of popular finance, particularly consumer finance for the C-terminal and risk preference for SME (small and medium micro-business) financial assets is becoming a variety of financial institutions, particularly emerging financial institutions. Digital assets corresponding thereto are becoming the mainstream of asset form as one of the important financial asset models. However, an important distinction between digital assets and traditional morphological assets is that their sources are dominated by assets of poor fluidity in the C-terminal and small and medium-sized micro-individuals, which requires a significant distinction between risk pricing models and traditional ones on technical routes and data basis.
In order to well complete risk pricing in the digital asset field, emerging financial institutions and traditional financial institutions are focusing on the development and implementation of credit risk pricing models based on big data, and even on the basis of the development and implementation, many financial and scientific companies specially designed for such models and providing related services are emerging and are in an increasingly important position in the financial field. The prior art method and process are generally as follows:
1. production and deployment of models
As a provider of the model, a professional finance and science company combines the original data with the technical foundation of the company through collection, cleaning and the like of credit information data, completes model construction through methods such as a proprietary method, a statistical method, machine learning and the like, is deployed into a production environment, and is continuously optimized and iterated.
2. Generally risk pricing model information is not passed outwards
As a core technology and confidentiality of a finance technology company, a risk pricing model is the most competitive product, and in view of protecting technological achievements and intellectual property rights, the finance technology company generally does not deploy the model in a production environment of a data provider, and is not willing to publish specific details of the model to society.
3. Once the model is out of the financial and technological company system, the safety problem is not well solved
For the problems of information security, privacy protection and the like, the data provider can provide data for the model to operate under the conditions of compliance and security, and based on the data, the model provider encrypts the model and then deploys the model to an independent environment of the data provider, and the result is calculated and then output.
Drawbacks of the prior art solutions:
as described above, due to the specificity of the credit risk pricing model of digital assets, deployment from a model provider environment is often required, such as in a data provider environment, or in a platform system that performs a particular financial business.
In the prior art, although the security problem of the model deployment in other environments can be solved to a certain extent, as the data provider has relatively complete model information, even if the model runs in a black box, the data can be directly input into the model to obtain a result without disassembling the process, and services are provided for other clients except the model provider, so that technically, an operation mechanism satisfactory to both parties cannot be established.
In order to solve at least one of the above technical problems, as shown in fig. 1, an embodiment of the present application provides a data processing method for model segmentation, including the following steps P1 to P5:
step P1, dividing a target model, and determining an algorithm corresponding to the sequential division and a logic unit obtained by the division; wherein the object model includes a plurality of logic units.
Specifically, the logical structure of the target model may be determined first, where the target model may be a formula for implementing a specific algorithm, and the logical structure corresponding to the formula may be: the logic units included in the formula are determined according to an operational sequence or algorithm. For example, when the following formula (1) exists:
M=0.8×[x 1 /3+(x 2 +x 3 )/(x 4 +x 5 )]+0.2×(x 6 2 +x 7 x 8 -x 9 +x 10 3 ) (1)
in the logic structure corresponding to the formula (1), the logic corresponding to the addition is the logic corresponding to the addition from the maximum level, so that the splitting can be obtained: 0.8 x 1 /3+(x 2 +x 3 )/(x 4 +x 5 )]Corresponding logic unit I is compared with 0.2 x (x 6 2 +x 7 x 8 -x 9 +x 10 3 ) The corresponding logical unit II, and the operation rule between the two logical units is addition.
Specifically, the algorithm according to each division and the logic units obtained by division are determined, so that the algorithm according to the correlation among the logic units obtained by division can be obtained.
Further, each time the logic unit obtained by dividing is performed, it is necessarily a sub-logic unit in the logic unit obtained by the last division. For example, when the formula (1) is divided for the first time, 0.8 x [ x ] is obtained 1 /3+(x 2 +x 3 )/(x 4 +x 5 )]Corresponding logic unit I or 0.2× (x 6 2 +x 7 x 8 -x 9 +x 10 3 ) The corresponding logical units II are all original logical units (i.e.: equation (1)) and the algorithm underlying between logic I and logic II is "addition"; when the logic unit I is further divided to obtain x 1 3 logical unit III, which is a sub-logical unit of logical unit I.
And step P2, obtaining an algorithm sequence corresponding to the minimum logic unit according to all algorithms corresponding to the minimum logic unit obtained by segmentation.
That is, the object model is sequentially divided according to the logical structure until the minimum logical unit is obtained. And, all algorithms related to the minimum logic unit are recorded and the corresponding algorithm sequence is obtained. Typically, each algorithm sequence includes at least one algorithm.
And step P3, determining logic relation information among each minimum logic unit according to the target model.
Specifically, taking the example that the target model is a formula, the logic relationship information between the minimum logic units may include: information of operation relations exists among different minimum logic units, and information among all intermediate variables is obtained according to the different minimum logic units; and each minimum logical unit can inherit the logical relationship between the intermediate variable formed by the minimum logical unit and other intermediate variables.
And step P4, obtaining a sub-model after segmentation according to the algorithm sequence, the logic relation information and the minimum logic unit.
Specifically, each sub-model after segmentation includes an algorithm sequence corresponding to the minimum logic unit and logic relationship information.
And then, according to the algorithm sequence and the logic relation information, how to 'assemble' each minimum logic unit so as to restore and obtain a target model.
Step P5., determining an operation flow according to the segmented submodel, and operation data corresponding to each operation step in the operation flow, and sending the operation data to a distribution platform, wherein the operation data comprises: the method comprises an algorithm corresponding to the operation step, a first node identifier of a first operation node for executing the operation step, and a second node identifier of a second operation node for receiving the operation result of the operation step.
Specifically, since the sub-model after segmentation is obtained according to the algorithm sequence, the logic relationship information and the minimum logic unit, which are obtained by splitting the object model according to the corresponding logic structure, the object model can be restored according to the logic relationship information and the minimum logic unit in the sub-model after segmentation, and the operation flow (for example, taking the formula (1) as an example, the operation flow needs to be calculated in the first step (x 2 +x 3 ) And (x) 4 +x 5 ) The second step of calculation (x 2 +x 3 )/(x 4 +x 5 ) X 1 3, third step of calculating x 1 /3+(x 2 +x 3 )/(x 4 +x 5 ) Fourth step of calculating 0.8 x 1 /3+(x 2 +x 3 )/(x 4 +x 5 ) Finally, calculating to obtain a final result); in addition, after the operation flow is obtained, the algorithm sequence corresponding to each minimum logic unit is combinedThe column can analyze and obtain the operation data corresponding to each step in the operation flow.
Alternatively, the operation data may be operation data corresponding to each operation step after each operation step is determined, or may be operation data corresponding to all operation steps after operation nodes for executing each operation step in the operation flow are planned in advance.
The first operation node may be an operation node corresponding to each operation step; the second operation node may be an operation node for receiving the operation result obtained in each operation step; therefore, after the second operation node receives the obtained operation result, it can be used as the first operation node again to receive the obtained operation data so as to determine the operation rule required to be executed on the received obtained operation result and the transmission target of the obtained new operation result (i.e. another second operation node).
According to the method, one target model can be divided into a plurality of minimum logic units, a corresponding divided sub-model is obtained, a corresponding operation flow can be obtained through restoration according to the divided sub-model, and the operation flow is sent to a distribution platform, so that each operation step in the operation flow can be calculated through different operation nodes, the purpose that each operation party cannot completely obtain the whole algorithm of the target model is finally achieved, and the algorithm cannot leak can be guaranteed.
As shown in fig. 2, in some embodiments, as in the foregoing method, the step P2 obtains the algorithm sequence corresponding to the minimum logical unit according to all algorithms corresponding to the minimum logical unit obtained by segmentation, including the following steps P21 and P22:
and step P21, determining an algorithm which is sequentially based on the minimum logic units obtained by dividing the target model.
And step P22, arranging the algorithm corresponding to each minimum logic unit according to the segmentation order to obtain an algorithm sequence corresponding to each minimum logic unit.
That is, when the target model is divided, the correspondence between the order of division and the algorithm when the minimum logical unit is obtained is recorded.
The algorithm sequence is obtained by arranging the respective algorithms in the order of division.
For example, the algorithm for dividing into a minimum logic unit z is as follows: when "+", "×", "+", "×" is used, the corresponding algorithm sequence may be: + → + →×.
As shown in fig. 3, in some embodiments, the step P4 obtains the segmented submodel according to the algorithm sequence, the logic relationship information and the minimum logic unit according to the method described above, and includes the following steps P41 to P44:
step P41, determining the hierarchical information of each minimum logic unit according to the number of algorithms in the algorithm sequence of each minimum logic unit.
That is, the hierarchical information of each minimum logical unit is consistent with the number of algorithms in the algorithm sequence corresponding to the hierarchical information.
For example, when the number of algorithms corresponding to the algorithm sequence is at most 5 in all the minimum logic units, the level information of the minimum logic unit of the algorithm sequence including 5 algorithms may be one level, the level information of the minimum logic unit of the algorithm sequence including 4 algorithms may be two levels, and so on.
Specifically, since the number of algorithms matches the number of divisions, the division number of each minimum logical unit may be divided into levels.
Step P42, determining the calculation sequence number of each minimum logic unit layer by layer according to the logic relation information and the hierarchy information.
Specifically, the calculation sequence number of each minimum logic unit is determined layer by layer according to the level information, and then the calculation sequence in the subsequent calculation can be determined according to the number of the algorithm of each minimum logic unit; meanwhile, the logic relation information determines the calculation sequence number of each minimum logic unit, and a plurality of minimum logic units with logic relation can be adjacently ordered, so that the minimum logic units can be distributed according to the calculation sequence numbers when model restoration is carried out in the later period.
Step P43, obtaining a base code corresponding to the algorithm sequence according to the base corresponding to each algorithm; wherein the base comprises at least one character.
That is, by the base band algorithm, and thereby the base code corresponding to the algorithm sequence is obtained.
Alternatively, when the base is provided with four kinds of "A, T, C, G", four kinds of algorithms "add, subtract, multiply, divide" may be respectively corresponded, and thus there are 4×4=16 pairing methods of the base and the algorithm. Furthermore, the following corresponding relations of A-adding, T-subtracting, C-multiplying and G-dividing can be determined, in the actual processing, decryption difficulty can be increased through a multi-layer nesting method, for example, the decryption difficulty can be increased in geometric progression after each nesting of the AT-adding, TA-subtracting, CG-multiplying and GC-dividing, so that the algorithm is further prevented from being decrypted.
And step P44, obtaining a sub-model after segmentation according to the calculation serial number, the base code and the minimum logic unit of each minimum logic unit.
That is, each of the sub-models after segmentation includes a calculation number and a base code corresponding to each of the minimum logical units.
In some embodiments, as in the previous method, step P43 obtains a base code corresponding to the algorithm sequence from the base corresponding to each algorithm, including steps P431 to P435 as follows:
step P431, obtaining a first base code corresponding to each algorithm sequence according to the base corresponding to each algorithm.
Specifically, the first base code is a base code corresponding to each algorithm sequence.
Step P432. Determining the longest base code with the largest number of bases in all the first base codes.
Specifically, the longest base code is the base code with the largest number of bases among all the first base codes, that is: corresponding to the most algorithmic sequence of algorithms.
Step P433, determining the base compensation number of the second base code according to the maximum base number of the longest base code; wherein the second base codon is the first base codon having fewer bases than the longest base codon.
Specifically, the maximum number of bases is the number of bases of the longest base code; thus, the base offset number can be obtained by determining the difference between the number of bases in each second base code and the maximum number of bases.
Step P434, supplementing empty logical bases at the rear end of the final algorithm of the second base code according to the base compensation quantity, so as to compensate the base quantity of the second base code to the maximum base quantity, and obtaining a compensated second base code; the final algorithm is the last algorithm in the algorithm sequence; an empty logical base is a base that does not contain an algorithm.
Specifically, the final algorithm is an algorithm based on which the last time the segmentation is performed when the minimum logic unit corresponding to the second base password is obtained by segmentation; alternatively, when the maximum number of bases is 4 and there is se:Sub>A second base code of C-A-C, and the leftmost base corresponds to the final algorithm, it is necessary to add an empty logical base (e.g., U) to the leftmost base and obtain U-C-A-C as the compensated second base code, based on step P53 in the above embodiment.
And step P435, obtaining the base codes corresponding to the algorithm sequences according to the longest base code and the compensated second base code.
Specifically, since the number of bases of the longest base code is the largest, compensation is not necessary; after the compensated second base code is obtained, the base code corresponding to each algorithm sequence can be obtained according to the longest base code and the compensated second base code.
By the method in the embodiment, the base codes can be unified, so that when model restoration is carried out in the later period, whether variable information corresponding to the minimum logic unit needs to be operated or not and whether the variable information needs to be distributed or not can be judged directly through identifying each base in the base codes, and the system is more convenient to identify and judge.
In some embodiments, as in the foregoing method, the step P5 determines an operation flow according to the segmented submodel, and the operation data corresponding to each operation step in the operation flow includes the following steps P51 to P56:
and step P51, analyzing the logic relation information to determine each logic unit associated with each other.
Step P52, obtaining an operation flow according to each logic unit and logic relation information which are mutually related;
step P53, determining the corresponding algorithm when each logic unit is associated with each other according to the algorithm sequence;
Step P54, according to the corresponding algorithm when each logic unit is associated with each other, the corresponding algorithm of each operation step in the operation flow is obtained.
Specifically, the steps P51 to P54 are performed in reverse order on the steps P1 to P4, and the splitting of the object model in the steps P1 to P4 is performed in reverse order to the operation flow, so that the operation flow and the algorithm corresponding to each operation step in the operation flow can be restored by the method in the steps P51 to P54.
Step P55, randomly selecting to obtain operation nodes corresponding to each operation step, and determining node identifiers of each operation node;
and step P56, obtaining the operation data corresponding to the operation steps according to the node identifiers of the operation nodes corresponding to the operation steps and the algorithm.
That is, stored in the operation data is the node identification of the operation node.
Obtaining each operation node through random selection; the situation that all operation nodes cooperate with each other and are restored to obtain the target model can be avoided.
Application example:
(I) Partitioning according to a logical structure
(1) First layer logic partitioning
Since the risk pricing model of digital assets generally belongs to a statistical model, the end result is to multiply its weight by different data dimensions and sum up. Thus, the first layer of logical partitioning results in a sequence of weights and a corresponding sequence of data dimensions. For convenience of description, this aspect takes an example of formula (1) as a model:
M=0.8×[x 1 /3+(x 2 +x 3 )/(x 4 +x 5 )]+0.2×(x 6 2 +x 7 x 8 -x 9 +x 10 3 ) (1)
Alternatively, when two portions of logic units corresponding to different weights are respectively taken as separate individuals, the first layer logic division is respectively performed on 0.8 x 1 /3+(x 2 +x 3 )/(x 4 +x 5 )]0.2× (x 6 2 +x 7 x 8 -x 9 +x 10 3 ) And (3) dividing, wherein the weight sequences corresponding to the two logic units are 0.8 and 0.2 respectively. And the logic unit obtained after the segmentation is "[ x ] 1 /3+(x 2 +x 3 )/(x 4 +x 5 )]Sum (x) 6 2 +x 7 x 8 -x 9 +x 10 3 )”。
The logic cells obtained by dividing the layer can be expressed as a formula (2) and a formula (3).
m 1 =x 1 /3+(x 2 +x 3 )/(x 4 +x 5 ) (2)
m 2 =x 6 2 +x 7 x 8 -x 9 +x 10 3 (3)
In the formula (2) and the formula (3), m 1 、m 2 The 1 and 2 in (a) represent a first logic unit and a second logic unit formed by the logic division of the present layer, and are similar to each other hereinafter.
m subscripted digits, the digits of the digits represent hierarchical information of the logical partitioning, a general example: e.g. m 1212 The subscript is provided with four digits, which indicate that the logic unit m is obtained after four divisions 1212 Thus the level information is 4, the logic unit m 1212 The segmentation method of (2) is as follows: dividing the target model m for the first time to obtainTo the first logic unit m 1 Then for the first logic unit m 1 Performing a second division to obtain a second logic unit m 12 And then to m 12 In the third division, a first logic unit m under the division is obtained 121 The method comprises the steps of carrying out a first treatment on the surface of the Finally to m 121 Performing fourth division, and taking the second logic unit m of the sub-division 1212 . A schematic diagram of the specific segmentation is shown in fig. 4:
(2) Second layer segmentation
The second layer division result is obtained as formula (4) to formula (9) according to the same procedure as the first layer division.
m 11 =x 1 /3 (4)
m 12 =(x 2 +x 3 )/(x 4 +x 5 ) (5)
m 21 =x 6 2 =x 6 ×x 6 (6)
m 22 =x 7 ×x 8 (7)
m 23 =-x 9 (8)
m 24 =x 10 3 =x 10 ×x 10 ×x 10 (9)
It should be noted that in equation (4), only one argument is included, and the argument is not involved in computation with itself (power operation) and other arguments, so x is not included in the third division 1 Content of x 9 The same applies.
(3) Third layer segmentation
The third layer of logic segmentation results in the formula:
m 121 =x 2 +x 3 (10)
m 122 =x 4 +x 5 (11)
m 211 =x 6 (12)
m 212 =x 6 (13)
m 221 =x 7 (14)
m 222 =x 8 (15)
m 241 =x 10 (16)
m 242 =x 10 (17)
m 243 =x 10 (18)
(4) Fourth layer segmentation
The fourth layer of logic division results in the formula:
m 1211 =x 2 (19)
m 1212 =x 3 (20)
m 1221 =x 4 (21)
m 1222 =x 5 (22)
the basis for the end of the logic partitioning is that each minimum logical unit finally only comprises one independent variable, and if a plurality of identical independent variables appear in the result, the operation of the identical variable is performed for a plurality of times.
The final formed arithmetic logic framework is shown in the model logic partition framework diagram in fig. 5:
as can be seen from fig. 5, the model in the example of the present invention is divided into four layers to obtain 13 basic logic units, from bottom to top: m is m 1211 、m 1212 、m 1221 、m 1222 、m 211 、m 212 、m 221 、m 222 、m 241 、m 242 、m 243 、m 11 、m 23 . It is determined whether it is the final logical unit, the criterion being that no segmentation is performed below it, and only one argument is involved in the operation in this unit.
(II) base codon usage for model logic
(1) Randomly determining base pairing rules
Since there are only four bases "A, T, C, G", four algorithms "add, subtract, multiply, divide" can be corresponded, and thus there are 4×4=16 pairing methods for the bases and the algorithm. For simplicity, the invention determines the following sequence of 'A-adding, T-subtracting, C-multiplying and G-dividing', and in actual treatment, decryption difficulty can be increased by a multi-layer nesting method, such as 'AT-adding, TA-subtracting, CG-multiplying and GC-dividing', and each nesting time, the decryption difficulty can show geometric progression. Some logic units do not involve all the division levels, if m212 is divided into only the third layer, the algorithm base corresponding to the fourth layer is replaced by "U", and the logic default is identified.
(2) Base pairing of logical units
According to the hierarchy of logic partitioning, the sequences of operations are paired according to the base pairing rules (the pairing process includes all four layers of operation logic, not just one layer), as shown in table 1:
table 1: base pairing table corresponding to logical partitioning
In table 1, the first column is a logic unit, the second column is a calculation sequence number, and all the unit logics of the whole formula can be assembled according to the calculation sequence number in the later stage. All elements corresponding to the first layer segmentation need to be multiplied by the weight value, so the base is 'C'.
(III) obtaining the logical Unit base code
The calculation numbers, base codes, and weights of the logical units are assembled together to form a logical unit base code list, as shown in table 2:
table 2: base sequence table of logic unit
/>
As shown in fig. 6, according to another embodiment of the present application, there is further provided a distributed data processing method, including steps S1 to S5 as follows:
s1, receiving information to be processed for calculation, wherein the information to be processed comprises at least two variable information.
Specifically, the method of the embodiment can be applied to a distribution platform for distributing various variable information; the information to be processed is data for calculation according to a preset algorithm, that is, the information to be processed is calculated through the algorithm to obtain a corresponding calculation result; wherein, each variable information may include: variable type and variable value; wherein the variable type may be, for example, x 1 、x 2 Isovaries; but may also be such as: the present gold, interest, etc. are information for characterizing the specific meaning of the variables.
Step S2, acquiring operation data corresponding to each operation step in an operation flow from an algorithm processing server, wherein the operation data comprises: the method comprises an algorithm corresponding to the operation step, a first node identifier of a first operation node for executing the operation step, and a second node identifier of a second operation node for receiving the operation result of the operation step.
The operation nodes are used for providing operation capability corresponding to operation steps, and each operation step corresponds to different operation nodes.
Specifically, each operation step has an operation node corresponding to the operation step; alternatively, the operation flow may be: when the variable information is required to be operated, an operation node is selected in real time, and an operation flow comprising an operation step is determined; the operation flow may include the variable information or intermediate variable information including the variable information (for example, when the variable corresponding to the variable information is x 1 In this case, the intermediate variable information may be information including information according to x 1 The intermediate variable (x 1 +x 2 ) Or x 3 (x 1 +x 2 ) Etc.).
In general, in order to break up the centering process to increase the difficulty of decryption, each operation node generally processes a limited number of operations (e.g., at most one time) to avoid a node processing too many operation steps, thereby reaching the probability of deriving an algorithm.
The algorithm may be, for example: gauge for addition, subtraction, multiplication, division, cube and other operationsThen, and at the time of operation, the operational relationship between the respective variables (for example, when there is a variable x 1 And x 2 When the two are needed to be divided, the algorithm determines x 1 And x 2 The division operation is needed to be carried out between the two, and x is also determined 1 And x 2 The positional relationship with respect to the divisor, i.e. determining x 1 And x 2 And dividend).
The first operation node is an operation node corresponding to the operation step and used for processing variable information; the second operation node is used for receiving the operation result and calculating according to the operation result; each operation node corresponds to a node identifier, and the node identifier can be address information or unique identification information of the operation node, so that the operation node can be positioned to a target operation node to which the operation information needs to be sent through the node identifier.
In some cases, multiple variable information may be involved in performing the operation, and thus the same operation node may receive multiple variable information in this case; in addition, when the algorithm corresponding to the variable information is negative or inverse, only one variable information may be received in the same operation node.
And S3, generating operation control information corresponding to each first operation node according to the operation data and the variable information.
Specifically, when the first operation node receives variable information subjected to any calculation and is associated with a second operation node to which an operation result needs to be sent, the operation control information includes: the method comprises an algorithm corresponding to the operation step, a first node identifier of a first operation node executing the operation step, a second node identifier of a second operation node used for receiving the operation result of the operation step, and variable information.
When the variable information for calculation received in the first operation node is an intermediate variable, only the operation algorithm, the first node identification and the second node identification need to be received.
When the first node is the node corresponding to the last operation step in the first node operation flow, only the algorithm and the first node identification are received.
S4, transmitting the operation control information to a first operation node according to the first node identification; so that the first operation node calculates to obtain an operation result according to the operation control information, and sends the operation result to the second operation node according to the second node identification.
S5, obtaining a final operation node and calculating to obtain a final result; the final operation node is the last operation node in the operation flow.
Specifically, the first operation node performs an operation on the variable information according to an algorithm in the operation information to obtain an operation result.
In addition, the operation information further includes a node identifier of the second operation node, so that after the first operation node obtains the operation result, the operation result can be sent to the corresponding second operation node according to the node identifier, so that the target operation node can continue to perform the next operation step according to the operation result.
And according to the recursion, the operation result obtained by the second operation node is sent to the next operation node again; until the final result is obtained by the final operation node according to the operation result obtained by the calculation of the received operation node of the preamble and the received operation rule; after the second operation node obtains the operation result, the node identifier of another operation node to which the second operation node needs to forward the operation result can be obtained from the information issued by the control end; furthermore, the distribution platform for implementing the method of the embodiment can only execute the distribution action, but cannot acquire the information to be processed and the overall view of the algorithm model.
Further, in this embodiment, the algorithm, the second node identifier and the variable information may be sent to the first operation node at the same time, or the algorithm and the variable information may be sent to the first operation node first, and after the current operation node feeds back the operation result, the second node identifier is sent to the first operation node; both methods can achieve the purpose of enabling the first operation node to send the operation result to the second operation node executing the next operation step.
By splitting the operation step into a plurality of sub-steps and operating by a plurality of operation nodes respectively, each operation node can only acquire partial algorithm and single variable information, but cannot acquire the whole view of the algorithm and the information to be processed, so that the algorithm can be effectively prevented from being cracked, and the information to be processed is prevented from being leaked.
As shown in fig. 7, in some embodiments, as the aforementioned method, the step S3 generates the operation control information corresponding to each first operation node according to the operation data and the variable information, including the following steps S31 to S34:
s31, inquiring to obtain an algorithm sequence corresponding to the variable information, and generating an operation unit according to the variable information and the corresponding algorithm sequence.
Specifically, the variable information may include type information, and the algorithm sequence may also correspond to one type information; further, the above-mentioned query may be performed by matching the characters of the type information, and after the algorithm sequence is obtained by the query, the variable value of the variable information may be written by adding a field to the algorithm sequence, thereby generating the arithmetic unit.
The method for obtaining the algorithm sequence may refer to the content in the foregoing embodiment, and will not be described herein.
And S32, determining operation data corresponding to each operation unit.
Specifically, since the operation data is information sent from the algorithm processing server, the operation data can be determined according to the sub-model after segmentation, and the sub-model after segmentation includes an algorithm sequence, the corresponding operation unit can be obtained according to the matching of the algorithm sequence, and the operation data corresponding to the operation unit can be further determined.
And S33, generating operation control information corresponding to each first operation node according to the operation data and the operation unit.
The arithmetic unit includes variable information, so that arithmetic control information corresponding to each first arithmetic node can be obtained according to the arithmetic data and the arithmetic unit.
Further, since the second operation node may receive a plurality of operation results, and when performing operations with the same rule (e.g. division), the divisor and the dividend are exchanged, a completely different result is obtained, and thus the operation data may further carry information specifying the order of the operation results corresponding to the first operation units, for example, one of the optional expression modes of the information may be: x is x 1 、x 2 -a/d; so that the second arithmetic unit can know that x should be 1 ÷x 2 The method comprises the steps of carrying out a first treatment on the surface of the When the second operation unit receives the following information: x is x 2 、x 1 -a/d; the operation that the second arithmetic unit should perform is x 2 ÷x 1
In some embodiments, as the foregoing method, the step S4 of transmitting the operation control information to the first operation node according to the first node identifier includes the following steps S41 and S42:
s41, encrypting the operation control information according to a preset encryption strategy to obtain encryption information;
and S42, transmitting the encrypted information to the first operation node so that the first operation node decrypts the encrypted information according to a decryption strategy corresponding to the encryption strategy to obtain operation control information.
Specifically, the encryption policy and the decryption policy in this embodiment are set correspondingly to each other, and optionally, a corresponding public key and private key may be used.
Therefore, the algorithm, the node identification and the variable information can be encrypted through the private key; and distributes the public key in advance to each candidate operation node that can be used for calculation. After the candidate operation node is selected, the encryption information can be decrypted according to the public key to obtain an algorithm, a node identifier and variable information.
By means of the method in the embodiment, illegal personnel can be prevented from intercepting distributed information through encryption, so that data and algorithms are leaked, and safety is affected.
Application example:
the operation assembly unit and the distributed operation are realized by the method in the embodiment:
the base code and the data unit of the logic unit are combined into an operator unit, and the operator unit and the logic unit can form an operation unit.
Based on the previous application example, as known from the model segmentation process, each logic unit corresponds to a different argument x, and the segmented data units are combined with the logic units to form an operation unit as shown in table 3:
table 3: logical unit, base code, and argument correspondence table
The leftmost side of each operation unit is provided with a calculation sequence number which can be used as a sequencing number of the operation unit.
And according to the calculation sequence number, the calculation sequence number can be used for limiting the sequence of two or more variable information when the calculation node is instructed to calculate; by way of example: in the process of calculating m 121 And m 122 The quotient between can be calculated by m 1211 、m 1212 、m 1221 And m 1222 The corresponding numbers 1, 2, 3 and 4, respectively, determine that m should be when division is performed 121 /m 122 Rather than m 122 /m 121
Distributed computing process for random selection and inverse logic cutting sequence of distributed computing environment
Through a random method, an operation node corresponding to the operation unit is selected in the block chain, and the operation node provides calculation force to operate the operation unit. The specific process is as follows:
(1) The arithmetic unit distributes and calculates for the first time
The distribution of the arithmetic units is carried out according to the sequence of logic division and the process of logic cutting of the installation model, if the arithmetic units are transmitted according to the division times, the arithmetic units distributed for the first time can be the No. 1, 2, 3 and 4 arithmetic units, and if the arithmetic units are only stored, the arithmetic units can also be other arithmetic units; in particular, the arithmetic units 1, 2, 3 and 4 can be used for storage without algorithms in the first distribution. The random distribution is to disorder the storage state after the centering cutting, decryption difficulty is improved through the decentralization process, and each operation node is randomly selected, so that the possibility of collusion is reduced.
The result of this distribution and operation yields x 2 、x 3 、x 4 、x 5 、x 1 、x 9 Wherein x is 2 、x 3 、x 4 、x 5 Alone, e.g. at pair x 1 、x 9 When the distribution is carried out, the method can be provided with related algorithms, and the operation nodes are enabled to calculate and obtain corresponding operation results: x is x 1 3 and-x 9
(2) Multiple distribution and distribution operation of operation unit
According to the generation process of the operation unit, the following results are obtained in sequence:
a. second time distribution and operation
The method comprises the following steps: x is x 2 +x 3 X 4 +x 5
b. Third time distribution and operation
The method comprises the following steps: (x) 2 +x 3 )/(x 4 +x 5 )、x 6 ×x 6 (equivalent to x 6 2 )、x 7 ×x 8 、x 10 ×x 10 ×x 10 (equivalent to x 10 3 )。
c. Fourth time distribution and operation
Obtain [ x ] 1 /3+(x 2 +x 3 )/(x 4 +x 5 )]And [ x ] 6 2 +x 7 ×x 8 +x 10 ×x 10 ×x 10 ]Two results.
d. Calculation and assembly of final results
Combining the distributed operation results into a randomly selected distributed computing node to obtain:
[x 1 /3+(x 2 +x 3 )/(x 4 +x 5 )]、[x 6 2 +x 7 ×x 8 +x 10 ×x 10 ×x 10 ]and substituting the combination of the weight coefficients of the algorithm A (which are already identified in the operation unit) to restore the weight coefficients to obtain the formula (1).
0.8×[x 1 /3+(x 2 +x 3 )/(x 4 +x 5 )]+0.2×[x 6 2 +x 7 ×x 8 +x 10 3 ] (1)
And then the complete reduction is carried out to obtain a formula (1), namely a target model.
As shown in fig. 8, according to another aspect of the present application, there is also provided a distributed data processing system, including: a distribution platform 3, a data providing end 2 and an algorithm processing server 1;
the data providing terminal 2 sends information to be processed, which includes at least two variable information, to be calculated to a distribution platform;
the algorithm processing server 1 determines operation data corresponding to each operation step in the operation flow and sends the operation data to the distribution platform; the operation data includes: an algorithm corresponding to the operation step, a first node identifier of a first operation node 401 for executing the operation step, and a second node identifier of a second operation node 402 for receiving an operation result of the operation step;
The distribution platform 3 generates operation control information corresponding to each first operation node 401 according to the operation data and the variable information;
the distribution platform 3 sends the operation control information to the first operation node 401 according to the first node identifier;
the first operation node 401 calculates an operation result according to the operation control information, and sends the operation result to the second operation node 402 according to the second node identifier;
recursion is performed until a final result is obtained by calculation of the final operation node 403; the final operation node is the last operation node in the operation flow.
In particular, the specific process of implementing the functions of each module in the apparatus of the embodiment of the present invention may be referred to the related description in the method embodiment, which is not repeated herein.
In some embodiments, as in the foregoing system, the distributing platform 3 sends the operation control information to the first operation node 401 according to the first node identifier, including:
the distribution platform 3 encrypts the operation control information according to a preset encryption strategy to obtain first encryption information;
the distribution platform 3 sends the encrypted information to the first operation node 401;
the first operation node 401 decrypts the first encrypted information according to a decryption policy corresponding to the encryption policy, to obtain operation control information.
In particular, the specific process of implementing the functions of each module in the apparatus of the embodiment of the present invention may be referred to the related description in the method embodiment, which is not repeated herein.
In some embodiments, in the foregoing system, the first operation node 401 calculates an operation result according to the operation control information, and sends the operation result to the second operation node 402 according to the second node identifier, including:
the first operation node 401 calculates an operation result according to the operation control information;
the first operation node 401 encrypts the operation result according to the encryption strategy to obtain second encryption information, and sends the second encryption information to the second operation node 402 according to the second node identifier;
the distribution platform 3 generates operation control information corresponding to the second operation node according to the operation data;
the distribution platform 3 encrypts the operation control information according to an encryption strategy to obtain third encryption information, and sends the third encryption information to the second operation node 402 according to the second node identifier;
the second operation node 402 decrypts the second encrypted information and the third encrypted information according to the decryption policy, and obtains corresponding operation data.
In particular, the specific process of implementing the functions of each module in the apparatus of the embodiment of the present invention may be referred to the related description in the method embodiment, which is not repeated herein.
As shown in fig. 9, according to an embodiment of another aspect of the present application, there is also provided a data processing apparatus for model segmentation, including:
the segmentation module 11 is used for segmenting the target model and determining an algorithm corresponding to the sequential segmentation and a logic unit obtained by the segmentation;
an algorithm sequence module 12, configured to obtain an algorithm sequence corresponding to a minimum logic unit according to all the algorithms corresponding to the minimum logic unit obtained by segmentation;
a determining module 13, configured to determine logic relationship information between each minimum logic unit according to the target model;
a sub-model obtaining module 14, configured to obtain a segmented sub-model according to the algorithm sequence, the logic relationship information, and the minimum logic unit;
and the sending module 15 is configured to determine an operation flow according to the segmented sub-model, and operation data corresponding to each operation step in the operation flow, and send the operation data to a distribution platform.
In particular, the specific process of implementing the functions of each module in the apparatus of the embodiment of the present application may be referred to the related description in the method embodiment, which is not repeated herein.
As shown in fig. 10, according to an embodiment of another aspect of the present application, there is also provided a distributed data processing apparatus including:
A first obtaining module 21, configured to obtain information to be processed for performing calculation, where the information to be processed includes at least two variable information;
the second obtaining module 22 is configured to obtain operation data corresponding to each operation step in the operation flow from the algorithm processing server, where the operation data includes: the method comprises an algorithm corresponding to the operation step, a first node identifier of a first operation node executing the operation step, and a second node identifier of a second operation node used for receiving the operation result of the operation step;
a generating module 23, configured to generate operation control information corresponding to each first operation node according to the operation data and the variable information;
a sending module 24, configured to send the operation control information to the first operation node according to the first node identifier; the first operation node calculates the operation result according to the operation control information, and sends the operation result to the second operation node according to the second node identifier;
a third obtaining module 25, configured to obtain a final result obtained by calculating the final operation node; the final operation node is the last operation node in the operation flow.
In particular, the specific process of implementing the functions of each module in the apparatus of the embodiment of the present application may be referred to the related description in the method embodiment, which is not repeated herein.
According to another embodiment of the present application, there is also provided an electronic apparatus including: as shown in fig. 11, the electronic device may include: the device comprises a processor 1501, a communication interface 1502, a memory 1503 and a communication bus 1504, wherein the processor 1501, the communication interface 1502 and the memory 1503 are in communication with each other through the communication bus 1504.
A memory 1503 for storing a computer program;
the processor 1501 is configured to execute the program stored in the memory 1503, thereby implementing the steps of the method embodiment described above.
The buses mentioned for the above electronic devices may be peripheral component interconnect standard (Peripheral Component Interconnect, PCI) buses or extended industry standard architecture (Extended Industry Standard Architecture, EISA) buses, etc. The bus may be classified as an address bus, a data bus, a control bus, etc. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the electronic device and other devices.
The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also Digital signal processors (Digital SignalProcessing, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field-programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
The embodiment of the application also provides a storage medium, which comprises a stored program, wherein the program executes the method steps of the method embodiment.
It should be noted that in this document, relational terms such as "first" and "second" and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (15)

1. A method of distributed data processing, comprising:
acquiring information to be processed for calculation, wherein the information to be processed comprises at least two variable information;
acquiring operation data corresponding to each operation step in an operation flow from an algorithm processing server, wherein the operation data comprises: the method comprises an algorithm corresponding to the operation step, a first node identifier of a first operation node executing the operation step, and a second node identifier of a second operation node used for receiving the operation result of the operation step;
generating operation control information corresponding to each first operation node according to the operation data and the variable information;
Transmitting the operation control information to the first operation node according to the first node identifier; the first operation node calculates the operation result according to the operation control information, and sends the operation result to the second operation node according to the second node identifier;
obtaining a final operation node and calculating to obtain a final result; the final operation node is the last operation node in the operation flow.
2. The method according to claim 1, characterized in that: the generating the operation control information corresponding to each first operation node according to the operation data and the variable information includes:
inquiring to obtain an algorithm sequence corresponding to the variable information, and generating an operation unit according to the variable information and the corresponding algorithm sequence;
determining the operation data corresponding to each operation unit;
and generating operation control information corresponding to each first operation node according to the operation data and the operation unit.
3. The method of claim 1, wherein the transmitting the operation control information to the first operation node according to the first node identification comprises:
Encrypting the operation control information according to a preset encryption strategy to obtain encryption information;
and sending the encrypted information to the first operation node so that the first operation node decrypts the encrypted information according to a decryption strategy corresponding to the encryption strategy to obtain the operation control information.
4. A data processing method of model segmentation, comprising:
dividing the target model, and determining an algorithm corresponding to the sequential division and a logic unit obtained by the division;
obtaining an algorithm sequence corresponding to the minimum logic unit according to all the algorithms corresponding to the minimum logic unit obtained by segmentation;
determining logic relation information among each minimum logic unit according to the target model;
obtaining a sub-model after segmentation according to the algorithm sequence, the logic relation information and the minimum logic unit;
determining an operation flow according to the segmented sub-model and operation data corresponding to each operation step in the operation flow, and sending the operation data to a distribution platform, wherein the operation data comprises: the method comprises an algorithm corresponding to the operation step, a first node identifier of a first operation node executing the operation step and a second node identifier of a second operation node used for receiving the operation result of the operation step.
5. The method of claim 4, wherein the obtaining the algorithm sequence corresponding to the minimum logical unit from all the algorithms corresponding to the minimum logical unit obtained by segmentation includes:
determining an algorithm on which the minimum logic unit is sequentially obtained by dividing the target model;
and arranging the algorithm corresponding to each minimum logic unit according to the segmentation order to obtain the algorithm sequence corresponding to each minimum logic unit.
6. The method of claim 4, wherein the obtaining the segmented sub-model from the algorithm sequence, the logical relationship information, and the minimum logical unit comprises:
determining the hierarchical information of each minimum logic unit according to the algorithm sequence of each minimum logic unit;
determining the calculation sequence number of each minimum logic unit layer by layer according to the logic relation information and the hierarchy information;
obtaining a base code corresponding to the algorithm sequence according to the base corresponding to each algorithm; wherein the base comprises at least one character;
and obtaining the sub-model after segmentation according to the calculation sequence number of each minimum logic unit, the base code and the minimum logic unit.
7. The method according to claim 6, wherein the obtaining a base code corresponding to the algorithm sequence from the base corresponding to each algorithm comprises:
obtaining a first base code corresponding to each algorithm sequence according to the base corresponding to each algorithm;
determining the longest base code with the largest number of bases in all the first base codes;
determining the base compensation number of the second base code according to the maximum base number of the longest base code; wherein the second base codon is the first base codon having a fewer number of bases than the longest base codon;
supplementing empty logical bases at the rear end of the final algorithm of the second base code according to the base compensation quantity so as to compensate the base quantity of the second base code to the maximum base quantity and obtain a compensated second base code; the final algorithm is the last algorithm in the algorithm sequence; the empty logical base is a base that does not contain an algorithm;
and obtaining base codes corresponding to the algorithm sequences according to the longest base code and the compensated second base code.
8. The method of claim 4, wherein determining an operation flow according to the segmented sub-model and operation data corresponding to each operation step in the operation flow comprises:
analyzing the logic relation information and determining each logic unit which is associated with each other;
obtaining the operation flow according to the logic units and the logic relation information which are mutually related;
determining corresponding algorithms when the logic units are associated with each other according to the algorithm sequence;
obtaining an algorithm corresponding to each operation step in the operation flow according to the algorithm corresponding to each logic unit when the logic units are related to each other;
randomly selecting operation nodes corresponding to each operation step, and determining node identification of each operation node;
and obtaining the operation data corresponding to the operation steps according to the node identifiers of the operation nodes corresponding to the operation steps and the operation rules.
9. A distributed data processing system, comprising: the system comprises a distribution platform, a data providing end and an algorithm processing server;
the data providing end sends information to be processed, which needs to be calculated, to a distribution platform, wherein the information to be processed comprises at least two variable information;
The algorithm processing server determines operation data corresponding to each operation step in the operation flow and sends the operation data to the distribution platform; the operation data includes: the method comprises an algorithm corresponding to the operation step, a first node identifier of a first operation node executing the operation step, and a second node identifier of a second operation node used for receiving the operation result of the operation step;
the distribution platform generates operation control information corresponding to each first operation node according to the operation data and the variable information;
the distribution platform sends the operation control information to the first operation node according to the first node identification;
the first operation node calculates the operation result according to the operation control information and sends the operation result to the second operation node according to the second node identifier;
recursion is performed until a final operation node calculates to obtain a final result; the final operation node is the last operation node in the operation flow.
10. The system of claim 9, wherein the distribution platform transmitting the operational control information to the first operational node based on the first node identification comprises:
The distribution platform encrypts the operation control information according to a preset encryption strategy to obtain first encrypted information;
the distribution platform sends the encrypted information to the first operation node;
and the first operation node decrypts the first encryption information according to a decryption strategy corresponding to the encryption strategy to obtain the operation control information.
11. The system of claim 10, wherein the first operation node calculates the operation result according to the operation control information, and sends the operation result to the second operation node according to the second node identifier, including:
the first operation node calculates the operation result according to the operation control information;
the first operation node encrypts the operation result according to the encryption strategy to obtain second encryption information, and the second encryption information is sent to the second operation node according to the second node identification;
the distribution platform generates operation control information corresponding to the second operation node according to the operation data;
the distribution platform encrypts the operation control information according to the encryption strategy to obtain third encryption information, and the third encryption information is sent to the second operation node according to the second node identifier;
And the second operation node decrypts the second encryption information and the third encryption information according to the decryption strategy to obtain corresponding operation data.
12. A distributed data processing apparatus, comprising:
the first acquisition module is used for acquiring information to be processed for calculation, wherein the information to be processed comprises at least two variable information;
the second obtaining module is configured to obtain operation data corresponding to each operation step in an operation flow from the algorithm processing server, where the operation data includes: the method comprises an algorithm corresponding to the operation step, a first node identifier of a first operation node executing the operation step, and a second node identifier of a second operation node used for receiving the operation result of the operation step;
the generation module is used for generating operation control information corresponding to each first operation node according to the operation data and the variable information;
the sending module is used for sending the operation control information to the first operation node according to the first node identification; the first operation node calculates the operation result according to the operation control information, and sends the operation result to the second operation node according to the second node identifier;
The third acquisition module is used for acquiring a final result obtained by calculation of the final operation node; the final operation node is the last operation node in the operation flow.
13. A data processing apparatus for model segmentation, comprising:
the segmentation module is used for segmenting the target model and determining an algorithm corresponding to the sequential segmentation and a logic unit obtained by the segmentation;
the algorithm sequence module is used for obtaining an algorithm sequence corresponding to the minimum logic unit according to all the algorithms corresponding to the minimum logic unit obtained through segmentation;
the determining module is used for determining logic relation information among the minimum logic units according to the target model;
the sub-model acquisition module is used for acquiring a sub-model after segmentation according to the algorithm sequence, the logic relation information and the minimum logic unit;
and the sending module is used for determining an operation flow according to the segmented sub-model and operation data corresponding to each operation step in the operation flow, and sending the operation data to a distribution platform.
14. An electronic device, comprising: the device comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
The memory is used for storing a computer program;
the processor being adapted to carry out the method steps of any one of claims 1-8 when the computer program is executed.
15. A storage medium comprising a stored program, wherein the program when run performs the method steps of any of the preceding claims 1-8.
CN202110183631.1A 2021-02-08 2021-02-08 Distributed data processing method and system Active CN112966279B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110183631.1A CN112966279B (en) 2021-02-08 2021-02-08 Distributed data processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110183631.1A CN112966279B (en) 2021-02-08 2021-02-08 Distributed data processing method and system

Publications (2)

Publication Number Publication Date
CN112966279A CN112966279A (en) 2021-06-15
CN112966279B true CN112966279B (en) 2023-11-03

Family

ID=76284809

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110183631.1A Active CN112966279B (en) 2021-02-08 2021-02-08 Distributed data processing method and system

Country Status (1)

Country Link
CN (1) CN112966279B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102012903A (en) * 2009-09-04 2011-04-13 斯必克有限公司 Method and equipment for organizing hierarchical data in relational database
CN109426574A (en) * 2017-08-31 2019-03-05 华为技术有限公司 Distributed computing system, data transmission method and device in distributed computing system
CN109450617A (en) * 2018-12-06 2019-03-08 成都卫士通信息产业股份有限公司 Encryption and decryption method and device, electronic equipment, computer readable storage medium
CN111666087A (en) * 2020-05-28 2020-09-15 平安医疗健康管理股份有限公司 Operation rule updating method and device, computer system and readable storage medium
WO2020233350A1 (en) * 2019-05-20 2020-11-26 创新先进技术有限公司 Receipt storage method, node and system based on plaintext logs

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102012903A (en) * 2009-09-04 2011-04-13 斯必克有限公司 Method and equipment for organizing hierarchical data in relational database
CN109426574A (en) * 2017-08-31 2019-03-05 华为技术有限公司 Distributed computing system, data transmission method and device in distributed computing system
WO2019042312A1 (en) * 2017-08-31 2019-03-07 华为技术有限公司 Distributed computing system, data transmission method and device in distributed computing system
CN109450617A (en) * 2018-12-06 2019-03-08 成都卫士通信息产业股份有限公司 Encryption and decryption method and device, electronic equipment, computer readable storage medium
WO2020233350A1 (en) * 2019-05-20 2020-11-26 创新先进技术有限公司 Receipt storage method, node and system based on plaintext logs
CN111666087A (en) * 2020-05-28 2020-09-15 平安医疗健康管理股份有限公司 Operation rule updating method and device, computer system and readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
一种基于微服务的试验台数采分析展示架构;关岳;王戬;;自动化博览(09);全文 *
基于WebGIS的气象服务产品制作***及关键技术;吕终亮;白新萍;薛峰;;应用气象学报(01);全文 *

Also Published As

Publication number Publication date
CN112966279A (en) 2021-06-15

Similar Documents

Publication Publication Date Title
US20240113858A1 (en) Systems and Methods for Performing Secure Machine Learning Analytics Using Homomorphic Encryption
Archer et al. From keys to databases—real-world applications of secure multi-party computation
CN110443067B (en) Federal modeling device and method based on privacy protection and readable storage medium
US10333696B2 (en) Systems and methods for implementing an efficient, scalable homomorphic transformation of encrypted data with minimal data expansion and improved processing efficiency
US11100427B2 (en) Multi-party computation system for learning a classifier
JP2022095891A (en) Implementation of logic gate function using block chain
EP3651405B1 (en) Cryptographic datashare control for blockchain
CN111898137A (en) Private data processing method, equipment and system for federated learning
CN112347495A (en) Trusted privacy intelligent service computing system and method based on block chain
CN113239391B (en) Third-party-free logistic regression federal learning model training system and method
CN112272825A (en) System and method for structuring objects to operate in a software environment
CN112199697A (en) Information processing method, device, equipment and medium based on shared root key
Luo et al. Parallel secure outsourcing of large-scale nonlinearly constrained nonlinear programming problems
JP2019153216A (en) Learning device, information processing system, method for learning, and program
Serrano et al. A peer-to-peer ownership-preserving data marketplace
CN117521102A (en) Model training method and device based on federal learning
Harris Consensus-based secret sharing in blockchain smart contracts
CN112966279B (en) Distributed data processing method and system
Anwarbasha et al. An efficient and secure protocol for checking remote data integrity in multi-cloud environment
Covaci et al. NECTAR: non-interactive smart contract protocol using blockchain technology
Arulananth et al. Multi party secure data access management in cloud using user centric block chain data encryption
CS Machado et al. Software control and intellectual property protection in cyber-physical systems
Geetha et al. Blockchain based Mechanism for Cloud Security
Ahmed et al. Integrity verification for an optimized cloud architecture
Tahir et al. Critical review of blockchain consensus algorithms: challenges and opportunities

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant