WO2023124677A1 - 数据处理的方法和计算平台 - Google Patents

数据处理的方法和计算平台 Download PDF

Info

Publication number
WO2023124677A1
WO2023124677A1 PCT/CN2022/134250 CN2022134250W WO2023124677A1 WO 2023124677 A1 WO2023124677 A1 WO 2023124677A1 CN 2022134250 W CN2022134250 W CN 2022134250W WO 2023124677 A1 WO2023124677 A1 WO 2023124677A1
Authority
WO
WIPO (PCT)
Prior art keywords
operator
basic
operators
output
placeholder
Prior art date
Application number
PCT/CN2022/134250
Other languages
English (en)
French (fr)
Inventor
吴双
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023124677A1 publication Critical patent/WO2023124677A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data

Definitions

  • the present application relates to the field of data processing, and more specifically, to a data processing method and computing platform.
  • secret-state computing related technologies provide corresponding solutions, one of which is: based on cryptography, data can be calculated in an encrypted state, and the executor of the calculation does not know what he is calculating. What kind of data plays a logical role in isolating the plaintext.
  • fully homomorphic encryption relies on mathematically difficult problems and has strict security proofs, so the security is relatively high.
  • the calculation speed of fully homomorphic encryption is relatively slow, so it may need to be accelerated through parallel acceleration or heterogeneous acceleration in actual use.
  • the interface of the hardware platform will directly call the basic algorithm of the algebraic library.
  • the sub code implements the calculation.
  • the basic operators in the algebra library are generally written based on a specific hardware platform, which cannot support other hardware platforms to optimize or accelerate the basic operators;
  • the function call logic between the basic operators in the library is written by cryptographers themselves, which requires cryptographers to fully understand the specific code implementation of the basic operators and the characteristics of the hardware platform when writing the function call logic.
  • the calculation engine cannot know the overall calculation task, and can only perform calculation tasks in accordance with the function call relationship and the order in which operators appear , operators of the same type in the same computing task cannot be combined and executed, which may cause waste of computing power.
  • This application provides a data processing method and computing platform, based on the algorithm architecture that can provide a computing graph describing the overall computing task, so that the computing engine can learn the overall computing task, so as to realize the construction and calculation of computing tasks based on high-order operators Decoupling of the execution of computational tasks. Furthermore, the combined execution of operators of the same type and heterogeneous acceleration of multiple hardware platforms can be realized, further improving the calculation speed of data processing.
  • a data processing method may include: a computing platform acquires data and a computing task description, and the computing task description includes calculations that need to be performed on the data; Describe the selection of the first set of operators, which includes multiple basic operators; the computing platform executes the pair of the first set of operators according to the dependencies of each basic operator in the first set The data is calculated to obtain the calculation result of the data, where there is a placeholder for the input operand of one of the two basic operators in the dependency relationship and a placeholder for the output operand of the other basic operator intersect.
  • a computing platform may include one or more computing devices.
  • the steps in the above method are all executed by a computer device.
  • the computing platform includes multiple computer devices, and the steps in the above method are executed by the multiple computer devices.
  • the first computer device obtains the data and the calculation task description
  • it inputs the above calculation task description into the second computer device, and inputs the data into the third computer device
  • the second computer device selects the first set of operators
  • the third computer device executes the first set of operators to calculate the above data.
  • the third computer device may have one or more hardware platforms, and the multiple basic operators in the first operator set may be executed by one hardware platform, or may be respectively executed by multiple hardware platforms.
  • the data acquired by the first computer device and the first set of operators selected by the second computer device may also be input into multiple computer devices, and the multiple computer devices may have one or more hardware platforms, and the above-mentioned implementation of the first A set of operators performs calculations on data, and the first set of operators can also be used to perform calculations on data for multiple computer devices.
  • each computer device in the multiple computer devices can use a hardware platform to perform corresponding basic calculations. Operators, multiple hardware platforms can also be used to execute corresponding basic operators, which is not specifically limited in this application.
  • one or more computer devices on the above computing platform may be cloud computer devices.
  • all the computer devices in the computing platform are computer devices in the cloud, and all the computer devices are used in the data processing module in the cloud to process ciphertext and/or plaintext data.
  • the above-mentioned third computer device or multiple computer devices used to execute the first set of operators to calculate data in the computing platform are computer devices in the cloud, and the above-mentioned computer devices are used in the data processing module of the cloud to Process ciphertext and/or plaintext data.
  • the computing platform after the computing platform obtains the data to be processed and the description of the computing task, it executes the above steps to obtain the computing results.
  • the computing platform can obtain the encrypted electronic medical record data and the calculation task description that needs to be processed on the electronic medical record data, and perform the above steps to obtain the calculation results; in another example, in federated learning, multiple participants can Realize joint machine learning modeling while ensuring the privacy of their respective data, that is, the computing platform obtains the ciphertext data of the participants, and uses the ciphertext data to improve the effect of its own model. It should be understood that the above examples are only illustrative, and the present application does not specifically limit the application scenarios of the data processing method provided.
  • executing the first set of operators to perform calculations on data may be performed by a calculation engine in the computer device.
  • the calculation engine performs related calculations on the data according to the multiple basic operators in the first operator set and the dependencies between them. It should be understood that the dependencies between the multiple basic operators in the first operator set Equivalent to computing task description.
  • the above data may be ciphertext data, or plaintext data, or mixed plaintext and ciphertext data, which is not specifically limited in this application.
  • the placeholder of the operand in this application may be one placeholder or a set of multiple placeholders, which is not specifically limited in this application.
  • operator a can also be called a dependent operator of operator b, or operator b can be called a notification operator of operator a.
  • the placeholder of the input operand of one of the two basic operators that has a dependency intersects the placeholder of the output operand of the other basic operator, which may specifically be: an operator
  • the input operand of the operator includes the output operand of another basic operator; or, a part of the output operand of an operator is an input operand of another operator, that is, the output operand of the operator can be split
  • these multiple operands can be used as an input operand of other operators; or, the output operand of an operator is a part of an input operand of another operator, that is, a
  • the output operand of an operator can be combined with the output operand of other operators to form a new operand, and the new operand can be used as an input operand of another operator; or, the output operand of an operator
  • a part is part of an input operand of another operator, that is, the output operand of an operator can be split into multiple operands, and one of the multiple operands can be combined with other operands to form a new Operand, the
  • the overall computing task can be known through the computing graph composed of the first set of operators. Furthermore, the computing platform can start to execute from the basic operators that do not depend on operators based on the computing graph. Computation, until all the basic operators of the calculation graph are executed, the calculation is completed without relying on the upper-level high-level operators, so it can realize the decoupling of the construction of calculation tasks based on high-level operators and the execution of calculation tasks. Furthermore, software and hardware engineers do not need to consider the call relationship between high-order operators and basic operators when optimizing basic operators.
  • calculation graph represents the dependencies between the basic operators of the calculation task, it has nothing to do with the specific hardware platform, so the cross-hardware platform migration of the algorithm architecture can be realized, and in turn, it can support heterogeneous acceleration of multiple hardware platforms, further improving The calculation speed of data processing of ciphertext data.
  • the method further includes: the computing platform determines the first operator in the first operator set according to the dependency relationship of each basic operator in the first operator set A basic operator, the first basic operator is a basic operator in which the intersection of the input placeholder and the output placeholder of any one of the basic operators in the first operator set is empty; the computing platform takes the data Stored in the storage space corresponding to the input placeholder set of the first basic operator; the computing platform executes the first basic operator on the data and executes the first basic operator according to the dependency relationship of each basic operator in the first operator set Other basic operators in the first operator set except the first basic operator.
  • computing platform may include one computer device, or may include multiple computer devices.
  • the input placeholder of an operator represents the union of placeholders of all input operands of the operator, and the set of input placeholders represents a set of input placeholders of multiple operators;
  • the output of an operator The placeholder represents the union of the placeholders of the output operands of the operator, and the output placeholder set represents the set of output placeholders of multiple operators.
  • the calculation engine determines to store data in the input placeholder set according to the input operands of the multiple first basic operators Which input placeholders in the corresponding storage space.
  • the dependency relationship between the first set of operators and each basic operator in the first set of operators is determined according to N sets of basic operators, the There is a one-to-one correspondence between the N basic operator sets and the N high-order operators, and the computation on the data can be realized through the N high-order operators.
  • the first basic operator set is expanded by the first high-order operator obtained, the first high-order operator is any high-order operator in the N high-order operators, and the first basic operator set is the N basic operator set corresponding to the first high-order operator
  • the basic operator set of the operator; the output placeholder of the first basic operator set is the same as the output placeholder of the first high-order operator.
  • a computing task needs to be described by multiple high-order operators, and there are dependencies among the multiple high-order operators. Then, in this application, each high-order operator is expanded into a corresponding set of basic operators in advance according to a user-defined operator expansion function.
  • the above expansion process is "level-by-level expansion", that is, the high-order operator is first expanded into multiple sub-order operators, and then the multiple sub-order operators are further expanded until the sub-order operator The child is expanded to a custom basic operator.
  • each of the N basic operator sets contains one or more basic operators, that is, the total number of basic operators contained in the N basic operator sets is greater than or equal to the number of basic operators before expansion. The total number of high-order operators N.
  • the N sets of basic operators are in one-to-one correspondence with N pieces of dependency information, and the N pieces of dependency information are used to indicate the The dependency relationship of each basic operator, the first operator set and the dependency relationship of each basic operator in the first operator set are obtained in the following manner: according to the N dependency information, determine the target dependency information , the target dependency information is used to indicate the dependency of each of the multiple basic operators; according to the target dependency information and the basic operators included in the N basic operator sets, determine the first set of operators , the target dependency information is the dependency of each basic operator in the first set of operators.
  • the output placeholders of the first basic operator set are the same as the output placeholders of the first high-order operator, and are implemented by absorbing operations
  • the The absorbing operation includes: determining the output placeholder of the first set of basic operators according to the first dependency information, where the first dependency information is used to indicate the dependency of each basic operator in the first set of basic operators relationship; determine whether the output placeholder of the first basic operator set is the same as the output placeholder of the first high-order operator; if the output placeholder of the first basic operator set is the same as the first high-order operator
  • the output placeholders of the operators are different, the pointer of the output operand of the first basic operator set points to the output placeholder of the first high-order operator, and the second identification information is updated to the first identification information , the first identification information is identification information of output operands of the first basic operator set, and the second identification information is identification information of output operands of the first high-order operator.
  • identification information is the "unique" identification of the output operand, which can be uniquely determined according to the definition of the output operand.
  • identification information may be a Hash value, or other unique identification information, which is not specifically limited in this application.
  • the absorbing operation further includes: after updating the second identification information to the first identification information, in the output operand of the first high-order operator
  • the pointer is accessed by the pointer synchronizer, it is judged whether the identification information of the pointer synchronizer is equal to the second identification information, and when the identification information of the pointer synchronizer is not equal to the second identification information, the pointer of the pointer synchronizer is updated
  • the identification information of the pointer synchronizer is updated to the second identification information.
  • the new output operand pointer is locally pointed to the placeholder of the output operand of the original operator, and the identification information of the output operand of the original operator is updated to the identification information of the new output operand, And when the output operand of the original operator accesses the pointer next time, its pointer is refreshed as the pointer of the new output operand, so as to ensure the consistency of the global reference.
  • the dependency relationship of each basic operator in the first basic operator set is determined by the following method: determining the input placeholder of the second basic operator Whether it intersects with at least one output placeholder in the first output placeholder set, the first output placeholder set includes the output placeholders of all basic operators in the first basic operator set, where the second The basic operator is any basic operator in the first basic operator set; if the input placeholder of the second basic operator intersects with at least one output placeholder in the first output placeholder set, Then it is determined that the basic operator corresponding to the at least one output placeholder has a dependency relationship with the second basic operator.
  • the input or output placeholder of an operator may be one placeholder or a set of multiple placeholders, which is not specifically limited in this application.
  • the output placeholder of the first operator is determined according to the input operand of the first operator and the type of the first operator, and the first An operator is any operator obtained during the expansion of the first high-order operator, and the intersection of the output placeholder of the first operator and the allocated output placeholder is empty.
  • the first operator is generated by a user-defined operator assembly function, and the output placeholder of the first operator is determined by calling the placeholder assigned by the placeholder/operand assignment module .
  • a computing platform may include one or more computer devices, configured to execute the method in the foregoing first aspect or any possible implementation manner of the first aspect.
  • the computing platform may be realized by hardware, or may be realized by executing corresponding software by hardware.
  • the device includes: a transceiver unit configured to acquire data and a computing task description, the computing task description including calculations that need to be performed on the data; a first processing unit configured to select a first set of operators according to the computing task description, the The first set of operators contains a plurality of basic operators; the second processing unit executes the first set of operators to calculate the data according to the dependency relationship of each basic operator in the first set of operators, and obtains the The calculation result of the data, where the placeholders of the input operands of one of the two basic operators that have a dependency intersect with the placeholders of the output operands of the other basic operator.
  • the transceiver unit, the first processing unit, and the second processing unit may be one computer device, or the transceiver unit, the first processing unit, and the second processing unit may also be different computer devices.
  • the second processing unit may be one or more computer devices.
  • the computer device may have one or more hardware platforms. In the first set of operators Multiple basic operators of can be executed by one hardware platform, or can be executed by multiple hardware platforms respectively.
  • the second processing unit is a plurality of computer devices
  • the data obtained by the transceiver unit and the first set of operators selected by the first processing unit may also be input into the plurality of computer devices
  • the plurality of computer devices may have One or more hardware platforms
  • the above-mentioned execution of the first set of operators to perform calculations on data may also execute the first set of operators to perform calculations on data for a plurality of computer devices, specifically, each computer device in the plurality of computer devices
  • One hardware platform may be used to execute the corresponding basic operator, or multiple hardware platforms may be used to execute the corresponding basic operator, which is not specifically limited in this application.
  • the computing platform may be a cloud computing platform, or the second processing unit in the computing platform is a cloud processing unit, which is not specifically limited in the present application.
  • the second processing unit is further configured to: determine the The first basic operator, the first basic operator is the basic operator whose intersection of the input placeholder and the output placeholder set of any one of the basic operators in the first operator set is empty; the data is stored in The storage space corresponding to the input placeholder set of the first basic operator; execute the first basic operator on the data and execute the division in the first operator set according to the dependencies of each basic operator in the first operator set Other basic operators other than the first basic operator.
  • the dependency relationship between the first set of operators and each basic operator in the first set of operators is determined according to N sets of basic operators, the There is a one-to-one correspondence between the N basic operator sets and the N high-order operators, and the calculation of the data can be realized through the N high-order operators.
  • the first basic operator set is expanded by the first high-order operator obtained, the first high-order operator is any high-order operator in the N high-order operators, and the first basic operator set is the N basic operator set corresponding to the first high-order operator
  • the basic operator set of the operator; the output placeholder of the first basic operator set is the same as the output placeholder of the first high-order operator.
  • the N sets of basic operators are in one-to-one correspondence with N pieces of dependency information, and the N pieces of dependency information are used to indicate the The dependency relationship of each basic operator, the first operator set and the dependency relationship of each basic operator in the first operator set are obtained in the following manner: according to the N dependency information, determine the target dependency information , the target dependency information is used to indicate the dependency of each of the multiple basic operators; according to the target dependency information and the basic operators included in the N basic operator sets, determine the first set of operators , the target dependency information is the dependency of each basic operator in the first set of operators.
  • the output placeholders of the first basic operator set are the same as the output placeholders of the first high-order operator, and are implemented by absorbing operations
  • the The absorbing operation includes: determining the output placeholder of the first set of basic operators according to the first dependency information, where the first dependency information is used to indicate the dependency of each basic operator in the first set of basic operators ;Determine whether the output placeholder of the first basic operator set is the same as the output placeholder of the first high-order operator; if the output placeholder of the first basic operator set is the same as the first high-order operator The output placeholders of the child are different, the pointer of the output operand of the first basic operator set points to the output placeholder of the first high-order operator, and the second identification information is updated to the first identification information,
  • the first identification information is identification information of output operands of the first basic operator set
  • the second identification information is identification information of output operands of the first high-order operator.
  • the absorbing operation further includes: after updating the second identification information to the first identification information, in the output operand of the first high-order operator
  • the pointer of the pointer is accessed by the pointer synchronizer, it is judged whether the identification information of the pointer synchronizer is equal to the second identification information, and when the identification information of the pointer synchronizer is not equal to the second identification information, the pointer of the pointer synchronizer is updated
  • the identification information of the pointer synchronizer is updated to the second identification information.
  • the dependency relationship of each basic operator in the first basic operator set is determined by the following method: determining the input placeholder of the second basic operator Whether it intersects with at least one output placeholder in the first output placeholder set, the first output placeholder set includes the output placeholders of all basic operators in the first basic operator set, where the second The basic operator is any basic operator in the first basic operator set; if the input placeholder of the second basic operator intersects with at least one output placeholder in the first output placeholder set, Then it is determined that the basic operator corresponding to the at least one output placeholder has a dependency relationship with the second basic operator.
  • the output placeholder of the first operator is determined according to the input operand of the first operator and the type of the first operator, and the first operator An operator is any operator obtained during the expansion of the first high-order operator, and the intersection of the output placeholder of the first operator and the allocated output placeholder is empty.
  • a data processing device in a third aspect, is provided, and the device may be a chip or a circuit, configured to execute the method in the above-mentioned first aspect or any possible implementation manner of the first aspect.
  • the apparatus may be implemented by hardware, or may be implemented by executing corresponding software by hardware.
  • the apparatus includes a module configured to execute the above first aspect or the method in any possible implementation manner of the first aspect.
  • the device includes: a processor and a memory; the memory is used to store instructions, and when the communication device is running, the processor executes the instructions stored in the memory, so that the communication device performs the above-mentioned first A method of data transmission in any one of the implementation methods of one aspect or the first aspect.
  • the memory may be integrated in the processor, or independent of the processor.
  • the device includes a processor, which is configured to be coupled with the memory, read instructions in the memory, and execute the above-mentioned first aspect or any implementation method of the first aspect according to the instructions.
  • the method of data transfer is configured to be coupled with the memory, read instructions in the memory, and execute the above-mentioned first aspect or any implementation method of the first aspect according to the instructions. The method of data transfer.
  • a computer-readable storage medium stores a program, and the program enables the communication device to execute any one of the above-mentioned aspects and any one of the data transmission methods in various implementations thereof .
  • the present application also provides a computer program product containing instructions, which, when run on a computer, causes the computer to execute any data transmission method in the above aspects.
  • a chip system including a processor connected to a memory, the processor is used to call and run a computer program from the memory, so that a communication device installed with the chip system executes any of the above-mentioned Aspects and any method of their possible implementations.
  • the memory can be located inside the system-on-a-chip, or outside the system-on-a-chip.
  • FIG. 1 is a schematic diagram of an application scenario of a data processing method provided by the present application
  • FIG. 2 is a schematic diagram of an operator involved in a data processing method provided by the present application
  • Fig. 3 is a schematic diagram of a calculation graph representation provided by the present application.
  • FIG. 4 is a schematic diagram of an application architecture of a data processing method provided by the present application.
  • FIG. 5 is a schematic diagram of an application framework of a data processing method provided by the present application.
  • FIG. 6 is a schematic flowchart of a data processing method provided by the present application.
  • FIG. 7 is a schematic flowchart of a data processing method provided by the present application.
  • FIG. 8 is a schematic flowchart of a data processing method provided by the present application.
  • FIG. 9 is a schematic diagram of different stages of a data processing method provided by the present application.
  • FIG. 10 is a schematic diagram of operator dependencies of a data processing method provided by the present application.
  • Fig. 11 is a schematic flowchart of another data processing method provided by the present application.
  • Fig. 12 is a schematic flowchart of another data processing method provided by the present application.
  • Fig. 13 is a schematic flowchart of another data processing method provided by the present application.
  • Fig. 14 is a schematic flowchart of another data processing method provided by the present application.
  • Fig. 15 is a schematic diagram of a computing platform provided by the present application.
  • Fig. 16 is a schematic diagram of a data processing device provided by the present application.
  • FIG. 1 is a schematic diagram of an application scenario of a data processing method provided in this application.
  • the cloud 100 is composed of an application management module 101 , a data processing module 102 , a data storage module 103 and a key management and authentication module 104 .
  • the application management module 101 is responsible for coordinating and processing user requests;
  • the data processing module 102 is used for computing tasks on data;
  • the data storage module 103 is used for storing data, storing and extracting user data according to the recorded user data location, and responsible for user The information exchange among them;
  • the key management and authentication module 104 is used to apply for a key or perform user authentication.
  • the user interacts with the cloud through the cloud computing system login program on the client 200, and is responsible for submitting user requests to the cloud 100, encrypting and decrypting user private data, and uploading and downloading data.
  • a fully homomorphic encryption algorithm can be used to process data in cloud computing.
  • the fully homomorphic encryption mechanism enables users or trusted third parties to directly process data without exposing the original data, and users can obtain processed data after decrypting the operation results.
  • electronic medical records are stored in encrypted text on the cloud server.
  • the electronic medical record data in ciphertext can be handed over to a professional data processing service provider for processing, and the correct data can be obtained by decrypting after obtaining the processing result.
  • the application scenario shown in FIG. 1 is only an example, and the data processing method and device provided in the embodiment of the present application can also be applied to other data processing scenarios based on homomorphic encryption algorithms, or can also be applied to other In the data processing not based on encrypted state algorithm, the data processing method provided by the embodiment of this application can be used to process ciphertext data, and can also be used to process plaintext data or mixed plaintext and ciphertext data. limited.
  • Fully homomorphic encryption A type of encryption algorithm that can support calculations on ciphertexts, which is equivalent to calculations on corresponding plaintexts. The calculation process on the ciphertext can be completed in an untrusted environment without worrying about leaking data privacy.
  • Heterogeneous computing A hybrid computing solution that uses more than one hardware platform. For example, a computing solution using a central processing unit (CPU) + a graphics processing unit (GPU) or a CPU + a field-programmable gate array (FPGA) at the same time.
  • CPU central processing unit
  • GPU graphics processing unit
  • FPGA field-programmable gate array
  • Lattice algebra an algebraic structure defined on a high-dimensional linear space.
  • the mathematically difficult problems that all current homomorphic encryption algorithms rely on at the bottom are based on lattice algebra.
  • Commonly used mathematically difficult problems include the shortest vector problem (shortest vector problem, SVP) on the lattice or learning with error (LWE).
  • Upper layer refers to the homomorphic encryption algorithm library, including high-order operators that can be used to represent computing tasks.
  • the computing tasks represented by high-order operators are realized by calling the basic operators of the algebraic library.
  • Bottom layer refers to the algebra library, which provides basic operators for the calculation engine to perform calculation tasks.
  • the underlying interface refers to the interface between the algebra library and the hardware platform.
  • Placeholder A collection of data blocks of a specific unit size, representing raw data without a type. In the process of describing the calculation graph, no specific values are involved, but an abstract representation is needed to distinguish different data and data sizes. In some possible implementations, assuming that all data can be represented as a data block of a specific unit size (such as 64 bits), the size of the placeholder is the number of unit-size data blocks it contains, and the intersection of the placeholders The sum and union are the intersection and union of sets of data blocks.
  • Operands A set of one or more placeholders.
  • the placeholders in the set do not overlap, that is, the intersection of placeholders in the operands is empty.
  • An operand can be regarded as a whole data, and an operand can carry some additional information, such as data type.
  • the size of an operand is the sum of the sizes of the placeholders it contains.
  • operands can be viewed as “clothed” placeholders.
  • the same placeholder can "wear different clothes”, which means that the same placeholder (naked data) can be regarded as different types of operands in different occasions; different operands may also contain the same placeholder.
  • an operand can be regarded as a placeholder, and at this time, the data represented by the placeholder is the union of all placeholders in the set.
  • intersection of operands is the intersection of the data placeholders they represent, and the union of operands is the union of the data placeholders they represent.
  • the input/output operands of the operator can be equivalent to the input/output placeholders of the operator, in which case the input/output placeholders of the operator represent all the input/output operands of the operator The union of all placeholders for .
  • Computing graph A collection of successively dependent operators to form a directed acyclic graph (DAG), which represents a complete computing task, that is, what kind of calculation needs to be performed on the data, specifically As shown in Figure 3, A, B, C, and D represent operators, and a, b, c, d, x, y, and z represent operands. It should be understood that there is no circular dependency between operators.
  • the calculation tasks described by the calculation graph are composed of algebraic data structures and operators, and have nothing to do with the specific hardware platform and calculation engine, so the front-end and back-end decoupling can be realized.
  • the high-level operator will directly call the code of the basic operator according to the function call logic
  • the specific implementation code of these basic operators is generally written based on a specific hardware platform, so it is difficult to optimize or accelerate these basic operators through other hardware platforms, which makes the current open source library of homomorphic encryption unable to effectively support Heterogeneous computing.
  • the computing engine cannot know the overall computing tasks, it can only execute the computing tasks in accordance with the calling relationship and the order in which the operators appear, which is likely to cause a waste of computing power.
  • the calculation engine must rely on the operator call relationship provided by the upper layer to achieve calculation, which makes the coupling between the front-end calculation task representation based on high-order operators and the back-end calculation task execution strong, which leads to software and hardware engineers.
  • optimizing an operator it is necessary to consider the logical relationship between the upper-level high-level operator and the basic operator, and the difficulty and cost of optimization are relatively high.
  • homomorphic encryption algorithm library homomorphic encryption library, HELib
  • lattice cipher open source library Palisade lattice cipher open source library Palisade
  • simple encrypted arithmetic library simple encrypted arithmetic library, SEAL
  • Both HELib and Palisade are based on the open source fast number theory algorithm library (number theory algorithm library, NTL), and its bottom layer is the open source high-precision integer library GNU multiple precision arithmetic library (the GNU multiple precision arithmetic library, GMP).
  • NTL number theory algorithm library
  • GMP GNU multiple precision arithmetic library
  • NTL and GMP currently supports GPU acceleration. Even if these two libraries expand their support for GPU and other hardware platforms in the future, due to the limitations of the existing architecture, they can only call the GPU inside the operator that supports the GPU to realize the optimization of a single operator designed for the GPU. Supports overall optimization between operators on different hardware platforms, such as merging multiple operators for parallel acceleration. Therefore, to support heterogeneous computing, NTL and GMP may need to change the implementation code of the basic operator based on the characteristics of the hardware platform every time they migrate to a new hardware platform. In addition, since the calling relationship between high-order operators and basic operators needs to be written by cryptographers themselves, after the specific implementation code of basic operators is changed, the calling relationship between high-order operators and basic operators may also change. Need to change. Therefore, not only the underlying implementation requires a lot of manpower to adjust, but also the application interface of the upper layer is difficult to remain unchanged. Migration is therefore expensive.
  • the structure of the SEAL library is similar to that of HELib and Palisade. The difference is that the underlying layer does not call other algebraic libraries, but develops the lattice algebra operators required by the fully homomorphic encryption layer based on the CPU.
  • the SEAL library has developed its own basic operators, the computing engine's execution of computing tasks also depends on the calling relationship between high-order operators and basic operators, and it cannot know the overall computing tasks. This makes its architecture have similar disadvantages to HELib and Palisade: high migration costs across hardware platforms, strong coupling between the construction of upper-layer computing tasks and the execution of computing tasks by the computing engine.
  • the embodiment of the present application provides a data processing method and a computing platform, based on the fully homomorphic encryption algorithm architecture that can provide the computing engine with a computing graph that can describe the overall computing task to process the ciphertext, and the computing engine is executing
  • the overall computing task can be known through the computing graph, so operators of the same type can be combined and processed according to the computing graph, thereby improving the computing speed while ensuring the correct execution of the computing task.
  • the calculation graph representation composed of basic operators that is, the dependency relationship between basic operators is also determined, so the computing engine can never rely on operators based on the calculation graph.
  • the basic operators start to execute the calculation until all the basic operators in the calculation graph are executed, and the calculation is completed without relying on the upper-level high-level operators, so the solution to the construction and execution of calculation tasks based on high-level operators can be realized couple. Furthermore, software and hardware engineers do not need to consider the call relationship between high-order operators and basic operators when optimizing basic operators.
  • the calculation graph represents the dependencies between basic operators and has nothing to do with the specific hardware platform, it can realize the cross-hardware platform migration of the algorithm architecture, and then can support heterogeneous acceleration of multiple hardware platforms, further improving data processing. calculation speed.
  • FIG. 4 is a schematic diagram of an application architecture required for realizing a data processing method provided by an embodiment of the present application, which can be applied to the application scenario shown in FIG. 1 .
  • it is a lattice algebra computing graph (heterogeneous lattice graph, HLG) architecture that supports heterogeneous computing.
  • HLG lattice algebra computing graph
  • the HLG architecture can support BGV algorithm (Zvika Brakerski, Craig Gentry and Vinod Vaikuntanathan, BGV), BFV algorithm (Zvika Brakerski , Junfeng Fan and Frederik Vercauteren, BFV), CKKS algorithm (Jung HeeCheon, Andrey Kim, Miran Kim and Yongsoo Song, CKKS) and other homomorphic encryption algorithms.
  • BGV algorithm Zvika Brakerski, Craig Gentry and Vinod Vaikuntanathan, BGV
  • BFV algorithm Zvika Brakerski , Junfeng Fan and Frederik Vercauteren
  • CKKS algorithm Jung HeeCheon, Andrey Kim, Miran Kim and Yongsoo Song, CKKS
  • computing tasks based on fully homomorphic encryption algorithms do not have logical branches, nor do they need to support conditional judgments and loops, that is, there is no control flow, so computing tasks can be expressed as DAG.
  • an abstract calculation graph expression layer is added between the upper-level high-level operators and the underlying basic operators.
  • the calculation graph expression layer uses DAG to describe the overall calculation task.
  • the calculation engine knows the dependencies between the basic operators, and can complete the calculation based on the basic operators and the relationship between them. That is to say, it is possible to achieve decoupling between the front-end calculation graph representation based on high-order operators and the back-end calculation task execution based on basic operators.
  • FIG. 5 is a schematic diagram of an HLG basic framework required for realizing a data processing method provided by an embodiment of the present application.
  • the HLG framework includes an operand/placeholder allocation module, an operator assembly module, an operator expansion module, and an operator dependency judgment module.
  • the HLG framework can also include graph optimization modules and/or computing engines. It should be noted that since acceleration using the RNS (Residue Number System) system is a conventional method, in order to enable the HLG framework to accelerate through the RNS algorithm, it needs to support arbitrary splitting and combination of data. Therefore, in the HLG basic framework, an operator assembly function and an operator expansion function can be defined for each operator.
  • RNS Residue Number System
  • the operator assembly function is used to generate an operator according to the input operand and operator type, and the operator expansion function uses It is used to expand the high-order operators into a set of basic operators. In some possible implementation manners, specific implementations of operators dedicated to the calculation engine may also be defined. It should be understood that the HLG framework shown in FIG. 5 is only an exemplary illustration, and the functional modules therein should not be simply understood as functional entities.
  • Operand/placeholder allocation module used to allocate virtual data placeholders to represent data of a specific length in the calculation graph. For this module, given the operand type and the desired placeholder size (number of unit data blocks), return a new operand that contains a new placeholder that meets the size requirements and is identical to the existing placeholder Bit symbols have no intersection.
  • the placeholder may be described as a left-closed right-open interval [x, y), where x and y are integers.
  • [0,4096) and [4096,8192) represent two different placeholders with a size of 4096 unit-sized data blocks.
  • the data block size of a unit size can be defined according to the needs, for example, the common size can be 1 bit, 2 bits, 4 bits, 8 bits, 16 bits or 64 bits, etc., the implementation of this application The example does not specifically limit this.
  • the placeholder can be described as a collection of integers, each integer represents a different abstract unit data block, for example: ⁇ 0,1,2,....,4095 ⁇ , ⁇ 4096,4097 ,...,8191 ⁇ .
  • Operator assembly module In some possible implementation manners, the operator assembly module is implemented by an operator assembly function. For this module, given several input operands X0, X1, ..., Xn and operator type T, output a brand new operator OP. In some possible implementations, the number of input operands may also be zero, that is, a new operator is output for a given operator type. It should be noted that the operator assembly function can be defined by the user of the algebra library, that is, the cryptographer.
  • the operator assembly module defines a set of candidate operators that describe computing tasks.
  • the OP's operator type is T
  • the input operands are X0, X1, . . . , Xn
  • the output operand is a brand new operand Y.
  • the placeholder of operand Y is a placeholder newly allocated by calling the operand/placeholder allocation module, and this placeholder has no intersection with all existing placeholders, that is, there is no intersection between output placeholders .
  • Operator expansion module used to expand the target operator into a set of second-order operators, where the calculation tasks represented by the set of second-order operators are the same as the calculation tasks represented by the original target operators. .
  • the operator expansion module is implemented by an operator expansion function. It should be noted that the operator expansion function can be defined by the user of the algebra library, that is, the cryptographer.
  • the expansion process of the high-order operator is "level-by-level expansion", that is, the high-order operator is first expanded into multiple sub-order operators through the operator expansion module.
  • this module calls the high-order operator set of the operator assembly module, that is, inputs a single high-order operator A, and outputs an equivalent operator set ⁇ A i ⁇ , that is, expands operator A into several order Operator A i .
  • i 0,1,2...
  • random protection mechanisms can be introduced into the operator expansion function, such as useless codes, redundant calculation branches, operator splitting, equivalent transformation, etc., which are not limited in this embodiment of the present application.
  • the basic operator is an operator supporting lattice algebra.
  • the user can define which operators are the basic operators.
  • the granularity of high-order operator expansion can be regulated by customizing basic operators.
  • a new output operand is generated during the execution of the operator expansion function, and the new output operand is different from the placeholder of the output operand of the original operator.
  • the embodiment of this application provides an "absorbing operation", so that after the operator is expanded, the new output operands of the expanded operator set and the output operands of the original operator are guaranteed. The digits match. Furthermore, since the new output operand and the output operand of the original operator can also be referenced by other occasions, in order to ensure global consistency, it is necessary to refresh all references to the new output operand.
  • Figure 6 shows the changes in local and global information before and after the "absorption operation".
  • operator D is expanded into operators D1, D2, and D3.
  • a new operand a is generated.
  • the placeholder of operand a is placeholder a, which is different from the operation of operator D before expansion.
  • the placeholder b of the number b which will cause the calculation graph dependencies before and after operator expansion to change.
  • operand a gives up its own placeholder a, and instead points to operand b's placeholder b.
  • operand b is still the output operand of operator D
  • operand a is still the output operand of operator D3.
  • all references that originally pointed to operand b now need to point to operand a.
  • a pointer synchronizer PtrSyncer in order to make all references to operand b point to operand a, it is necessary to change all pointers or references to operands to be completed by a pointer synchronizer PtrSyncer. It should be noted that, in addition to storing a pointer to an operand, the pointer synchronizer also saves a Hash value. In addition, a Hash value field is added for all operand objects, and the Hash value is used to uniquely identify the operand object. Furthermore, it is also necessary to define a mapping table ObjMap from all Hash functions to all operand objects, which is used to represent the mapping relationship between Hash values and operation objects.
  • method 600 includes:
  • the first operand may be the above operand a
  • the second operand may be the above operand b
  • the second placeholder may be the above placeholder b.
  • the second placeholder may be one placeholder, or may also be a set of multiple placeholders, which is not specifically limited in this embodiment of the present application.
  • the pointer synchronizer of the second operand accesses the pointer next time, the original pointer pointing to the second operand is refreshed as a pointer pointing to the first operand.
  • the Hash value in the pointer synchronizer of the second operand is updated with the Hash value of the first operand.
  • the method for accessing a pointer through a pointer synchronizer may refer to the method 700 shown in FIG. 8, and the method 700 includes:
  • the pointer access interface PtrSyncer.Ptr() of the pointer synchronizer PtrSyncer is called.
  • the Hash value of the pointer synchronizer is the Hash value of the second operand
  • the target Hash value is the Hash value of the first operand.
  • the Hash value of the pointer synchronizer may be expressed as PtrSyncer.Hash
  • the target Hash value may be expressed as PtrSyncer.ptr.Obj.Hash.
  • the target pointer is the pointer of the above-mentioned first operand.
  • the returned pointer of the pointer synchronizer is the target pointer, exemplarily, the pointer of the above-mentioned first operand.
  • the return value also includes a Hash value corresponding to the target pointer.
  • the return value may be PtrSyncer.ptr.
  • the data processing method provided by the embodiment of the present application uses pointer delay synchronization to ensure that the placeholders of the new output operands generated during the operator expansion process are consistent with the placeholders of the original operator output operands. More specifically, locally point the new output operand pointer to the placeholder of the output operand of the original operator, and at the same time update the Hash value of the output operand of the original operator to the Hash value of the new output operand, and in the original When the output operand of the operator accesses the pointer next time, its pointer is refreshed as the pointer of the new output operand, so as to ensure the consistency of the global reference.
  • Operator dependency judgment module Given a set S of basic operators, determine the sequential dependencies between operators to form a calculation graph.
  • Dependency set of operator B Indicates the set of operators that all B depends on. The intersection of the output placeholders of all operators in the set and the input placeholders of operator B is not empty, and there is no dependency on other operators being B. In other words, operator B must wait until all operators in its dependency set have been executed before it can start executing.
  • Notification set of operator B Indicates the set of all operators that depend on B. The intersection of the input placeholders of all operators in the set and the output placeholders of operator B is not empty, and no other operators depend on operator B. That is to say, after operator B finishes executing, it can notify operators in its notification set to start checking whether it meets the execution conditions (that is, whether the dependency set is empty).
  • each operator has a unique identifier, represented by an integer. Specifically, as shown in Figure 10, the arrow with operator 7 points to operator 5, which means that operator 7 depends on operator 5.
  • the dependency set and notification set of the operator shown in Figure 10 are defined in Table 1:
  • OP(a) may have a unique definition according to the setting that the output placeholders of all operators do not intersect.
  • the input operand set and output operand set of the entire calculation graph can also be analyzed from the calculation graph, so the overall input and output operand sets can be explicitly saved in the calculation graph, or not.
  • the above-mentioned input operand set is an operand that does not depend on other operands, and is the input data of the entire calculation graph;
  • the above-mentioned output operand set is a set of operands that are not referenced by any other operator as input, and is the entire calculation graph. output data.
  • method 1000 includes:
  • the first output placeholder set ⁇ ph j , j ⁇ [L,R) ⁇ includes all placeholders in the set ⁇ ph i ⁇ that are not empty at the intersection with ph, in other words, the first output placeholder
  • the operator set OP(S A ) is the dependency set of operator A. It should be understood that when L ⁇ R, S A is an empty set, that is, operator A has no dependency set.
  • S1020 and S1030 may be implemented through a "binary search” algorithm, and the complexity is O(log(n)).
  • the dependency set of operator A may also be determined with reference to the method 1100 shown in FIG. 12 .
  • method 1100 includes:
  • the set ret determined in S1120 is the first output placeholder set, that is, the output placeholder set S A that intersects with the input placeholders of operator A.
  • S A is an empty set, that is, operator A has no dependent set, then the input placeholder of operator A is used to store the actual input data of the computing task.
  • the data processing method provided by the embodiment of this application constructs an overall computing task by determining the set of basic operators and their dependencies, so that the computing engine can execute the overall computing task that only includes basic operators, thereby realizing the construction of computing tasks Decoupling from computing task execution.
  • computing engines of different hardware platforms can be implemented based on the same computing task determined in the embodiment of the present application, so heterogeneous computing can be realized.
  • Graph optimization module It is used to optimize the calculation graph, specifically, it can perform equivalent transformation on the calculation graph, reduce repeated calculations, and improve execution performance.
  • graph optimization algorithms such as clipping useless calculation branches, operator deduplication, automatic precomputation, and automatic SIMD optimization can be used, which are not specifically limited in this embodiment of the present application.
  • Calculation engine used to specifically execute the calculation tasks described in the calculation graph, and process the input data of the overall calculation task into output data.
  • the actual execution process of the calculation engine is as follows:
  • the calculation is performed from the operator that has no dependencies, and the intermediate result of the operator calculation is saved in the actual storage space of the output placeholder of the corresponding operator. Until all operators are calculated.
  • FIG. 13 shows a schematic flow chart of a data processing method 1300 provided by an embodiment of the present application.
  • the method 1300 can be applied to the HLG architecture shown in FIG. 4 or to the HLG framework shown in FIG. 5 In the embodiment of the present application, this is not limited.
  • the method 1300 may be executed by the HLG framework shown in FIG. 5, specifically, the method 1300 includes:
  • defining the operands of the input data can be performed by the "operand/placeholder allocation module" in the above-mentioned embodiments. It should be understood that all or part of the input operands of all subsequent operators come from the operands defined in this step.
  • the high-order operators in the set of high-order operators for constructing computing tasks may be determined by the "operator assembling module" in the above-mentioned embodiments.
  • the operator assembly module uses the operator assembly function to use the operand selected in S1310 as the input operand of the operator, and generates a corresponding high-order operator in combination with the operator type.
  • the high-order operator B is a dependent operator of the high-order operator A. After calling the operator assembly module to generate the high-order operator A, the output operand of the high-order operator A is used as the high-order One of the input operands of the higher-order operator B assembles the higher-order operator B.
  • the output operand of the high-order operator is the operand newly allocated by the "operand/placeholder allocation module"
  • the placeholder of the output operand is the same as the placeholder of all the operands of the input data defined in S1310 symbols, and have no intersection with the output placeholders of other higher-order operators that have been assigned.
  • expanding a set of high-order operators into a set of basic operators may be performed by the "operator expansion module" in the foregoing embodiments.
  • the operator expansion module uses the operator expansion function to expand each high-order operator into a corresponding set of basic operators.
  • first the first high-order operator The child is expanded into a set of first-order operators, and then each sub-order operator in the first-order operator set is expanded until all sub-order operators are expanded into basic operators.
  • the operator assembly function is called to assemble the sub-order operator and the basic operator, that is, to determine the input operands and output operands of the sub-order operator and the basic operator.
  • the placeholders of the output operands of an operator are disjoint with the placeholders of the existing operands, that is, the output operands of each operator are new output operands.
  • operator A is expanded into operators a1, a2, a3, and a4.
  • the operator assembly function is called during the expansion process to generate a1 and a4 respectively.
  • a2 call the operator assembly module again to generate a3 according to the output operands of a1 and a2 and the operands of the input data defined in S1310.
  • the operator assembling module may also combine the output operands of the operators a1 and a2 into a new operand as an input operand of the operator a3.
  • operator a1 in the process of expanding operator A into operators a1-a4, operator a1 is not only the notification operator of operator a3, but also the notification operator of a4, and the operator assembly module generates
  • the output operand of operator a1 may be split into output operand 1 and output operand 2.
  • the output operand 1 of a1 is used as an input operand of the operator a3, and the output operand 2 of a1 is used as an input operand of the operator a4.
  • splitting and merging of operands may be understood as splitting and merging of placeholder sets of operands.
  • the dependency relationship between the above operators a1, a2 and a3 may be destroyed during the continuous expansion of the operators a1, a2 and a3, so it is necessary to determine the final Expand the dependency information of the basic operator set obtained.
  • the dependency relationship information of the basic operator set refers to the dependency relationship of each basic operator in the basic operator set.
  • determining the dependency information of the basic operator set is performed by the "operator dependency judging module" in the above embodiment.
  • a computing task is represented by N high-order operators, and the set of basic operators is determined by expanding the N high-order operators. Specifically, N high-order operators are expanded into corresponding N basic operator sets. In order to ensure that after the high-order operators are expanded into basic operator sets, the computing tasks represented by the basic The output placeholder of a sub is the same as the output placeholder of the base operator set it expands. Taking the first high-order operator expansion as the first basic operator set as an example, use the method 1000 or method 1100 in the above embodiment to determine the dependency information of the first basic operator set, and then determine the first basic operator set output placeholder.
  • the output placeholder of the first basic operator set is inconsistent with the output placeholder of the first high-order operator
  • the output placeholder of the first basic operator set is updated through the "absorb" operation in the above embodiment A placeholder for the output of the first higher-order operator.
  • the dependency relationship information of the basic operator set is determined according to the N basic operator sets and the dependency relationship information of the N basic operator sets.
  • the calculation engine performs calculations on data, it only needs to execute the calculation graph composed of the above-mentioned set of basic operators to obtain calculation results, without knowing the calling relationship between high-order operators and basic operators in the algebra library.
  • the embodiment of the present application provides a data processing method, which uses basic operators and the dependencies between them to represent computing tasks, so that the computing engine can directly execute basic Operators complete calculations, thus decoupling the construction of computing tasks based on high-order operators and the execution of computing tasks can be realized. Furthermore, software and hardware engineers do not need to consider the calling relationship between high-order operators and basic operators when optimizing basic operators. In addition, since the calculation graph represents the dependencies between basic operators and has nothing to do with the specific hardware platform, it can realize the cross-hardware platform migration of the algorithm architecture, and then can support heterogeneous acceleration of multiple hardware platforms, further improving data processing. calculation speed.
  • FIG. 14 shows a schematic flow chart of a data processing method 1400 provided by an embodiment of the present application.
  • the method 1400 can be applied to the HLG architecture shown in FIG. 4 or to the HLG framework shown in FIG. 5 In the embodiment of the present application, this is not limited.
  • the method 1400 may be executed by a calculation engine in the HLG framework shown in FIG. 5, specifically, the method 1400 includes:
  • the computing platform acquires data and a computing task description, where the computing task description includes calculations that need to be performed on the data.
  • the above data may be ciphertext data, or may be plaintext data, or may be mixed plaintext and ciphertext data, which is not specifically limited in the present application.
  • the computing platform selects a first operator set according to the computing task description, where the first operator set includes multiple basic operators.
  • the first set of operators may be obtained through expansion of high-order operators.
  • a specific process of obtaining the first set of operators through expansion of high-order operators reference may be made to the descriptions in the foregoing embodiments, and details are not repeated here.
  • the computing platform executes the first set of operators to calculate the data according to the dependencies of each basic operator in the first set of operators, and obtains the calculation result of the data, wherein the two basic operators with dependencies
  • the input operands of one base operator include the output operands of another base operator.
  • the calculation engine performs related calculations on the data according to the multiple basic operators in the first operator set and the dependencies between them. It should be understood that the dependencies between the multiple basic operators in the first operator set Equivalent to computing task description.
  • the placeholder of the operand in this application may be one placeholder or a set of multiple placeholders, which is not specifically limited in this application.
  • operator a can only be executed after operator a is executed, that is, the output placeholder of operator a intersects the input placeholder of operator b.
  • operator a can also be called a dependent operator of operator b, or operator b can be called a notification operator of operator a.
  • the calculation engine starts to perform calculations from the basic operators in the first operator set that have no dependent operators until all the basic operators in the first operator set are executed to obtain the calculation results of the above data.
  • the overall computing task can be known through the computing graph composed of the first set of operators. Further, the computing engine can never rely on The basic operator of the operator starts to execute the calculation, and the calculation is completed until all the basic operators in the calculation graph are executed. There is no need to rely on the upper-level high-level operators, so the construction of calculation tasks based on high-level operators and the calculation of calculation tasks can be realized. Decoupling of execution. Furthermore, software and hardware engineers do not need to consider the call relationship between high-order operators and basic operators when optimizing basic operators.
  • calculation graph represents the dependencies between the basic operators of the calculation task, it has nothing to do with the specific hardware platform, so the cross-hardware platform migration of the algorithm architecture can be realized, and in turn, it can support heterogeneous acceleration of multiple hardware platforms, further improving The calculation speed of data processing of ciphertext data.
  • the data processing method and computing platform based on the HLG framework provided by the embodiment of this application can be used for fully homomorphic encryption algorithms, other cryptographic algorithms, or other scenarios.
  • the example does not specifically limit this.
  • the computing task can support arbitrary splitting and combination of data, support the expansion from high-level operators to low-level operators, support the granularity of user-defined operator expansion, and do not need to consider the control logic, the data in the embodiment of this application can be used Processing methods and computing platforms.
  • the HLG framework provided by the embodiment of the present application can also be used as a random white-box cipher, code obfuscation scheme generation framework, or used to describe the calculation graph of symmetric cryptographic algorithms such as AES. This is not limited.
  • Fig. 15 is a schematic block diagram of a computing platform provided by an embodiment of the present application.
  • the computing platform 2000 includes a transceiver unit 2010 , a first processing unit 2020 and a second processing unit 2030 .
  • the transceiver unit 2010 can implement a corresponding communication function, and the first processing unit 2020 and the second processing unit 2030 are used for data processing.
  • the computing platform 2000 may also include a storage unit, which may be used to store instructions and/or data, and the processing unit 2020 may read the instructions and/or data in the storage unit, so that the device implements the aforementioned methods. example.
  • the computing platform 2000 may include units for performing the method in FIG. 14 . Moreover, each unit in the computing platform 2000 and the above-mentioned other operations and/or functions are respectively for realizing the corresponding process of the method embodiment in FIG. 14 .
  • the transceiver unit 2010 can be used to execute S1410 in the method 700
  • the first processing unit 2020 can be used to execute S1420 in the method 1400
  • the second processing unit 2030 can be used to execute S1430 in the method 1400.
  • the device includes: a transceiver unit 2010, configured to acquire data and a calculation task description, and the calculation task description includes calculations that need to be performed on the data; a first processing unit 2020, configured to select a first set of operators according to the calculation task description , the first set of operators includes a plurality of basic operators; the second processing unit 2030 is configured to execute the first set of operators on the data according to the dependency relationship of each basic operator in the first set of operators The calculation is performed to obtain the calculation result of the data, in which the placeholder of the input operand of one of the two basic operators having a dependency intersects the placeholder of the output operand of the other basic operator.
  • the second processing unit 2030 is further configured to: determine the first basic operator in the first operator set according to the dependency relationship of each basic operator in the first operator set, the The first basic operator is a basic operator whose intersection of the input placeholder and the output placeholder set of any basic operator in the first operator set is empty; the data is stored in the input of the first basic operator The storage space corresponding to the placeholder set; execute the first basic operator on the data and execute the first basic operator in the first operator set except the first basic operator according to the dependency relationship of each basic operator in the first operator set other basic operators.
  • the dependency relationship between the first operator set and each basic operator in the first operator set is determined according to N basic operator sets, and the N basic operator sets are related to N There is a one-to-one correspondence between the N high-order operators, and calculations on the data can be realized through the N high-order operators.
  • the first set of basic operators is obtained by expanding the first high-order operators.
  • the first high-order The operator is any high-order operator among the N high-order operators, and the first basic operator set is a basic operator set corresponding to the first high-order operator among the N basic operator sets;
  • the output placeholders of the first basic operator set are the same as the output placeholders of the first high-order operator.
  • the N sets of basic operators are in one-to-one correspondence with N pieces of dependency information, and the N pieces of dependency information are used to indicate the dependencies of each basic operator in the corresponding set of basic operators,
  • the dependency relationship between the first operator set and each basic operator in the first operator set is obtained in the following manner: according to the N pieces of dependency information, target dependency information is determined, and the target dependency information is used for Indicate the dependency relationship of each basic operator among the multiple basic operators; determine the first operator set according to the target dependency information and the basic operators included in the N basic operator sets, and the target dependency information is the first operator set The dependencies of each basic operator in an operator set.
  • the output placeholder of the first basic operator set is the same as the output placeholder of the first high-order operator through an absorbing operation
  • the absorbing operation includes: according to the first Dependency information, determine the output placeholder of the first set of basic operators, the first dependency information is used to indicate the dependency of each basic operator in the first set of basic operators; determine the first basic operator Whether the output placeholder of the set is the same as the output placeholder of the first high-order operator; if the output placeholder of the first basic operator set is different from the output placeholder of the first high-order operator , the pointer of the output operand of the first basic operator set points to the output placeholder of the first high-order operator, and the second identification information is updated to the first identification information, and the first identification information is the first identification information
  • the identification information of the output operands of a set of basic operators, the second identification information is the identification information of the output operands of the first high-order operator.
  • the absorbing operation further includes: after updating the second identification information to the first identification information, when the pointer of the output operand of the first high-order operator is accessed by a pointer synchronizer , judging whether the identification information of the pointer synchronizer is equal to the second identification information, when the identification information of the pointer synchronizer is not equal to the second identification information, updating the pointer of the pointer synchronizer to the first basic operator set The pointer of the output operand of , updating the identification information of the pointer synchronizer to the second identification information.
  • the dependency relationship of each basic operator in the first basic operator set is determined in the following manner: determining whether the input placeholder of the second basic operator is consistent with the first output placeholder At least one output placeholder in the set intersects, the first output placeholder set includes the output placeholders of all basic operators in the first basic operator set, wherein the second basic operator is the first basic operator Any basic operator in the operator set; if the input placeholder of the second basic operator intersects with at least one output placeholder in the first output placeholder set, then determine the at least one output placeholder The basic operator corresponding to the symbol has a dependency relationship with the second basic operator.
  • the output placeholder of the first operator is determined according to the input operand of the first operator and the type of the first operator, and the first operator is the first high-order For any operator obtained during operator expansion, the intersection of the output placeholder of the first operator and the assigned output placeholder is empty.
  • the transceiver unit 2010, the first processing unit 2020, and the second processing unit 2030 in FIG. 15 may be realized by one computer device, or may be realized by multiple computer devices. More specifically, the processing unit may be implemented by at least one processor or processor-related circuits, the transceiver unit may be implemented by a transceiver or transceiver-related circuits, and the storage unit may be implemented by at least one memory.
  • FIG. 16 is a schematic block diagram of a data processing device according to an embodiment of the present application.
  • the data processing apparatus 2100 shown in FIG. 16 may include: a processor 2110 , a transceiver 2120 and a memory 2130 .
  • the processor 2110, the transceiver 2120 and the memory 2130 are connected through an internal connection path, the memory 2130 is used to store instructions, the processor 2110 is used to execute the instructions stored in the memory 2130, and the transceiver 2130 receives/sends some parameters.
  • the memory 2130 may be coupled to the processor 2110 through an interface, or may be integrated with the processor 2110 .
  • transceiver 2120 may include but not limited to a transceiver device such as an input/output interface (input/output interface), so as to realize communication between the communication device 2100 and other devices or communication networks.
  • a transceiver device such as an input/output interface (input/output interface), so as to realize communication between the communication device 2100 and other devices or communication networks.
  • each step of the above method may be implemented by an integrated logic circuit of hardware in the processor 2110 or instructions in the form of software.
  • the methods disclosed in the embodiments of the present application may be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor.
  • the software module can be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register.
  • the storage medium is located in the memory 2130, and the processor 2110 reads the information in the memory 2130, and completes the steps of the above method in combination with its hardware. To avoid repetition, no detailed description is given here.
  • the memory may include a read-only memory and a random access memory, and provide instructions and data to the processor.
  • a portion of the processor may also include non-volatile random access memory.
  • the processor may also store device type information.
  • An embodiment of the present application also provides a computer-readable storage medium, where the computer-readable medium stores program codes, and when the computer program codes are run on a computer, the computer is made to execute the above-mentioned method in FIG. 14 .
  • An embodiment of the present application also provides a chip, including: at least one processor and a memory, the at least one processor is coupled to the memory, and is used to read and execute instructions in the memory, so as to execute the above-mentioned instructions in Figure 14. Methods.
  • the present application presents various aspects, embodiments or features in terms of a system comprising a number of devices, components, modules and the like. It is to be understood and appreciated that the various systems may include additional devices, components, modules, etc. and/or may not include all of the devices, components, modules etc. discussed in connection with the figures. Additionally, combinations of these schemes can also be used.
  • the network architecture and business scenarios described in the embodiments of the present application are for more clearly illustrating the technical solutions of the embodiments of the present application, and do not constitute limitations on the technical solutions provided by the embodiments of the present application.
  • the technical solutions provided by the embodiments of this application are also applicable to similar technical problems.
  • references to "one embodiment” or “some embodiments” or the like in this specification means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application.
  • appearances of the phrases “in one embodiment,” “in some embodiments,” “in other embodiments,” “in other embodiments,” etc. in various places in this specification are not necessarily All refer to the same embodiment, but mean “one or more but not all embodiments” unless specifically stated otherwise.
  • the terms “including”, “comprising”, “having” and variations thereof mean “including but not limited to”, unless specifically stated otherwise.
  • At least one means one or more, and “multiple” means two or more.
  • “And/or” describes the association relationship of associated objects, indicating that there may be three types of relationships, for example, A and/or B, which may indicate: including the existence of A alone, the existence of A and B at the same time, and the existence of B alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the contextual objects are an “or” relationship.
  • At least one of the following” or similar expressions refer to any combination of these items, including any combination of single or plural items.
  • At least one item (piece) of a, b, or c can represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c can be single or multiple .
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the functions described above are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or the part that contributes to the prior art or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Operations Research (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioethics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Hardware Design (AREA)
  • Algebra (AREA)
  • Computer Security & Cryptography (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请提供了一种数据处理的方法和计算平台,该方法包括:计算平台获取数据和计算任务描述,计算任务描述包括需要对数据执行的计算;计算平台根据计算任务描述选择第一算子集合,第一算子集合中包含多个基础算子;计算平台根据第一算子集合中的各个基础算子的依赖关系,执行第一算子集合对数据进行计算,得到数据的计算结果,其中存在依赖关系的两个基础算子中的一个基础算子的输入操作数的占位符与另一个基础算子的输出操作数的占位符相交。本申请提供的数据处理的方法,能够实现基于高阶算子的计算任务的构建和计算任务执行之间的解耦,还能够提高数据处理的计算速度。

Description

数据处理的方法和计算平台
本申请要求于2021年12月30日提交中国专利局、申请号为202111681763.3、申请名称为“数据处理的方法和计算平台”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据处理领域,更具体地,涉及一种数据处理的方法和计算平台。
背景技术
数据的全生命周期中,存储中的数据和传输中的数据,基于传统的密码学可以进行较好地保护。但是使用中的数据则无法进行有效地保护:传统加密方式加密以后的数据无法参与运算,必须解密才能使用。在云计算等场景中,因为对云服务商的信任不够,客户的敏感数据不敢交给云服务商进行存储和处理:担心云上的数据会因为内部攻击或者安全漏洞导致泄露。
在这种情况下,密态计算相关技术提供了相应解决思路,其中之一为:基于密码学的方式,让数据可以在加密的状态下进行计算,计算的执行方并不知道自己计算的是什么数据,起到对明文的逻辑隔离作用。在基于密码学的各种方案中,由于全同态加密依赖于数学困难问题,有严格的安全证明,因此安全性较高。但是全同态加密的计算速度较慢,因此,在实际使用过程中可能需要通过并行加速或异构加速的方式进行加速。
然而,现有的基于同态加密的算法架构中,当密码算法库的用户调用同态加密算法库的高阶算子描述计算任务进行计算时,硬件平台的接口会直接调用代数库的基础算子的代码实现计算。然而,一方面,代数库中的基础算子一般是基于特定硬件平台编写的,无法支持其他硬件平台对基础算子进行优化或加速;另一方面,密码算法库中的高阶算子与代数库中基础算子之间的函数调用逻辑由密码学家自行编写,这就需要密码学家在编写函数调用逻辑时要充分了解基础算子的具体代码实现以及硬件平台的特点。因此,若想通过当前同态加密算法架构实现多硬件平台的异构加速,不仅需要软硬件工程师基于新的硬件平台重新编写特定的基础算子,还要求密码学专家了解新的基础算子的具体代码实现和新的硬件平台的特点。
此外,囿于当前同态加密算法架构下高阶算子与基础算子之间的调用逻辑关系,计算引擎无法获知整体计算任务,只能需要依照函数调用关系和算子出现的顺序执行计算任务,无法将同一计算任务中相同类型的算子合并执行,有可能造成算力浪费。
发明内容
本申请提供一种数据处理的方法和计算平台,基于能够提供描述整体计算任务的计算图的算法架构,使得计算引擎能够获知整体计算任务,从而能够实现基于高阶算子的计算 任务的构建和计算任务的执行的解耦。进一步地,可以实现相同类型算子的合并执行,以及多硬件平台异构加速,进一步提高数据处理的计算速度。
第一方面,提供了一种数据处理的方法,具体地,该方法可以包括:计算平台获取数据和计算任务描述,该计算任务描述包括需要对该数据执行的计算;该计算平台根据该计算任务描述选择第一算子集合,该第一算子集合中包含多个基础算子;该计算平台根据该第一算子集合中的各个基础算子的依赖关系,执行该第一算子集合对该数据进行计算,得到该数据的计算结果,其中存在依赖关系的两个基础算子中的一个基础算子的输入操作数的占位符与另一个基础算子的输出操作数的占位符相交。
具体地,计算平台可以包括一个或多个计算机设备。
在一些可能的实现方式中,上述方法中的步骤均由一个计算机设备执行。
在一些可能的实现方式中,计算平台包括多个计算机设备,上述方法中的步骤由多个计算机设备执行。示例性地,第一计算机设备获取数据和计算任务描述后,将上述计算任务描述输入第二计算机设备,将数据输入第三计算机设备,第二计算机设备选择第一算子集合后,将该第一算子集合输入第三计算机设备,进而第三计算机设备执行第一算子集合计算上述数据。示例性地,第三计算机设备可以具有一个或多个硬件平台,第一算子集合中的多个基础算子可以由一个硬件平台执行,也可以分别由多个硬件平台执行。示例性地,第一计算机设备获取的数据和第二计算机设备选择的第一算子集合也可以输入到多个计算机设备中,该多个计算机设备可以具有一个或多个硬件平台,上述执行第一算子集合对数据进行计算,也可以为多个计算机设备执行第一算子集合对数据进行计算,具体地,多个计算机设备中的每个计算机设备可以使用一个硬件平台执行相应的基础算子,也可以使用多个硬件平台执行相应的基础算子,本申请对此不作具体限定。
应理解,当多个硬件平台或多个计算机设备同时执行第一算子集合时,可以加快数据处理的速度。
在一些可能的实现方式中,上述计算平台中的一个或多个计算机设备可以为云端的计算机设备。示例性地,计算平台中的所有计算机设备均为云端的计算机设备,该所有计算机设备用于云端的数据处理模块,以对密文和/或明文数据进行处理。示例性地,计算平台中的用于执行第一算子集合对数据进行计算的上述第三计算机设备或多个计算机设备为云端的计算机设备,则上述计算机设备用于云端的数据处理模块,以对密文和/或明文数据进行处理。
示例性地,在一些需要数据处理的场景中,计算平台获取需要处理的数据及计算任务描述后,执行上述步骤以获得计算结果。一示例,计算平台可以获取密文的电子病历数据以及需要对该电子病历数据进行处理的计算任务描述,并执行上述步骤以获得计算结果;另一示例,在联邦学***台获取参与方的密文数据,并利用该密文数据提升自身模型的效果。应理解,上述实例仅为示例性说明,本申请对提供的数据处理的方法的应用场景不作具体限定。
更具体地,执行第一算子集合对数据进行计算可以由计算机设备中的计算引擎执行。
示例性地,计算引擎根据第一算子集合中多个基础算子及其之间的依赖关系对数据进行相关计算,应理解,第一算子集合中多个基础算子之间的依赖关系等价于计算任务描述。上述数据可以为密文数据,或者也可以为明文数据,或者也可以为明文与密文混杂的数据,本申请对此不作具体限定。
需要说明的是,本申请中操作数的占位符可以为一个占位符,也可以为多个占位符组成的集合,本申请对此不作具体限定。
在一些可能的实现方式中,算子a与算子b存在依赖关系可以指算子a执行完毕才能执行算子b,即算子a的输出占位符与算子b的输入占位符相交。需要说明的是,算子a也可以称为算子b的依赖算子,或称算子b为算子a的通知算子。示例性地,存在依赖关系的两个基础算子中的一个基础算子的输入操作数的占位符与另一个基础算子的输出操作数的占位符相交,具体可以为:一个算子的输入操作数包括另一个基础算子的输出操作数;或者,一个算子的输出操作数的一部分为另一个算子的一个输入操作数,也就是说,算子的输出操作数可以拆分为多个操作数,这多个操作数可以分别作为其他算子的一个输入操作数;或者,一个算子的输出操作数为另一个算子的一个输入操作数的一部分,也就是说,一个算子的输出操作数可以与其他算子的输出操作数合并为新的操作数,该新的操作数可以作为上述另一个算子的一个输入操作数;或者,一个算子的输出操作数的一部分为另一个算子的一个输入操作数的一部分,也就是说,算子的输出操作数可以拆分为多个操作数,该多个操作数中的一个可以与其他操作数合并为新的操作数,该新的操作数可以作为上述另一个算子的一个输入操作数。
在上述技术方案中,由于计算平台执行计算任务时,可以通过第一算子集合构成的计算图获知整体计算任务,进一步地,计算平台可以基于计算图从没有依赖算子的基础算子开始执行计算,直到计算图所有基础算子均执行完毕即完成计算,无需依赖上层的高阶算子,因而能够实现基于高阶算子的计算任务的构建和计算任务的执行的解耦。进而,软硬件工程师对基础算子进行优化时无需考虑高阶算子与基础算子之间的调用关系。此外,由于计算图表示的是计算任务基础算子之间的依赖关系,与具体实现的硬件平台无关,因此可以实现算法架构的跨硬件平台迁移,进而能够支持多硬件平台异构加速,进一步提高密文数据数据处理的计算速度。
结合第一方面,在第一方面的某些实现方式中,该方法还包括:该计算平台根据该第一算子集合中各个基础算子的依赖关系,确定该第一算子集合中的第一基础算子,该第一基础算子为输入占位符与该第一算子集合中的任一个基础算子的输出占位符的交集为空的基础算子;该计算平台将该数据存储在该第一基础算子的输入占位符集合对应的存储空间;该计算平台对该数据执行该第一基础算子并根据该第一算子集合中各个基础算子的依赖关系执行该第一算子集合中除该第一基础算子以外的其他基础算子。
应理解,上述计算平台可以包括一个计算机设备,也可以包括多个计算机设备。
需要说明的是,算子的输入占位符代表算子的所有输入操作数的占位符的并集,输入占位符集合代表多个算子输入占位符构成的集合;算子的输出占位符代表算子的输出操作数的占位符的并集,输出占位符集合代表多个算子的输出占位符构成的集合。
在一些可能的实现方式中,第一基础算子集合中可能有多个第一基础算子,则计算引擎根据多个第一基础算子的输入操作数确定将数据存储在输入占位符集合中哪些输入占位符对应的存储空间中。
结合第一方面,在第一方面的某些实现方式中,该第一算子集合和该第一算子集合中的各个基础算子的依赖关系是根据N个基础算子集合确定的,该N个基础算子集合与N个高阶算子一一对应,对该数据执行该计算能够通过该N个高阶算子实现,第一基础算子集合是通过对第一高阶算子展开得到的,该第一高阶算子是该N个高阶算子中的任一个高 阶算子,该第一基础算子集合是该N个基础算子集合中对应于该第一高阶算子的基础算子集合;该第一基础算子集合的输出占位符与该第一高阶算子的输出占位符相同。
在一些可能的实现方式中,一个计算任务需要有多个高阶算子描述,该多个高阶算子之间存在依赖关系。则在本申请中,预先根据自定义的算子展开函数将每个高阶算子分别展开为对应的基础算子集合。在一些可能的实现方式中,上述展开过程为“逐级展开”,即先将高阶算子展开为多个次阶算子,再将多个次阶算子进一步展开,直至将次阶算子展开为自定义的基础算子为止。
应理解,为保证算子展开后表示的计算任务与多个高阶算子表示的计算任务相同,要求高阶算子展开得到的基础算子集合的输出占位符与高阶算子的输出占位符一致。
需要说明的是,N个基础算子集合中每个基础算子集合均包含一个或多个基础算子,也就是说,N个基础算子集合包含的基础算子总数大于或等于展开前的高阶算子的总数N。
结合第一方面,在第一方面的某些实现方式中,该N个基础算子集合与N个依赖关系信息一一对应,N个依赖关系信息用于指示该对应的基础算子集合中的各个基础算子的依赖关系,该第一算子集合和该第一算子集合中的各个基础算子的依赖关系是通过以下方式得到的:根据该N个依赖关系信息,确定目标依赖关系信息,该目标依赖关系信息用于指示多个基础算子中各个基础算子的依赖关系;根据该目标依赖关系信息和该N个基础算子集合包括的基础算子,确定该第一算子集合,该目标依赖关系信息为该第一算子集合中的各个基础算子的依赖关系。
应理解,每个高阶算子展开为对应的基础算子集合后,该基础算子集合中各个基础算子之间是存在依赖关系的。在一些可能的实现方式中,当N个基础算子集合对应的N个依赖关系信息确定后,根据该N个基础算子集合中各基础算子集合的输入占位符集合和输出占位符集合确定N个基础算子集合之间的依赖关系。在一些可能的实现方式中,还可以对N个基础算子集合及其之间的依赖关系进行优化,即将N个基础算子集合中重复的基础算子或依赖关系进行合并或去除。
结合第一方面,在第一方面的某些实现方式中,该第一基础算子集合的输出占位符与该第一高阶算子的输出占位符相同是通过吸收操作实现的,该吸收操作,包括:根据第一依赖关系信息,确定该第一基础算子集合的输出占位符,该第一依赖关系信息用于指示该第一基础算子集合中的各个基础算子的依赖关系;确定该第一基础算子集合的输出占位符与该第一高阶算子的输出占位符是否相同;若该第一基础算子集合的输出占位符与该第一高阶算子的输出占位符不相同,将该第一基础算子集合的输出操作数的指针指向该第一高阶算子的输出占位符,并将第二标识信息更新为第一标识信息,该第一标识信息为该第一基础算子集合的输出操作数的标识信息,该第二标识信息为该第一高阶算子的输出操作数的标识信息。
需要说明的是,上述标识信息为输出操作数的“唯一”标识,可以根据该输出操作数的定义唯一确定。在一些可能的实现方式中,该标识信息可以为Hash值,或者也可以为其他唯一标识信息,本申请对此不作具体限定。
在上述技术方案中,通过上述吸收操作,保证算子展开过程中生成的新的输出操作数的占位符与原算子的输出操作数的占位符一致,从而保证高阶算子展开前后表示的计算任务的一致性。
结合第一方面,在第一方面的某些实现方式中,该吸收操作,还包括:在该更新第二 标识信息为第一标识信息后,在该第一高阶算子的输出操作数的指针被指针同步器访问时,判断该指针同步器的标识信息是否等于该第二标识信息,当该指针同步器的该标识信息不等于该第二标识信息时,将该指针同步器的指针更新为该第一基础算子集合的输出操作数的指针,将该指针同步器的标识信息更新为该第二标识信息。
在上述技术方案中,在局部将新的输出操作数指针指向原算子的输出操作数的占位符,同时更新原算子的输出操作数的标识信息为新的输出操作数的标识信息,并在原算子的输出操作数下一次访问指针时,将其指针刷新为新的输出操作数的指针,从而保证全局引用的一致性。
结合第一方面,在第一方面的某些实现方式中,该第一基础算子集合中的各个基础算子的依赖关系是通过以下方式确定的:确定第二基础算子的输入占位符是否与第一输出占位符集合中的至少一个输出占位符相交,该第一输出占位符集合包括该第一基础算子集合中所有基础算子的输出占位符,其中该第二基础算子为该第一基础算子集合中的任一个基础算子;若该第二基础算子的输入占位符与该第一输出占位符集合中的至少一个输出占位符相交,则确定该至少一个输出占位符对应的基础算子与该第二基础算子存在依赖关系。
需要说明的是,一个算子的输入或输出占位符可能为一个占位符,也可能为多个占位符组成的集合,本申请对此不作具体限定。
结合第一方面,在第一方面的某些实现方式中,第一算子的输出占位符是根据该第一算子的输入操作数和该第一算子的类型确定的,该第一算子是该第一高阶算子展开过程中得到的任一个算子,该第一算子的输出占位符与已分配的输出占位符的交集为空。
在一些可能的实现方式中,该第一算子由自定义的算子组装函数生成,该第一算子的输出占位符是调用占位符/操作数分配模块分配的占位符确定的。
第二方面,提供了一种计算平台,该计算平台可以包括一个或多个计算机设备,用于执行上述第一方面或第一方面的任意可能的实现方式中的方法。具体地,该计算平台可以通过硬件实现,也可以通过硬件执行相应的软件实现。具体地,该装置包括:收发单元,用于获取数据和计算任务描述,计算任务描述包括需要对该数据执行的计算;第一处理单元,用于根据计算任务描述选择第一算子集合,该第一算子集合中包含多个基础算子;第二处理单元,根据该第一算子集合中的各个基础算子的依赖关系,执行该第一算子集合对该数据进行计算,得到该数据的计算结果,其中存在依赖关系的两个基础算子中的一个基础算子的输入操作数的占位符与另一个基础算子的输出操作数的占位符相交。
在一些可能的实现方式中,收发单元、第一处理单元和第二处理单元可以为一个计算机设备,或者收发单元、第一处理单元和第二处理单元也可以为不同的计算机设备。示例性地,第二处理单元可以为一个或多个计算机设备,示例性地,当第二处理单元为一个计算机设备时,该计算机设备可以具有一个或多个硬件平台,第一算子集合中的多个基础算子可以由一个硬件平台执行,也可以分别由多个硬件平台执行。示例性地,当第二处理单元为多个计算机设备时,收发单元获取的数据和第一处理单元选择的第一算子集合也可以输入到多个计算机设备中,该多个计算机设备可以具有一个或多个硬件平台,上述执行第一算子集合对数据进行计算,也可以为多个计算机设备执行第一算子集合对数据进行计算,具体地,多个计算机设备中的每个计算机设备可以使用一个硬件平台执行相应的基础算子,也可以使用多个硬件平台执行相应的基础算子,本申请对此不作具体限定。
在一些可能的实现方式中,该计算平台可以为云端的计算平台,或者该计算平台中的 第二处理单元为云端的处理单元,本申请对此不作具体限定。
结合第二方面,在第二方面的某些实现方式中,该第二处理单元还用于:根据该第一算子集合中各个基础算子的依赖关系,确定该第一算子集合中的第一基础算子,该第一基础算子为输入占位符与该第一算子集合中的任一个基础算子的输出占位符集合的交集为空的基础算子;将数据存储在第一基础算子的输入占位符集合对应的存储空间;对数据执行该第一基础算子并根据该第一算子集合中各个基础算子的依赖关系执行该第一算子集合中除该第一基础算子以外的其他基础算子。
结合第二方面,在第二方面的某些实现方式中,该第一算子集合和该第一算子集合中的各个基础算子的依赖关系是根据N个基础算子集合确定的,该N个基础算子集合与N个高阶算子一一对应,对该数据执行计算能够通过该N个高阶算子实现,该第一基础算子集合是通过对第一高阶算子展开得到的,该第一高阶算子是该N个高阶算子中的任一个高阶算子,该第一基础算子集合是该N个基础算子集合中对应于该第一高阶算子的基础算子集合;该第一基础算子集合的输出占位符与该第一高阶算子的输出占位符相同。
结合第二方面,在第二方面的某些实现方式中,该N个基础算子集合与N个依赖关系信息一一对应,该N个依赖关系信息用于指示对应的基础算子集合中的各个基础算子的依赖关系,该第一算子集合和该第一算子集合中的各个基础算子的依赖关系是通过以下方式得到的:根据该N个依赖关系信息,确定目标依赖关系信息,该目标依赖关系信息用于指示多个基础算子中各个基础算子的依赖关系;根据该目标依赖关系信息和该N个基础算子集合包括的基础算子,确定该第一算子集合,目标依赖关系信息为该第一算子集合中的各个基础算子的依赖关系。
结合第二方面,在第二方面的某些实现方式中,该第一基础算子集合的输出占位符与该第一高阶算子的输出占位符相同是通过吸收操作实现的,该吸收操作,包括:根据第一依赖关系信息,确定该第一基础算子集合的输出占位符,该第一依赖关系信息用于指示第一基础算子集合中的各个基础算子的依赖关系;确定该第一基础算子集合的输出占位符与该第一高阶算子的输出占位符是否相同;若该第一基础算子集合的输出占位符与该第一高阶算子的输出占位符不相同,将该第一基础算子集合的输出操作数的指针指向该第一高阶算子的输出占位符,并将第二标识信息更新为第一标识信息,该第一标识信息为该第一基础算子集合的输出操作数的标识信息,该第二标识信息为该第一高阶算子的输出操作数的标识信息。
结合第二方面,在第二方面的某些实现方式中,该吸收操作,还包括:在更新该第二标识信息为该第一标识信息后,在该第一高阶算子的输出操作数的指针被指针同步器访问时,判断该指针同步器的标识信息是否等于该第二标识信息,当该指针同步器的标识信息不等于该第二标识信息时,将该指针同步器的指针更新为该第一基础算子集合的输出操作数的指针,将该指针同步器的标识信息更新为该第二标识信息。
结合第二方面,在第二方面的某些实现方式中,该第一基础算子集合中的各个基础算子的依赖关系是通过以下方式确定的:确定第二基础算子的输入占位符是否与第一输出占位符集合中的至少一个输出占位符相交,该第一输出占位符集合包括该第一基础算子集合中所有基础算子的输出占位符,其中该第二基础算子为该第一基础算子集合中的任一个基础算子;若该第二基础算子的输入占位符与该第一输出占位符集合中的至少一个输出占位符相交,则确定该至少一个输出占位符对应的基础算子与该第二基础算子存在依赖关系。
结合第二方面,在第二方面的某些实现方式中,第一算子的输出占位符是根据该第一算子的输入操作数和该第一算子的类型确定的,该第一算子是该第一高阶算子展开过程中得到的任一个算子,该第一算子的输出占位符与已分配的输出占位符的交集为空。
第三方面,提供了一种数据处理的装置,该装置可以是芯片或电路,用于执行上述第一方面或第一方面的任意可能的实现方式中的方法。具体地,该装置可以通过硬件实现,也可以通过硬件执行相应的软件实现。
在一些可能的实现方式中,该装置包括用于执行上述第一方面或第一方面的任意可能的实现方式中的方法的模块。
在一些可能的实现方式中,该装置包括:处理器和存储器;该存储器用于存储指令,当该通信装置运行时,该处理器执行该存储器存储的该指令,以使该通信装置执行上述第一方面或第一方面的任一实现方法中的数据传输的方法。需要说明的是,该存储器可以集成于处理器中,也可以是独立于处理器之外。
在一些可能的实现方式中,该装置包括处理器,该处理器用于与存储器耦合,并读取存储器中的指令并根据所述指令执行上述第一方面或第一方面的任一实现方法中的数据传输的方法。
第四方面,提供了一种计算机可读存储介质,该计算机可读存储介质存储有程序,该程序使得通信装置执行上述任一方面,及其各种实现方式中的任一种数据传输的方法。
第五方面,本申请还提供一种包含指令的计算机程序产品,当其在计算机上运行时,使得计算机执行上述各方面中的任意数据传输的方法。
第六方面,提供了一种芯片***,包括处理器,该处理器与存储器相连,所述处理器用于从存储器中调用并运行计算机程序,使得安装有所述芯片***的通信设备执行上述任一方面及其可能的实施方式中的任一方法。该存储器可以位于该芯片***内部,也可以位于该芯片***外部。
附图说明
图1是本申请提供的一种数据处理的方法的一种应用场景示意图;
图2是本申请提供的一种数据处理的方法涉及的一种算子的示意图;。
图3是本申请提供的一种计算图表示的示意图;
图4是本申请提供的一种数据处理的方法的应用架构的示意图;
图5是本申请提供的一种数据处理的方法的应用框架的示意图;
图6是本申请提供的一种数据处理的方法的示意性流程图;
图7是本申请提供的一种数据处理的方法的示意性流程图;
图8是本申请提供的一种数据处理的方法的示意性流程图;
图9是本申请提供的一种数据处理的方法的不同阶段状态的示意图;
图10是本申请提供的一种数据处理的方法的算子依赖关系的示意图;
图11是本申请提供的又一种数据处理的方法的示意性流程图;
图12是本申请提供的又一种数据处理的方法的示意性流程图;
图13是本申请提供的再一种数据处理的方法的示意性流程图;
图14是本申请提供的再一种数据处理的方法的示意性流程图;
图15是本申请提供的一种计算平台的示意图;
图16是本申请提供的一种数据处理的装置的示意图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。为了便于理解,下文结合图1,介绍本申请实施例适用的场景。
图1是本申请提供的一种数据处理的方法的一种应用场景示意图。具体地,云端100由应用管理模块101、数据处理模块102、数据存储模块103和密钥管理与认证模块104构成。其中,应用管理模块101负责统筹处理用户请求;数据处理模块102用于对数据进行运算的任务;数据存储模块103用于存储数据,以及根据记录的用户数据位置存储和提取用户数据,并负责用户之间的信息交互;密钥管理与认证模块104用于申请密钥或进行用户认证。用户在用户端200上通过云计算***登录程序与云端进行信息交互,负责完成提交用户请求给云端100、对用户隐私数据进行加密和解密以及数据的上传和下载等工作。为了保证云计算中数据的安全性,可以在云计算中使用全同态加密算法对数据进行处理。具体地,全同态加密机制可使用户或可信第三方对数据直接进行处理操作,而不用暴露原始数据,用户得到运算结果进行解密即可得到处理好的数据。例如,在医疗信息***中,电子病历均以密文形式存储在云端的服务器上,当***门需要知道某地区某种病的病人地理位置和年龄分布以应对可能引发的公共卫生安全问题时,就可将密文的电子病历数据交给专业的数据处理服务商处理,得到处理结果后解密即可得到需要的正确数据。
应理解,图1所示的应用场景仅为示例性说明,本申请实施例提供的数据处理的方法及装置还可以应用于其他基于同态加密算法的数据处理场景中,或者也可以应用于其他非基于密态算法的数据处理中,本申请实施例提供的数据处理的方法可以用于处理密文数据,也可以用于处理明文数据或明文密文混杂数据,本申请实施例对此不作具体限定。
为了便于理解本申请实施例,以下结合图2和图3对本申请实施例涉及的相关概念进行简要介绍:
1.全同态加密(fully homomorphic encryption,FHE):可以支持在密文上进行计算的一类加密算法,等价于在对应的明文上进行计算。密文上的计算过程可以在不可信的环境中完成而不用担心泄露数据隐私。
2.异构计算(heterogeneous computing,HC):使用超过一种硬件平台的混合计算方案。例如同时使用中央处理器(central processing unit,CPU)+图形处理器(graphics processing unit,GPU),或者同时使用CPU+现场可编程门阵列(field-programmable gate array,FPGA)的计算方案。
3.格代数(lattice algebra,LA):高维线性空间上定义的代数结构。当前所有同态加密算法底层依赖的数学困难问题都是基于格代数的。常用的数学困难问题包括格上的最短向量问题(shortest vector problem,SVP)或者是带错误的学习问题(learning with error,LWE)等。
4.上层:指同态加密算法库,包括可用于表示计算任务的高阶算子,高阶算子表示的计算任务通过调用代数库的基础算子实现。
5.底层:指代数库,为计算引擎执行计算任务提供基础算子。底层接口是指代数库与硬件平台之间的接口。
6.占位符:若干特定单位尺寸数据块的集合,表示不带类型的裸数据。描述计算图 的过程中,不涉及到具体的数值,但是需要有一个抽象的表示来区分不同的数据以及数据的尺寸。在一些可能的实现方式中,假设所有数据都可以表示成特定单位尺寸(如64比特)的数据块,则占位符的尺寸就是其包含的单位尺寸数据块的个数,占位符的交集和并集就是数据块的集合的交集和并集。
7.操作数:一个或者多个占位符组成的集合,集合中的占位符无交叉,即操作数中的占位符交集为空。一个操作数可以看作一个整体数据,一个操作数可以带有一些额外信息,如数据类型。操作数的尺寸就是其包含的占位符的尺寸的总和。
应理解,操作数可以看作“穿了衣服”的占位符。同一个占位符可以“穿上不同的衣服”,这意味着同一个占位符(裸数据)在不同的场合下可以看作不同类型的操作数;不同的操作数之间也可能会包含同一个占位符。
在一些可能的实现方式中,可以把一个操作数看作一个占位符,此时,占位符代表的数据是集合中所有占位符的并集。
应理解,操作数的交集就是其代表的数据占位符的交集,操作数的并集就是其代表的数据占位符的并集。
8.算子:表示一个基本计算单元,一个算子包括算子类型,以及若干(>=0)个输入操作数和一个输出操作数,如图2所示。也就是说,算子的输入和输出是操作数,而不是占位符。
在一些可能的实现方式中,算子的输入/输出操作数可以等同于算子的输入/输出占位符,此时算子的输入/输出占位符代表算子的所有输入/输出操作数的所有占位符的并集。
9.算子之间的依赖关系:如果算子A的输入占位符和算子B的输出占位符有交叉,即交集非空,则算子A依赖于算子B。这意味着计算过程中,算子B必须在算子A之前执行。
10.计算图(computing graph):有先后依赖的算子的集合,形成一个有向无环图(directed acyclic graph,DAG),表示一个完整的计算任务,即需要对数据进行何种计算,具体如图3所示,其中A、B、C、D代表算子,a、b、c、d、x、y、z代表操作数。应理解,算子之间不会出现循环依赖。在本申请实施例中,计算图描述的计算任务由代数的数据结构和算子组成,和具体实现的硬件平台以及计算引擎无关,因此可以实现前后端解耦。
应理解,算子的先后依赖关系可以从算子集合中分析得到,因此计算图中可以显式保存这个依赖关系,也可以不保存。
如上所述,在当前已有技术框架下,当密码算法库的用户调用上层同态加密算法的高阶算子描述计算任务后,高阶算子会根据函数调用逻辑直接调用基础算子的代码实现完成计算,这些基础算子的具体实现代码一般是基于特定硬件平台编写的,因此难以通过其他硬件平台对这些基础算子进行优化或加速,这使得目前同态加密的开源库均无法有效支持异构计算。并且,由于计算引擎无法获知整体计算任务,只能需要依照调用关系和算子出现的顺序执行计算任务,容易造成算力的浪费。此外,计算引擎必须依赖上层提供的算子调用关系才能实现计算,使得前端基于高阶算子的计算任务表示和后端计算任务执行之间的耦合性较强,进而导致软硬件工程师在对基础算子进行优化时,需要考虑上层高阶算子与基础算子之间的逻辑关系,优化难度和成本较高。为了更加直观,下面以同态加密算法库(homomorphic encryption library,HELib)、格密码开源库Palisade、简单加密函数库(simple encrypted arithmetic library,SEAL)这几个开源库为例,介绍同态加密开源库在 进行数据处理时的特点。
HELib和Palisade都是基于开源的快速数论算法库(number theory algorithm library,NTL),其底层是开源的高精度整数库GNU多精度算术库(the GNU multiple precision arithmetic library,GMP)。当用户调用上层的同态加密算法的高阶算子的时候,底层接口会根据高阶算子与基础算子之间的调用关系直接调用底层的基础算子的代码实现计算。也就是说,计算引擎只能根据算子的调用顺序执行计算任务,而无法掌握整体计算任务。进而无法将在调用函数中不同位置出现的相同类型的算子合并执行,有可能造成算力浪费。此外,NTL和GMP目前都不支持GPU加速。即便后续这两个库扩展了对GPU和其它硬件平台的支持,受到现有架构的限制,也只能在支持GPU的算子内部调用GPU来实现针对GPU设计的单个算子的优化,无法进行支持不同硬件平台的算子之间的整体优化,如多个算子合并并行加速。因此NTL和GMP要支持异构计算,每迁移到一个新的硬件平台,可能需要基于该硬件平台的特点更改基础算子的实现代码。此外,由于高阶算子和基础算子之间的调用关系需要由密码学家自己编写,所以基础算子具体实现代码的改变后,高阶算子和基础算子之间的调用关系可能也需要更改。因此,不光底层实现需要大量人力进行调整,上层的应用接口也很难保持不变。因此迁移代价很高。
SEAL库的架构与HELib和Palisade类似,区别在于底层并没有调用其它代数库,而是自己专门基于CPU开发了全同态加密底层需要的格代数的算子。SEAL库虽然自己开发了基础算子,计算引擎执行计算任务也是依赖高阶算子和基础算子之间的调用关系,其无法获知整体计算任务。这使其架构与HELib、Palisade存在类似的缺点:跨硬件平台迁移代价高、上层计算任务的构建和计算引擎的计算任务执行之间的耦合性较强。
鉴于此,本申请实施例提供了一种数据处理的方法和计算平台,基于能为计算引擎提供能够描述整体计算任务的计算图的全同态加密算法架构对密文进行处理,计算引擎在执行计算任务时,可以通过计算图获知整体计算任务,因而可以根据计算图实现同类型算子的合并处理,从而在保证计算任务正确执行的前提下提升计算速度。此外,在该算法架构中,由于对于确定的计算任务,其由基础算子构成的计算图表示即基础算子之间的依赖关系也是确定,因而计算引擎可以基于计算图从没有依赖算子的基础算子开始执行计算,直到计算图所有基础算子均执行完毕即完成计算,无需依赖上层的高阶算子,因而能够实现基于高阶算子的计算任务的构建和计算任务的执行的解耦。进而,软硬件工程师对基础算子进行优化时无需考虑高阶算子与基础算子之间的调用关系。此外,由于计算图表示的是基础算子之间的依赖关系,与具体实现的硬件平台无关,因此可以实现算法架构的跨硬件平台迁移,进而能够支持多硬件平台异构加速,进一步提高数据处理的计算速度。
图4所示为本申请实施例提供的一种数据处理的方法实现所需的应用架构的示意图,可以应用于图1所示的应用场景中。如图4所示为支持异构计算的格代数计算图(heterogeneous lattice graph,HLG)架构,示例性地,HLG架构能够支持BGV算法(ZvikaBrakerski,Craig Gentry and Vinod Vaikuntanathan,BGV)、BFV算法(ZvikaBrakerski,Junfeng Fan and Frederik Vercauteren,BFV)、CKKS算法(Jung HeeCheon,Andrey Kim,Miran Kim and Yongsoo Song,CKKS)等同态加密算法。应理解,基于全同态加密算法的计算任务不存在逻辑分支,也不需要支持条件判断和循环等,即不存在控制流,因此计算任务可以表示成DAG。因此,在本申请实施例提供的HLG架构中,在上层高阶算子和底层基础算子之间增加一个抽象的计算图表达层,该计算图表达层通过DAG来描述整体计 算任务,在该计算图的基础上,计算引擎获知基础算子之间的依赖关系,能够基于基础算子及其之间的关系完成计算。也就是说,能够实现前端基于高阶算子的计算图表示和后端基于基础算子的计算任务执行之间的解耦。进而,硬件工程师针对整体计算图的特点,以及硬件平台本身的特性,可以对基础算子进行专门的优化,以提升整体计算的性能,而无需了解同态加密算法库中高阶算子相关的代数或密码学知识。此外,通过自定义基础算子调控高阶算子的展开粒度,由于对于同一个计算任务,算子之间的依赖关系是清楚的,因此在不同硬件平台,只需要改变相应基础算子的具体实现代码,因此可以实现同态加密算法库的跨硬件平台迁移。这样,可以实现上层同态加密算法库中算法代码的一次编写即可在多种硬件平台运行,因此,密码学家设计新的算法后,可以在多种硬件平台开展实验,对密码学专家设计新的算法非常友好。
图5所示为本申请实施例提供的一种数据处理的方法实现所需的HLG基础框架的示意图。如图5所示,HLG框架包括操作数/占位符分配模块、算子组装模块、算子展开模块、算子依赖关系判断模块,在一些可能的实现方式中,HLG框架还可以包括图优化模块和/或计算引擎。需要说明的是,由于使用RNS(Residue Number System)***进行加速是一种常规手段,因此为了使HLG框架能够通过RNS算法进行加速,需要其支持数据的任意拆分和组合。因此,在HLG基础框架中,对于每个算子可以定义算子组装函数、算子展开函数,其中,算子组装函数用于根据输入操作数和算子类型生成算子,算子展开函数用于将高阶算子展开为基础算子集合。在一些可能的实现方式中,还可以定义计算引擎专用的算子具体实现。应理解,图5所示的HLG框架仅为示例性说明,其中的功能模块不应简单的理解为功能实体。
下面结合图6至图14对各模块的功能进行详细阐述:
1)操作数/占位符分配模块:用于分配虚拟的数据占位符,以在计算图中代表特定长度的数据。对于该模块,给定操作数类型和想要的占位符尺寸(单元数据块数量),返回一个新的操作数,其内部包括一个符合尺寸要求的新占位符,而且和现有的占位符无交集。
在一些可能的实现方式中,可以将占位符描述为左闭右开区间[x,y),其中x,y为整数。例如,[0,4096)和[4096,8192)表示两个尺寸为4096个单元尺寸数据块的不同占位符。在一些可能的实现方式中,一个单元尺寸的数据块大小可以可以根据需要自行定义,例如,常用尺寸可以为1比特、2比特、4比特、8比特、16比特或者64比特等,本申请实施例对此不作具体限定。进一步地,操作数/占位符分配模块保存一个计数器counter,初始为0。当输入操作数类型T和占位符尺寸size时,返回一个T类型的操作数,只包括一个占位符[counter,counter+size),同时计数器counter增加size,即counter=counter+size。
应理解,对于非空的占位符ph1=[x1,y1),ph2=[x2,y2),当且仅当x1<=x2<y1或x1<y2<=y1时,占位符ph1和ph2有交叉。
在一些可能的方式中,可以将占位符描述为整数组成的集合,每一个整数代表一个不同的抽象单元数据块,例如:{0,1,2,….,4095},{4096,4097,…,8191}。进一步地,操作数/占位符分配模块保存一个计数器counter,初始为0。当输入操作数类型T和占位符尺寸size时,返回一个T类型的操作数,只包括一个占位符{counter,counter+1,…,counter+size-1},同时计数器counter增加size,即counter=counter+size。
2)算子组装模块:在一些可能的实现方式中,该算子组装模块通过算子组装函数来实现。对于该模块,给定若干输入操作数X0,X1,…,Xn和算子类型T,输出一个全新 的算子OP。在一些可能的实现方式中,输入操作数的个数也可能为零,即给定算子类型,输出一个全新算子。需要说明的是,算子组装函数可以由代数库用户,即密码学家自行定义。
进一步地。算子组装模块定义描述计算任务的候选算子集合。
在一些可能的实现方式中,OP的算子类型为T,输入操作数为X0,X1,…,Xn,输出操作数为全新的操作数Y。其中,操作数Y的占位符是调用操作数/占位符分配模块新分配的占位符,该占位符和已经存在的所有占位符没有交集,即输出占位符之间不交叉。
3)算子展开模块:用于将目标算子展开为次阶算子集合,其中,次阶算子集合表示的计算任务与原目标算子表示的计算任务相同。。在一些可能的实现方式中,该算子展开模块通过算子展开函数来实现。需要说明的是,算子展开函数可以由代数库用户,即密码学家自行定义。在一些可能的实现方式中,高阶算子的展开过程为“逐级展开”,即通过算子展开模块先将高阶算子展开为多个次阶算子。具体地,该模块调用算子组装模块的高阶算子集合,即输入单个高阶算子A,输出一个等价的算子集合{A i},即,将算子A展开为若干次阶算子A i。其中,i=0,1,2…
在一些可能的实现中,可以在算子展开函数中引入随机的保护机制,如无用代码、冗余计算分支、算子拆分、等价变换等,本申请实施例对此不作限定。
需要说明的是,基于算子展开函数将上述候选算子集合中的所有高阶算子都展开,直至候选算子集合中只剩下不需要继续展开的基础算子为止。由于基础算子不需继续展开,因此需要提供基础算子的具体实现代码。在一些可能的实现方式中,基础算子为支持格代数的算子。在一些可能的实现方式中,用户可以自定义哪些算子为基础算子。进一步地,通过自定义基础算子可以调控高阶算子展开的粒度。
还需说明的是,为了保证计算图描述的计算任务的一致性,展开后的算子集合和展开前的算子表示的计算任务需等价;并且,为了不破坏算子展开前后与其他算子之间的依赖关系,展开后的算子集合的整体输入占位符集合需要与算子展开前的整体输入占位符一致,展开后的算子集合的整体输出占位符需要与算子展开前的输出占位符需保持一致。
在一些可能的实现方式中,在算子展开函数执行过程中会产生新的输出操作数,这个新的输出操作数与原算子的输出操作数的占位符不同。为了保持计算图前后依赖关系不变,本申请实施例提供一个“吸收操作”,以使算子展开后,保证展开的算子集合的新的输出操作数和原算子的输出操作数的占位符一致。进一步地,由于新的输出操作数和原算子的输出操作数还可以被其他场合引用,因此,为了保证全局一致性,还需刷新所有引用到新的输出操作数。
以下结合图6至图9说明“吸收操作”的具体实现方式。
图6所示为进行“吸收操作”前后局部和全局信息发生的变化。从局部看:算子D展开为算子D1、D2和D3,算子展开后生成新的操作数a,操作数a的占位符为占位符a,不同于算子D展开前的操作数b的占位符b,这将导致算子展开前后的计算图依赖关系发生改变。进行“吸收操作”后,操作数a放弃自己的占位符a,转而指向操作数b的占位符b。同时,操作数b仍然是算子D的输出操作数,操作数a仍然是算子D3的输出操作数。另外,从全局来看,进行“吸收操作”后,所有原本指向操作数b的引用,现在需要指向操作数a。
更具体地,为了使所有原本指向操作数b的引用均指向操作数a,需要将所有对操作 数的指针或引用改为通过一个指针同步器PtrSyncer来完成。需要说明的是,该指针同步器除了存储一个操作数的指针以外,还会保存一个Hash值。此外,对于所有操作数对象增加一个Hash值字段,该Hash值用于唯一标识这个操作数对象。更进一步地,还需要定义一个所有Hash函数到所有操作数对象的映射表ObjMap,用于表示Hash值与操作对象之间的映射关系。
在一些可能的实现方式中,可以通过图7所示的方法600完成上述的“吸收”及同步操作。具体地,方法600包括:
S601,使第一操作数指向第二操作数的第二占位符。
示例性地,该第一操作数可以为上述操作数a,该第二操作数可以为上述操作数b,该第二占位符可以为上述占位符b。应理解,该第二占位符可以为一个占位符,或者,也可以为多个占位符组成的集合,本申请实施例对此不作具体限定。
S602,更新第二操作数的Hash值为第一操作数的Hash值。
S603,在下次访问时,将指向第二操作数的指针刷新为指向第一操作数的指针。
具体地,第二操作数的指针同步器在下一次访问指针时,将原来指向第二操作数的指针刷新为指向第一操作数的指针。在一些可能的实现方式中,更新第二操作数的指针同步器中的Hash值为第一操作数的Hash值。
在一些可能的实现方式中,通过指针同步器访问指针的方法可参见图8中所示的方法700,该方法700包括:
S701,调用指针同步器的指针访问接口。
在一些可能的实现方式中,调用指针同步器PtrSyncer的指针访问接口PtrSyncer.Ptr()。
S702,判断指针同步器的Hash值是否等于目标Hash值,如果等于,则执行S704;否则继续执行S703。
示例性地,该指针同步器的Hash值为上述第二操作数的Hash值,该目标Hash值为上述第一操作数的Hash值。在一些可能的实现方式中,指针同步器的Hash值可以表示为PtrSyncer.Hash,目标Hash值可以表示为PtrSyncer.ptr.Obj.Hash。
当PtrSyncer.Hash!=PtrSyncer.ptr.Obj.Hash时,继续执行S703。
S703,将指针同步器的指针更新为目标指针,将指针同步器的Hash值更新为目标Hash值。
示例性地,该目标指针为上述第一操作数的指针。
在一些可能的实现方式中,根据映射表ObjMap完成目标指针的更新。具体地,令P=ObjMap[PtrSyncer.ptr.Obj.Hash],令PtrSyncer.ptr=P;更进一步地,令PtrSyncer.Hash=P.Obj.Hash。至此,完成指针同步器的指针更新。
S704,返回指针同步器的指针。
应理解,返回的指针同步器的指针为目标指针,示例性地,为上述第一操作数的指针。在一些可能的实现方式中,返回值中还包含目标指针对应的Hash值。在一些可能的实现方式中,该返回值可以为PtrSyncer.ptr。
应理解,在未进行“吸收操作”之前,也即算子展开模块处于图9所示的初始状态时,指针Ptr_b和指针Ptr_a分别指向不同的操作符和占位符,同时操作数a和操作数b都有多个不同的引用,从指针Ptr_b和指针Ptr_a的局部无法得知除自身以外的其他指针有多少个引用,更无法得知全局总共有多少个引用。在经过S601和S602的“吸收”后,操作数 a指向占位符b,操作数b的Hash值变为操作数a的Hash值Hash_a。进一步地,在指针同步过程中,通过指针同步器PtrSyncer_b访问指针后,将原来的操作数b的指针Ptr_b刷新为指针Ptr_a,完成S703的全局指针同步过程。
本申请实施例提供的数据处理的方法,通过指针延迟同步,保证算子展开过程中生成的新的输出操作数的占位符与原算子的输出操作数的占位符一致。更具体地,在局部将新的输出操作数指针指向原算子的输出操作数的占位符,同时更新原算子的输出操作数的Hash值为新的输出操作数的Hash值,并在原算子的输出操作数下一次访问指针时,将其指针刷新为新的输出操作数的指针,从而保证全局引用的一致性。
4)算子依赖关系判断模块:给定一个基础算子集合S,确定算子之间的先后依赖关系,形成计算图。
在一些可能的实现方式中,对算子B,其依赖集合和通知集合分别定义如下:
算子B的依赖集合:表示所有B依赖的算子集合。集合中的所有算子的输出占位符和算子B的输入占位符交集非空,同时不存在其他算子为B的依赖。也就是说,算子B必须等到其依赖集合中的所有算子都执行完毕,才能开始执行。
算子B的通知集合:表示所有依赖B的算子集合。集合中的所有算子的输入占位符和算子B的输出占位符交集非空,同时不存在其他算子依赖算子B。也就是说,算子B执行完以后,可以通知其通知集合中的算子开始检查自己是否满足执行条件(也就是依赖集合是否为空)。
假设每一个算子有唯一标识符,用整数表示。则具体如图10所示,有算子7的箭头指向算子5,代表算子7依赖算子5。图10所示的算子的依赖集合和通知集合定义如表1:
表1:依赖集合和通知集合实例
所有算子的依赖集合 所有算子的通知集合
1->{} 1->{5}
2->{} 2->{5}
3->{} 3->{5}
4->{} 4->{6}
5->{1,2,3,6} 5->{7}
6->{4} 6->{5}
7->{5} 7->{}
需要说明的是,对于任一占位符a都可以找到对应的生成算子OP(a),使得a是OP(a)的输出占位符之一。应理解,根据所有算子的输出占位符不交叉的设定,OP(a)可以有唯一的定义。
在上述前提下,算子依赖关系判断模块的工作流程具体如下:
①把算子集合S中所有算子的输出占位符组合为一个集合S out={ph i};
②对于算子集合S中的每个算子A的输入占位符ph A,找到S out中和ph A交叉的占位符集合S A。应理解,S A为S out的子集。然后获得S A中所有占位符对应的生成算子集合OP(S A)。则确定算子A依赖于算子集合OP(S A)中的所有算子。
③当算子集合中的所有算子都确定了其依赖关系以后,S集合加上这个依赖关系,就可以完全表示一个计算图。
需要说明的是,从计算图中还可以分析出整个计算图的输入操作数集合和输出操作数集合,因此计算图中可以显式保存整体输入和输出操作数集合,也可以不保存。应理解,上述输入操作数集合为不依赖其它操作数的操作数,是整个计算图的输入数据;上述输出操作数集合为不被其它任何算子引用为输入的操作数集合,是整个计算图的输出数据。
下面结合图11和图12说明确定占位符集合S A的方法。
在一些可能的实现方式中,定义算子A的输入占位符为目标占位符ph=[ph.x,ph.y),对比占位符集合S out={ph i},其中,ph i=[ph i.x,ph i.y),i=0,…,n-1;n为算子集合S中算子的个数。对比占位符集合{ph i}的内的占位符按照ph i.x从小到大排序;另外,根据所有算子的输出占位符无交叉的假设,占位符集合S out内的占位符ph i之间没有交叉。具体地,对比占位符集合{ph i}满足两个条件:1)对任意i,有ph i.x<ph i.y;2)对任意的i,j,满足0≤i<j<n时,有ph i.y<=ph j.x。
基于上述条件,可以参照图11所示的方法1000确定算子A的依赖集合。具体地,方法1000包括:
S1010,输入目标占位符ph=[ph.x,ph.y)和对比占位符集合S out={ph i},其中,ph i=[ph i.x,ph i.y),i=0,…,n-1。
S1020,确定满足ph L.y>ph.x的最小的下标L。
S1030,确定满足ph R.x≥ph.y的最小的下标R。
S1040,确定第一输出占位符集合{ph j,j∈[L,R)}。
应理解,第一输出占位符集合{ph j,j∈[L,R)}包括了集合{ph i}中与ph交集非空的所有占位符,换句话说,该第一输出占位符集合即为与算子A的输入占位符ph=[ph.x,ph.y)有交集的输出占位符集合S A。进一步地,算子集合OP(S A)即为算子A的依赖集合。应理解,当L≥R时,S A为空集,即算子A无依赖集合。
在一些可能的实现方式中,S1020和S1030可以通过“二分查找”算法实现,复杂度为O(log(n))。
在一些可能的实现方式中,还可以参照图12所示的方法1100确定算子A的依赖集合。具体地,方法1100包括:
S1110,输入目标占位符ph和对比占位符集合S out={ph i},其中,i=0,…,n-1。
S1120,对于j=0,…n-1,确定与ph交集非空的ph j
在一些可能的实现方式中,定义集合ret=空集,对于j=0,…n-1,如果ph和ph j的交集非空,则将ph j加入集合ret。应理解,该查找方法的复杂度为O(n)。
S1130,确定第一输出占位符集合{ph j}。
应理解,S1120中确定的集合ret即为第一输出占位符集合,也就是与算子A的输入占位符有交集的输出占位符集合S A。当集合ret为空集时,S A为空集,即算子A无依赖集合,则此时算子A的输入占位符用于存储计算任务的实际输入数据。
本申请实施例提供的数据处理的方法,通过确定基础算子集合以及它们之间的依赖关系,构造整体计算任务,使得计算引擎能够执行只包含基础算子的整体计算任务,从而实现计算任务构建和计算任务执行的解耦。进一步地,不同硬件平台的计算引擎均可以基于本申请实施例确定的同一份计算任务进行具体实现,因此可以实现异构计算。
5)图优化模块:用于对计算图进行优化,具体地,可以对计算图进行等价变换,减少重复计算,提高执行性能等。在一些可能的实现方式中,可以使用裁剪无用计算分支、 算子去重、自动预计算、自动SIMD优化等图优化算法,本申请实施例对此不作具体限定。
6)计算引擎:用于具体执行计算图描述的计算任务,把整体计算任务的输入数据处理为输出数据。
在一些可能的实现方式中,计算引擎在实际执行的流程如下:
①为所有数据占位符分配实际存储空间;
②接收整体计算任务的输入数据,存入输入占位符对应的实际存储空间;
③根据计算图依赖关系,从没有依赖的算子开始执行计算,将算子计算的中间结果保存在对应算子的输出占位符的实际存储空间中。直到所有算子都计算完成为止。
④从计算图的整体输出占位符对应的实际存储空间中取出数据,作为整体计算任务的输出数据。
需要说明的是,计算引擎的工作流程仅为示例性说明,本申请实施例对其具体实施不作具体限定。
图13示出了本申请实施例提供的一种数据处理的方法1300的示意性流程图,该方法1300可以应用于图4所示的HLG架构中,也可以应用于图5所示的HLG框架中,本申请实施例对此不作限定。在一些可能的实现方式中,该方法1300可以由图5所示的HLG框架执行,具体地,方法1300包括:
S1310,定义计算任务所有输入数据的操作数。
示例性地,定义输入数据的操作数可以由上述实施例中的“操作数/占位符分配模块”执行。应理解,后续的所有算子的全部或部分输入操作数均来自本步骤定义的操作数。
S1320,确定构建计算任务的高阶算子集合。
示例性地,构建计算任务的高阶算子集合中的高阶算子可以由上述实施例中的“算子组装模块”确定。具体地,算子组装模块使用算子组装函数将从S1310中选择的操作数作为算子的输入操作数,结合算子类型生成相应的高阶算子。在一些可能的实现方式中,高阶算子B为高阶算子A的依赖算子,则调用算子组装模块生成高阶算子A后,将高阶算子A的输出操作数作为高阶算子B的输入操作数之一组装高阶算子B。应理解,高阶算子的输出操作数是“操作数/占位符分配模块”新分配的操作数,该输出操作数的占位符与S1310中定义的所有输入数据的操作数的占位符均无交集,与已经分配的其他高阶算子的输出占位符也无交集。
S1330,根据算子展开函数将高阶算子集合展开为基础算子集合。
示例性地,将高阶算子集合展开为基础算子集合可以由上述实施例中的“算子展开模块”执行。具体地,算子展开模块使用算子展开函数将每个高阶算子展开为对应的基础算子集合。现以将高阶算子集合中的第一高阶算子展开为第一基础算子集合为例详细说明展开过程涉及的模块调用:在一些可能的实现方式中,先将第一高阶算子展开为第一次阶算子集合,进而将第一次阶算子集合中的每个次阶算子展开,直至所有次阶算子均展开为基础算子为止。需要说明的是,展开过程中,调用算子组装函数进行次阶算子和基础算子的组装,即,确定次阶算子和基础算子的输入操作数和输出操作数。根据上述定义,算子的输出操作数的占位符与已存在的操作数的占位符均不相交,即每个算子的输出操作数为新的输出操作数。
需要说明的是,在将算子展开为下一阶算子的过程中,下一阶算子的依赖关系已经被定义在算子展开函数中。例如,将算子A展开为算子a1,a2,a3和a4,展开过程中即可 根据算子展开函数确定a3依赖于a1和a2,则展开过程中调用算子组装函数先分别生成a1和a2,再调用算子组装模块根据a1和a2的输出操作数以及S1310中定义的输入数据的操作数生成a3。示例性地,算子组装模块也可以将算子a1和a2的输出操作数合并为一个新的操作数作为算子a3的一个输入操作数。在一些可能的实现方式中,在算子A展开为算子a1-a4的过程中,算子a1除了是算子a3的通知算子,也是a4的通知算子,则在算子组装模块生成算子a3和a4的过程中,可能将算子a1的输出操作数拆分为输出操作数1和输出操作数2。进一步地,将a1的输出操作数1作为算子a3的一个输入操作数,将a1的输出操作数2作为算子a4的一个输入操作数。应理解,根据上述实施例中的描述,操作数的拆分和合并,可以理解为操作数的占位符集合的拆分和合并。在一些可能的实现方式中,还需说明的是,上述算子a1,a2和a3之间的依赖关系在算子a1,a2和a3继续展开过程中可能会被破坏,因此需要通过S1340确定最终展开得到的基础算子集合的依赖关系信息。
S1340,确定基础算子集合的依赖关系信息。
具体地,基础算子集合的依赖关系信息是指基础算子集合中各基础算子的依赖关系。示例性地,确定基础算子集合的依赖关系信息由上述实施例中的“算子依赖关系判断模块”执行。
在一些可能的实现方式中,某个计算任务是由N个高阶算子表示的,则基础算子集合是由该N个高阶算子展开确定的。具体地,N个高阶算子展开为对应的N个基础算子集合,为保证高阶算子展开为基础算子集合后,基础算子集合表示的计算任务不变,需使高阶算子的输出占位符与其展开的基础算子集合的输出占位符一致。以第一高阶算子展开为第一基础算子集合为例,使用上述实施例中的方法1000或方法1100确定第一基础算子集合的依赖关系信息,进而确定第一基础算子集合的输出占位符。当第一基础算子集合的输出占位符与第一高阶算子的输出占位符不一致时,通过上述实施例中的“吸收”操作将第一基础算子集合的输出占位符更新为第一高阶算子的输出占位符。进一步地,根据N个基础算子集合和N个基础算子集合的依赖关系信息确定基础算子集合的依赖关系信息。
进一步地,计算引擎在对数据进行计算时,只需执行上述基础算子集合构成的计算图即可获得计算结果,而无需了解代数库中高阶算子与基础算子之间的调用关系。
本申请实施例提供一种数据处理的方法,使用基础算子及其之间的依赖关系表示计算任务,使得计算引擎在进行计算时无需依赖上层高阶算子表示的计算任务,可以直接执行基础算子完成计算,因而能够实现基于高阶算子的计算任务的构建和计算任务的执行的解耦。进而,软硬件工程师在对基础算子进行优化时,无需考虑高阶算子与基础算子之间的调用关系。此外,由于计算图表示的是基础算子之间的依赖关系,与具体实现的硬件平台无关,因此可以实现算法架构的跨硬件平台迁移,进而能够支持多硬件平台异构加速,进一步提高数据处理的计算速度。
图14示出了本申请实施例提供的一种数据处理的方法1400的示意性流程图,该方法1400可以应用于图4所示的HLG架构中,也可以应用于图5所示的HLG框架中,本申请实施例对此不作限定。在一些可能的实现方式中,该方法1400可以由图5所示的HLG框架中的计算引擎执行,具体地,方法1400包括:
S1410,计算平台获取数据和计算任务描述,计算任务描述包括需要对数据执行的计算。
具体地,上述数据可以为密文数据,或者也可以为明文数据,或者也可以为明文与密文混杂的数据,本申请对此不作具体限定。
S1420,计算平台根据计算任务描述选择第一算子集合,第一算子集合包含多个基础算子。
示例性地,该第一算子集合可以为根据高阶算子展开获得的,高阶算子展开得到第一算子集合的具体流程可以参考上述实施例中的描述,在此不再赘述。
应理解,第一算子集合表示的计算任务与其展开前的高阶算子表示的计算任务一致。
S1430,计算平台根据第一算子集合中的各个基础算子的依赖关系,执行第一算子集合对数据进行计算,得到该数据的计算结果,其中存在依赖关系的两个基础算子中的一个基础算子的输入操作数包括另一个基础算子的输出操作数。
示例性地,计算引擎根据第一算子集合中多个基础算子及其之间的依赖关系对数据进行相关计算,应理解,第一算子集合中多个基础算子之间的依赖关系等价于计算任务描述。
需要说明的是,本申请中操作数的占位符可以为一个占位符,也可以为多个占位符组成的集合,本申请对此不作具体限定。
应理解,算子a与算子b存在依赖关系可以指算子a执行完毕才能执行算子b,即算子a的输出占位符与算子b的输入占位符相交。需要说明的是,算子a也可以称为算子b的依赖算子,或称算子b为算子a的通知算子。
进一步地,计算引擎从第一算子集合中没有依赖算子的基础算子开始执行计算,直至第一算子集合中所有基础算子执行完毕,获得上述数据的计算结果。
更具体地,第一算子集合中各个基础算子的依赖关系的确定流程可以参考上述实施例中的描述,在此不再赘述。
本申请实施例提供的一种数据处理的方法中,由于计算引擎执行计算任务时,可以通过第一算子集合构成的计算图获知整体计算任务,进一步地,计算引擎可以基于计算图从没有依赖算子的基础算子开始执行计算,直到计算图所有基础算子均执行完毕即完成计算,无需依赖上层的高阶算子,因而能够实现基于高阶算子的计算任务的构建和计算任务的执行的解耦。进而,软硬件工程师对基础算子进行优化时无需考虑高阶算子与基础算子之间的调用关系。此外,由于计算图表示的是计算任务基础算子之间的依赖关系,与具体实现的硬件平台无关,因此可以实现算法架构的跨硬件平台迁移,进而能够支持多硬件平台异构加速,进一步提高密文数据数据处理的计算速度。
需要说明的是,本申请实施例提供的基于HLG框架数据处理的方法和计算平台,可以用于全同态加密算法,也可以用于其他密码算法,或者也可以用于其他场景,本申请实施例对此不作具体限定。只要计算任务能够支持数据任意的拆分组合、支持从高阶算子展开为低阶算子、支持自定义算子展开的粒度,并且不需要考虑控制逻辑,即可使用本申请实施例的数据处理的方法和计算平台。在一些可能的实现方式中,在使用本申请实施例的数据处理的方法和计算平台时,需要验证密码算法实现的正确性,即需要验证:原始计算图的逻辑正确,以及具体硬件平台上的单个算子的实现是正确的。在一些可能的实现方式中,本申请实施例提供的HLG框架还可以作为随机的白盒密码、代码混淆方案的生成框架,或者用于描述AES等对称密码算法的计算图,本申请实施例对此不作限定。
图15是本申请实施例提供的计算平台的示意性框图。该计算平台2000包括收发单元2010、第一处理单元2020和第二处理单元2030。收发单元2010可以实现相应的通信功 能,第一处理单元2020和第二处理单元2030用于进行数据处理。
可选地,该计算平台2000还可以包括存储单元,该存储单元可以用于存储指令和/或数据,处理单元2020可以读取存储单元中的指令和/或数据,以使得装置实现前述方法实施例。
该计算平台2000可以包括用于执行图14中的方法的单元。并且,该计算平台2000中的各单元和上述其他操作和/或功能分别为了实现图14的方法实施例的相应流程。
当该计算平台2000用于执行图14中的方法1400时,收发单元2010可用于执行方法700中的S1410,第一处理单元2020可用于执行方法1400中的S1420,第二处理单元2030可用于执行方法1400中的S1430。
具体地,该装置包括:收发单元2010,用于获取数据和计算任务描述,计算任务描述包括需要对该数据执行的计算;第一处理单元2020,用于根据计算任务描述选择第一算子集合,该第一算子集合中包含多个基础算子;第二处理单元2030,用于根据该第一算子集合中的各个基础算子的依赖关系,执行该第一算子集合对该数据进行计算,得到该数据的计算结果,其中存在依赖关系的两个基础算子中的一个基础算子的输入操作数的占位符与另一个基础算子的输出操作数的占位符相交。
在一些可能的实现方式中,该第二处理单元2030还用于:根据该第一算子集合中各个基础算子的依赖关系,确定该第一算子集合中的第一基础算子,该第一基础算子为输入占位符与该第一算子集合中的任一个基础算子的输出占位符集合的交集为空的基础算子;将数据存储在第一基础算子的输入占位符集合对应的存储空间;对数据执行该第一基础算子并根据该第一算子集合中各个基础算子的依赖关系执行该第一算子集合中除该第一基础算子以外的其他基础算子。
在一些可能的实现方式中,该第一算子集合和该第一算子集合中的各个基础算子的依赖关系是根据N个基础算子集合确定的,该N个基础算子集合与N个高阶算子一一对应,对该数据执行计算能够通过该N个高阶算子实现,该第一基础算子集合是通过对第一高阶算子展开得到的,该第一高阶算子是该N个高阶算子中的任一个高阶算子,该第一基础算子集合是该N个基础算子集合中对应于该第一高阶算子的基础算子集合;该第一基础算子集合的输出占位符与该第一高阶算子的输出占位符相同。
在一些可能的实现方式中,该N个基础算子集合与N个依赖关系信息一一对应,该N个依赖关系信息用于指示对应的基础算子集合中的各个基础算子的依赖关系,该第一算子集合和该第一算子集合中的各个基础算子的依赖关系是通过以下方式得到的:根据该N个依赖关系信息,确定目标依赖关系信息,该目标依赖关系信息用于指示多个基础算子中各个基础算子的依赖关系;根据该目标依赖关系信息和该N个基础算子集合包括的基础算子,确定该第一算子集合,目标依赖关系信息为该第一算子集合中的各个基础算子的依赖关系。
在一些可能的实现方式中,该第一基础算子集合的输出占位符与该第一高阶算子的输出占位符相同是通过吸收操作实现的,该吸收操作,包括:根据第一依赖关系信息,确定该第一基础算子集合的输出占位符,该第一依赖关系信息用于指示第一基础算子集合中的各个基础算子的依赖关系;确定该第一基础算子集合的输出占位符与该第一高阶算子的输出占位符是否相同;若该第一基础算子集合的输出占位符与该第一高阶算子的输出占位符不相同,将该第一基础算子集合的输出操作数的指针指向该第一高阶算子的输出占位符, 并将第二标识信息更新为第一标识信息,该第一标识信息为该第一基础算子集合的输出操作数的标识信息,该第二标识信息为该第一高阶算子的输出操作数的标识信息。
在一些可能的实现方式中,该吸收操作,还包括:在更新该第二标识信息为该第一标识信息后,在该第一高阶算子的输出操作数的指针被指针同步器访问时,判断该指针同步器的标识信息是否等于该第二标识信息,当该指针同步器的标识信息不等于该第二标识信息时,将该指针同步器的指针更新为该第一基础算子集合的输出操作数的指针,将该指针同步器的标识信息更新为该第二标识信息。
在一些可能的实现方式中,该第一基础算子集合中的各个基础算子的依赖关系是通过以下方式确定的:确定第二基础算子的输入占位符是否与第一输出占位符集合中的至少一个输出占位符相交,该第一输出占位符集合包括该第一基础算子集合中所有基础算子的输出占位符,其中该第二基础算子为该第一基础算子集合中的任一个基础算子;若该第二基础算子的输入占位符与该第一输出占位符集合中的至少一个输出占位符相交,则确定该至少一个输出占位符对应的基础算子与该第二基础算子存在依赖关系。
在一些可能的实现方式中,第一算子的输出占位符是根据该第一算子的输入操作数和该第一算子的类型确定的,该第一算子是该第一高阶算子展开过程中得到的任一个算子,该第一算子的输出占位符与已分配的输出占位符的交集为空。
图15中的收发单元2010、第一处理单元2020和第二处理单元2030可以由一个计算机设备实现,也可以由多个计算机设备实现。更具体地,处理单元可以由至少一个处理器或处理器相关电路实现,收发单元可以由收发器或收发器相关电路实现,存储单元可以通过至少一个存储器实现。
图16是本申请实施例的一种数据处理的装置的示意性框图。图16所示的数据处理的装置2100可以包括:处理器2110、收发器2120以及存储器2130。其中,处理器2110、收发器2120以及存储器2130通过内部连接通路相连,该存储器2130用于存储指令,该处理器2110用于执行该存储器2130存储的指令,以收发器2130接收/发送部分参数。可选地,存储器2130既可以和处理器2110通过接口耦合,也可以和处理器2110集成在一起。
需要说明的是,上述收发器2120可以包括但不限于输入/输出接口(input/output interface)一类的收发装置,来实现通信设备2100与其他设备或通信网络之间的通信。
在实现过程中,上述方法的各步骤可以通过处理器2110中的硬件的集成逻辑电路或者软件形式的指令完成。结合本申请实施例所公开的方法可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器2130,处理器2110读取存储器2130中的信息,结合其硬件完成上述方法的步骤。为避免重复,这里不再详细描述。
还应理解,本申请实施例中,该存储器可以包括只读存储器和随机存取存储器,并向处理器提供指令和数据。处理器的一部分还可以包括非易失性随机存取存储器。例如,处理器还可以存储设备类型的信息。
本申请实施例还提供一种计算机可读存储介质,所述计算机可读介质存储有程序代码,当所述计算机程序代码在计算机上运行时,使得所述计算机执行上述图14中的方法。
本申请实施例还提供一种芯片,包括:至少一个处理器和存储器,所述至少一个处理 器与所述存储器耦合,用于读取并执行所述存储器中的指令,以执行上述图14中的方法。
本申请将围绕包括多个设备、组件、模块等的***来呈现各个方面、实施例或特征。应当理解和明白的是,各个***可以包括另外的设备、组件、模块等,并且/或者可以并不包括结合附图讨论的所有设备、组件、模块等。此外,还可以使用这些方案的组合。
另外,在本申请实施例中,“示例的”、“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例”的任何实施例或设计方案不应被解释为比其它实施例或设计方案更优选或更具优势。确切而言,使用示例的一词旨在以具体方式呈现概念。
本申请实施例中,“相应的(corresponding,relevant)”和“对应的(corresponding)”有时可以混用,应当指出的是,在不强调其区别时,其所要表达的含义是一致的。
本申请实施例描述的网络架构以及业务场景是为了更加清楚地说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着网络架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
在本说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:包括单独存在A,同时存在A和B,以及单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的***、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的***、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的 部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (17)

  1. 一种数据处理的方法,其特征在于,包括:
    计算平台获取数据和计算任务描述,所述计算任务描述包括需要对所述数据执行的计算;
    所述计算平台根据所述计算任务描述选择第一算子集合,所述第一算子集合中包含多个基础算子;
    所述计算平台根据所述第一算子集合中的各个基础算子的依赖关系,执行所述第一算子集合对所述数据进行计算,得到所述数据的计算结果,其中存在依赖关系的两个基础算子中的一个基础算子的输入操作数的占位符与另一个基础算子的输出操作数的占位符相交。
  2. 根据权利要求1所述的方法,其特征在于,所述计算平台根据所述第一算子集合中各个基础算子的依赖关系,执行所述第一算子集合对所述数据进行计算,包括:
    所述计算平台根据所述第一算子集合中各个基础算子的依赖关系,确定所述第一算子集合中的第一基础算子,所述第一基础算子为输入占位符与所述第一算子集合中的任一个基础算子的输出占位符的交集为空的基础算子;
    所述计算平台将所述数据存储在所述第一基础算子的输入占位符集合对应的存储空间;
    所述计算平台对所述数据执行所述第一基础算子并根据所述第一算子集合中各个基础算子的依赖关系执行所述第一算子集合中除所述第一基础算子以外的其他基础算子。
  3. 根据权利要求1或2所述的方法,其特征在于,所述第一算子集合和所述第一算子集合中的各个基础算子的依赖关系是根据N个基础算子集合确定的,所述N个基础算子集合与N个高阶算子一一对应,对所述数据执行所述计算能够通过所述N个高阶算子实现,
    第一基础算子集合是通过对第一高阶算子展开得到的,所述第一高阶算子是所述N个高阶算子中的任一个高阶算子,所述第一基础算子集合是所述N个基础算子集合中对应于所述第一高阶算子的基础算子集合;
    所述第一基础算子集合的输出占位符与所述第一高阶算子的输出占位符相同。
  4. 根据权利要求3所述的方法,其特征在于,所述N个基础算子集合与N个依赖关系信息一一对应,N个依赖关系信息用于指示所述对应的基础算子集合中的各个基础算子的依赖关系,
    所述第一算子集合和所述第一算子集合中的各个基础算子的依赖关系是通过以下方式得到的:
    根据所述N个依赖关系信息,确定目标依赖关系信息,所述目标依赖关系信息用于指示多个基础算子中各个基础算子的依赖关系;
    根据所述目标依赖关系信息和所述N个基础算子集合包括的基础算子,确定所述第一算子集合,所述目标依赖关系信息为所述第一算子集合中的各个基础算子的依赖关系。
  5. 根据权利要求3或4所述的方法,其特征在于,所述第一基础算子集合的输出占位符与所述第一高阶算子的输出占位符相同是通过吸收操作实现的,所述吸收操作,包括:
    根据第一依赖关系信息,确定所述第一基础算子集合的输出占位符,所述第一依赖关系信息用于指示所述第一基础算子集合中的各个基础算子的依赖关系;
    确定所述第一基础算子集合的输出占位符与所述第一高阶算子的输出占位符是否相同;
    若所述第一基础算子集合的输出占位符与所述第一高阶算子的输出占位符不相同,将所述第一基础算子集合的输出操作数的指针指向所述第一高阶算子的输出占位符,并将第二标识信息更新为第一标识信息,所述第一标识信息为所述第一基础算子集合的输出操作数的标识信息,所述第二标识信息为所述第一高阶算子的输出操作数的标识信息。
  6. 根据权利要求5所述的方法,其特征在于,所述吸收操作,还包括:
    在所述更新第二标识信息为第一标识信息后,在所述第一高阶算子的输出操作数的指针被指针同步器访问时,判断所述指针同步器的标识信息是否等于所述第二标识信息,当所述指针同步器的所述标识信息不等于所述第二标识信息时,将所述指针同步器的指针更新为所述第一基础算子集合的输出操作数的指针,将所述指针同步器的标识信息更新为所述第二标识信息。
  7. 根据权利要求3至6中任一项所述的方法,其特征在于,所述第一基础算子集合中的各个基础算子的依赖关系是通过以下方式确定的:
    确定第二基础算子的输入占位符是否与第一输出占位符集合中的至少一个输出占位符相交,所述第一输出占位符集合包括所述第一基础算子集合中所有基础算子的输出占位符,其中所述第二基础算子为所述第一基础算子集合中的任一个基础算子;
    若所述第二基础算子的输入占位符与所述第一输出占位符集合中的至少一个输出占位符相交,则确定所述至少一个输出占位符对应的基础算子与所述第二基础算子存在依赖关系。
  8. 根据权利要求3至7中任一项所述的方法,其特征在于,第一算子的输出占位符是根据所述第一算子的输入操作数和所述第一算子的类型确定的,所述第一算子是所述第一高阶算子展开过程中得到的任一个算子,所述第一算子的输出占位符与已分配的输出占位符的交集为空。
  9. 一种计算平台,其特征在于,包括:
    收发单元,用于获取数据和计算任务描述,所述计算任务描述包括需要对所述数据执行的计算;
    第一处理单元,用于根据所述计算任务描述选择第一算子集合,所述第一算子集合中包含多个基础算子;
    第二处理单元,用于根据所述第一算子集合中的各个基础算子的依赖关系,执行所述第一算子集合对所述数据进行计算,得到所述数据的计算结果,其中存在依赖关系的两个基础算子中的一个基础算子的输入操作数的占位符与另一个基础算子的输出操作数的占位符相交。
  10. 根据权利要求9所述的计算平台,其特征在于,所述第二处理单元根据所述第一算子集合中各个基础算子的依赖关系,执行所述第一算子集合对所述数据进行计算,具体为:
    根据所述第一算子集合中各个基础算子的依赖关系,确定所述第一算子集合中的第一基础算子,所述第一基础算子为输入占位符与所述第一算子集合中的任一个基础算子的输 出占位符的交集为空的基础算子;
    将所述数据存储在所述第一基础算子的输入占位符集合对应的存储空间;
    对所述数据执行所述第一基础算子并根据所述第一算子集合中各个基础算子的依赖关系执行所述第一算子集合中除所述第一基础算子以外的其他基础算子。
  11. 根据权利要求9或10所述的计算平台,其特征在于,所述第一算子集合和所述第一算子集合中的各个基础算子的依赖关系是根据N个基础算子集合确定的,所述N个基础算子集合与N个高阶算子一一对应,对所述数据执行所述计算能够通过所述N个高阶算子实现,
    第一基础算子集合是通过对第一高阶算子展开得到的,所述第一高阶算子是所述N个高阶算子中的任一个高阶算子,所述第一基础算子集合是所述N个基础算子集合中对应于所述第一高阶算子的基础算子集合;
    所述第一基础算子集合的输出占位符与所述第一高阶算子的输出占位符相同。
  12. 根据权利要求11所述的计算平台,其特征在于,所述N个基础算子集合与N个依赖关系信息一一对应,N个依赖关系信息用于指示所述对应的基础算子集合中的各个基础算子的依赖关系,
    所述第一算子集合和所述第一算子集合中的各个基础算子的依赖关系是通过以下方式得到的:
    根据所述N个依赖关系信息,确定目标依赖关系信息,所述目标依赖关系信息用于指示多个基础算子中各个基础算子的依赖关系;
    根据所述目标依赖关系信息和所述N个基础算子集合包括的基础算子,确定所述第一算子集合,所述目标依赖关系信息为所述第一算子集合中的各个基础算子的依赖关系。
  13. 根据权利要求11或12所述的计算平台,其特征在于,所述第一基础算子集合的输出占位符与所述第一高阶算子的输出占位符相同是通过吸收操作实现的,所述吸收操作,包括:
    根据第一依赖关系信息,确定所述第一基础算子集合的输出占位符,所述第一依赖关系信息用于指示所述第一基础算子集合中的各个基础算子的依赖关系;
    确定所述第一基础算子集合的输出占位符与所述第一高阶算子的输出占位符是否相同;
    若所述第一基础算子集合的输出占位符与所述第一高阶算子的输出占位符不相同,将所述第一基础算子集合的输出操作数的指针指向所述第一高阶算子的输出占位符,并将第二标识信息更新为第一标识信息,所述第一标识信息为所述第一基础算子集合的输出操作数的标识信息,所述第二标识信息为所述第一高阶算子的输出操作数的标识信息。
  14. 根据权利要求13所述的计算平台,其特征在于,所述吸收操作,还包括:
    在所述更新第二标识信息为第一标识信息后,在所述第一高阶算子的输出操作数的指针被指针同步器访问时,判断所述指针同步器的标识信息是否等于所述第二标识信息,当所述指针同步器的所述标识信息不等于所述第二标识信息时,将所述指针同步器的指针更新为所述第一基础算子集合的输出操作数的指针,将所述指针同步器的标识信息更新为所述第二标识信息。
  15. 根据权利要求11至14中任一项所述的计算平台,其特征在于,所述第一基础算子集合中的各个基础算子的依赖关系是通过以下方式确定的:
    确定第二基础算子的输入占位符是否与第一输出占位符集合中的至少一个输出占位符相交,所述第一输出占位符集合包括所述第一基础算子集合中所有基础算子的输出占位符,其中所述第二基础算子为所述第一基础算子集合中的任一个基础算子;
    若所述第二基础算子的输入占位符与所述第一输出占位符集合中的至少一个输出占位符相交,则确定所述至少一个输出占位符对应的基础算子与所述第二基础算子存在依赖关系。
  16. 根据权利要求11至15中任一项所述的计算平台,其特征在于,第一算子的输出占位符是根据所述第一算子的输入操作数和所述第一算子的类型确定的,所述第一算子是所述第一高阶算子展开过程中得到的任一个算子,所述第一算子的输出占位符与已分配的输出占位符的交集为空。
  17. 一种计算机可读存储介质,其特征在于,其上存储有计算机程序,所述计算机程序被计算机执行时,以使得实现如权利要求1至8中任一项所述的方法。
PCT/CN2022/134250 2021-12-30 2022-11-25 数据处理的方法和计算平台 WO2023124677A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111681763.3 2021-12-30
CN202111681763.3A CN116415271A (zh) 2021-12-30 2021-12-30 数据处理的方法和计算平台

Publications (1)

Publication Number Publication Date
WO2023124677A1 true WO2023124677A1 (zh) 2023-07-06

Family

ID=86997539

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/134250 WO2023124677A1 (zh) 2021-12-30 2022-11-25 数据处理的方法和计算平台

Country Status (2)

Country Link
CN (1) CN116415271A (zh)
WO (1) WO2023124677A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117349868A (zh) * 2023-12-04 2024-01-05 粤港澳大湾区数字经济研究院(福田) 基于gpu的全同态加解密方法、电子设备和存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110888720A (zh) * 2019-10-08 2020-03-17 北京百度网讯科技有限公司 任务处理方法、装置、计算机设备及存储介质
CN112381211A (zh) * 2020-11-20 2021-02-19 西安电子科技大学 基于异构平台执行深度神经网络的***及方法
CN112947933A (zh) * 2021-02-24 2021-06-11 上海商汤智能科技有限公司 一种算子的执行方法、装置、计算机设备及存储介质

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110888720A (zh) * 2019-10-08 2020-03-17 北京百度网讯科技有限公司 任务处理方法、装置、计算机设备及存储介质
CN112381211A (zh) * 2020-11-20 2021-02-19 西安电子科技大学 基于异构平台执行深度神经网络的***及方法
CN112947933A (zh) * 2021-02-24 2021-06-11 上海商汤智能科技有限公司 一种算子的执行方法、装置、计算机设备及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117349868A (zh) * 2023-12-04 2024-01-05 粤港澳大湾区数字经济研究院(福田) 基于gpu的全同态加解密方法、电子设备和存储介质
CN117349868B (zh) * 2023-12-04 2024-04-12 粤港澳大湾区数字经济研究院(福田) 基于gpu的全同态加解密方法、电子设备和存储介质

Also Published As

Publication number Publication date
CN116415271A (zh) 2023-07-11

Similar Documents

Publication Publication Date Title
Jaques et al. Implementing Grover oracles for quantum key search on AES and LowMC
Wahby et al. Verifiable asics
Aikata et al. KaLi: A crystal for post-quantum security using Kyber and Dilithium
JP2017515195A (ja) 断熱量子計算を介してデジタル論理制約問題を解く
Schneider Engineering secure two-party computation protocols: design, optimization, and applications of efficient secure function evaluation
Kazymyrov et al. Influence of addition modulo 2 n on algebraic attacks
Gouert et al. Sok: New insights into fully homomorphic encryption libraries via standardized benchmarks
Canetti et al. Task-structured probabilistic I/O automata
WO2023124677A1 (zh) 数据处理的方法和计算平台
CN112286752A (zh) 一种联邦学习异构处理***的算法验证方法及***
Silitonga et al. Hls-based performance and resource optimization of cryptographic modules
Jang et al. Parallel quantum addition for Korean block ciphers
Hu et al. Efficient parallel secure outsourcing of modular exponentiation to cloud for IoT applications
Courtois et al. Exact logic minimization and multiplicative complexity of concrete algebraic and cryptographic circuits
Banik Conditional differential cryptanalysis of 105 round Grain v1
Castiglione et al. On the relations between security notions in hierarchical key assignment schemes for dynamic structures
US11676074B2 (en) Heterogeneous processing system for federated learning and privacy-preserving computation
Sikka et al. RETRACTED ARTICLE: High-throughput field-programable gate array implementation of the advanced encryption standard algorithm for automotive security applications
Davidson et al. Dynamic Circuit Specialisation for Key‐Based Encryption Algorithms and DNA Alignment
Wang et al. Research on full homomorphic encryption algorithm for integer in cloud environment
Disser et al. Breaking the Size Barrier: Universal Circuits meet Lookup Tables
Soni et al. Picnic
US11849020B2 (en) Fully homomorphic encryption transpiler for high-level languages
US20240039692A1 (en) Private vertical federated learning
Thoonen Hardening FPGA-based AES implementations against side channel attacks based on power analysis

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22913935

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2022913935

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022913935

Country of ref document: EP

Effective date: 20240613