CN112307499A - Mining method for frequent item set of encrypted data in cloud computing - Google Patents

Mining method for frequent item set of encrypted data in cloud computing Download PDF

Info

Publication number
CN112307499A
CN112307499A CN202011193510.7A CN202011193510A CN112307499A CN 112307499 A CN112307499 A CN 112307499A CN 202011193510 A CN202011193510 A CN 202011193510A CN 112307499 A CN112307499 A CN 112307499A
Authority
CN
China
Prior art keywords
data
mining
encrypted
ciphertext
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011193510.7A
Other languages
Chinese (zh)
Other versions
CN112307499B (en
Inventor
程梓岩
郑培嘉
陈梓阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
Original Assignee
Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sun Yat Sen University filed Critical Sun Yat Sen University
Priority to CN202011193510.7A priority Critical patent/CN112307499B/en
Publication of CN112307499A publication Critical patent/CN112307499A/en
Application granted granted Critical
Publication of CN112307499B publication Critical patent/CN112307499B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a mining method of an encrypted data frequent item set in cloud computing, which solves the problem that the existing method for mining the encrypted data frequent item set cannot simultaneously give consideration to the correctness, privacy security and efficiency of a mining result, a user generates a fully homomorphic encrypted private key and a bootstrap key, the private key is transmitted to a data mining party through a secure channel while being retained, and the bootstrap key is sent to a cloud server; encrypting data by using a private key, and uploading the encrypted data to a cloud server; based on homomorphic operation, the data mining party submits a query requirement to the cloud server, and after receiving the query requirement, the cloud server performs calculation mining on the data and transmits an encrypted calculation mining result to the data mining party; and the data mining party decrypts through the fully homomorphic encryption private key to obtain a mining result and then confirms the frequent item set, and the correctness and the privacy security of the mining result are considered.

Description

Mining method for frequent item set of encrypted data in cloud computing
Technical Field
The invention relates to the technical field of encrypted domain data processing, in particular to a mining method of an encrypted data frequent item set in cloud computing.
Background
With the development of cloud computing, many cloud service providers may provide cloud storage and computing resources that are easily accessible. According to the requirements of users, cloud service providers can provide specified services for the users, wherein a series of data mining algorithms are included, and the cloud service application programs are widely used in practice. However, the data submitted by the user may contain extremely sensitive information (e.g., personal location, medical information, or business data, etc.) that the user does not want to disclose. Therefore, mining these private data inevitably brings about a significant privacy disclosure problem. Under such circumstances, in recent years, privacy preserving data mining has attracted a wide attention with the aim of mining databases without accessing the original content of the data.
In practical application, association rule mining is a data mining method for discovering potential relationships among variables in a large-scale data set, frequent item set mining is a sub-process of association rule mining, and all frequent item sets must be found through some frequent item set mining algorithms before association rules are generated.
31.8.2018, China patent (publication No. CN108475292A) discloses a frequent item set mining method for large-scale data sets, which comprises the steps of firstly estimating sample capacity, collecting sample data sets with sample capacity from the large-scale data sets, then mining closed frequent item sets in the sample data sets and calculating maximum length constraints corresponding to the large-scale data sets to generate reduced data sets corresponding to the large-scale data sets, constructing noise based on the reduced data sets, selecting candidate sets through noise and noise thresholds, carrying out privacy protection on the candidate sets by using the noise, and finally selecting a preset number of frequent item sets from the candidate sets, wherein the method reduces the calculation intensity of frequent item set mining of the large-scale data sets, ensures the privacy of data mining, but in order to avoid privacy data leakage, carries out privacy protection on the candidate sets through the noise and cannot ensure the correctness of mining results, and is not efficient.
Disclosure of Invention
In order to solve the problem that the existing method for mining the frequent item sets of the encrypted data cannot simultaneously consider the correctness, the privacy safety and the efficiency of a mining result, the invention provides the mining method for the frequent item sets of the encrypted data in the cloud computing, which is used for completing the mining of the frequent item sets of the encrypted data in the cloud terminal and not revealing the privacy data while returning the correct encrypted mining result.
In order to achieve the technical effects, the technical scheme of the invention is as follows:
a mining method for a frequent item set of encrypted data in cloud computing at least comprises the following steps:
s1, a user generates a fully homomorphic encryption private key and a bootstrap key, the private key is reserved and simultaneously transmitted to a data mining party through a secure channel, and the bootstrap key is sent to a cloud server;
s2, encrypting data by using a private key, and uploading the encrypted data to a cloud server;
s3, submitting a query requirement to a cloud server by the data mining party according to the service requirement, carrying out data calculation mining by the cloud server based on homomorphic operation after receiving the query requirement, and transmitting an encrypted calculation mining result to the data mining party;
and S4, the data mining party decrypts through the fully homomorphic encryption private key to obtain a mining result and then confirms the frequent item set.
In the technical scheme, the whole mining method is realized based on a three-party model formed by a user, a cloud server and a data mining party, the user and the data mining party hold an encrypted private key, the cloud server holds a bootstrap key used in the data mining process, a user side encrypts data by using the private key and uploads the data to the cloud server, the cloud server performs data mining calculation after receiving an inquiry requirement submitted by the data mining party, interactive calculation with other servers is avoided, all data mining is completed only on the cloud server, efficiency is improved, the cloud server only performs homomorphic operation on the encrypted data, decryption is not involved, privacy safety is improved, in addition, the technical scheme avoids the traditional mode of protecting privacy safety by means of other noise media of a third party and the like, and accuracy of a mining result is guaranteed.
Preferably, the procedure of encrypting the private key and the bootstrap key in a fully homomorphic manner in step S1 is as follows:
given a security parameter λ, the formula for generating the encryption private key SK and the bootstrapping key BK satisfies:
{SK,BK}←TFHE.KeyGen(1λ)
where tfhe. keygen denotes a medium that generates an encryption private key SK and a bootstrap key BK.
The private key SK is sent to all users and data miners through a secure channel for encrypting data; the bootstrap key BK is sent to the cloud server for homomorphic evaluation of the ciphertext in the encrypted domain.
Preferably, the data in step S2 includes transaction data and mining parameter data, the transaction data is encoded into a boolean matrix, and then encrypted by using a fully homomorphic encryption private key, the process is as follows:
each user holds transaction data, and the ith user holds a set of transaction data:
Figure BDA0002753402200000021
wherein m isiIs the amount of transaction data; the ith user encrypts the transaction data set bit by bit, i.e.
Cij(k)←TFHE.Enc(SK,Tij(k)),
Wherein, Tij(k) Representing transaction data TijThe k element of (2), Cij(k) Represents TijThe j of the transaction data encrypted by the encryption private key SK is less than or equal to miA positive integer of (d); the ith user aggregates the encrypted transactions
Figure BDA0002753402200000034
Sending the transaction set to a cloud server, wherein the encrypted transaction set of all users is represented as C ═ { C ═ C1,…,Cm};
The mining parameter data is an unsigned integer minimum support threshold, after being coded into a binary vector, the mining parameter data is encrypted into minsupCtxt by using a fully homomorphic private key and transmitted to the cloud server, the minsupCtxt is transmitted once only before a data mining party submits a query requirement, and the minsupCtxt is subjected to encryption operation before being wrapped to the cloud server, so that privacy safety in the data mining process is guaranteed.
Preferably, after the step S2, before the step S3, the method further includes performing bitwise encryption on the query requirement of the data miner, and the process is as follows:
the query requirement is recorded as a Boolean vector q with the length of n, and an encryption formula meets the following requirements:
queryCtxt←(TFHE.Enc(SK,q1),…,TFHE.Enc(SK,qn))
wherein q isiAnd an ith bit representing the Boolean vector q, and transmitting the encrypted query demand ciphertext vector queryCtxt to the cloud server.
Preferably, after the cloud server receives the query requirement in step S3, the process of performing computational mining on data includes:
s31, initializing, namely initializing an encryption counter Boolean vector accCtxt with the elements of all being ciphertext 0 by the cloud server, wherein the length of the encryption counter Boolean vector is
Figure BDA0002753402200000035
S32, the encrypted transaction set C is set to { C ═ C1,…,CmThe query demand ciphertext vector queryCtxt and a ciphertext vector minisuppctxt of a unsigned integer minimum support degree threshold are used as the input of the cloud server; for i from 1 to m, the following calculations are performed;
s33, cipher text is obtained by utilizing the safety subset judgment algorithm
Figure BDA0002753402200000031
For determining transaction CiWhether the encrypted query requirement ciphertext vector queryCtxt is contained or not, wherein SecSubDet represents a safety subset judgment algorithm;
s34, using a secure counting algorithm SecAccum to encrypt the ciphertext
Figure BDA0002753402200000032
Add to the counter:
Figure BDA0002753402200000033
s35, judging whether the updating times reach m or not, if not, returning to the step S33; if yes, using a secure comparison algorithm SecCmp to compare the counter accCtxt with a ciphertext vector minsupCtxt of the unsigned integer minimum support threshold, wherein the formula is as follows:
Figure BDA0002753402200000041
wherein,
Figure BDA0002753402200000042
and the ciphertext representing the comparison result is an encrypted Boolean value, and is used for judging whether the query demand ciphertext vector queryCtxt is a frequent item set or not and is a calculation mining result.
In the process of computing and mining the data, the cloud server sequentially performs algorithm processes of judgment, safety counting and safety comparison of safety subsets in an encryption domain on the encrypted data (including encrypted query demand vectors and transaction data) based on fully homomorphic encryption and expansion with error learning on difficult problems, and ensures high-level safety.
Preferably, the secure subset decision algorithm is used to obtain the ciphertext in step S33
Figure BDA0002753402200000043
The process comprises the following steps:
s331, determining a ciphertext query demand vector queryCtxt ═ (c)x,1,…,cx,n) And encrypted transaction Ci=(cy,1,…,cy,n) Respectively corresponding plaintext inquiry request vector q ═ { q ═ q1,…,qnAnd transaction Ti=(T1,…,Tn);
S332. initializing cipher text
Figure BDA0002753402200000044
For encrypted 1, based on a fully homomorphic mis-learned Torus ring extension scheme; for i from 1 to n, the following calculations are performed;
s333. calculatingc=HomORNY(cx,i,cy,i) By usingcAnd HomAND operation update ciphertext
Figure BDA0002753402200000045
The update formula is:
Figure BDA0002753402200000046
s334, judging whether the updating times reach m, if yes, outputting the updated ciphertext
Figure BDA0002753402200000047
Otherwise, return to step S333;
preferably, the secure counting algorithm SecAccum described in step S34 is used to generate the ciphertext
Figure BDA0002753402200000048
The process of adding up to the counter is:
s341, determining ciphertext to obtain a vector accCtxt ═ acCtxt1,…,accCtxtk) And ciphertext
Figure BDA0002753402200000049
Respectively representing the binary representation of the integer accumulator and the calculation result of the safety subset judgment algorithm;
s342, initializing the ciphertext carry as
Figure BDA00027534022000000410
An expansion scheme on a Torus ring based on homomorphism with error learning; for i decreasing from k to 1, the following calculation is performed;
s342. calculating
Figure BDA00027534022000000411
After that, the carry is updated using the HomAND operation, and the update formula is:
carry=HomAND(carry,accCtxti);
s343, judging whether the updating times reach k, if yes, outputting the ciphertext vector
Figure BDA0002753402200000051
Otherwise, returning to step S342;
preferably, in step S35, the process of comparing the counter accCtxt with the ciphertext vector minsupctxt of the unsigned integer minimum support threshold by using the secure comparison algorithm SecCmp is as follows:
s351, determining that a counter accCtxt is (c)x,1,…cx,k) And (c) the ciphertext vector minsupctxt ═ of the unsigned integer minimum support thresholdy,1,…,cy,k) The corresponding plaintext vectors x ═ x1,…,xk),y=(y1,…,yk);
S352. initialization
Figure BDA0002753402200000059
Is HomANDNY (c)x,1,cy,1) (ii) a For i from 2 to k, the following calculations are performed;
s353, initializing the current state
Figure BDA00027534022000000510
Then for j from 1 to i-1, the following calculations are performed;
s354. calculationc=HomXNOR(cx,j,cy,j) Is updated again
Figure BDA0002753402200000052
Has a value of
Figure BDA0002753402200000053
S355, judging whether the value of the current j reaches i-1, if not, returning to the step S354; if yes, updating
Figure BDA0002753402200000054
S356, judging whether the value of the current i reaches k, if not, returning to the step S353; if yes, output
Figure BDA0002753402200000055
A ciphertext indicating the result b of the comparison, where b is 1, indicates x<y; if b is 0, x ≧ y is indicated.
Preferably, the data mining party in step S4 decrypts the data with the fully homomorphic encryption private key, and determines the frequent item set after obtaining the mining result, where the process is as follows:
the data mining party uses the encryption private key SK to decrypt the mining result, and the formula is as follows:
Figure BDA0002753402200000056
wherein,
Figure BDA0002753402200000057
indicating the result of decryption as a Boolean value, if
Figure BDA0002753402200000058
And 1, the query requirement submitted by the data mining party is a frequent item set, otherwise, the query requirement submitted by the data mining party is not the frequent item set.
Compared with the prior art, the technical scheme of the invention has the beneficial effects that:
the invention provides a mining method of a frequent item set of encrypted data in cloud computing, wherein a user and a data mining party hold an encryption private key, a cloud server holds a bootstrap key which needs to be used in a data mining process, a user side encrypts the data by using the private key and uploads the data to the cloud server, the cloud server carries out data mining computing after receiving an inquiry requirement submitted by the data mining party, interactive computing with other servers is avoided, all data mining is completed only on the cloud server, efficiency is improved, the cloud server only carries out homomorphic operation on the encrypted data without decryption, privacy security is improved, in addition, the technical scheme avoids the traditional mode of protecting privacy security by means of other noise media of a third party and the like, and accuracy of a mining result is ensured.
Drawings
Fig. 1 is a schematic flow chart illustrating a mining method for a frequent itemset of encrypted data in cloud computing according to an embodiment of the present invention;
fig. 2 shows a schematic diagram of the runtime proposed in the embodiment of the present invention when the number n of transaction data items is fixed and the number m of transaction data items is not fixed.
Detailed Description
The drawings are for illustrative purposes only and are not to be construed as limiting the patent;
for better illustration of the present embodiment, certain parts of the drawings may be omitted, enlarged or reduced, and do not represent actual dimensions;
it will be understood by those skilled in the art that certain well-known descriptions of the figures may be omitted.
The technical solution of the present invention is further described below with reference to the accompanying drawings and examples.
Example 1
As shown in fig. 1, a schematic flow diagram of a mining method for an encrypted data frequent item set in cloud computing, referring to fig. 1, includes the steps of:
s1, a user generates a fully homomorphic encryption private key and a bootstrap key, the private key is reserved and simultaneously transmitted to a data mining party through a secure channel, and the bootstrap key is sent to a cloud server;
s2, encrypting data by using a private key, and uploading the encrypted data to a cloud server;
s3, submitting a query requirement to a cloud server by the data mining party according to the service requirement, carrying out data calculation mining by the cloud server based on homomorphic operation after receiving the query requirement, and transmitting an encrypted calculation mining result to the data mining party;
and S4, the data mining party decrypts through the fully homomorphic encryption private key to obtain a mining result and then confirms the frequent item set.
In this embodiment, the procedure of encrypting the private key and the bootstrap key in a fully homomorphic manner in step S1 is as follows:
given a security parameter λ, the formula for generating the encryption private key SK and the bootstrapping key BK satisfies:
{SK,BK}←TFHE.KeyGen(1λ)
where tfhe. keygen denotes a medium that generates an encryption private key SK and a bootstrap key BK. The private key SK is sent to all users and data mining parties through a secure channel and is used for encrypting data; the bootstrap key BK is sent to the cloud server for homomorphic evaluation of the ciphertext in the encrypted domain.
The cryptography system used is the extension of the all homomorphic encryption with wrong learning to the difficult problem, and the specific details of the wrong learning based on the process are as follows:
(1) and (3) key generation: private key s ∈ B in LWE encryption modenIs a random, uniformly distributed binary vector.
(2) Encryption: the plaintext to be encrypted is mu e T, and the ciphertext c is (a, b), wherein a e TnIs a uniformly randomly sampled vector. And b is calculated as a · s + e + mu to obtain a ciphertext c, wherein e is noise.
(3) And (3) decryption: for the ciphertext c ═ a, b, and the key s, μ + e ═ b-a · s are calculated. If the increase of the noise e is controlled within a certain range, the plaintext μ can be correctly decrypted.
(4) Bootstrapping (Bootstrapping): unlike the partially homomorphic cryptosystem, the TFHE fully homomorphic cryptosystem, for a given LWE ciphertext c ═ (a, b), the bootstrapping algorithm may construct the same ciphertext corresponding to the plaintext under the same key s, but the amount of noise is fixed, in other words, the bootstrapping process may refresh the noise in the ciphertext, and the bootstrapping process for the ciphertext c is denoted as boottrap (c).
In this embodiment, the data in step S2 includes transaction data and mining parameter data, the transaction data is encoded into a boolean matrix, and then encrypted by using a fully homomorphic encryption private key, the process is as follows:
each user holds transaction data, and the ith user holds a set of transaction data:
Figure BDA0002753402200000071
wherein m isiIs the amount of transaction data; the ith user encrypts the transaction data set bit by bit, i.e.
Cij(k)←TFHE.Enc(SK,Tij(k)),
Wherein,Tij(k) representing transaction data TijThe k element of (2), Cij(k) Represents TijThe j of the transaction data encrypted by the encryption private key SK is less than or equal to miA positive integer of (d); the ith user aggregates the encrypted transactions
Figure BDA0002753402200000072
Sending the transaction set to a cloud server, wherein the encrypted transaction set of all users is represented as C ═ { C ═ C1,…,Cm};
The mining parameter data is an unsigned integer minimum support threshold, after being coded into a binary vector, the mining parameter data is encrypted into minsupCtxt by using a fully homomorphic private key and transmitted to the cloud server, the minsupCtxt is transmitted once only before a data mining party submits a query requirement, and the minsupCtxt is subjected to encryption operation before being wrapped to the cloud server, so that privacy safety in the data mining process is guaranteed.
In this embodiment, after the step S2, before the step S3, the method further includes performing bitwise encryption on the query requirement of the data miner, where the process is as follows:
the query requirement is recorded as a Boolean vector q with the length of n, and an encryption formula meets the following requirements:
queryCtxt←(TFHE.Enc(SK,q1),…,TFHE.Enc(SK,qn))
wherein q isiAnd an ith bit representing the Boolean vector q, and transmitting the encrypted query demand ciphertext vector queryCtxt to the cloud server.
In this embodiment, after the cloud server receives the query requirement in step S3, the process of performing data calculation and mining includes:
s31, initializing, namely initializing an encryption counter Boolean vector accCtxt with the elements of all being ciphertext 0 by the cloud server, wherein the length of the encryption counter Boolean vector is
Figure BDA00027534022000000810
S32, the encrypted transaction set C is set to { C ═ C1,…,CmThe query demand ciphertext vector queryCtxt and a ciphertext vector minisuppctxt of a unsigned integer minimum support degree threshold are used as the input of the cloud server; for thei from 1 to m, the following calculations are performed;
s33, cipher text is obtained by utilizing the safety subset judgment algorithm
Figure BDA0002753402200000081
For determining transaction CiWhether the encrypted query requirement ciphertext vector queryCtxt is contained or not, wherein SecSubDet represents a safety subset judgment algorithm;
s34, using a secure counting algorithm SecAccum to encrypt the ciphertext
Figure BDA0002753402200000082
Add to the counter:
Figure BDA0002753402200000083
s35, judging whether the updating times reach m or not, if not, returning to the step S33; if yes, using a secure comparison algorithm SecCmp to compare the counter accCtxt with a ciphertext vector minsupCtxt of the unsigned integer minimum support threshold, wherein the formula is as follows:
Figure BDA0002753402200000084
wherein,
Figure BDA0002753402200000085
the ciphertext is an encrypted Boolean value and represents a ciphertext of a comparison result, and is used for judging whether a query demand ciphertext vector queryCtxt is a frequent item set or not and is a calculation mining result;
step S33, obtaining ciphertext by using the secure subset decision algorithm
Figure BDA0002753402200000086
The process comprises the following steps:
s331, determining a ciphertext query demand vector queryCtxt ═ (c)x,1,…,cx,n) And encrypted transaction Ci=(cy,1,…,cy,n) Respectively corresponding plaintext inquiry request vector q ═ { q ═ q1,…,qnAnd transaction Ti=(T1,…,Tn);
S332. initializing cipher text
Figure BDA0002753402200000087
For encrypted 1, based on a fully homomorphic mis-learned Torus ring extension scheme; for i from 1 to n, the following calculations are performed;
s333. calculatingc=HomORNY(cx,i,cy,i) By usingcAnd HomAND operation update ciphertext
Figure BDA0002753402200000088
The update formula is:
Figure BDA0002753402200000089
s334, judging whether the updating times reach m, if yes, outputting the updated ciphertext
Figure BDA0002753402200000091
Otherwise, return to step S333;
step S34, the secure counting algorithm SecAccum will cipher text
Figure BDA0002753402200000092
The process of adding up to the counter is:
s341, determining ciphertext to obtain a vector accCtxt ═ acCtxt1,…,accCtxtk) And ciphertext
Figure BDA0002753402200000093
Respectively representing the binary representation of the integer accumulator and the calculation result of the safety subset judgment algorithm;
s342, initializing the ciphertext carry as
Figure BDA0002753402200000094
An expansion scheme on a Torus ring based on homomorphism with error learning; for i decreasing from k to 1, the following calculation is performed;
s342. calculating
Figure BDA0002753402200000095
After that, the carry is updated using the HomAND operation, and the update formula is:
carry=HomAND(carry,accCtxti);
s343, judging whether the updating times reach k, if yes, outputting the ciphertext vector
Figure BDA0002753402200000096
Otherwise, returning to step S342;
step S35 is to compare the counter accCtxt with the ciphertext vector minsupctxt of the unsigned integer minimum support threshold using the secure comparison algorithm SecCmp:
s351, determining that a counter accCtxt is (c)x,1,…cx,k) And (c) the ciphertext vector minsupctxt ═ of the unsigned integer minimum support thresholdy,1,…,cy,k) The corresponding plaintext vectors x ═ x1,…,xk),y=(y1,…,yk);
S352. initialization
Figure BDA0002753402200000097
Is HomANDNY (c)x,1,cy,1) (ii) a For i from 2 to k, the following calculations are performed;
s353, initializing the current state
Figure BDA00027534022000000913
Then for j from 1 to i-1, the following calculations are performed;
s354. calculationc=HomXNOR(cx,j,cy,j) Is updated again
Figure BDA0002753402200000098
Has a value of
Figure BDA0002753402200000099
S355, judging whether the value of the current j reaches i-1, if not, returning to the step S354, a first electrode; if yes, updating
Figure BDA00027534022000000910
S356, judging whether the value of the current i reaches k, if not, returning to the step S353; if yes, output
Figure BDA00027534022000000911
A ciphertext indicating the result b of the comparison, where b is 1, indicates x<y; if b is 0, x ≧ y is indicated.
The mining process is based on the homomorphic gate property of full homomorphic encryption, and the specific implementation of the homomorphic gate used is as follows:
Figure BDA00027534022000000912
Figure BDA0002753402200000101
Figure BDA0002753402200000102
Figure BDA0002753402200000103
Figure BDA0002753402200000104
in this embodiment, the data mining party in step S4 decrypts the data with the fully homomorphic encryption private key, and determines the frequent item set after obtaining the mining result, where the process is as follows:
the data mining party uses the encryption private key SK to decrypt the mining result, and the formula is as follows:
Figure BDA0002753402200000105
wherein,
Figure BDA0002753402200000106
indicating the result of decryption as a Boolean value, if
Figure BDA0002753402200000107
And 1, the query requirement submitted by the data mining party is a frequent item set, otherwise, the query requirement submitted by the data mining party is not the frequent item set.
The positional relationships depicted in the drawings are for illustrative purposes only and are not to be construed as limiting the present patent; it should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims (7)

1. A mining method for a frequent itemset of encrypted data in cloud computing is characterized by at least comprising the following steps:
s1, a user generates a fully homomorphic encryption private key and a bootstrap key, the private key is reserved and simultaneously transmitted to a data mining party through a secure channel, and the bootstrap key is sent to a cloud server;
s2, encrypting data by using a private key, and uploading the encrypted data to a cloud server;
s3, submitting a query requirement to a cloud server by the data mining party based on homomorphic operation according to a service requirement, calculating and mining data after the cloud server receives the query requirement, and transmitting an encrypted calculation mining result to the data mining party;
and S4, the data mining party decrypts through the fully homomorphic encryption private key to obtain a mining result and then confirms the frequent item set.
2. The method for mining the frequent item set of the encrypted data in the cloud computing as claimed in claim 1, wherein the fully homomorphic process of encrypting the private key and the bootstrap key in step S1 is as follows:
given a security parameter λ, the formula for generating the encryption private key SK and the bootstrapping key BK satisfies:
{SK,BK}←TFHE.KeyGen(1λ)
where tfhe. keygen denotes a process of generating an encryption private key SK and a bootstrapping key BK.
3. The mining method for the frequent item set of the encrypted data in the cloud computing according to claim 2, wherein the data in the step S2 includes transaction data and mining parameter data, the transaction data is encoded into a boolean matrix and then encrypted by using a fully homomorphic encryption private key, and the process is as follows:
each user holds transaction data, and the ith user holds a set of transaction data:
Figure FDA0002753402190000012
wherein m isiIs the amount of transaction data; the ith user encrypts the transaction data set bit by bit, namely:
Cij(k)←TFHE.Enc(SK,Tij(k)),
wherein, Tij(k) Representing transaction data TijThe k element of (2), Cij(k) Represents TijThe j of the transaction data encrypted by the encryption private key SK is less than or equal to miA positive integer of (d); the ith user aggregates the encrypted transactions
Figure FDA0002753402190000011
Sending the transaction set to a cloud server, wherein the encrypted transaction set of all users is represented as C ═ { C ═ C1,…,Cm};
And mining parameter data is an unsigned integer minimum support degree threshold, after the data is coded into a binary vector, the data is encrypted into minsupCtxt by using a fully homomorphic private key and transmitted to a cloud server, and the minsupCtxt is transmitted once only before a data mining party submits a query requirement.
4. The mining method for the encrypted data frequent item set in the cloud computing as claimed in claim 3, wherein after the step S2, before the step S3, the mining method further comprises performing bitwise encryption on the query requirement of the data mining party by the following process:
the query requirement is recorded as a Boolean vector q with the length of n, and an encryption formula meets the following requirements:
queryCtxt←(TFHE.Enc(SK,q1),…,TFHE.Enc(SK,qn))
wherein q isiAnd an ith bit representing the Boolean vector q, and transmitting the encrypted query demand ciphertext vector queryCtxt to the cloud server.
5. The method for mining the frequent itemses of the encrypted data in the cloud computing according to claim 4, wherein the process of performing the computing mining on the data after the cloud server receives the query requirement in step S3 includes:
s31, initializing, namely initializing an encryption counter Boolean vector accCtxt with the elements of all being ciphertext 0 by the cloud server, wherein the length of the encryption counter Boolean vector is
Figure FDA0002753402190000027
S32, the encrypted transaction set C is set to { C ═ C1,…,CmThe query demand ciphertext vector queryCtxt and a ciphertext vector minisuppctxt of a unsigned integer minimum support degree threshold are used as the input of the cloud server; for i from 1 to m, the following calculations are performed;
s33, cipher text is obtained by utilizing the safety subset judgment algorithm
Figure FDA0002753402190000021
For determining transaction CiWhether the encrypted query requirement ciphertext vector queryCtxt is contained or not, wherein SecSubDet represents a safety subset judgment algorithm;
s34, using a secure counting algorithm SecAccum to encrypt the ciphertext
Figure FDA0002753402190000022
Add to the counter:
Figure FDA0002753402190000023
s35, judging whether the updating times reach m or not, if not, returning to the step S33; if yes, using a secure comparison algorithm SecCmp to compare the counter accCtxt with a ciphertext vector minsupCtxt of the unsigned integer minimum support threshold, wherein the formula is as follows:
Figure FDA0002753402190000024
wherein,
Figure FDA0002753402190000025
and the ciphertext representing the comparison result is an encrypted Boolean value, and is used for judging whether the query demand ciphertext vector queryCtxt is a frequent item set or not and is a calculation mining result.
6. The method for mining the frequent item set of the encrypted data in the cloud computing as claimed in claim 5, wherein the step S33 is to obtain the ciphertext by using the secure subset judgment algorithm
Figure FDA0002753402190000026
The process comprises the following steps:
s331, determining a ciphertext query demand vector queryCtxt ═ (c)x,1,…,cx,n) And encrypted transaction Ci=(cy,1,…,cy,n) Respectively corresponding plaintext inquiry request vector q ═ { q ═ q1,…,qnAnd transaction Ti=(T1,…,Tn);
S332. initializing cipher text
Figure FDA0002753402190000031
For encrypted 1, based on a fully homomorphic mis-learned Torus ring extension scheme; for i from 1 to n, the following calculations are performed;
s333. calculatingc=HomORNY(cx,i,cy,i) By usingcAnd HomAND operation update ciphertext
Figure FDA0002753402190000032
The update formula is:
Figure FDA0002753402190000033
s334, judging whether the updating times reach m, if yes, outputting the updated ciphertext
Figure FDA0002753402190000034
Otherwise, return to step S333;
step S34, the secure counting algorithm SecAccum will cipher text
Figure FDA0002753402190000035
The process of adding up to the counter is:
s341, determining ciphertext to obtain a vector accCtxt ═ acCtxt1,…,accCtxtk) And ciphertext
Figure FDA0002753402190000036
Respectively representing the binary representation of the integer accumulator and the calculation result of the safety subset judgment algorithm;
s342, initializing the ciphertext carry as
Figure FDA0002753402190000037
An expansion scheme on a Torus ring based on homomorphism with error learning; for i decreasing from k to 1, the following calculation is performed;
s342. calculating
Figure FDA0002753402190000038
After that, the carry is updated using the HomAND operation, and the update formula is:
carry=HomAND(carry,accCtxti);
s343, judging whether the updating times reach k, if yes, outputting the ciphertext vector
Figure FDA0002753402190000039
Otherwise, returning to step S342;
step S35 is to compare the counter accCtxt with the ciphertext vector minsupctxt of the unsigned integer minimum support threshold using the secure comparison algorithm SecCmp:
s351, determining that a counter accCtxt is (c)x,1,…cx,k) And (c) the ciphertext vector minsupctxt ═ of the unsigned integer minimum support thresholdy,1,…,cy,k) The corresponding plaintext vectors x ═ x1,…,xk),y=(y1,…,yk);
S352. initialization
Figure FDA00027534021900000310
Is HomANDNY (c)x,1,cy,1) (ii) a For i from 2 to k, the following calculations are performed;
s353, initializing the current state
Figure FDA00027534021900000311
Then for j from 1 to i-1, the following calculations are performed;
s354. calculationc=HomXNOR(cx,j,cy,j) Is updated again
Figure FDA00027534021900000312
Has a value of
Figure FDA00027534021900000313
S355, judging whether the value of the current j reaches i-1, if not, returning to the step S354; if yes, updating
Figure FDA0002753402190000041
S356, judging whether the value of the current i reaches k, if not, returning to the step S353; if yes, output
Figure FDA0002753402190000042
A ciphertext indicating the result b of the comparison, where b is 1, indicates x<y; if b is 0, x ≧ y is indicated.
7. The mining method for the encrypted data frequent item set in the cloud computing according to claim 5 or 6, wherein the data mining party in step S4 decrypts the encrypted data frequent item set by using a fully homomorphic encryption private key, and the process of confirming the frequent item set after obtaining the mining result is as follows:
the data mining party uses the encryption private key SK to decrypt the mining result, and the formula is as follows:
Figure FDA0002753402190000043
wherein,
Figure FDA0002753402190000044
indicating the result of decryption as a Boolean value, if
Figure FDA0002753402190000045
And 1, the query requirement submitted by the data mining party is a frequent item set, otherwise, the query requirement submitted by the data mining party is not the frequent item set.
CN202011193510.7A 2020-10-30 2020-10-30 Mining method for encrypted data frequent item set in cloud computing Active CN112307499B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011193510.7A CN112307499B (en) 2020-10-30 2020-10-30 Mining method for encrypted data frequent item set in cloud computing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011193510.7A CN112307499B (en) 2020-10-30 2020-10-30 Mining method for encrypted data frequent item set in cloud computing

Publications (2)

Publication Number Publication Date
CN112307499A true CN112307499A (en) 2021-02-02
CN112307499B CN112307499B (en) 2024-04-12

Family

ID=74332977

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011193510.7A Active CN112307499B (en) 2020-10-30 2020-10-30 Mining method for encrypted data frequent item set in cloud computing

Country Status (1)

Country Link
CN (1) CN112307499B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002044715A1 (en) * 2000-11-28 2002-06-06 Surromed, Inc. Methods for efficiently minig broad data sets for biological markers
CN103401871A (en) * 2013-08-05 2013-11-20 苏州大学 Method and system for sequencing ciphertexts orienting to homomorphic encryption
CN108183791A (en) * 2017-12-11 2018-06-19 北京航空航天大学 Applied to the Intelligent terminal data safe processing method and system under cloud environment
CN110120873A (en) * 2019-05-08 2019-08-13 西安电子科技大学 Mining Frequent Itemsets based on cloud outsourcing transaction data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002044715A1 (en) * 2000-11-28 2002-06-06 Surromed, Inc. Methods for efficiently minig broad data sets for biological markers
CN103401871A (en) * 2013-08-05 2013-11-20 苏州大学 Method and system for sequencing ciphertexts orienting to homomorphic encryption
CN108183791A (en) * 2017-12-11 2018-06-19 北京航空航天大学 Applied to the Intelligent terminal data safe processing method and system under cloud environment
CN110120873A (en) * 2019-05-08 2019-08-13 西安电子科技大学 Mining Frequent Itemsets based on cloud outsourcing transaction data

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
侯伟;杨炳儒;吴晨生;周谆;: "基于周期采样的数据流频繁项集挖掘算法研究", 高技术通讯, no. 08 *

Also Published As

Publication number Publication date
CN112307499B (en) 2024-04-12

Similar Documents

Publication Publication Date Title
Munjal et al. A systematic review of homomorphic encryption and its contributions in healthcare industry
EP3506550B1 (en) Providing security against user collusion in data analytics using random group selection
Liu et al. An efficient privacy-preserving outsourced calculation toolkit with multiple keys
CN108989026B (en) Method for revoking user attribute in publishing/subscribing environment
CN113569271B (en) Threshold proxy re-encryption method based on attribute condition
KR19990082665A (en) Common Key Communication Method
US8180048B2 (en) Method and system for computational transformation
Jayapandian et al. Secure and efficient online data storage and sharing over cloud environment using probabilistic with homomorphic encryption
CN104168108A (en) Attribute-based hybrid encryption method capable of tracing leaked secret key
Zou et al. Highly secure privacy‐preserving outsourced k‐means clustering under multiple keys in cloud computing
CN116324778A (en) Updatable private collection intersections
Azogagh et al. Probonite: Private one-branch-only non-interactive decision tree evaluation
Aloufi et al. Computing blindfolded on data homomorphically encrypted under multiple keys: An extended survey
Hou et al. Multi‐Party Verifiable Privacy‐Preserving Federated k‐Means Clustering in Outsourced Environment
Yousefipoor et al. An efficient, secure and verifiable conjunctive keyword search scheme based on rank metric codes over encrypted outsourced cloud data
CN111859440B (en) Sample classification method of distributed privacy protection logistic regression model based on mixed protocol
Li et al. Securely outsourcing ID3 decision tree in cloud computing
CN112307499B (en) Mining method for encrypted data frequent item set in cloud computing
KR20100003093A (en) Method of producing searchable keyword encryption based on public key for minimizing data size of searchable keyword encryption and method of searching data based on public key through that
Hariss et al. Cloud assisted privacy preserving using homomorphic encryption
Chang A flexible hierarchical access control mechanism enforcing extension policies
Sehrawat Privacy enhancing cryptographic constructs for cloud and distributed security
Wei et al. Flexible, secure, and reliable data sharing service based on collaboration in multicloud environment
Alex et al. Energy Efficient and Secure Neural Network–based Disease Detection Framework for Mobile Healthcare Network
Zhao et al. ePMLF: Efficient and Privacy‐Preserving Machine Learning Framework Based on Fog Computing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant