CN113111090B - Multidimensional data query method based on order-preserving encryption - Google Patents

Multidimensional data query method based on order-preserving encryption Download PDF

Info

Publication number
CN113111090B
CN113111090B CN202110403024.1A CN202110403024A CN113111090B CN 113111090 B CN113111090 B CN 113111090B CN 202110403024 A CN202110403024 A CN 202110403024A CN 113111090 B CN113111090 B CN 113111090B
Authority
CN
China
Prior art keywords
data
query
dimension
user
interval
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110403024.1A
Other languages
Chinese (zh)
Other versions
CN113111090A (en
Inventor
王保仓
沈丹峰
段普
张本宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202110403024.1A priority Critical patent/CN113111090B/en
Publication of CN113111090A publication Critical patent/CN113111090A/en
Application granted granted Critical
Publication of CN113111090B publication Critical patent/CN113111090B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/06Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols the encryption apparatus using shift registers or memories for block-wise or stream coding, e.g. DES systems or RC4; Hash functions; Pseudorandom sequence generators
    • H04L9/0618Block ciphers, i.e. encrypting groups of characters of a plain text message using fixed encryption transformation
    • H04L9/0631Substitution permutation network [SPN], i.e. cipher composed of a number of stages or rounds each involving linear and nonlinear transformations, e.g. AES algorithms

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a multidimensional data query method based on order-preserving encryption, which mainly solves the problems that the existing order-preserving encryption technology only supports single-dimensional range query and is difficult to apply to the actual production environment, the safety is low and the query efficiency is low. The implementation scheme is as follows: the data owner generates a secret key and shares the secret key to the user, the data component is preprocessed by utilizing a mesh data structure, a B + tree and a prefix coding technology, and the data and the preprocessed data component are encrypted and uploaded to the cloud server; the cloud server safely and quickly determines the upper limit and the lower limit of the query interval according to the query interval and the query parameters provided by the user, and sends a ciphertext result to the user; and the user decrypts the ciphertext result by using the secret key to obtain query data. The invention can support fine-grained multidimensional data query in a ciphertext state, improves the calculation efficiency, reduces the storage space, and can be used for protecting data of users and data owners when querying data.

Description

Multidimensional data query method based on order-preserving encryption
Technical Field
The invention belongs to the technical field of information security, and further relates to an order-preserving encryption method which can be used for protecting data of a user and a data owner when data is inquired.
Background
Order preserving encryption is an algorithm in cryptography, and along with the great advantages brought by a data outsourcing technology, privacy security problems in the data outsourcing are paid more and more attention, and from daily life to enterprise operation, the method has great demands on improving the efficiency of data query while protecting privacy. With the improvement of security protection awareness of people on private data and the popularization of 5G technology, order-preserving encryption has a good application scene as a means for supporting efficient ciphertext retrieval. Massive data query inevitably generates privacy safety problems, as long as data are outsourced to a third-party cloud server, the cloud server can contact information of data owners and users, and if malicious third-party cloud servers occur, the privacy data of participants can be easily acquired, so that privacy leakage problems are caused, and huge harm is brought.
The existing order-preserving encryption mainly aims at performing range query on single-dimensional data, and is different from other scheme supporting range query encryption, the order-preserving encryption does not need to traverse the whole ciphertext data set to screen data, but the order-preserving index is constructed to keep the same order of ciphertext data and plaintext data so as to efficiently position a ciphertext query interval and extract data. However, the existing order-preserving encryption of single-dimensional data has limited functions for practical application, and existing large data is often massive and multidimensional, for example, common SQL query Select From Students' wheels 18 layer Ap 25and 170 layer Ap 180and 70 layer weight Ap 80, that is, searching student candidates with both age and weight meeting requirements, if the order-preserving encryption of single-dimensional data is simply applied to multidimensional data, huge communication and calculation overhead occur.
The paper efficiency and security top-k queries with a top order-preserving encryption scheme published by Quan Hanyu proposes a top-k query-based order-preserving encryption scheme, which mainly balances privacy protection and query efficiency through heap ordering.
Peng Yang uo proposes two order-preserving encryption schemes AhOPE and PhOPE which support homomorphic operation based on homomorphic encryption in the published paper hOPE, the AhOPE solves the problem of addition homomorphism, the PhOPE solves the problem of multiplication homomorphism, the two schemes also support the comparison operation of cipher text while meeting the homomorphic property, the homomorphic encryption is used for encrypting data and then generating a comparison index through a bilinear pair and the structure of a code tree.
Disclosure of Invention
The invention aims to provide an order-preserving encryption method supporting multi-dimensional data range query aiming at the defects of the prior art, so as to reduce the computational complexity, improve the query efficiency and realize the query of multi-dimensional data in a ciphertext state.
The technical idea of the invention is that the problem of whether data is in a range interval is converted into the characteristic of whether sets are intersected or not by using prefix coding and a bloom filter, an order-preserving encryption network is constructed by using a mesh data structure, data is encrypted by using a symmetric encryption technology, and the data in a ciphertext state is uploaded to a third-party cloud server for a user to inquire. After the user generates the query token, the query token is only required to be uploaded to the third-party cloud server, the third-party cloud server can calculate the data required by the user very efficiently according to the order-preserving encryption network, and the security of the privacy data of the data owner and the user can be guaranteed.
The technical method adopted by the invention comprises the following steps:
(1) Data owner generates a two-part key sk 1 ,sk 2 And the two parts of keys are shared to the user;
(2) The data owner divides the data according to the dimension and fills the data to obtain a filled result T (d) a,b ) Then, using the property of the mesh data structure to respectively use all the filled data components as father nodes of the data, using the B + tree to sort the filled data components which have become the father nodes, and then processing the filled data components which have become the father nodes after the B + tree is sorted by prefix coding to obtain a prefix coding set P (d) a,b ) Wherein d is a,b Representing the data component of the B-dimension of the a-th data, and also carrying out prefix coding processing on split node data generated in the B + tree sequencing;
(3) The data owner encrypts all data by using an Advanced Encryption Standard (AES) encryption algorithm, and generates an array by a bloom filter for all data components and a prefix coding set of split node data generated in the B + tree sequencing;
(4) Defining a query interval and query parameters by a user, filling the query interval in sequence, carrying out prefix coding and hash processing, and then carrying out hash processing on the query interval Q * Uploading the query parameter U to a cloud server;
(5) The cloud server determines a lower limit and an upper limit of each dimension of ciphertext interval according to a query interval and query parameters sent by a user, then stores ciphertext data corresponding to all data components in the middle of the lower limit and the upper limit into an alternative result set, and adds one to the access times of the ciphertext data;
(6) The cloud server traverses the alternative result set, and if the access times of the ciphertext data are equal to the total number of the number of dimensions queried by the user, the ciphertext data are used as a ciphertext result I * And returning the result to the user, decrypting the ciphertext result returned by the cloud server by the user to obtain a query result, and otherwise, continuously traversing the next data.
Compared with the prior art, the invention has the following advantages:
firstly, the invention uses the mesh data structure to generate the order-preserving index, so that the cloud server can select which dimensions are specifically used for inquiring by using the inquiry interval and the inquiry parameters in the ciphertext state, and the problem that the order-preserving encryption technology only supports single-dimensional range inquiry and is difficult to apply to the actual production environment in the prior art is solved, so that the invention can support fine-grained multi-dimensional data inquiry in the ciphertext state;
secondly, the invention uses the bloom filter and the prefix coding technology to convert the problem of comparison between data in the ciphertext state into the judgment of whether the position corresponding to the array generated by the bloom filter is 1 or not, thereby overcoming the problems of low security and low query efficiency of order-preserving encryption in the prior art, improving the calculation efficiency and reducing the storage space.
Drawings
FIG. 1 is a flow chart of an implementation of the present invention.
Detailed Description
Embodiments of the present invention are described in further detail below with reference to fig. 1.
Referring to fig. 1, the implementation steps of this example are as follows:
step 1, the data owner generates two-part key and shares the two-part key to the user.
1.1 Data owner selects k pseudo-random hash functions:
hash({0,1} * ,h),
wherein, {0,1} * Represents a 01-bit string of finite length, h represents the seed of the pseudorandom hash function;
1.2 ) the data owner extracts the seed of the pseudo-random hash function as the first partial key sk 1
sk 1 =(h 1 ,h 2 ,…,h i ,…,h k ),i∈[1,k],
Wherein h is i Expressing the seed of the ith pseudorandom hash function, and k expressing the seed number of the pseudorandom hash function;
1.3 ) the data owner enters the security parameter lambda and generates a second partial key sk using the key generation algorithm in the AES encryption scheme 2
sk 2 =AES.Gen(1 λ ),
Gen denotes key generation of AES encryption schemeAlgorithm of formation, 1 λ Indicating the key length.
Step 2, the data owner preprocesses each dimension data component of all the data:
2.1 Data owner for n-dimensional dataset D = { D = { D } 1 ,d 2 ,…,d a ,…d m },a∈[1,m]Is subjected to an extraction operation for each dimension of each data, wherein d a Representing the a-th data, m representing the number of data in the data set D, data D a Expressed as:
d a =(d a,1 ,d a,2 ,…,d a,b ,…,d a,n ),b∈[1,n],
wherein d is a,b A data component representing a b-th dimension of the a-th data, n representing the number of dimensions of the data in the data set D;
2.2 Data owner utilizes a fill-in technique on data D in data set D a Each component of d a,b Filling is carried out to obtain a result T (d) after filling a,b ) Expressed as:
T(d a,b )=d a,b ||01||r,
wherein, | | represents the cascade connection among the data, 01 represents the comparison parameter of 2 bits length, r represents a random number of the same length with each dimension data component and each r is different, each dimension data component of each data in D uses l bits to represent;
2.3 The data owner will fill the padded data component T (d) a,b ) Conversion to binary representation of 2l +2 bits long:
T(d a,b )=c 1 c 2 …c δ …c 2l+1 c 2l+2 ,δ∈[1,2l+2],
wherein, c δ Represents T (d) a,b ) Is represented by l bits, each dimension data component of each data in D is represented by l bits, 2l +2 represents data component D a,b Length after filling;
2.4 The data owner will fill the padded data component T (d) a,b ) The binary representation of (A) is sequentially replaced by (A) from the last bit to obtain a prefixEncoding set P (d) a,b ):
P(d a,b )={c 1 c 2 …c δ …c 2l+1 c 2l+2 ,c 1 c 2 …c δ …c 2l+1 *,c 1 c 2 …c δ …**,…,c 1 *…* …**,**…*…**}。
And 3, encrypting all data and each preprocessed dimension data component by the data owner.
3.1 Data d) the data owner will send data 1 ,d 2 ,…,d a ,…d m ,a∈[1,m]And a first key sk 1 As input to the encryption algorithm in the AES encryption scheme, a ciphertext C (d) is obtained a ):
C(d a )=AES.Enc(d a ,sk 2 ),
Wherein d is a Representing the a-th data in the data set D, m representing the number of data in the data set D, and aes.enc representing the encryption algorithm of the AES encryption scheme;
3.2 Data owner sets P (d) of prefix codes for each data component a,b ) The hashing process is performed to obtain a series of hash values, which can be expressed as:
Figure BDA0003021124750000051
wherein, f e A set of prefix encodings P (d) representing data components a,b ) The e-th element of (1), h i Represents the seed of the ith pseudorandom hash function, k represents the number of seeds of the pseudorandom hash function, hash (f) e ,h i ) Means for encoding a prefix using an ith pseudo-random hash function seed e Hash value after hash processing, the hash function can be a finite 01-bit string {0,1} * Mapping to a value in the interval [1, u ]]Wherein u represents the length of the array generated by the bloom filter;
3.3 The data owner will H (P (d)) in an array of one bit length u a,b ))All the positions corresponding to all the hash values are set to be 1, and a bloom filter array B (P (d)) is obtained a,b ))。
And 4, generating a query interval and a query parameter by the user.
4.1 User pair query interval Q = ([ p ]) 1 ,q 1 ],[p 2 ,q 2 ],…,[p τ ,q τ ],…,[p x ,q x ]),τ∈[1,x]And (3) filling to obtain a filled query interval T (Q):
Figure BDA0003021124750000052
wherein p is τ Lower bound, q, representing the τ -th dimension of data that the user wants to query τ Representing the upper limit of the tau-th dimension data which a user wants to query, x representing the number of dimensions which the user wants to query, | | representing the cascade connection among the data, 00 and 11 representing comparison parameters with the length of 2 bits, r representing a random number with the same length as the upper limit and the lower limit of each dimension, wherein each r is different, and the upper limit and the lower limit of each dimension in Q are represented by l bits;
if the filled interval meets the condition that a certain data component is in the upper limit and the lower limit of the query interval, namely:
p τ <d a,b <q τ and 00<01<11, the padded data component will also be within the upper and lower bounds of the padded query interval, i.e., p τ ||00||r<d a,b ||01||r<q τ ||11||r;
4.2 Prefix coding the padded query interval T (Q):
4.2.1 For the interval [ p ] τ ||00||r,q τ ||11||r]Binary representation of all elements in 2l +2 bits long constitutes the unprocessed prefix-encoded set P' ([ P ] τ ,q τ ]);
4.2.2 Traverse the set P' ([ P ]) starting from the first element τ ,q τ ]) If the first v bits of two consecutive elements are the same and the v +1 th bit is 0and 1 respectively, then these two data are merged into the data comprising the first v bits and 2l +2-And v new data formed by bit cascade connection, otherwise, no combination is carried out, the next element is traversed until no element which can be combined exists, and finally, a prefix coding set of each one-dimensional query interval is obtained:
Figure BDA0003021124750000061
wherein, the first and the second end of the pipe are connected with each other,
Figure BDA0003021124750000062
represents the τ -th prefix-coded set P ([ P ] τ ,q τ ]) E of (1) τ Element, y τ Represents a prefix-coded set P ([ P ] τ ,q τ ]) The number of middle elements;
4.3 The user hashes the prefix code of the interval to obtain a series of hash values H (P [ P ]) τ ,q τ ]):
Figure BDA0003021124750000063
Wherein the content of the first and second substances,
Figure BDA0003021124750000064
h i denotes a seed of an ith pseudo random hash function, k denotes a number of seeds of the pseudo random hash function,
Figure BDA0003021124750000065
representing encoding a prefix using an ith pseudorandom hash function seed
Figure BDA0003021124750000066
Carrying out Hash processing on the hash value;
4.5 ) the user integrates a series of hash values of each dimension to obtain a processed query interval Q *
Q * =(H(P([p 1 ,q 1 ])),H(P([p 2 ,q 2 ])),…,H(P([p τ ,q τ ])),…,H(P([p x ,q x ])));
4.6 A bit string with n bits length is initialized by a user, the bit corresponding to the bit string is set to 1 according to the query interval, namely if the b-th data is required to be queried, the b-th bit of the bit string is set to 1, and then the bit string is marked as a query parameter U, wherein b belongs to [1, n ], and n represents the dimension of the data in the data set.
Step 5, the cloud server inquires data according to the inquiry interval and the inquiry parameters, and stores the result into an alternative result set:
5.1 ) cloud server receives query interval Q uploaded by user * Traversing the query interval Q with the query parameter U bit by bit while traversing the query parameter U * If the bit b traversed to U is 1, Q will be sent * Sends one element in (B) th B + tree to query, each Q * After the elements in the tree are sent to a B + tree for query, Q is sent in sequence when the next U is traversed to 1 * Else, continuously traversing the query parameter U bit by bit;
5.2 When the cloud server queries the B + tree, the query interval Q is judged * The hashed query interval H (P ([ P ]) corresponding to the middle (b) th dimension data τ ,q τ ]) Whether there is a set of hash values in (c)
Figure BDA0003021124750000071
In bloom Filter array B (P(s) b,ε ) All the positions in the split node data are 1, if so, the child node corresponding to the split node data is accessed, otherwise, the traversal is continued until all g split node data are traversed, wherein g represents that each root node or internal node stores g bloom filter arrays:
B(P(s b,1 )),B(P(s b,2 )),…,B(P(s b,ε )),…,B(P(s b,g )),ε∈[1,g],
s b,ε epsilon-th split node data representing a root node or an interior node in a B-th B + tree, B (P(s) b,ε ) ) represents s b,ε A corresponding bloom filter array;
5.3 The cloud server traverses the bloom filter arrays in the visited leftmost leaf nodes one by one from small to large in the B + tree until a hashed query interval H (P ([ P ]) corresponding to the B-th dimension data appears τ ,q τ ]) In), there is a set of hash values
Figure BDA0003021124750000072
Bloom Filter array B (P (d)) corresponding to a certain data component a,b ) When all the positions are 1), taking the array as the lower query limit of the current dimension;
5.4 The cloud server traverses the bloom filter arrays in the visited rightmost leaf nodes one by one from large to small in the B + tree until a hashed query interval H (P ([ P ]) corresponding to the B-th dimension data appears τ ,q τ ]) In), there is a set of hash values
Figure BDA0003021124750000073
Bloom Filter array B (P (d)) corresponding to a certain data component a,b ) When all the positions in the array are 1, taking the array as the query upper limit of the current dimension;
5.5 The cloud server stores ciphertext data corresponding to all data components between the lower limit and the upper limit into the alternative result set, and adds one to the access times of the ciphertext data.
And 6, the cloud server traverses the alternative result set and returns the ciphertext result to the user, and the user decrypts the ciphertext result to obtain the query result.
6.1 ) the cloud server traverses the alternative result set, and if the access times of the ciphertext data are equal to the total number of the dimensionality queried by the user, the ciphertext data are taken as a ciphertext result I * Returning to the user;
6.2 ) the user assigns the second key sk 2 And ciphertext result I * As the input of the decryption algorithm in the AES encryption scheme, obtaining a query result I:
I=AES.Dec(I * ,sk 2 ),
dec denotes a decryption algorithm of the AES encryption scheme.
The foregoing description is only an example of the present invention and is not intended to limit the invention, so that it will be apparent to those skilled in the art that various changes and modifications in form and detail may be made therein without departing from the spirit and scope of the invention.

Claims (8)

1. A multidimensional data query method based on order preserving encryption is characterized by comprising the following steps:
(1) Data owner generates a two-part key sk 1 ,sk 2 And the two parts of keys are shared to the user;
(2) The data owner divides the data according to the dimension and fills the data to obtain a filled result T (d) a,b ) Then, using the property of the mesh data structure to respectively use all the filled data components as father nodes of the data, using the B + tree to sort the filled data components which have become the father nodes, and then processing the filled data components which have become the father nodes after the B + tree is sorted by prefix coding to obtain a prefix coding set P (d) a,b ) Wherein d is a,b Representing the data component of the B-dimension of the a-th data, and also carrying out prefix coding processing on split node data generated in the B + tree sequencing;
(3) The data owner encrypts all data by using an Advanced Encryption Standard (AES) encryption algorithm, and generates an array by a bloom filter for all data components and a prefix coding set of split node data generated in the B + tree sequencing;
(4) Defining a query interval and query parameters by a user, filling the query interval in sequence, carrying out prefix coding and Hash processing, and then carrying out Hash processing on the query interval Q * Uploading the query parameter U to a cloud server;
(5) The cloud server determines a lower limit and an upper limit of each dimension of ciphertext interval according to a query interval and query parameters sent by a user, then stores ciphertext data corresponding to all data components in the middle of the lower limit and the upper limit into an alternative result set, and adds one to the access times of the ciphertext data;
(6) The cloud server traverses the alternative result set, and if the number of access times of the ciphertext data is equal to the total number of dimensions queried by the user, the ciphertext data is taken as a ciphertext result I * And returning to the user, decrypting the ciphertext result returned by the cloud server by the user to obtain a query result, and otherwise, continuously traversing the next data.
2. The method of claim 1, wherein the two-part key is generated in (1) by:
(1a) The data owner selects k pseudo-random hash functions:
hash({0,1} * ,h),
wherein, {0,1} * Represents a 01-bit string of finite length, h represents the seed of the pseudorandom hash function;
(1b) The data owner extracts the seed of the pseudorandom hash function as the first key sk 1
sk 1 =(h 1 ,h 2 ,…,h i ,…,h k ),i∈[1,k],
Wherein h is i Expressing the seed of the ith pseudorandom hash function, and k expressing the seed number of the pseudorandom hash function;
(1c) The data owner inputs the security parameter lambda and generates a second key sk using a key generation algorithm in the AES encryption scheme 2
sk 2 =AES.Gen(1 λ ),
Gen denotes a key generation algorithm of AES encryption scheme, 1 λ Indicating the key length.
3. The method of claim 1, wherein the data is dimensionally split and filled in (2) as follows:
(2a) Data owner pair n dimension data set D = { D = { (D) } 1 ,d 2 ,…,d a ,…d m },a∈[1,m]Each dimension of each data is subjected to an extraction operation, wherein d a Representing the a-th data, m representing the number of data in the data set D, data D a Expressed as:
d a =(d a,1 ,d a,2 ,…,d a,b ,…,d a,n ),b∈[1,n],
wherein d is a,b A data component representing a b-th dimension of the a-th data, n representing the number of dimensions of the data in the data set D;
(2b) Data owner utilizes filling technique to data D in data set D a Each component of d a,b Filling is carried out to obtain a result T (d) after filling a,b ) Expressed as:
T(d a,b )=d a,b ||01||r,
wherein, | | represents concatenation between data, 01 represents a comparison parameter 2 bits long, r represents a random number of the same length as each dimension data component and each r is different, and each dimension data component of each data in D is represented by l bits.
4. The method of claim 1, wherein in (2), the padded data components that have been sorted into B + trees and become parent nodes are processed by prefix coding to obtain a prefix coding set, which is implemented as follows:
(2c) The data owner will fill in the data component T (d) a,b ) Conversion to binary representation of 2l +2 bits long:
T(d a,b )=c 1 c 2 …c δ …c 2l+1 c 2l+2 ,δ∈[1,2l+2],
wherein, c δ Represents T (d) a,b ) Is represented by l bits, each dimension data component of each data in D is represented by l bits, 2l +2 represents data component D a,b Length after filling;
(2d) The data owner will fill in the data component T (d) a,b ) Two advances ofThe expression is sequentially replaced by the last bit to obtain a prefix coding set P (d) a,b ):
P(d a,b )={c 1 c 2 …c δ …c 2l+1 c 2l+2 ,c 1 c 2 …c δ …c 2l+1 *,c 1 c 2 …c δ …**,…,c 1 *…*…**,**…*…**}。
5. The method of claim 1, wherein the prefix-encoded set of all data components and split node data generated in the B + tree ordering in (3) is generated into an array by a bloom filter, as follows:
(3a) The data owner encodes a set P (d) of prefixes for each data component a,b ) The hashing process is performed to obtain a series of hash values, which can be expressed as:
Figure FDA0003021124740000031
wherein f is e A set of prefix encodings P (d) representing data components a,b ) E element of (2), h i Represents the seed of the ith pseudorandom hash function, k represents the number of seeds of the pseudorandom hash function, hash (f) e ,h i ) Means for encoding a prefix using an ith pseudo-random hash function seed e Hash value after hash processing, the hash function can be a finite 01-bit string {0,1} * Mapping to a value in the interval [1, u ]]Wherein u represents the length of the array generated by the bloom filter;
(3b) The data owner will H (P (d) in an array of one bit length u a,b ) All the positions corresponding to all the hash values are set to be 1, and a bloom filter array B (P (d)) is obtained a,b ))。
6. The method of claim 1, wherein (4) the user defines a query interval and query parameters, and sequentially performs filling, prefix coding and hashing on the query interval, as follows:
(4a) User pair query interval Q = ([ p) = 1 ,q 1 ],[p 2 ,q 2 ],…,[p τ ,q τ ],…,[p x ,q x ]),τ∈[1,x]And (3) filling to obtain a filled query interval T (Q):
Figure FDA0003021124740000032
wherein p is τ Lower bound, q, representing the τ -th dimension of data that the user wants to query τ Representing the upper limit of the tau-th dimension data which a user wants to query, x representing the number of dimensions which the user wants to query, | | representing the cascade connection among the data, 00 and 11 representing comparison parameters with the length of 2 bits, r representing a random number with the same length as the upper limit and the lower limit of each dimension, wherein each r is different, and the upper limit and the lower limit of each dimension in Q are represented by l bits;
(4b) The user carries out prefix coding on the filled query interval T (Q):
(4b1) Will be interval [ p τ ||00||r,q τ ||11||r]Binary representation of all elements in 2l +2 bits long constitutes the unprocessed prefix-encoded set P' ([ P ] τ ,q τ ]);
(4b2) Traversal of the set P' starting with the first element ([ P ] τ ,q τ ]) If the first v bits of two consecutive elements are the same and the v +1 th bit is 0and 1 respectively, merging the two data into new data formed by cascading the first v bits and 2l +2-v + bits, otherwise, not merging, and traversing the next element;
(4b3) Repeating (4 b 2) until no elements can be combined, and obtaining a prefix encoding set of each dimension of the query interval:
Figure FDA0003021124740000041
wherein the content of the first and second substances,
Figure FDA0003021124740000042
represents the τ -th prefix-coded set P ([ P ] τ ,q τ ]) E of (1) τ Element, y τ Represents a prefix-coded set P ([ P ] τ ,q τ ]) The number of middle elements;
(4c) The user hashes the prefix code of the interval to obtain a series of hash values H (P [ P ]) τ ,q τ ]):
Figure FDA0003021124740000043
Wherein the content of the first and second substances,
Figure FDA0003021124740000044
h i denotes a seed of an ith pseudo random hash function, k denotes a number of seeds of the pseudo random hash function,
Figure FDA0003021124740000045
representing encoding a prefix using an ith pseudorandom hash function seed
Figure FDA0003021124740000046
Carrying out Hash processing on the hash value;
(4d) The user integrates a series of hash values of each dimension to obtain a processed query interval Q *
Q * =(H(P([p 1 ,q 1 ])),H(P([p 2 ,q 2 ])),…,H(P([p τ ,q τ ])),…,H(P([p x ,q x ])));
(4e) A user initializes a bit string with n bits length, and sets the bit corresponding to the bit string as 1 according to the query interval, namely if b-th data is required to be queried, the b-th bit of the bit string is set as 1, and then the bit string is marked as a query parameter U, wherein b belongs to [1, n ], and n represents the dimension of data in a data set.
7. The method according to claim 1, wherein in the step (5), according to the query interval and the query parameter sent by the user, the lower limit and the upper limit of the ciphertext interval of each dimension are determined as follows:
(5a) The cloud server receives a query interval Q uploaded by a user * Traversing the query interval Q with the query parameter U bit by bit while traversing the query parameter U * If the bit b traversed to U is 1, Q will be sent * Sends one element in the B-th B + tree to be queried, each Q * After the elements in the tree are sent to a B + tree for query, Q is sent in sequence when the next U is traversed to 1 * If not, continuously traversing the query parameter U bit by bit;
(5b) When the cloud server queries the B + tree, the query interval Q is judged * Hash inquiry interval H (P ([ P ]) corresponding to middle b-th dimension data τ ,q τ ]) Whether there is a set of hash values in (c)
Figure FDA0003021124740000047
In bloom Filter array B (P(s) b,ε ) All the positions in the split node data are 1, if so, the child node corresponding to the split node data is accessed, otherwise, the traversal is continued until all g split node data are traversed, wherein g represents that each root node or internal node stores g bloom filter arrays:
B(P(s b,1 )),B(P(s b,2 )),…,B(P(s b,ε )),…,B(P(s b,g )),ε∈[1,9],
s b,ε epsilon-th split node data representing a root node or an interior node in a B-th B + tree, B (P(s) b,ε ) ) represents s b,ε A corresponding bloom filter array;
(5c) Cloud server judges query interval Q * Hash inquiry interval H (P ([ P ]) corresponding to middle b-th dimension data τ ,q τ ]) Whether there is a set of hash values in (c)
Figure FDA0003021124740000051
In bloom Filter array B (P(s) b,ε ) All the positions in the split node data are 1, if yes, the child node corresponding to the split node data is accessed, otherwise, the traversal is continued until all the g split node data are traversed;
(5d) The cloud server traverses the bloom filter arrays in the visited leftmost leaf nodes one by one from small to large in the B + tree until a hashed query interval H (P ([ P ]) corresponding to the B-th dimension data appears τ ,q τ ]) In), there is a set of hash values
Figure FDA0003021124740000052
Bloom Filter array B (P (d)) corresponding to a certain data component a,b ) When all the positions in the array are 1, taking the array as a query lower limit of the current dimension;
(5e) The cloud server traverses the bloom filter arrays in the visited rightmost leaf nodes one by one from large to small in the B + tree until a hashed query interval H (P ([ P ]) corresponding to the B-th dimension data appears τ ,q τ ]) In), there is a set of hash values
Figure FDA0003021124740000053
Bloom Filter array B (P (d)) corresponding to a certain data component a,b ) All locations in 1) is taken as the upper limit of the query for the current dimension.
8. The method of claim 1, wherein (6) the user decrypts the ciphertext result returned from the cloud server by using the second key sk 2 And ciphertext result I * As the input of the decryption algorithm in the AES encryption scheme, obtaining a query result I:
I=AES.Dec(I * ,sk 2 ),
dec denotes a decryption algorithm of the AES encryption scheme.
CN202110403024.1A 2021-04-15 2021-04-15 Multidimensional data query method based on order-preserving encryption Active CN113111090B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110403024.1A CN113111090B (en) 2021-04-15 2021-04-15 Multidimensional data query method based on order-preserving encryption

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110403024.1A CN113111090B (en) 2021-04-15 2021-04-15 Multidimensional data query method based on order-preserving encryption

Publications (2)

Publication Number Publication Date
CN113111090A CN113111090A (en) 2021-07-13
CN113111090B true CN113111090B (en) 2023-01-06

Family

ID=76716942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110403024.1A Active CN113111090B (en) 2021-04-15 2021-04-15 Multidimensional data query method based on order-preserving encryption

Country Status (1)

Country Link
CN (1) CN113111090B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230394021A1 (en) * 2022-06-07 2023-12-07 Oracle International Corporation Computing similarity of tree data structures using metric functions defined on sets
CN115935429B (en) * 2022-12-30 2023-08-22 上海零数众合信息科技有限公司 Data processing method, device, medium and electronic equipment
CN116150795B (en) * 2023-04-17 2023-07-14 粤港澳大湾区数字经济研究院(福田) Homomorphic encryption-based data processing method, system and related equipment
CN116933299B (en) * 2023-09-18 2023-12-05 国网智能电网研究院有限公司 Tax electric data safety fusion method, tax electric node, equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105721485A (en) * 2016-03-04 2016-06-29 安徽大学 Secure nearest neighbor query method for multiple data owners in outsourcing cloud environment
CN106407447A (en) * 2016-09-30 2017-02-15 福州大学 Simhash-based fuzzy sequencing searching method for encrypted cloud data
CN108337085A (en) * 2018-01-03 2018-07-27 西安电子科技大学 A kind of newer approximate adjacent retrieval construction method of support dynamic
CN108985094A (en) * 2018-06-28 2018-12-11 电子科技大学 The access control and range query method of cryptogram space data are realized under cloud environment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9531679B2 (en) * 2014-02-06 2016-12-27 Palo Alto Research Center Incorporated Content-based transport security for distributed producers

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105721485A (en) * 2016-03-04 2016-06-29 安徽大学 Secure nearest neighbor query method for multiple data owners in outsourcing cloud environment
CN106407447A (en) * 2016-09-30 2017-02-15 福州大学 Simhash-based fuzzy sequencing searching method for encrypted cloud data
CN108337085A (en) * 2018-01-03 2018-07-27 西安电子科技大学 A kind of newer approximate adjacent retrieval construction method of support dynamic
CN108985094A (en) * 2018-06-28 2018-12-11 电子科技大学 The access control and range query method of cryptogram space data are realized under cloud environment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Privacy-Preserving Online Medical Prediagnosis Scheme for Cloud Environment;Wei Guo,等;《IEEE Access》;20180823;第6卷;第48946-48957页 *
支持结果排序的安全密文检索方法研究;姚寒冰,等;《计算机科学》;20180531;第45卷(第5期);第123-130页 *

Also Published As

Publication number Publication date
CN113111090A (en) 2021-07-13

Similar Documents

Publication Publication Date Title
CN113111090B (en) Multidimensional data query method based on order-preserving encryption
CN106815350B (en) Dynamic ciphertext multi-keyword fuzzy search method in cloud environment
CN112800088B (en) Database ciphertext retrieval system and method based on bidirectional security index
CN104780161B (en) Support multi-user&#39;s to can search for encryption method in a kind of cloud storage
CN109543061B (en) Encrypted image retrieval method supporting multiple keys
CN112270006A (en) Searchable encryption method for hiding search mode and access mode in e-commerce platform
CN105681280A (en) Searchable encryption method based on Chinese in cloud environment
CN109740362B (en) Ciphertext index generation and retrieval method and system based on entropy coding
CN111026788B (en) Homomorphic encryption-based multi-keyword ciphertext ordering and retrieving method in hybrid cloud
CN109145079B (en) Cloud searchable encryption method based on personal interest user model
KR20110068542A (en) Method for searchable symmetric encryption
CN114826703B (en) Block chain-based data search fine granularity access control method and system
CN102024054A (en) Ciphertext cloud-storage oriented document retrieval method and system
JP6381128B2 (en) SEARCH SYSTEM, CLIENT, SERVER, SEARCH PROGRAM, AND SEARCH METHOD
CN106934301B (en) Relational database secure outsourcing data processing method supporting ciphertext data operation
CN113221155B (en) Multi-level and multi-level encrypted cloud storage system
CN108768639B (en) Public key order-preserving encryption method
CN110166466A (en) It is a kind of efficiently the multi-user of renewal authority to can search for encryption method and system
CN106874516A (en) Efficient cipher text retrieval method based on KCB trees and Bloom filter in a kind of cloud storage
WO2018070932A1 (en) System and method for querying an encrypted database for documents satisfying an expressive keyword access structure
CN108197491B (en) Subgraph retrieval method based on ciphertext
CN110727951B (en) Lightweight outsourcing file multi-keyword retrieval method and system with privacy protection function
CN113434739B (en) Forward-safe multi-user dynamic symmetric encryption retrieval method in cloud environment
JP2003186725A (en) Relational database, index table generation method in the relational database, and range search method and rank search method for its range search in the relational database
CN104794243B (en) Third party&#39;s cipher text retrieval method based on filename

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant