CN109492410B - Data searchable encryption and keyword search method, system, terminal and equipment - Google Patents

Data searchable encryption and keyword search method, system, terminal and equipment Download PDF

Info

Publication number
CN109492410B
CN109492410B CN201811170800.2A CN201811170800A CN109492410B CN 109492410 B CN109492410 B CN 109492410B CN 201811170800 A CN201811170800 A CN 201811170800A CN 109492410 B CN109492410 B CN 109492410B
Authority
CN
China
Prior art keywords
keyword
file
data
character
dictionary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811170800.2A
Other languages
Chinese (zh)
Other versions
CN109492410A (en
Inventor
李西明
粟晨
郭玉彬
陶汝裕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Nengxi Information Technology Co ltd
Original Assignee
South China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Agricultural University filed Critical South China Agricultural University
Priority to CN201811170800.2A priority Critical patent/CN109492410B/en
Publication of CN109492410A publication Critical patent/CN109492410A/en
Application granted granted Critical
Publication of CN109492410B publication Critical patent/CN109492410B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Storage Device Security (AREA)

Abstract

The invention discloses a method, a system, a terminal and equipment for data searchable encryption and keyword search, wherein in the data searchable encryption process: acquiring a data file uploaded by a data owner; extracting key words of each data file; extracting the abstract of each data file to obtain an abstract file; according to the corresponding relation between each keyword and each data file, generating a dictionary gamma after data processing is carried out through an encryption algorithm, and encrypting each data file to obtain an encrypted data file; and encrypting each abstract file to obtain the encrypted abstract file. In the keyword search process: acquiring the dictionary gamma, the encrypted data files and the abstract files, searching whether the data files comprise the keywords or not through the dictionary gamma when keyword search is received, searching the keywords in the abstract files if the data files comprise the keywords, and adding label index pairs of the keywords into the dictionary gamma under the condition that the search is successful; the invention greatly improves the keyword search efficiency.

Description

Data searchable encryption and keyword search method, system, terminal and equipment
Technical Field
The invention belongs to the technical field of searchable encryption, and particularly relates to a method, a system, a terminal and equipment for data searchable encryption and keyword search.
Background
With the development of internet applications, more and more users often implement searches by inputting search keywords in search pages and triggering search operations. Specifically, after the search page obtains the input search keyword and the triggered search operation, corresponding association words are listed according to the input search keyword, a user clicks one of the association words to obtain a search result related to the association word, and the user can browse detailed information corresponding to the search result by clicking the search result expected to be viewed. The above search process has a disadvantage of low search efficiency because it requires that information desired by the user is available depending on a plurality of operators input by the user.
In past SSE studies, keywords were derived from keywords extracted from data files. The content and number of keywords are limited by the keyword extraction algorithm and are fixed. When the keyword set is modified, a new keyword set needs to be submitted or constructed again, so that the construction time of the data set is increased, and the operation is troublesome. The existing keyword search method is roughly as follows:
step 1, encryption process: the user encrypts the plaintext file locally using the key and uploads it to the server.
Step 2, a trapdoor generation process: and the user with retrieval capability uses the key to generate the trapdoor of the keyword to be queried, and the trapdoor is required not to reveal any information of the keyword.
Step 3, search process: the server executes a retrieval algorithm by taking the key word trapdoor as input, returns all the ciphertext files containing the key word corresponding to the trapdoor, and requires that the server can not obtain more information except knowing whether the ciphertext files contain a certain specific key word.
Step 4, decryption process: and the user decrypts the ciphertext file returned by the server by using the secret key to obtain a query result.
In addition, David Cash et al propose a safe and efficient data processing scheme for different database sizes, especially for larger databases, that can efficiently and privately search server encrypted databases having hundreds of billions of record key pairs. Their basic theoretical construct supports single-keyword searches and provides asymptotically optimized server index sizes, fully parallel searches and minimal leakage.
In the above methods, since the keywords are obtained by the keyword extraction algorithm, the content and number of the keywords are limited by the keyword extraction algorithm and are fixed. Therefore, the inquired keywords are limited, and when the keyword set is modified, a new keyword set needs to be submitted or constructed again, so that the process increases the time for constructing the data set and also causes the operation to be troublesome.
Disclosure of Invention
The first purpose of the present invention is to overcome the disadvantages and shortcomings of the prior art, and to provide a data searchable encryption method, which encrypts data files and summaries thereof uploaded by a data owner respectively, and generates a dictionary corresponding to a keyword at the same time.
It is a second object of the present invention to provide a data searchable encryption system.
A third object of the present invention is to provide a terminal.
The fourth purpose of the present invention is to provide a keyword searching method based on the above data searchable encryption method, which can automatically update the keywords during the searching process, thereby greatly improving the keyword searching efficiency.
It is a fifth object of the invention to provide a computing device.
The first purpose of the invention is realized by the following technical scheme: a data searchable encryption method comprises the following steps:
acquiring a data file uploaded by a data owner;
extracting key words of each data file;
extracting the abstract of each data file to obtain an abstract file;
generating a dictionary gamma after data processing is carried out through an encryption algorithm according to the corresponding relation between each keyword and each data file, wherein labels corresponding to each data file by each keyword and index information corresponding to each data file by each keyword are stored in the dictionary gamma, and the labels corresponding to each data file by the keyword and the index information corresponding to each data file by the keyword are in one-to-one pairing relation in the dictionary gamma for each keyword;
encrypting each data file to obtain an encrypted data file;
and encrypting each abstract file to obtain the encrypted abstract file.
Preferably, the specific process of generating the dictionary γ after performing data processing by an encryption algorithm according to the correspondence between each keyword and each data file is as follows:
s11, establishing an empty table L, and selecting a master key K for the table L;
s12, for each keyword, obtaining each data file including the keyword, and generating a pair of subkeys K for the keyword through the main key K1,K2
K1←F(K,1||ω);
K2←F(K,2||ω);
Wherein omega is a keyword;
for each keyword, numbering each data file comprising the keyword to obtain a file number corresponding to each data file, and sequencing each file number to obtain a sequence number of each file number;
for each keyword, a key K is used1Generating labels in sequence according to each file number corresponding to the keyword, and simultaneously adopting a secret key K2Encrypting each file number corresponding to the keyword in sequence, and taking the encrypted result as the index information of the data file corresponding to the keyword to obtain a label index pair (L)i,di):
Li←F(K1,i);
di←Enc(K2,idi);
i=0,1,…,N-1;
Wherein L isiTo adopt a secret key K1The ith document number id corresponding to the keyword omegaiThe generated label; diTo adopt a secret key K2I (th) file number id corresponding to encryption keyword omegaiThen obtaining a result, using the result as the key word to correspond to the file number as idiN is the total number of data files including the keyword ω;
s13, obtaining one label index pair (L) for each keywordi,di) Sequentially inserting the data into a table L according to the order of a dictionary gamma; and indexes a pair (L) for each tagi,di) Time with timestamp addediObtaining a product containing (L)i,di,timei) A dictionary gamma is created through the table L; wherein timeiIs the initial time for encryption to complete each tag index pair (L)i,di) Middle index information diTime of (d).
Preferably, the process of acquiring each summary file is as follows:
firstly, extracting a summary file from a data file through a document summary extraction algorithm; then, taking the file number corresponding to the abstract as an index, storing the abstract at a corresponding position, and performing character filling on the rest positions to form an abstract file;
substring search encryption is carried out on each summary file by adopting a Burrows-Wheeler conversion algorithm and an FM indexing technology, and the specific process is as follows:
respectively creating a linked list aiming at each different character in the abstract file; for each character linked list, each node storage tuple is < nptr, addr >, nptr is a pointer pointing to the next node of the character linked list, addr is the position of the character at a certain position in the summary file in the FM index, and addr in different node storage tuples in the character linked list are the positions of the characters at different positions in the summary file in the FM index respectively;
aiming at each different character in the abstract file, the first node of each character linked list, namely the storage tuple of the linked list head, is encrypted to obtain:
Figure GDA0002583190940000041
wherein<nptr1,addr1>For the first node of each character chain table, i.e. the memory tuple of the head of the chain table, cmThe number of the m characters in different characters of the abstract file is Y, and the Y is the total number of the different characters in the abstract file; k' is a secondary key, FK′(cm) Indicating that the character c is pointed to by the secondary key KmCarrying out encryption;
aiming at different characters in the abstract file, firstly, encryption processing is carried out, data after encryption processing of the different characters are used as linked list indexes to obtain a linked list index set, and the linked list indexes corresponding to the different characters are respectively mapped to the linked list heads of the linked lists of the different characters to obtain the mapping relation between the linked list indexes and the linked list heads of the different characters; after encryption processing of different characters in the summary file, the method comprises the following steps:
Figure GDA0002583190940000042
k being the master key, FK(cm) Indicating that the character c is pointed to by the master key KmCarrying out encryption; fK′(cm) Indicating that the character c is pointed to by the secondary key KmEncryption is performed.
The second purpose of the invention is realized by the following technical scheme: a data searchable encryption system comprising:
the data file acquisition unit is used for acquiring a data file uploaded by a data owner;
a keyword extraction unit for extracting keywords of each data file,
the abstract extraction unit is used for extracting an abstract of each data file to obtain an abstract file;
the dictionary generating unit is used for generating a dictionary gamma after data processing is carried out through an encryption algorithm according to the corresponding relation between each keyword and each data file, wherein the dictionary gamma stores the label corresponding to each data file by each keyword and the index information corresponding to each data file by each keyword, and the label corresponding to each data file by each keyword and the index information corresponding to each data file by each keyword are in one-to-one pairing relation aiming at each keyword;
the data file encryption unit is used for encrypting each data file to obtain an encrypted data file;
and the digest file encryption unit is used for searching and encrypting substrings of the digest files to obtain encrypted digest files.
The third purpose of the invention is realized by the following technical scheme: a terminal comprising a processor and a memory for storing a program executable by the processor, wherein the processor executes the program stored in the memory to implement the data searchable encryption method according to the first object of the present invention.
The fourth purpose of the invention is realized by the following technical scheme: a keyword search method comprises the following steps:
step X1, firstly, acquiring a dictionary gamma, an encrypted data file and an encrypted summary file which are acquired by the data searchable encryption method of the first object of the invention;
when receiving each keyword which is sent by a user and needs to be searched, firstly, determining whether the encrypted data file comprises the keyword through a search dictionary gamma; if so, returning the corresponding encrypted data file as a query result to the user for decryption;
if not, go to step X2;
step X2, performing substring search on each keyword to be searched in the encrypted abstract file set;
if the key word is searched in the abstract file after the substring search, the encrypted data file corresponding to the abstract file is returned to the user as a query result; under the condition that the user confirms that the data files are correct, the corresponding data files serving as the query result are confirmed to comprise the key words, the labels of the data files corresponding to the key words and the index information of the corresponding data files are calculated and added into the dictionary gamma, and the dictionary gamma is updated;
and if the substring search is passed and the substring search fails, returning a search failure result to the user.
Preferably, in the step X1, for each keyword that needs to be searched, a specific process of determining whether the encrypted data file includes the keyword through the search dictionary γ is as follows:
step X11, aiming at each keyword which needs to be searched by the user, generating a pair of sub-keys K 'for the keyword according to the main key K sent by the user'1,K′2
K′1←F(K,1||ω′);
K′2←F(K,2||ω′);
Wherein omega' is a keyword which needs to be searched by a user;
step X12, for each keyword needing to be searched, traversing the file number sequence corresponding to the data file, and passing through the sub-key K'1Generating a label of the data file of which the keyword corresponds to each file number:
Li′←F(K′1,i′);i′=0,1,2,…I;
wherein I' is a file number sequence number corresponding to the traversed data file, and I is the maximum value of the file number sequence number corresponding to the traversed data file; l isi′A label corresponding to the data file with the file number i 'for the keyword omega';
step X13, for each keyword which needs to be searched, searching dictionary gamma for whether there is label L of data file corresponding to each file number by the keyword generated in the step X12i′
If not, go to step X2;
if yes, index information paired with the label is obtained in a dictionary gamma, and then the sub-key K 'of the keyword is passed'2Decrypting the index information, acquiring the corresponding encrypted data file through the decrypted index information, and returning the encrypted data file to a user for decryption as a query result; at the same time, the time stamp of the tag index pair stored in the dictionary γ is updated to the sub-key K 'of the keyword'2Time of completion of decrypting the index information;
the index information paired with the label acquired in the dictionary γ is:
di′←Get(γ,Li′);
wherein d isi′For obtaining and labeling L in dictionary gammai′Paired index information;
wherein, the sub-key K 'of the keyword'2The obtained decrypted index information is:
di←Dec(K′2,di′);
wherein d isiIs di′Sub-key K 'by keyword ω'2Decrypted index information, wherein the decrypted index information diI.e. the file number of the data file comprising the keyword omega'.
Preferably, in the step X2, a Burrows-Wheeler conversion algorithm and an FM indexing technique are collectively used for substring search in the encrypted digest file set, and the specific process is as follows:
step X21, aiming at the keyword omega' needing to be searched, generating a keyword query token tkT,S
tkT,S=F(K,ω′[1…M])=F(K,ω′[1]),F(K,ω′[2]),…F(K,ω′[M]),F(K′,ω′[M]);
Wherein, ω ' 1, ω ' 2, …, ω ' M are each character of the keyword ω ' to be searched, M is the total number of characters of the keyword ω '; k' is a secondary key, K ═ F (K, 2), K is a primary key;
step X22, aiming at each character omega 'of the keyword omega' required to be searched]And M is 1,2,3, … M, which is first encrypted to yield:
Figure GDA0002583190940000071
then search the ciphertext from the linked list index set
Figure GDA0002583190940000072
By the index of each character ω' [ m ]]Mapping between linked list index and linked list headerObtaining the character omega'm by the radial relation]A linked list of;
step X23, for the last character ω ' M of the keyword ω ' that needs to be searched, mapping each node in the linked list of the character ω ' M to the encrypted FM tuple:
Figure GDA0002583190940000073
wherein
Figure GDA0002583190940000074
Data corresponding to column F at FM;
wherein
Figure GDA0002583190940000075
Data corresponding to L columns at FM;
wherein E (pos)j) Corresponding to data at column j of SA of FM, posjRepresenting a position ciphertext of a character corresponding to the data of the jth line of the SA column in the summary file, wherein n is the total line number of FM;
wherein the content of the first and second substances,
Figure GDA0002583190940000076
for the character corresponding to the data in column fth and row j of FM,
Figure GDA0002583190940000077
corresponding character for data in FM Fth column and j th line
Figure GDA0002583190940000078
The position number of (2);
Figure GDA0002583190940000079
for the character corresponding to the data in column lth and row jth of FM,
Figure GDA0002583190940000081
the character corresponding to the data in the L column and j row of FM
Figure GDA0002583190940000082
The position number of (2);
for each encrypted FM tuple to which each byte in the linked list of ω' M maps:
first of all, using FK(ω′[m]) For data in F column of FM namely
Figure GDA0002583190940000083
Performing XOR operation to realize decryption to obtain
Figure GDA0002583190940000084
Then adopt
Figure GDA0002583190940000085
Decrypting an element of a first portion of data in an L-column of FM as a key
Figure GDA0002583190940000086
To obtain
Figure GDA0002583190940000087
Will be provided with
Figure GDA0002583190940000088
And elements of the second part of the data in the L columns of FM
Figure GDA0002583190940000089
Performing exclusive-or operation to obtain an exclusive-or operation result, and then entering step X24;
step X24, aiming at each XOR operation result obtained in the previous step, searching a row with data as the result of the XOR operation in the F column of the FM, then obtaining the FM tuple of the row, searching a linked list with nodes mapped to the FM tuple, and thus obtaining the character c corresponding to the linked listxAs the currently searched character; wherein x is the number of times data is searched in the F column of the FM currently; go to step X25;
step X25, determination of each character c acquired in step X24xWhether or not there is a sum character omega' [ M-x]The same character;
if yes, judging whether the number x of data searching in the F column of the FM is equal to M-1 or not; if yes, ending substring search, successfully searching substrings, and enabling the corresponding abstract files to comprise keywords omega' needing to be searched; if not, go to step X26;
if not, ending substring search, and returning a result of substring search failure, namely, the corresponding abstract file does not contain the keyword omega';
step X26, for the sum character ω' [ M-X ] acquired at step X24]Identical character cxThe character c is obtained in the obtaining step X24xEach FM tuple obtained, and for each FM tuple:
first of all, using FK(cx) For data in F column of FM namely
Figure GDA00025831909400000810
Performing XOR operation to realize decryption to obtain
Figure GDA00025831909400000811
Then adopt
Figure GDA00025831909400000812
Decrypting an element of a first portion of data in an L-column of FM as a key
Figure GDA00025831909400000813
To obtain
Figure GDA00025831909400000814
Will be provided with
Figure GDA00025831909400000815
And elements of the second part of the data in the L columns of FM
Figure GDA00025831909400000816
Carrying out XOR operation to obtain an XOR operation result; then proceed to step X24.
Preferably, the dictionary γ is set to be a fixed-length dictionary, and in the step X2, the updating process of the dictionary γ is implemented as follows:
when a new keyword corresponds to a tag of a data file and index information of the corresponding data file need to be added to the dictionary γ, that is, when a new keyword tag index pair needs to be added to the dictionary γ, if the dictionary currently stores a full tag index pair, the new keyword tag index pair is replaced by the tag index pair with the smallest timestamp in the dictionary γ, and when a plurality of new keyword tag index pairs are stored, the plurality of tag index pairs with the smallest timestamp in the dictionary γ are replaced.
The fifth purpose of the invention is realized by the following technical scheme: a computing device comprising a processor and a memory for storing a program executable by the processor, wherein the processor, when executing the program stored in the memory, implements the keyword search method according to the fourth aspect of the present invention.
Compared with the prior art, the invention has the following advantages and effects:
(1) in the data searchable encryption method, firstly, a data file uploaded by a data owner is obtained; extracting key words of each data file; and abstract extraction is carried out on each data file to obtain an abstract file; generating a dictionary gamma according to the corresponding relation between each keyword and each data file, and encrypting each data file to obtain an encrypted data file; meanwhile, substring searching and encrypting are carried out on each abstract file to obtain an encrypted abstract file; when the dictionary gamma, the encrypted data file and the encrypted abstract file acquired by the method are uploaded for searching, the keyword searching can be carried out not only through the dictionary but also through the abstract file, and the keyword searching efficiency can be higher.
(2) In the keyword searching method, firstly, a dictionary gamma, an encrypted data file and an encrypted summary file which are obtained by the data searching and encrypting method are obtained; firstly, determining whether the encrypted data file comprises the keyword or not by a search dictionary gamma after receiving each keyword which is sent by a user and needs to be searched; if so, returning the corresponding encrypted data file as a query result to the user for decryption; if not, further performing substring search on each keyword needing to be searched in the encrypted abstract file set; and if the key word is inquired in the abstract file after the substring search is carried out, returning the encrypted data file corresponding to the abstract file to the user as an inquiry result, calculating a label of the key word corresponding to the data file and index information of the corresponding data file, and adding the label and the index information into the dictionary gamma. In the invention, when the corresponding keyword is not searched through the dictionary gamma, the corresponding label of the keyword is not in the dictionary, namely the keyword is not in the keyword set of the initially generated dictionary, under the condition, the invention searches the keyword in the abstract file in a substring search mode, when the keyword is searched in the abstract file, the label corresponding to the keyword and the index information of the data file are added into the dictionary gamma, and then the corresponding keyword can be searched through the dictionary gamma in the next search; the method for updating the keyword dictionary in the searching process by combining the substring searching mode can enable the content of the keyword dictionary to be more accurate and flexible, is not limited by a keyword extraction algorithm, and greatly improves the searching efficiency of the keywords.
(3) In the keyword search method, the dictionary gamma is set as a dictionary gamma with the length, when a new keyword corresponding to a data file and index information corresponding to the data file need to be added to the dictionary gamma, namely when a new keyword corresponding to a label index pair needs to be added to the dictionary gamma, if the dictionary is stored with full label index pairs, the new keyword corresponding to the label index pair replaces the label index pair with the smallest timestamp in the dictionary gamma, and when the new keyword corresponding to the label index pair is multiple, the new keyword corresponding to the label index pair with the smallest timestamp in the dictionary gamma is replaced with the label index pair with the smallest timestamp in the dictionary gamma. The updatable keyword dictionary adopts a feedback mechanism similar to a fast table, so that the influence on a memory caused by continuous expansion of the dictionary due to incorrect query records can be avoided, and the dictionary can cover and update keywords with low use frequency in a fixed dictionary size.
Drawings
FIG. 1 is a flow chart of a data searchable encryption method of the present invention.
FIG. 2 is a flow chart of a keyword search method of the present invention.
FIG. 3 is a linked list index, linked list and FM map of the present invention.
FIG. 4 is a general block diagram of the data searchable encryption and keyword search methodology of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples and drawings, but the present invention is not limited thereto.
Example 1
The embodiment discloses a data searchable encryption method, as shown in fig. 1, the steps are as follows:
step S1, acquiring the data file uploaded by the data owner;
step S2, extracting keywords of each data file; simultaneously, abstracting each data file to obtain abstract files; in this embodiment, the process of acquiring each summary file specifically includes:
firstly, extracting a summary file from a data file through a document summary extraction algorithm; and then, taking the file number corresponding to the abstract as an index, storing the abstract at a corresponding position, and performing character filling on the rest positions to form an abstract file.
Step S3, according to the corresponding relation between each keyword and each data file, generating a dictionary gamma after data processing is carried out through an encryption algorithm, wherein the dictionary gamma stores the label corresponding to each data file by each keyword and the index information corresponding to each data file by each keyword, and for each keyword, the label corresponding to each data file by the keyword and the index information corresponding to each data file by the keyword in the dictionary gamma are in one-to-one pairing relation; the specific process of generating the dictionary γ after performing data processing by an encryption algorithm according to the correspondence between each keyword and each data file is as follows:
s31, establishing an empty table L, and selecting a master key K for the table L;
s32, aiming at each keyword, obtaining the keyword comprisesEach data file of the key word, and a pair of subkeys K is generated for the key word by the main key K1,K2
K1←F(K,1||ω);
K2←F(K,2||ω);
Wherein omega is a keyword;
step S33, firstly, aiming at each keyword, numbering each data file comprising the keyword to obtain a file number corresponding to each data file, and sequencing each file number to obtain a sequence number of each file number; then for each keyword, a key K is used1Generating labels in sequence according to each file number corresponding to the keyword, and simultaneously adopting a secret key K2Encrypting each file number corresponding to the keyword in sequence, and taking the encrypted result as the index information of the data file corresponding to the keyword to obtain a label index pair (L)i,di):
Li←F(K1,i);
di←Enc(K2,idi);
i=0,1,…,N-1;
Wherein L isiTo adopt a secret key K1The ith document number id corresponding to the keyword omegaiThe generated label; diTo adopt a secret key K2I (th) file number id corresponding to encryption keyword omegaiThen obtaining a result, using the result as the key word to correspond to the file number as idiN is the total number of data files including the keyword ω;
s34, obtaining one label index pair (L) for each keywordi,di) Sequentially inserting the data into a table L according to the order of a dictionary gamma; and indexes a pair (L) for each tagi,di) Time with timestamp addediObtaining a product containing (L)i,di,timei) A dictionary gamma is created through the table L; wherein timeiIs the initial time for encryption to complete each tag index pair (L)i,di) Middle index information diThe time of (d);
step S4, encrypting each data file to obtain an encrypted data file; and encrypting each abstract file to obtain the encrypted abstract file.
In this embodiment, each digest file is subjected to sub-string search encryption by using a Burrows-Wheeler conversion algorithm and an FM index technique, wherein the conversion Burrows-Wheeler conversion (BWT) algorithm converts a data stream by entropy of each character. In short, the data stream S is converted into the encoding W such that the compression algorithm provides a high compression rate, the steps of the conversion being substantially as follows: first, the algorithm builds a matrix W by changing the sequence of tokens $ after appending a termination token to the input string S. The changed sequence in each iteration is appended as a new row to the matrix W. Finally, the rows of W are arranged in ascending order according to the dictionary order. The data obtained after BWT algorithm conversion is mapped into FM through LF mapping technology, the LF mapping technology takes the first column F and the last column L of BWT conversion, and the original character string S is reconstructed through the iterative process of the algorithm. From the first element of each column of F and L, L is used as an index to the F column. The elements of the L columns are added to a last-in-first-out stack each time. The value of the current position of the L columns will be used as the index of the F columns in the next cycle. Where at the first iteration the pointer simultaneously points to the first location of F, L. From this the last position F7 is found]S. The character of the current L (i.e., s) is added to the stack D. The next iteration, the current character of L is the next F column index. The character i is pushed. When $ at L, the process ends. And (4) popping all elements in the stack D by an algorithm to obtain an initial character string S. In the FM index technique, FM consists of three column groups. The first is the F column and the second is the L column in the LF map, which corresponds to BWT (S). The last one is the suffix array SA. SA includes the position of each column i, sub-string in the original character string S, i of W matrix obtained after BWT conversionthAnd (6) rows.
The specific process of substring search encryption in this embodiment is as follows:
step S41, respectively creating a linked list aiming at each different character in the abstract file; for each character linked list, each node storage tuple is < nptr, addr >, nptr is a pointer pointing to the next node of the character linked list, addr is the position of the character at a certain position in the summary file in the FM index, and addr in different node storage tuples in the character linked list are the positions of the characters at different positions in the summary file in the FM index respectively; for example, a character T exists in 10 positions of a certain summary file, then addr in the storage tuple from the 1 st node to the 10 th node in the established linked list of the character T is the position of the character T in the 10 positions of the summary file in the FM index.
Aiming at each different character in the abstract file, the first node of each character linked list, namely the storage tuple of the linked list head, is encrypted to obtain:
Figure GDA0002583190940000121
wherein<nptr1,addr1>For the first node of each character chain table, i.e. the memory tuple of the head of the chain table, cmThe number of the m characters in different characters of the abstract file is Y, and the Y is the total number of the different characters in the abstract file; k' is a secondary key, FK′(cm) Indicating that the character c is pointed to by the secondary key KmEncryption is performed.
Step S42, aiming at each different character in the abstract file, firstly carrying out encryption processing, taking the data after the encryption processing of each different character as a linked list index to obtain a linked list index set, and respectively mapping each linked list index corresponding to each different character to a linked list head of each different character linked list to obtain a mapping relation between the linked list index and the linked list head of each different character; after encryption processing of different characters in the summary file, the method comprises the following steps:
Figure GDA0002583190940000131
k being the master key, FK(cm) Indicating that the character c is pointed to by the master key KmCarrying out encryption; fK′(cm) Indicating that the character c is pointed to by the secondary key KmEncryption is performed.
As shown in fig. 3, for the linked list indexes LLSET, the linked list LL, and the FM map obtained in the above steps in this embodiment, each linked list index is mapped to a linked list header of one linked list, that is, the corresponding linked list can be obtained through the linked list index; each node in each linked list is mapped to each group of FM tuples in the FM table correspondingly, namely each row in the FM table, and the corresponding linked list can be obtained through the FM tuples through the mapping relation between the linked list and the FM. In FIG. 3
Figure GDA0002583190940000132
Indicating character cmThe tuple stored in the ith byte in the corresponding linked list, i is 1,2,3 ….
The embodiment also discloses a data searchable encryption system, which includes:
the data file acquisition unit is used for acquiring a data file uploaded by a data owner;
a keyword extraction unit for extracting keywords of each data file,
the abstract extraction unit is used for extracting an abstract of each data file to obtain an abstract file;
the dictionary generating unit is used for generating a dictionary gamma after data processing is carried out through an encryption algorithm according to the corresponding relation between each keyword and each data file, wherein the dictionary gamma stores the label corresponding to each data file by each keyword and the index information corresponding to each data file by each keyword, and the label corresponding to each data file by each keyword and the index information corresponding to each data file by each keyword are in one-to-one pairing relation aiming at each keyword;
the data file encryption unit is used for encrypting each data file to obtain an encrypted data file;
and the digest file encryption unit is used for searching and encrypting substrings of the digest files to obtain encrypted digest files.
The embodiment also discloses a terminal, which comprises a processor and a memory for storing the executable program of the processor, wherein when the processor executes the program stored in the memory, the data searchable encryption method of the embodiment is realized. In this embodiment, as shown in fig. 4, the terminal may be a computer, which serves as a client for data owner to upload a data file into the computer, and then the computer executes the above data searchable encryption method in this embodiment to obtain a dictionary γ, an encrypted data file, and an encrypted digest file, and the computer may upload the obtained dictionary γ, encrypted data file, and encrypted digest file to a server, so that an authorized user (a user owning a master key issued by the system) can search for a corresponding data file through a keyword.
Example 2
The embodiment discloses a keyword search method, as shown in fig. 2, including the following steps:
step X1, first obtaining a dictionary γ, an encrypted data file, and an encrypted digest file obtained by the data searchable encryption method of the embodiment;
when receiving each keyword which is sent by a user and needs to be searched, firstly, determining whether the encrypted data file comprises the keyword through a search dictionary gamma; if so, returning the corresponding encrypted data file as a query result to the user for decryption; if not, go to step X2;
in this embodiment, a specific process of determining whether there is any encrypted data file including the keyword by searching the dictionary γ for each keyword that needs to be searched is as follows:
step X11, aiming at each keyword which needs to be searched by the user, generating a pair of sub-keys K 'for the keyword according to the main key K sent by the user'1,K′2
K′1←F(K,1||ω′);
K′2←F(K,2||ω′);
Wherein omega' is a keyword which needs to be searched by a user; wherein the function F () in the present embodiment represents a hash function. The main key K and each keyword needing to be searched are simultaneously sent by a user;
step X12, for each keyword needing to be searched, traversing the file number sequence corresponding to the data file, and passing through the sub-key K'1Generating a label of the data file of which the keyword corresponds to each file number:
Li′←F(K′1,i′);i′=0,1,2,…I;
wherein I' is a file number sequence number corresponding to the traversed data file, I is a maximum value of the file number sequence number corresponding to the traversed data file, and I +1 is the total number of the preset data files including the keyword to be searched; l isi′A label corresponding to the data file with the file number i 'for the keyword omega';
step X13, for each keyword which needs to be searched, searching dictionary gamma for whether there is label L of data file corresponding to each file number by the keyword generated in the step X12i′
If not, the data file corresponding to the keyword required to be searched cannot be searched through the dictionary γ, and the process proceeds to step X2.
If yes, index information paired with the label is obtained in a dictionary gamma, and then the sub-key K 'of the keyword is passed'2Decrypting the index information, acquiring the corresponding encrypted data file through the decrypted index information, and returning the encrypted data file to a user for decryption as a query result; at the same time, the time stamp of the tag index pair stored in the dictionary γ is updated to the sub-key K 'of the keyword'2Time of completion of decrypting the index information;
the index information paired with the label acquired in the dictionary γ is:
di′←Get(γ,Li′);
wherein d isi′For obtaining and labeling L in dictionary gammai′Paired index information;
wherein, the sub-key K 'of the keyword'2The obtained decrypted index information is:
di←Dec(K′2,di′);
wherein d isiIs di′Sub-key K 'by keyword ω'2Decrypted index information, wherein the decrypted index information diThe file number of the data file containing the keyword omega' is obtained;
step X2, performing substring search on each keyword to be searched in the encrypted abstract file set;
if the key word is searched in the abstract file after the substring search, the encrypted data file corresponding to the abstract file is returned to the user as a query result; and if the user confirms that the data files are correct, determining that the corresponding data files as the query result comprise the key words, calculating labels of the data files corresponding to the key words and index information of the corresponding data files, adding the labels to the dictionary gamma, and updating the dictionary gamma. In this embodiment, the dictionary γ is set to be the dictionary γ with a fixed length, when it is necessary to add a tag of a data file corresponding to a new keyword and index information of the corresponding data file to the dictionary γ, that is, when it is necessary to add a tag index pair of the new keyword to the dictionary γ, if the dictionary currently stores a full tag index pair, the tag index pair of the new keyword is replaced by the tag index pair with the smallest timestamp in the dictionary γ, and when the tag index pair of the new keyword is multiple, the tag index pairs with the smallest timestamp in the dictionary γ are replaced by the multiple tag index pairs with the smallest timestamp in the dictionary γ.
And if the substring search is passed and the substring search fails, returning a search failure result to the user.
In the above step X2, the encrypted digest file set is correspondingly sub-string searched by using a Burrows-Wheeler conversion algorithm and an FM indexing technique, and the specific process is as follows:
step X21, aiming at the keyword omega' needing to be searched, generating a keyword query token tkT,S
tkT,S=F(K,ω′[1…M])=F(K,ω′[1]),F(K,ω′[2]),…F(K,ω′[M]),F(K′,ω′[M]);
Wherein, ω ' 1, ω ' 2, …, ω ' M are each character of the keyword ω ' to be searched, M is the total number of characters of the keyword ω '; k' is a secondary key, K ═ F (K, 2), K is a primary key;
step X22, aiming at each character omega 'of the keyword omega' required to be searched]And M is 1,2,3, … M, which is first encrypted to yield:
Figure GDA0002583190940000161
then search the ciphertext from the linked list index set
Figure GDA0002583190940000162
By the index of each character ω' [ m ]]The mapping relation between the linked list index and the linked list head is obtained to obtain each character omega' [ m ]]A linked list of;
step X23, for the last character ω ' M of the keyword ω ' that needs to be searched, mapping each node in the linked list of the character ω ' M to the encrypted FM tuple:
Figure GDA0002583190940000163
wherein
Figure GDA0002583190940000164
Data corresponding to column F at FM;
wherein
Figure GDA0002583190940000165
Data corresponding to L columns at FM;
wherein E (pos)j) Corresponding to data at column j of SA of FM, posjRepresenting a position ciphertext of a character corresponding to the data of the jth line of the SA column in the summary file, wherein n is the total line number of FM;
wherein the content of the first and second substances,
Figure GDA0002583190940000166
for the character corresponding to the data in column fth and row j of FM,
Figure GDA0002583190940000167
corresponding character for data in FM Fth column and j th line
Figure GDA0002583190940000168
The position number of (2);
Figure GDA0002583190940000169
for the character corresponding to the data in column lth and row jth of FM,
Figure GDA00025831909400001610
the character corresponding to the data in the L column and j row of FM
Figure GDA00025831909400001611
The position number of (2);
for each encrypted FM tuple to which each byte in the linked list of ω' M maps:
first of all, using FK(ω′[m]) For data in F column of FM namely
Figure GDA0002583190940000171
Performing XOR operation to realize decryption to obtain
Figure GDA0002583190940000172
Then adopt
Figure GDA0002583190940000173
Decrypting an element of a first portion of data in an L-column of FM as a key
Figure GDA0002583190940000174
To obtain
Figure GDA0002583190940000175
Will be provided with
Figure GDA0002583190940000176
And elements of the second part of the data in the L columns of FM
Figure GDA0002583190940000177
Performing exclusive-or operation to obtain an exclusive-or operation result, and then entering step X24;
step X24, for each xor operation result obtained in the previous step, searching a row with data as the xor operation result in an F column of the FM, then obtaining an FM tuple of the row, finding a linked list with corresponding nodes mapped to the FM tuple according to a mapping relationship between each node of the linked list and the FM tuple in the FM table as shown in fig. 3, and thus obtaining a character c corresponding to the linked listxAs the currently searched character; wherein x is the number of times data is searched in the F column of the FM currently; go to step X25;
step X25, determination of each character c acquired in step X24xWhether or not there is a sum character omega' [ M-x]The same character;
if yes, judging whether the number x of data searching in the F column of the FM is equal to M-1 or not; if yes, ending substring search, successfully searching substrings, and enabling the corresponding abstract files to comprise keywords omega' needing to be searched; if not, go to step X26;
if not, ending substring search, and returning a result of substring search failure, namely, the corresponding abstract file does not contain the keyword omega';
in step X26, for the character cx that is obtained in step X24 and is the same as the character ω' [ M-X ], each FM tuple obtained when the character cx is obtained in step X24 is obtained, and the following operations are performed for each FM tuple:
first of all, using FK(cx) For data in F column of FM namely
Figure GDA0002583190940000178
Performing XOR operation to realize decryption to obtain
Figure GDA0002583190940000179
Then adopt
Figure GDA00025831909400001710
As a secretThe key decrypts the elements of the first portion of the data in the L columns of FM
Figure GDA00025831909400001711
To obtain
Figure GDA00025831909400001712
Will be provided with
Figure GDA00025831909400001713
And elements of the second part of the data in the L columns of FM
Figure GDA00025831909400001714
Carrying out XOR operation to obtain an XOR operation result; then proceed to step X24.
The embodiment also discloses a keyword search system, which includes:
a data file acquisition module, configured to acquire a dictionary γ, an encrypted data file, and an encrypted digest file that are obtained by the data searchable encryption method according to the embodiment;
the keyword receiving module is used for receiving each keyword which is sent by a user and needs to be searched;
the first keyword searching module is used for determining whether the encrypted data file comprises the keyword or not through a searching dictionary gamma aiming at each keyword needing to be searched;
the keyword second searching module is used for searching substrings in the encrypted summary file set under the condition that the data file set cannot be searched aiming at each keyword which needs to be searched;
the query result returning unit is used for returning the query results of the first keyword searching module and the second keyword searching module to the user;
the dictionary gamma updating unit is used for updating the dictionary gamma according to the query result of the second keyword searching module, and specifically comprises the following steps: in case that the user confirms that the query result of the second search module for the keyword is correct, it is determined that the corresponding data file as the query result includes the keyword, a tag of the data file corresponding to the keyword and index information of the corresponding data file are calculated and added to the dictionary γ.
The embodiment also discloses a computing device, which comprises a processor and a memory for storing the executable program of the processor, wherein when the processor executes the program stored in the memory, the keyword search method of the embodiment is realized.
In this embodiment, as shown in fig. 4, the computing device includes a client and a server, where the client is a computer or other intelligent terminal, the client is user-oriented, and the user inputs a keyword and a master key to be searched through the client. After receiving the key word and the main key input by the user, the client executes step X11 of the key word searching method of the embodiment to generate a pair of sub keys of the key word and sends the pair of sub keys to the server, and after receiving the sub keys, the server executes step X12 and step X13 of the key word searching method of the embodiment to determine whether the key word omega' to be searched can be found in the encrypted data file through the search dictionary; the codes of step X12 and step X13 in which the server performs the above keyword search method of the present embodiment are as follows:
For(i′=0;i′!=⊥;i′++){
Li′←F(K′1,i′);
di′←Get(γ,Li′) (ii) a V/calculate Li′Then the contained label L can be found in the dictionary gammai′To obtain corresponding index information di′
di←Dec(K′2,di′);//di′Decrypting to obtain the file number of the data file comprising the keyword omega';
refresh (time); updating a timestamp corresponding to a label index pair of the label in the dictionary gamma corresponding to the keyword omega';
when the data file including the keyword ω ' cannot be searched by searching the dictionary γ, the server performs steps X21 to X26 of the keyword search method, that is, searches the digest file set for a digest file including the keyword ω ', and if the search is successful, returns the searched encrypted data file to the client, and adds the tag index corresponding to the keyword ω ' to the dictionary γ to update the dictionary γ.
The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims (6)

1. A data searchable encryption method is characterized by comprising the following steps:
acquiring a data file uploaded by a data owner;
extracting key words of each data file;
extracting the abstract of each data file to obtain an abstract file;
generating a dictionary gamma after data processing is carried out through an encryption algorithm according to the corresponding relation between each keyword and each data file, wherein labels corresponding to each data file by each keyword and index information corresponding to each data file by each keyword are stored in the dictionary gamma, and the labels corresponding to each data file by the keyword and the index information corresponding to each data file by the keyword are in one-to-one pairing relation in the dictionary gamma for each keyword;
encrypting each data file to obtain an encrypted data file;
encrypting each abstract file to obtain an encrypted abstract file;
according to the corresponding relation between each keyword and each data file, the specific process of generating the dictionary gamma after data processing is carried out through an encryption algorithm is as follows:
s11, establishing an empty table L, and selecting a master key K for the table L;
s12, for each keyword, obtaining each data file including the keyword, and generating a pair of subkeys K for the keyword through the main key K1,K2
K1←F(K,1||ω);
K2←F(K,2||ω);
Wherein omega is a keyword;
for each keyword, numbering each data file comprising the keyword to obtain a file number corresponding to each data file, and sequencing each file number to obtain a sequence number of each file number;
for each keyword, a key K is used1Generating labels in sequence according to each file number corresponding to the keyword, and simultaneously adopting a secret key K2Encrypting each file number corresponding to the keyword in sequence, and taking the encrypted result as the index information of the data file corresponding to the keyword to obtain a label index pair (L)i,di):
Li←F(K1,i);
di←Enc(K2,idi);
i=0,1,…,N-1;
Wherein L isiTo adopt a secret key K1The ith document number id corresponding to the keyword omegaiThe generated label; diTo adopt a secret key K2I (th) file number id corresponding to encryption keyword omegaiThen obtaining a result, using the result as the key word to correspond to the file number as idiN is the total number of data files including the keyword ω;
s13, obtaining one label index pair (L) for each keywordi,di) Sequentially inserting the data into a table L according to the order of a dictionary gamma; and indexes a pair (L) for each tagi,di) Time with timestamp addediObtaining a product containing (L)i,di,timei) A dictionary gamma is created through the table L; wherein timeiIs the initial time for encryption to complete each tag index pair (L)i,di) Middle index information diThe time of (d);
the process of acquiring each summary file is as follows:
firstly, extracting a summary file from a data file through a document summary extraction algorithm; then, taking the file number corresponding to the abstract as an index, storing the abstract at a corresponding position, and performing character filling on the rest positions to form an abstract file;
substring search encryption is carried out on each summary file by adopting a Burrows-Wheeler conversion algorithm and an FM indexing technology, and the specific process is as follows:
respectively creating a linked list aiming at each different character in the abstract file; for each character linked list, each node storage tuple is < nptr, addr >, nptr is a pointer pointing to the next node of the character linked list, addr is the position of the character at a certain position in the summary file in the FM index, and addr in different node storage tuples in the character linked list are the positions of the characters at different positions in the summary file in the FM index respectively;
aiming at each different character in the abstract file, the first node of each character linked list, namely the storage tuple of the linked list head, is encrypted to obtain:
Figure FDA0002583190930000021
wherein<nptr1,addr1>For the first node of each character chain table, i.e. the memory tuple of the head of the chain table, cmThe number of the m characters in different characters of the abstract file is Y, and the Y is the total number of the different characters in the abstract file; k' is a secondary key, FK′(cm) Indicating that the character c is pointed to by the secondary key KmCarrying out encryption;
aiming at different characters in the abstract file, firstly, encryption processing is carried out, data after encryption processing of the different characters are used as linked list indexes to obtain a linked list index set, and the linked list indexes corresponding to the different characters are respectively mapped to the linked list heads of the linked lists of the different characters to obtain the mapping relation between the linked list indexes and the linked list heads of the different characters; after encryption processing of different characters in the summary file, the method comprises the following steps:
Figure FDA0002583190930000031
k being the master key, FK(cm) Indicating that the character c is pointed to by the master key KmCarrying out encryption; fK′(cm) Indicating that the character c is pointed to by the secondary key KmEncryption is performed.
2. A data searchable encryption system, comprising:
the data file acquisition unit is used for acquiring a data file uploaded by a data owner;
a keyword extraction unit for extracting keywords of each data file,
the abstract extraction unit is used for extracting an abstract of each data file to obtain an abstract file;
the dictionary generating unit is used for generating a dictionary gamma after data processing is carried out through an encryption algorithm according to the corresponding relation between each keyword and each data file, wherein the dictionary gamma stores the label corresponding to each data file by each keyword and the index information corresponding to each data file by each keyword, and the label corresponding to each data file by each keyword and the index information corresponding to each data file by each keyword are in one-to-one pairing relation aiming at each keyword;
the data file encryption unit is used for encrypting each data file to obtain an encrypted data file;
the digest file encryption unit is used for searching and encrypting substrings aiming at the digest files to obtain encrypted digest files;
according to the corresponding relation between each keyword and each data file, the specific process of generating the dictionary gamma after data processing is carried out through an encryption algorithm is as follows:
s11, establishing an empty table L, and selecting a master key K for the table L;
s12, for each keyword, obtaining each data file including the keyword, and generating a pair of subkeys K for the keyword through the main key K1,K2
K1←F(K,1||ω);
K2←F(K,2||ω);
Wherein omega is a keyword;
for each keyword, numbering each data file comprising the keyword to obtain a file number corresponding to each data file, and sequencing each file number to obtain a sequence number of each file number;
for each keyword, a key K is used1Generating labels in sequence according to each file number corresponding to the keyword, and simultaneously adopting a secret key K2Encrypting each file number corresponding to the keyword in sequence, and taking the encrypted result as the index information of the data file corresponding to the keyword to obtain a label index pair (L)i,di):
Li←F(K1,i);
di←Enc(K2,idi);
i=0,1,…,N-1;
Wherein L isiTo adopt a secret key K1The ith document number id corresponding to the keyword omegaiThe generated label; diTo adopt a secret key K2I (th) file number id corresponding to encryption keyword omegaiThen obtaining a result, using the result as the key word to correspond to the file number as idiN is the total number of data files including the keyword ω;
s13, obtaining one label index pair (L) for each keywordi,di) Sequentially inserting the data into a table L according to the order of a dictionary gamma; and indexes a pair (L) for each tagi,di) Time with timestamp addediObtaining a product containing (L)i,di,timei) A dictionary gamma is created through the table L; wherein timeiIs the initial time for encryption to complete each tag index pair (L)i,di) Middle index information diThe time of (d);
the process of acquiring each summary file is as follows:
firstly, extracting a summary file from a data file through a document summary extraction algorithm; then, taking the file number corresponding to the abstract as an index, storing the abstract at a corresponding position, and performing character filling on the rest positions to form an abstract file;
substring search encryption is carried out on each summary file by adopting a Burrows-Wheeler conversion algorithm and an FM indexing technology, and the specific process is as follows:
respectively creating a linked list aiming at each different character in the abstract file; for each character linked list, each node storage tuple is < nptr, addr >, nptr is a pointer pointing to the next node of the character linked list, addr is the position of the character at a certain position in the summary file in the FM index, and addr in different node storage tuples in the character linked list are the positions of the characters at different positions in the summary file in the FM index respectively;
aiming at each different character in the abstract file, the first node of each character linked list, namely the storage tuple of the linked list head, is encrypted to obtain:
Figure FDA0002583190930000051
wherein<nptr1,addr1>For the first node of each character chain table, i.e. the memory tuple of the head of the chain table, cmThe number of the m characters in different characters of the abstract file is Y, and the Y is the total number of the different characters in the abstract file; k' is a secondary key, FK′(cm) Indicating that the character c is pointed to by the secondary key KmCarrying out encryption;
aiming at different characters in the abstract file, firstly, encryption processing is carried out, data after encryption processing of the different characters are used as linked list indexes to obtain a linked list index set, and the linked list indexes corresponding to the different characters are respectively mapped to the linked list heads of the linked lists of the different characters to obtain the mapping relation between the linked list indexes and the linked list heads of the different characters; after encryption processing of different characters in the summary file, the method comprises the following steps:
Figure FDA0002583190930000052
k being the master key, FK(cm) Indicating that the character c is pointed to by the master key KmCarrying out encryption; fK′(cm) Indicating that the character c is pointed to by the secondary key KmEncryption is performed.
3. A terminal comprising a processor and a memory for storing processor-executable programs, the terminal characterized by: the processor, when executing a program stored in the memory, implements the data searchable encryption method of claim 1.
4. A keyword search method is characterized by comprising the following steps:
step X1, firstly, acquiring a dictionary gamma, an encrypted data file and an encrypted digest file which are obtained by the data searchable encryption method according to any claim 1;
when receiving each keyword which is sent by a user and needs to be searched, firstly, determining whether the encrypted data file comprises the keyword through a search dictionary gamma; if so, returning the corresponding encrypted data file as a query result to the user for decryption;
if not, go to step X2;
step X2, performing substring search on each keyword to be searched in the encrypted abstract file set;
if the key word is searched in the abstract file after the substring search, the encrypted data file corresponding to the abstract file is returned to the user as a query result; under the condition that the user confirms that the data files are correct, the corresponding data files serving as the query result are confirmed to comprise the key words, the labels of the data files corresponding to the key words and the index information of the corresponding data files are calculated and added into the dictionary gamma, and the dictionary gamma is updated;
if the substring search is carried out, the result of the search failure is returned to the user;
in step X1, for each keyword that needs to be searched, the specific process of determining whether there is an encrypted data file including the keyword by searching the dictionary γ is as follows:
step X11, aiming at each keyword which needs to be searched by the user, generating a pair of sub-keys K 'for the keyword according to the main key K sent by the user'1,K′2
K′1←F(K,1||ω′);
K′2←F(K,2||ω′);
Wherein omega' is a keyword which needs to be searched by a user;
step X12, for each keyword needing to be searched, traversing the file number sequence corresponding to the data file, and passing through the sub-key K'1Generating a label of the data file of which the keyword corresponds to each file number:
Li′←F(K′1,i′);i′=0,1,2,…I;
wherein I' is a file number sequence number corresponding to the traversed data file, and I is the maximum value of the file number sequence number corresponding to the traversed data file; l isi′A label corresponding to the data file with the file number i 'for the keyword omega';
step X13, for each keyword which needs to be searched, searching dictionary gamma for whether there is label L of data file corresponding to each file number by the keyword generated in the step X12i′
If not, go to step X2;
if yes, index information paired with the label is obtained in a dictionary gamma, and then the sub-key K 'of the keyword is passed'2Decrypting the index information, acquiring the corresponding encrypted data file through the decrypted index information, and returning the encrypted data file to a user for decryption as a query result; at the same time, the time stamp of the tag index pair stored in the dictionary γ is updated to the sub-key K 'of the keyword'2Time of completion of decrypting the index information;
the index information paired with the label acquired in the dictionary γ is:
di′←Get(γ,Li′);
wherein d isi′For obtaining and labeling L in dictionary gammai′Paired index information;
wherein, the sub-key K 'of the keyword'2The obtained decrypted index information is:
di←Dec(K′2,di′);
wherein d isiIs di′Sub-key K 'by keyword ω'2Decrypted index information, wherein the decrypted index information diThe file number of the data file containing the keyword omega' is obtained;
in the step X2, in the step X,
substring search is carried out on the encrypted summary file set by adopting a Burrows-Wheeler conversion algorithm and an FM indexing technology, and the specific process is as follows:
step X21, aiming at the keyword omega' needing to be searched, generating a keyword query token tkT,S
tkT,S=F(K,ω′[1…M])=F(K,ω′[1]),F(K,ω′[2]),…F(K,ω′[M]),F(K′,ω′[M]);
Wherein, ω ' 1, ω ' 2, …, ω ' M are each character of the keyword ω ' to be searched, M is the total number of characters of the keyword ω '; k' is a secondary key, K ═ F (K, 2), K is a primary key;
step X22, aiming at each character omega 'of the keyword omega' required to be searched]And M is 1,2,3, … M, which is first encrypted to yield:
Figure FDA0002583190930000071
then search the ciphertext from the linked list index set
Figure FDA0002583190930000072
By the index of each character ω' [ m ]]The mapping relation between the linked list index and the linked list head is obtained to obtain each wordSymbol omega' m]A linked list of;
step X23, for the last character ω ' M of the keyword ω ' that needs to be searched, mapping each node in the linked list of the character ω ' M to the encrypted FM tuple:
Figure FDA0002583190930000073
wherein
Figure FDA0002583190930000074
Data corresponding to column F at FM;
wherein
Figure FDA0002583190930000075
Data corresponding to L columns at FM;
wherein E (pos)j) Corresponding to data at column j of SA of FM, posjRepresenting a position ciphertext of a character corresponding to the data of the jth line of the SA column in the summary file, wherein n is the total line number of FM;
wherein the content of the first and second substances,
Figure FDA0002583190930000076
for the character corresponding to the data in column fth and row j of FM,
Figure FDA0002583190930000077
corresponding character for data in FM Fth column and j th line
Figure FDA0002583190930000081
The position number of (2);
Figure FDA0002583190930000082
for the character corresponding to the data in column lth and row jth of FM,
Figure FDA0002583190930000083
the character corresponding to the data in the L column and j row of FM
Figure FDA0002583190930000084
The position number of (2);
for each encrypted FM tuple to which each byte in the linked list of ω' M maps:
first of all, using FK(ω′[m]) For data in F column of FM namely
Figure FDA0002583190930000085
Performing XOR operation to realize decryption to obtain
Figure FDA0002583190930000086
Then adopt
Figure FDA0002583190930000087
Decrypting an element of a first portion of data in an L-column of FM as a key
Figure FDA0002583190930000088
To obtain
Figure FDA0002583190930000089
Will be provided with
Figure FDA00025831909300000810
And elements of the second part of the data in the L columns of FM
Figure FDA00025831909300000811
Performing exclusive-or operation to obtain an exclusive-or operation result, and then entering step X24;
step X24, aiming at each XOR operation result obtained in the previous step, searching a row with data as the result of the XOR operation in the F column of the FM, then obtaining the FM tuple of the row, searching a linked list with nodes mapped to the FM tuple, and thus obtaining the character c corresponding to the linked listxAs the currently searched character; wherein x is the number of times data is searched in the F column of the FM currently; go to step X25;
step X25, determination of each character c acquired in step X24xWhether or not there is a sum character omega' [ M-x]The same character;
if yes, judging whether the number x of data searching in the F column of the FM is equal to M-1 or not; if yes, ending substring search, successfully searching substrings, and enabling the corresponding abstract files to comprise keywords omega' needing to be searched; if not, go to step X26;
if not, ending substring search, and returning a result of substring search failure, namely, the corresponding abstract file does not contain the keyword omega';
step X26, for the sum character ω' [ M-X ] acquired at step X24]Identical character cxThe character c is obtained in the obtaining step X24xEach FM tuple obtained, and for each FM tuple:
first of all, using FK(cx) For data in F column of FM namely
Figure FDA00025831909300000812
Performing XOR operation to realize decryption to obtain
Figure FDA00025831909300000813
Then adopt
Figure FDA00025831909300000814
Decrypting an element of a first portion of data in an L-column of FM as a key
Figure FDA00025831909300000815
To obtain
Figure FDA00025831909300000816
Will be provided with
Figure FDA00025831909300000817
And elements of the second part of the data in the L columns of FM
Figure FDA00025831909300000818
Carrying out XOR operation to obtain an XOR operation result; then proceed to step X24.
5. The keyword search method according to claim 4, wherein the dictionary γ is set to a fixed-length dictionary, and in the step X2, the updating process of the dictionary γ is implemented as follows:
when a new keyword corresponds to a tag of a data file and index information of the corresponding data file need to be added to the dictionary γ, that is, when a new keyword tag index pair needs to be added to the dictionary γ, if the dictionary currently stores a full tag index pair, the new keyword tag index pair is replaced by the tag index pair with the smallest timestamp in the dictionary γ, and when a plurality of new keyword tag index pairs are stored, the plurality of tag index pairs with the smallest timestamp in the dictionary γ are replaced.
6. A computing device comprising a processor and a memory for storing processor-executable programs, wherein: the processor implements the keyword search method according to any one of claims 4 to 5 when executing a program stored in the memory.
CN201811170800.2A 2018-10-09 2018-10-09 Data searchable encryption and keyword search method, system, terminal and equipment Active CN109492410B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811170800.2A CN109492410B (en) 2018-10-09 2018-10-09 Data searchable encryption and keyword search method, system, terminal and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811170800.2A CN109492410B (en) 2018-10-09 2018-10-09 Data searchable encryption and keyword search method, system, terminal and equipment

Publications (2)

Publication Number Publication Date
CN109492410A CN109492410A (en) 2019-03-19
CN109492410B true CN109492410B (en) 2020-09-01

Family

ID=65690127

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811170800.2A Active CN109492410B (en) 2018-10-09 2018-10-09 Data searchable encryption and keyword search method, system, terminal and equipment

Country Status (1)

Country Link
CN (1) CN109492410B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3894275A4 (en) 2018-12-11 2022-10-19 Ess-Help, Inc. Enhanced operation of vehicle hazard and lighting communication systems
CN112948386B (en) * 2021-03-04 2023-09-22 电信科学技术第五研究所有限公司 Simple indexing and encrypting disk-dropping mechanism for ETL abnormal data
CN113076319B (en) * 2021-04-13 2022-05-06 河北大学 Dynamic database filling method based on outlier detection technology and bitmap index
CN115688149B (en) * 2023-01-03 2023-05-16 大熊集团有限公司 Encrypted data access method and system

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103593476A (en) * 2013-11-28 2014-02-19 中国科学院信息工程研究所 Multi-keyword plaintext and ciphertext retrieving method and device oriented to cloud storage
CN104038349A (en) * 2014-07-03 2014-09-10 西安电子科技大学 Effective and verifiable public key searching encryption method based on KP-ABE
CN104394155A (en) * 2014-11-27 2015-03-04 暨南大学 Multi-user cloud encryption keyboard searching method capable of verifying integrity and completeness
CN104765848A (en) * 2015-04-17 2015-07-08 中国人民解放军空军航空大学 Symmetrical searchable encryption method for supporting result high-efficiency sequencing in hybrid cloud storage
CN104780161A (en) * 2015-03-23 2015-07-15 南京邮电大学 Searchable encryption method supporting multiple users in cloud storage
CN104899517A (en) * 2015-05-15 2015-09-09 陕西师范大学 Phrase-based searchable symmetric encryption method
WO2016108468A1 (en) * 2014-12-29 2016-07-07 Samsung Electronics Co., Ltd. User terminal, service providing apparatus, driving method of user terminal, driving method of service providing apparatus, and encryption indexing-based search system
CN105825427A (en) * 2016-03-23 2016-08-03 华南农业大学 Encrypted keyword search-based bidirectional anonymity trusted network debit and credit system and method
CN106874516A (en) * 2017-03-15 2017-06-20 电子科技大学 Efficient cipher text retrieval method based on KCB trees and Bloom filter in a kind of cloud storage
CN106997384A (en) * 2017-03-24 2017-08-01 福州大学 A kind of semantic ambiguity that can verify that sorts can search for encryption method
CN107491497A (en) * 2017-07-25 2017-12-19 福州大学 Multi-user's multi-key word sequence of any language inquiry is supported to can search for encryption system
CN107622212A (en) * 2017-10-13 2018-01-23 上海海事大学 A kind of mixing cipher text retrieval method based on double trapdoors
CN108055122A (en) * 2017-11-17 2018-05-18 西安电子科技大学 The anti-RAM leakage dynamic that can verify that can search for encryption method, Cloud Server
CN108388807A (en) * 2018-02-28 2018-08-10 华南理工大学 It is a kind of that the multiple key sequence that efficiently can verify that of preference search and Boolean Search is supported to can search for encryption method
CN108416037A (en) * 2018-03-14 2018-08-17 安徽大学 Centric keyword cipher text searching method based on two-stage index in cloud environment

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103593476A (en) * 2013-11-28 2014-02-19 中国科学院信息工程研究所 Multi-keyword plaintext and ciphertext retrieving method and device oriented to cloud storage
CN104038349A (en) * 2014-07-03 2014-09-10 西安电子科技大学 Effective and verifiable public key searching encryption method based on KP-ABE
CN104394155A (en) * 2014-11-27 2015-03-04 暨南大学 Multi-user cloud encryption keyboard searching method capable of verifying integrity and completeness
WO2016108468A1 (en) * 2014-12-29 2016-07-07 Samsung Electronics Co., Ltd. User terminal, service providing apparatus, driving method of user terminal, driving method of service providing apparatus, and encryption indexing-based search system
CN104780161A (en) * 2015-03-23 2015-07-15 南京邮电大学 Searchable encryption method supporting multiple users in cloud storage
CN104765848A (en) * 2015-04-17 2015-07-08 中国人民解放军空军航空大学 Symmetrical searchable encryption method for supporting result high-efficiency sequencing in hybrid cloud storage
CN104899517A (en) * 2015-05-15 2015-09-09 陕西师范大学 Phrase-based searchable symmetric encryption method
CN105825427A (en) * 2016-03-23 2016-08-03 华南农业大学 Encrypted keyword search-based bidirectional anonymity trusted network debit and credit system and method
CN106874516A (en) * 2017-03-15 2017-06-20 电子科技大学 Efficient cipher text retrieval method based on KCB trees and Bloom filter in a kind of cloud storage
CN106997384A (en) * 2017-03-24 2017-08-01 福州大学 A kind of semantic ambiguity that can verify that sorts can search for encryption method
CN107491497A (en) * 2017-07-25 2017-12-19 福州大学 Multi-user's multi-key word sequence of any language inquiry is supported to can search for encryption system
CN107622212A (en) * 2017-10-13 2018-01-23 上海海事大学 A kind of mixing cipher text retrieval method based on double trapdoors
CN108055122A (en) * 2017-11-17 2018-05-18 西安电子科技大学 The anti-RAM leakage dynamic that can verify that can search for encryption method, Cloud Server
CN108388807A (en) * 2018-02-28 2018-08-10 华南理工大学 It is a kind of that the multiple key sequence that efficiently can verify that of preference search and Boolean Search is supported to can search for encryption method
CN108416037A (en) * 2018-03-14 2018-08-17 安徽大学 Centric keyword cipher text searching method based on two-stage index in cloud environment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Storage Efficient Substring Searchable Symmetric Encryption";Iraklis Leontiadis等;《Proceedings of the 6th International Workshop on Security in Cloud Computing》;20180531;第3-13页 *
"一种灵活的精度可控的可搜索对称加密方案";李西明等;《计算机研究与发展》;20200101;第3-16页 *

Also Published As

Publication number Publication date
CN109492410A (en) 2019-03-19

Similar Documents

Publication Publication Date Title
CN106815350B (en) Dynamic ciphertext multi-keyword fuzzy search method in cloud environment
CN109492410B (en) Data searchable encryption and keyword search method, system, terminal and equipment
US11537626B2 (en) Full-text fuzzy search method for similar-form Chinese characters in ciphertext domain
CN108712366B (en) Searchable encryption method and system supporting word form and word meaning fuzzy retrieval in cloud environment
CN111199053B (en) System and method for multi-character wildcard search of encrypted data
CN108959567B (en) Safe retrieval method suitable for large-scale images in cloud environment
CN105681280A (en) Searchable encryption method based on Chinese in cloud environment
CN111026788B (en) Homomorphic encryption-based multi-keyword ciphertext ordering and retrieving method in hybrid cloud
CN109992995B (en) Searchable encryption method supporting location protection and privacy inquiry
CN112800088A (en) Database ciphertext retrieval system and method based on bidirectional security index
CN106407447A (en) Simhash-based fuzzy sequencing searching method for encrypted cloud data
CN111801665A (en) Hierarchical Locality Sensitive Hash (LSH) partition indexing for big data applications
Rane et al. Multi-user multi-keyword privacy preserving ranked based search over encrypted cloud data
CN110908959A (en) Dynamic searchable encryption method supporting multi-keyword and result sorting
CN103607420A (en) Safe electronic medical system for cloud storage
CN112328606A (en) Keyword searchable encryption method based on block chain
CN114884650A (en) Searchable encryption method based on safe inverted index
CN114531220A (en) Efficient fault-tolerant dynamic phrase searching method based on forward privacy and backward privacy
CN115438230A (en) Safe and efficient dynamic encrypted cloud data multidimensional range query method
CN113626645B (en) Hierarchical optimization efficient ciphertext fuzzy retrieval method and related equipment
CN113076562A (en) Database encryption field fuzzy retrieval method based on GCM encryption mode
US11461551B1 (en) Secure word search
CN108710698B (en) Multi-keyword fuzzy query method based on ciphertext under cloud environment
US20220277098A1 (en) Method and system for securely storing and programmatically searching data
CN114528370A (en) Dynamic multi-keyword fuzzy ordering searching method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20230713

Address after: 511400 2107, No. 68, Hanxingzhi Street, Zhongcun Street, Panyu District, Guangzhou, Guangdong

Patentee after: Guangzhou Nengxi Information Technology Co.,Ltd.

Address before: 510642 No. five, 483 mountain road, Guangzhou, Guangdong, Tianhe District

Patentee before: SOUTH CHINA AGRICULTURAL University

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230728

Address after: 511400 2107, No. 68, Hanxingzhi Street, Zhongcun Street, Panyu District, Guangzhou, Guangdong

Patentee after: Guangzhou Nengxi Information Technology Co.,Ltd.

Address before: 510642 No. five, 483 mountain road, Guangzhou, Guangdong, Tianhe District

Patentee before: SOUTH CHINA AGRICULTURAL University

TR01 Transfer of patent right