GB2562767A - Right to erasure compliant back-up - Google Patents

Right to erasure compliant back-up Download PDF

Info

Publication number
GB2562767A
GB2562767A GB1708336.1A GB201708336A GB2562767A GB 2562767 A GB2562767 A GB 2562767A GB 201708336 A GB201708336 A GB 201708336A GB 2562767 A GB2562767 A GB 2562767A
Authority
GB
United Kingdom
Prior art keywords
data
encryption key
individual
subsystem
repository
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB1708336.1A
Other versions
GB201708336D0 (en
Inventor
James Parton William
Edward Nunn David
William Hern John
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Trust Hub Ltd
Original Assignee
Trust Hub Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Trust Hub Ltd filed Critical Trust Hub Ltd
Priority to GB1708336.1A priority Critical patent/GB2562767A/en
Publication of GB201708336D0 publication Critical patent/GB201708336D0/en
Publication of GB2562767A publication Critical patent/GB2562767A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/602Providing cryptographic facilities or services
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6245Protecting personal data, e.g. for financial or medical purposes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Medical Informatics (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Storage Device Security (AREA)

Abstract

A method and associated system for providing right to erasure compliant back-up, the method comprising: storing, by a back-up subsystem 4, data which relates to an individual on at least one storage module 2 coupled to the back-up subsystem; encrypting, by the back-up subsystem, the data using a unique encryption key; storing, by the back­up subsystem, the encrypted data on one or more back-up storage media 7; and storing the encryption key in the master table 5 of an encryption key repository as at least part of an entry pertaining to the individual, wherein the encryption key repository is accessible only by the back-up subsystem 4. Data relating to a user may be quickly deleted by erasing the corresponding encryption key from the key database 5, thereby rendering the archived data inaccessible. The back-up media, which may be hard disk drives, may be kept offline. Further redundant tables and data may be backed up periodically.

Description

(71) Applicant(s):
Trust-Hub Limited
Ballards Lane, London, N3 1XW, United Kingdom (72) Inventor(s):
William James Parton
David Edward Nunn
John William Hern (56) Documents Cited:
JP 2014170412 A
US 20140372393 A1
US 20070022290 A1
US 20060143443 A1 (58) Field of Search:
INT CL G06F
Other: EPODOC & WPI
US 6134660 A
US 20110055559 A1
US 20060210085 A1 (74) Agent and/or Address for Service:
Beck Greener
Fulwood House, 12 Fulwood Place, LONDON, WC1V 6HR, United Kingdom (54) Title of the Invention: Right to erasure compliant back-up Abstract Title: Right to erasure compliant back-up by management of encryption keys (57) A method and associated system for providing right to erasure compliant back-up, the method comprising: storing, by a back-up subsystem 4, data which relates to an individual on at least one storage module 2 coupled to the back-up subsystem; encrypting, by the back-up subsystem, the data using a unique encryption key; storing, by the backup subsystem, the encrypted data on one or more back-up storage media 7; and storing the encryption key in the master table 5 of an encryption key repository as at least part of an entry pertaining to the individual, wherein the encryption key repository is accessible only by the back-up subsystem 4. Data relating to a user may be quickly deleted by erasing the corresponding encryption key from the key database 5, thereby rendering the archived data inaccessible. The back-up media, which may be hard disk drives, may be kept offline. Further redundant tables and data may be backed up periodically.
i
1/3
FIG 1
FIG 2
Unencrypted data for individual
Unencrypted data far individual
2/3
FIG 3
Eh N/A N/A «Cent
A fcia Ta
S Nb Th Sb
c Nc Tc Κε
Ent !\/A N/A Kent
A Na Ta Ka
8 Tb Kb
£. Tc Kc
ί,
FIG 4
=>
Kent
Kent Ent N/A N/A
K= A Ta Na
,Kc B Tb Mb
Ke C Tc Me
3/3
FIG 5
: a \ I vj Λ iSLj —>, V ; ί V______J
Kc-nt Ent N/A N/A
Ka A Ta Na
Kt B Tb Nb
Kc C Tc Nc
FIG 6
Kent Ent N/A N/A
Ks A Ta Na
Kb B Tb Nb
Kc C Ic Nc
Kent Ent N/A N/A
Kb 8 Tb Nb
Kc C Ic Nc
Ent
Right to Erasure Compliant Back-Up
The present invention relates to a method for providing right to erasure compliant back-up and to a right to erasure compliant back-up system.
EU and UK law each contain provisions giving an individual the right to request that a company in possession of their personal data deletes all copies of this data both from live databases and any offline back-up media. This is known generally as the “right to erasure” or “right to be forgotten”. The DPA (Data Protection Act) in the UK requires companies to delete data on request if substantial distress is being caused by processing of the data. The GDPR (General Data Protection Regulation; Regulation (EU) 2016/679) was adopted in 2016 and will apply in member states, after a transition period, from 2018. This regulation, which will therefore apply Europe-wide, does not include the threshold of substantial distress. This means that any person will be able to request that a company deletes all copies of their personal data at any time by submitting a right to erasure request and companies must be in a position to be able to comply.
Given the importance of providing reliable storage for an individual’s data, including providing redundancy in offline back-up storage, complying with such a request is likely to be difficult. This is particularly true given that copies of the data may be stored offline at different, potentially geographically remote, locations making them very difficult to access and therefore also to delete. Offline back-ups provide advantages over online back-ups in terms of resistance to tampering, however it has been suggested that after the introduction of the GDPR offline back-ups might necessarily become less widely used. This will become an increasing problem over the coming years, and will result in a growing need for dependable back-up storage where a particular individual’s data can nevertheless be quickly and easily deleted in response to a right to erasure request.
US-A-2014/0164247 describes a data storage system comprising several online storage media, each storing different encrypted data. All of the storage media are coupled to an “encryption switch” which, together with a key management system and a key repository, forms an Ethernet key management local area network. A key identifier is used to determine which key to use for access, and this is stored along with the data on the storage media. In some cases a portion of the data is overwritten with the key. US-A2006/0136732 discloses a data storage system wherein encryption keys used to encrypt and store the data are allocated based on the time period within which the data is stored. US-A-2017/0099136 describes a system wherein decryption information is stored on behalf of a user in a remote server. Authentication is required in order to retrieve this information.
According to a first aspect of the present invention, there is provided a method for providing right to erasure compliant back-up, the method comprising: storing, by a backup subsystem, data which relates to an individual on at least one storage module coupled to the back-up subsystem; encrypting, by the back-up subsystem, the data using a unique encryption key; storing, by the back-up subsystem, the encrypted data on one or more back-up storage media; and storing, by the back-up subsystem, the encryption key in the master table of an encryption key repository as at least part of an entry pertaining to the individual, wherein the encryption key repository is accessible only by the back-up subsystem.
By using encryption to produce back-up copies of an individual’s data in combination with a secure encryption key repository to which access is limited, a company can be sure that they are complying with the relevant right to erasure provisions simply deleting the encryption key from the key repository.
In an embodiment, the method comprises receiving a right to erasure request from the individual and, in response to the request, deleting the encryption key from the key repository and the data from the storage module. All data relating to a particular individual may be stored in back-up using the same unique key or keys which means that the data is made unavailable as soon as the individual’s entry in the key repository is deleted. If one key is allocated to each individual, this can be achieved by the deletion of a single table entry in the encryption key repository. Erasure of an individual’s data will generally be requested by the individual themselves, however the request need not be received directly from the individual and can be received from another person, who may or may not be acting on their behalf.
In an embodiment, the key repository is accessible only via a secure application programming interface. This ensures that access to the key is limited.
In an embodiment, deletion of the encryption key from the encryption key repository occurs after a pre-determined time period has elapsed since deletion of the data from the storage module. This allows data to be recovered from the back-up media for a short time after an individual’s data is deleted from the storage module. The time period may be 1 minute, 1 hour, or 1 day, for example.
In an embodiment, the encryption key repository comprises a secondary table in which a second copy of the entry is stored. The encryption key repository may also include an additional secondary table in which a third copy of the entry is stored. The master table and both secondary tables may be connected to the back-up system, or mostly connected to the back-up system by wired or wireless connection. Any number of additional secondary tables with copies of the data may be included in the key repository to provide the desired degree of redundancy. This redundancy ensures that the encryption key information for an individual is not lost due to accident or malicious action, which is important given that back-up copies of the data are inaccessible without the key, and ensuring that an individual’s data is not lost can be crucial to a company.
In an embodiment, the method comprises periodically overwriting the secondary table with the data in the master table. In a case where more than one secondary table is included in the key repository, data in the secondary tables may be overwritten at the same time or the overwriting may be staggered. Updating the secondary tables at different times means that there will always be a certain time period within which data can be recovered, even if all other tables are overwritten with the wrong data or with corrupted data.
In an embodiment, the key repository comprises at least one back-up table which is periodically overwritten with the data in the master table, and the time interval between instances of overwriting is longer than the time interval between instances of overwriting data in the secondary table. The secondary tables are kept online are updated regularly. This way redundant copies of the data in the master table is always available should data in the master table be lost. Back-up tables can of course be updated as regularly as desired, however back-up tables will generally be kept offline and thus updated less regularly because connection to the system is required each time an update occurs.
In an embodiment, the at least one back-up table is offline except during instances of overwriting. Offline refers to the fact that back-up tables are generally not in communication with or coupled to the rest of the system, aside from when it is desired to copy data from the master table to the back-up tables. A bug affecting the system will therefore not be able to corrupt data in the back-up tables. As long as a sufficient time is left between updates there will be the possibility of recovering data in the master and secondary online tables using data in the back-up tables once the bug is fixed. Updates can also, or alternatively, be carried out manually so that it can be insured that data in the back-up tables is not overwritten automatically (because a certain time period has elapsed) before the bug has been dealt with. Of course, any number of back-up tables may be provided and the hardware on which they are stored may be physically separate in order to avoid simultaneous destruction of the data contained thereon. Data in backup tables may be overwritten one table at a time. For example, after a first time interval a connection between the back-up system and a first back-up table may be made in order to overwrite with data from the master table. After a second time interval (which may be the same as the first) a connection may be made with a second back-up table in order to overwrite with data from the master table, and so on. The interval between successive back-up events may be at least 1 hour, at least 1 day, or at least 1 week (e.g. a first back-up table may be overwritten after 1 hour, a second after 2 hours, etc).
In an embodiment, the master table is stored on hardware which is physically separate from the hardware on which the secondary table is stored. This ensures that an event which leads to the destruction of the hardware on which the master table is stored will not also lead to the loss of data in the secondary table. The encryption key and associated information will still be available for a time and can be used to recover the data in the master table.
In an embodiment, the storage module comprises a database. In an embodiment, the database is a graph database. Graph databases comprise a data structure including nodes and connecting edges. These databases are straightforward to decompose into a series of subgraphs. Data relating to a particular individual can therefore be easily extracted from the larger database by simply locating the correct node and associated branches. This makes graph databases particularly suitable for carrying out the method of the present invention because data needs to be decomposed into parts relating to each individual in order to encrypt a particular individual’s data using one or more keys unique to that individual.
In an embodiment, the method comprises, prior to encrypting the data, exchanging at least a portion of the data with a token and storing the mapping between the token and the data as part of the individual’s entry. This technique (known as pseudonymisation) provides an additional level of obfuscation and makes it more difficult for the original data to be recovered. It is possible only to treat sensitive parts of the data in this way (for example data which allows the individual to be identified or contacted) or, alternatively, all of the data may be pseudonymised.
In an embodiment, the encrypted data is written to two physically separate backup storage media. The copy of the data on the first back-up medium and second backup medium will generally be identical as in most cases both will have been written at the time of encrypting the data by the back-up subsystem. In a similar manner to the secondary tables and back-up tables in the key repository, storing back-up media in different geographical locations prevents an event which destroys the hardware on which some of the data is stored from completely destroying all back-up copies. It is extremely unlikely that the same event will be able to destroy two back-up copies stored on hardware which is separated by a large enough distance. Copies may be stored more than 1 mile, or more than 10 miles apart, for example.
In an embodiment, the back-up storage medium is offline. In this way, the backup data is harder to access and cannot be corrupted by the same bug which affects the rest of the online system. The back-up storage may be a write-once medium, such as a CD-R. In this case, reliable deletion of an individual’s data or record from the medium would be extremely labour intensive as it would involve generating an entirely new CD-R that did not contain the individual’s data and then physically destroying the original disc.
In an embodiment, all of the data stored on the storage module relating to an individual is encrypted using the same unique encryption key. This makes it easier to respond to a right to erasure request because all of an individual’s back-up data can be deleted by deleting a single encryption key or a single entry in the key repository relating to that individual.
In an embodiment, asymmetric cryptography is used to encrypt the data and the public key is stored with the data on the back-up storage medium and with the private encryption key in the key repository. The public key can therefore be used as an identifier to locate the encryption key which is suitable for decrypting particular data on the back-up media. The data on the back-up media may need to be accessed and decrypted in order to recover data on the online storage module which has been lost or corrupted.
According to a second aspect of the present invention, there is provided a system for providing right to erasure compliant back-up, the system comprising: at least one storage module configured to store data relating to an individual; a back-up subsystem coupled to the storage module; a back-up storage medium; and an encryption key repository comprising a master table, the encryption key repository being accessible only by the back-up subsystem, wherein the back-up subsystem is configured to encrypt the data using a unique encryption key, store the encrypted data on the back-up storage medium, and store the encryption key in the master table of the encryption key repository as at least part of an entry pertaining to the individual.
In an embodiment, the back-up subsystem is configured to delete the key from the encryption key repository and the data from the storage module in response to a right to erasure request received from the individual.
Embodiments of the present invention will now be described in detail with reference to the accompanying drawings, in which:
Figure 1 shows an overview of a storage system including a secure key repository;
Figure 2 illustrates the interaction between different parts of the system when generating back-up copies of an individual’s data and restoring the individual’s data from the back-up copies;
Figure 3 shows an example of a key repository system;
Figure 4 is a flowchart showing the process of backing-up data;
Figure 5 is a flowchart showing the process of restoring the data from the backup medium.
Figure 6 is a flowchart showing the process followed after a right to erasure request is received.
Figure 1 gives an overview of the structure of a right to erasure compliant backup system 1. The system 1 includes a storage module 2 containing structured data. This storage module may be an online database containing entries relating to a plurality of customers or individuals (individuals A, B, C etc). The database may be a relational database comprising one or more tuples corresponding to each individual or may be a graph database. In a graph database nodes in the hierarchically structured data may represent separate individuals. The data relating to each individual will then branch off from this node, connected to the node via edges. Graph databases will generally be more easily partitioned in order to extract all of the data relating to an individual and therefore are particularly advantageous in combination with the other elements of the system, which are described below.
The storage module 2 is connected to a back-up subsystem 4 by wired or wireless connection 3, which may be continuous or intermittent but which should preferably be as close to a continuous connection as possible. The back-up subsystem may comprise a processor or processors by which encryption and/or decryption of one or more individual’s data may be carried out. The system also includes a secure encryption key repository 5 which can be accessed by the back-up subsystem in order to store and retrieve the information necessary to decrypt data. The only way to access the encryption key repository is by the back-up subsystem via a secure application programming interface 6.
Alternatively, the processing relating to encryption and decryption can take place partially or completely within the secure encryption key repository itself. In such a case, the processor is contained within the closed system shown in Figure 3. Decrypted data must then be transferred to the back-up subsystem or directly to the online database through the secure application processing interface. Applications not contained within the secure repository have no access to the keys once they have been written to the repository, reducing the risk of the information being disseminated.
The system 1 also includes one or more back-up storage media 7 (two are shown in the figure, however any number may be present) which are generally kept offline. In this context offline refers to the fact that they are not connected to the online storage module and back-up system, or are connected only in order to carry out certain tasks such as writing data to the back-up storage media. The back-up data storage media may comprise one or more hard disk drives or any other type of storage media including combinations of different types of storage media. For improved redundancy, several back-up storage media may be included in the system (the system may, for example, include three back-up storage media). Each storage medium may carry an encrypted copy of the data stored in the online database, which may include one or more identifiers for associating the data with individuals. These separate storage media may be located in different geographical locations in order to avoid a situation in which all back-up copies of the relevant data are destroyed by a local event which results in physical destruction of hardware. The back-up subsystem 4 comprises a back-up generator module 8 and a back-up restore module 9. The back-up generator module and the back-up restore module may be implemented on the same or separate physical devices.
Figure 2 illustrates the interaction between different parts of the system when generating back-up copies of an individual’s data and restoring the individual’s data from the back-up copies. Data relating to several individuals (users n-1 to n+2) is shown in the centre. Three copies of the data for each individual are shown, indicating that in this case there are three back-up media 7, each with a copy of the data for each individual.
When it is desired to back-up an individual’s data, the back-up generator module 8 receives unencrypted data from the online database (or another source), generates an encryption key, encrypts the data with the key, and then stores the encrypted data on the back-up medium and stores the encryption key in a master table within the encryption key repository 5.
The back-up restore module 9 is configured to receive a request to restore data, retrieve encrypted data from the back-up storage medium, decrypt this data using the correct encryption key retrieved from the encryption key repository 5 (which is accessed via the secure application programming interface), and forward this data to the requester. In the case that the online database requires a restore, the decrypted data is stored in the online database (overwriting newer or corrupted data, for example).
In order to be able to identify the correct key within the encryption key repository for decrypting a particular individual’s data, some type of identifier may be used to link the encrypted data stored on the back-up storage medium with the associated key in the encryption key repository. One way to provide a suitable identifier is to use asymmetric cryptography to encrypt the data, which results in the generation of separate private and public keys. The public key can be stored and associated with both the encrypted data within the back-up storage medium and the private key in the key repository. The private key is stored solely in the key repository system to which only very limited access is available. With this type of encryption, it is possible to use both keys to encrypt the data, however generally the public key is used to encrypt the data, and use of the secret, private key, which can only be accessed by the back-up subsystem through the secure API, is the only possible way to decrypt the data. In this embodiment, the public key stored on the back-up medium is matched with the public key stored in the key repository.
The encryption key generated may comprise at least 2048 bits or at least 4096 bits. This size of key does not use up much storage space but is all but impossible to determine by trial and error, rendering the encrypted data completely indecipherable without access to the corresponding key. Once the encryption key relating to particular data has been deleted from the repository, the associated data on the back-up media cannot be decrypted. Although the encrypted data may still be present on the back-up storage medium or media, deletion of the encryption key renders the data irretrievable and will therefore have the same effect as deletion of the data itself.
After a request for deletion of the data has been received and the corresponding encryption key has been deleted from the key repository system, the data on the backup medium may be overwritten with new data requiring back-up. A log may be kept by the back-up subsystem of data for which an encryption key is no longer available within the repository, and this log used to determine which data within the back-up data storage is no longer accessible and can thus be overwritten or deleted. The log may be kept on a per-storage medium basis, with a separate log available for each storage medium to which encrypted data has been written as back-up.
The key repository system may comprise a master table as well as one or more secondary tables which provide back-up copies of the encryption keys and any related data stored in the master table. The secondary tables must be contained within the closed system such that they too can be accessed only through the secure application programming interface.
An example of a key repository system 5 is shown in Figure 3. The system is a closed system including three tables x, y, and z, each of which holds a copy of the encryption key K for each individual (having user IDs A, B, and C). One of these tables, in this case table z, is the master table and the other two are secondary tables. It is not necessary for table z to remain as the master table. Which table is designated as the master table could, for example, rotate between tables x, y, and z over time or could be adapted to depend on which data is the least likely to be corrupted or lost. It is the master table with which the back-up subsystem usually communicates (e.g. when it is desired to decrypt some of the data stored on the back-up media). In this case a mapping between tokens T and data N is also contained in all three of the databases. Both the key and the mapping is associated with a particular user ID.
As mentioned above, a public key may also be held within the database in place of, or as well as, the user ID. The public key can be used to match an entry within the key repository to related data stored on a back-up medium or for encrypting additional data relating to the same individual. Using the same key to encrypt all of the available data linked to a particular individual makes it more straightforward to delete all of this data simultaneously when a right to erasure request is received.
The user ID may also serve the purpose of identifying data belonging to a particular individual. The identical copies of the encryption data in the master and secondary tables may be located in different geographical locations in order to provide redundancy for the data and to protect against physical destruction of the hardware on which the data within one or more of the tables is stored (in a similar manner to the encrypted data on the back-up media). Including three tables is particularly advantageous since enough back-up is provided to allow for corruption or destruction of two instances of the data while not incurring excessive processing or monetary cost. In this case, unencrypted data is sourced from a graph database so that information about the root node to which user nodes relating to each individual are connected (the root node is given ID “Ent” in the figure) is also contained within tables x, y, and z. When it is desired to reconstruct the graph database, the information relating to the root node is also decrypted to be used for this purpose.
The tables may be configured to communicate with one another in order to ensure that the data stored in the secondary tables is consistent with the data stored in the master table. Synchronisation of the data in each of the secondary tables with the data in the master table can be carried out after a certain period of time has elapsed since the tables were previously synchronised. For example, secondary tables may be overwritten periodically with a copy of the data contained in the master table. Either the whole of the data contained in the table can be overwritten or just the portion found to be inconsistent with the data in the master table. The period between instances of synchronisation may be extremely small (1 second or 1 millisecond for example) in order to provide near continuous synchronisation. This ensures that a copy of the latest encryption data is always available from one of the two secondary tables should the hardware on which the master table is stored be destroyed.
In addition to the master table and one or more secondary tables, there may be additional offline back-up copies 10 of the tables containing encryption data which are periodically connected to the main system and overwritten with data from the master or secondary tables. These may be backed-up less often that the secondary tables (the interval between updates may be as much as 1 day or 1 week, for example). If more than one back-up copy of the encryption key table is included, the back-ups may be updated one at a time (i.e. a first after one day and a second after 2 days in a rotating fashion). The addition of offline back-up storage for the encryption key tables ensures that a clean and recent version of the data is available in the event that corrupted data is copied to the online tables.
Prior to encryption, data pertaining to a particular individual may be extracted from the online database. Partitioning of the data is simpler in the case of a graph database than for a relational database or another type of structured data. Since partitioning of the data may need to be carried out fairly often as data is retrieved or updated, it is advantageous for data in the online database to be in the form of a graph database. The extracted data may represent all of the data in the database that relates to a particular individual or it may represent only portion of their personal data that it is desired to back-up, such as a particular tuple, sub-graph, or portion of a sub-graph. This data may then be encrypted using an encryption key before writing the encrypted data to the back-up storage medium. A single unique encryption key may be used for each individual and this key may be used only for encrypting or decrypting that individual’s data which is to be written to back-up storage. Alternatively, a particular individual’s data may be divided into slices and each slice encrypted using a different key. This allows an individual’s data to be sub-divided according to some additional criterion such as the length of time the data has been available in the online database. This means that data can be deleted after a certain time by deleting the relevant key within the encryption key repository system after the desired period has elapsed (along with the corresponding data in the online database).
An individual may refer to a user, customer, or to a group of individuals such as an organisation, and references herein to the individual will then refer to a user, customer, or to such a grouping. This way, an organisation can request that all of their data stored in the online database be deleted, even though this data may relate to a number of individuals.
Figure 4 is a flowchart illustrating the process of generating a back-up copy of data in the online database. The figure illustrates the steps used to encrypt data from a graph database; however the method is applicable to any type of structured data relating to one or more individuals or groups of individuals. Data within the database is first divided into sub-graphs at step 11, with each grouping relating to a particular individual or organisation. For a relational database this involves creating a set of tuples or rows relating to a particular individual and for a graph database the graph is decomposed into a series of sub-graphs relating to each individual as shown in the figure for users A, B, and C. Data relating to each individual is denoted “N”. The figure also shows one grouping for the root node (Ent) with user nodes A, B, and C branching off. Encryption information for this sub-graph is also recorded in the key repository, which allows the original graph to be reconstructed (or at least traced back to the root node for a particular individual) using information therein.
Data can be pseudonymised at step 12 in order to make extracting the original data from the encrypted data more difficult. Pseudonymisation works by replacing the data with a token according to a particular mapping. Data is encrypted using encryption key “K” at step 13, and at 14 the mapping between the data and token is stored along with the encryption key for each individual in the secure encryption key repository 5. The individual’s original data may also be written to the key repository as part of the entry if desired. The encrypted data is written to the back-up storage medium (or media) 7 at step 15.
Pseudonymisation may be carried out before the data is encrypted and may be carried out on all or part of the data. If pseudonymisation is carried out on only a part of the data, this may be any data which allows an individual to be identified or may be the most sensitive data. Each of the sub-graphs or tuple sets is encrypted using an encryption key generated by the back-up subsystem and written to the back-up medium. Several identical copies may be stored on a number of back-up media which are housed at different geographical locations. The encryption key relating to the individual or portion of the individual’s data is stored in the secure key repository. Only a single copy of the key is stored anywhere and this is kept in the repository which is accessible only through the secure API. This ensures that once copies of the key within the secure repository are deleted, data can be guaranteed to be completely inaccessible.
Figure 5 illustrates the steps of restoring the data in the online database from back-up. This may be necessary, for example, if data in the online database is corrupted or if it is desired to revert to an earlier version of the data. The process of restoring the data is similar to the process of backing-up the data but in reverse. The individual encrypted elements are read from the back-up medium 7 at step 16. Elements are decrypted by the back-up subsystem which accesses the encryption key corresponding to the data through the secure API (shown as step 17 in the figure). If the data has been pseudonymised, the relevant mappings are also accessed through the API at step 17. The data is then decrypted using the key and the tokens are replaced with the original data using the mappings retrieved from the table. The resulting tuple sets may then be loaded back into the online database or the subgraphs assembled into a graph in the online database (step 18).
When a right to erasure request is received from an individual, it is necessary to delete all of their personal data from the relevant back-up media as well as from the online database. This may be done by tracking an identifier (an individual’s user ID for example) and deleting all tuples or subgraphs in the online database relating to this identifier. The entry or entries for the individual in the back-up encryption key repository is also deleted. This will include the encryption key used to encrypt the data written to back-up media, any pseudonymisation mappings stored in the repository along with the key, and any additional data associated with the individual. Figure 6 shows at step 19 process of deleting an individual’s entry in the key repository as well as their information in the online database when a right to erasure request is received from user A.
The step of deleting the individual’s entry from the encryption key repository can optionally be carried out after a predefined time period has passed since the data in the online database was deleted (after 1 hour or after 1 day for example). This allows for the possibility of recovering data when data is deleted by mistake or when the individual changes their mind after deletion of the online data. While the encryption key is still available, data can be recovered from the back-up storage media. Once the individual specific encryption key or keys and, if applicable, the mapping information is deleted from the key repository the individual’s data is irretrievable and thus has effectively been deleted from the back-up storage. This provides a straightforward and reliable way of ensuring that rules regarding right to erasure can be complied with effectively.
Embodiments of the present invention have been described with particular reference to the examples illustrated. However, it will be appreciated that variations and modifications may be made to the examples described within the scope of the present 5 invention.

Claims (32)

Claims
1. A method for providing right to erasure compliant back-up, the method comprising:
storing, by a back-up subsystem, data which relates to an individual on at least one storage module coupled to the back-up subsystem;
encrypting, by the back-up subsystem, the data using a unique encryption key;
storing, by the back-up subsystem, the encrypted data on one or more back-up storage media; and storing, by the back-up subsystem, the encryption key in the master table of an encryption key repository as at least part of an entry pertaining to the individual, wherein the encryption key repository is accessible only by the back-up subsystem.
2. The method of claim 1, comprising receiving a right to erasure request from the individual and, in response to the request, deleting the encryption key from the key repository and the data from the storage module.
3. The method of any of claims 1 and 2, wherein the key repository is accessible only via a secure application programming interface.
4. The method of claim 2, wherein deletion of the encryption key from the key repository occurs after a pre-determined time period has elapsed since deletion of the data from the storage module.
5. The method of any of claims 1 to 4, wherein the encryption key repository comprises a secondary table in which a second copy of the entry is stored.
6. The method of claim 5, comprising periodically overwriting the secondary table with the data in the master table.
7. The method of claim 6, wherein the key repository comprises at least one backup table which is periodically overwritten with the data in the master table, and wherein the time interval between instances of overwriting is longer than the time interval between instances of overwriting data in the secondary table.
8. The method of claim 7, wherein the at least one back-up table is offline except during instances of overwriting.
9. The method of any of claims 5 to 8, wherein the master table is stored on hardware which is physically separate from the hardware on which the secondary table is stored.
10. The method of any of claims 1 to 9, wherein the storage module comprises a database.
11. The method of claim 10, wherein the database is a graph database.
12. The method of any of claims 1 to 11, comprising, prior to encrypting the data, exchanging at least a portion of the data with a token and storing the mapping between the token and the data as part of the individual’s entry.
13. The method of any of claims 1 to 12, wherein the encrypted data is written to two physically separate back-up storage media.
14. The method of any of claims 1 to 13, wherein the back-up storage medium is offline.
15. The method of any of claims 1 to 14, wherein all of the data stored on the storage module relating to an individual is encrypted using the same unique encryption key.
16. The method of any of claims 1 to 15, wherein asymmetric cryptography is used to encrypt the data and the public key is stored with the data on the back-up storage medium and with the private encryption key in the key repository.
17. A system for providing right to erasure compliant back-up, the system comprising:
at least one storage module configured to store data relating to an individual;
a back-up subsystem coupled to the storage module;
a back-up storage medium; and an encryption key repository comprising a master table, the encryption key repository being accessible only by the back-up subsystem, wherein the back-up subsystem is configured to encrypt the data using a unique encryption key, store the encrypted data on the back-up storage medium, and store the encryption key in the master table of the encryption key repository as at least part of an entry pertaining to the individual.
18. The system of claim 17, wherein the back-up subsystem is configured to delete the key from the encryption key repository and the data from the storage module in response to a right to erasure request received from the individual.
19. The system of any of claims 17 and 18, wherein the key repository is accessible only via a secure application programming interface.
20. The system of claim 18, wherein deletion of the encryption key from the key repository occurs after a pre-determined time period has elapsed since deletion of the data from the storage module.
21. The system of any of claims 17 to 20, wherein the encryption key repository comprises a secondary table in which a second copy of the entry is stored.
22. The system of claim 21, comprising periodically overwriting the secondary table with the data in the master table.
23. The system of claim 22, wherein the key repository comprises at least one backup table which is periodically overwritten with the data in the master table, and wherein the time interval between instances of overwriting is longer than the time interval between instances of overwriting data in the secondary table.
24. The system of claim 23, wherein the at least one back-up table is offline except during instances of overwriting.
25. The system of any of claims 21 to 24, wherein the master table is stored on hardware which is physically separate from the hardware on which the secondary table is stored.
26. The system of any of claims 17 to 25, wherein the storage module comprises a database.
27. The system of claim 26, wherein the database is a graph database.
28. The system of any of claims 17 to 27, comprising, prior to encrypting the data, exchanging at least a portion of the data with a token and storing the mapping between the token and the data as part of the individual’s entry.
29. The system of any of claims 17 to 28, wherein the encrypted data is written to two physically separate back-up storage media.
30. The system of any of claims 17 to 29, wherein the back-up storage medium is offline.
31. The system of any of claims 17 to 30, wherein all of the data stored on the storage module relating to an individual is encrypted using the same unique encryption key.
32. The system of any of claims 17 to 31, wherein asymmetric cryptography is used to encrypt the data and the public key is stored with the data on the back-up storage medium and with the private encryption key in the key repository.
Intellectual
Property Office
Application No: GB1708336.1 Examiner: Mr Robert Hunt
GB1708336.1A 2017-05-24 2017-05-24 Right to erasure compliant back-up Withdrawn GB2562767A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
GB1708336.1A GB2562767A (en) 2017-05-24 2017-05-24 Right to erasure compliant back-up

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1708336.1A GB2562767A (en) 2017-05-24 2017-05-24 Right to erasure compliant back-up

Publications (2)

Publication Number Publication Date
GB201708336D0 GB201708336D0 (en) 2017-07-05
GB2562767A true GB2562767A (en) 2018-11-28

Family

ID=59220512

Family Applications (1)

Application Number Title Priority Date Filing Date
GB1708336.1A Withdrawn GB2562767A (en) 2017-05-24 2017-05-24 Right to erasure compliant back-up

Country Status (1)

Country Link
GB (1) GB2562767A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024064176A1 (en) * 2022-09-20 2024-03-28 Thales DIS CPL USA, Inc System and method for data privacy compliance cross-reference to related applications

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111198784B (en) * 2018-11-16 2024-04-23 杭州海康威视***技术有限公司 Data storage method and device
CN112631576B (en) * 2020-12-31 2022-09-27 杭州天宽科技有限公司 Java universal code generation optimization method and system

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6134660A (en) * 1997-06-30 2000-10-17 Telcordia Technologies, Inc. Method for revoking computer backup files using cryptographic techniques
US20060143443A1 (en) * 2004-02-04 2006-06-29 Alacritus, Inc. Method and apparatus for deleting data upon expiration
US20060210085A1 (en) * 2005-03-17 2006-09-21 Min-Hank Ho Method and apparatus for expiring encrypted data
US20070022290A1 (en) * 2005-07-25 2007-01-25 Canon Kabushiki Kaisha Information processing apparatus, control method thereof, and computer program
US20110055559A1 (en) * 2009-08-27 2011-03-03 Jun Li Data retention management
JP2014170412A (en) * 2013-03-04 2014-09-18 Ricoh Co Ltd Information processing device and program
US20140372393A1 (en) * 2005-08-09 2014-12-18 Imation Corp. Data archiving system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6134660A (en) * 1997-06-30 2000-10-17 Telcordia Technologies, Inc. Method for revoking computer backup files using cryptographic techniques
US20060143443A1 (en) * 2004-02-04 2006-06-29 Alacritus, Inc. Method and apparatus for deleting data upon expiration
US20060210085A1 (en) * 2005-03-17 2006-09-21 Min-Hank Ho Method and apparatus for expiring encrypted data
US20070022290A1 (en) * 2005-07-25 2007-01-25 Canon Kabushiki Kaisha Information processing apparatus, control method thereof, and computer program
US20140372393A1 (en) * 2005-08-09 2014-12-18 Imation Corp. Data archiving system
US20110055559A1 (en) * 2009-08-27 2011-03-03 Jun Li Data retention management
JP2014170412A (en) * 2013-03-04 2014-09-18 Ricoh Co Ltd Information processing device and program

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024064176A1 (en) * 2022-09-20 2024-03-28 Thales DIS CPL USA, Inc System and method for data privacy compliance cross-reference to related applications

Also Published As

Publication number Publication date
GB201708336D0 (en) 2017-07-05

Similar Documents

Publication Publication Date Title
EP1927060B1 (en) Data archiving method and system
US8589697B2 (en) Discarding sensitive data from persistent point-in-time image
US6754827B1 (en) Secure File Archive through encryption key management
US6134660A (en) Method for revoking computer backup files using cryptographic techniques
US7770213B2 (en) Method and apparatus for securely forgetting secrets
US20140237232A1 (en) Selective shredding in a deduplication system
US20080208929A1 (en) System And Method For Backing Up Computer Data
GB2562767A (en) Right to erasure compliant back-up
WO2009056570A1 (en) Method and apparatus for restoring encrypted files to an encrypting file system based on deprecated keystores
JP2006301849A (en) Electronic information storage system
US9575977B1 (en) Data management system
US20180225179A1 (en) Encrypted data chunks
US11288382B2 (en) Removing information from data
JP2020505835A (en) Data filing method and system
CN117910931B (en) Express access platform
WO2023112272A1 (en) Management method, information processing device, and management program
US20240054217A1 (en) Method and apparatus for detecting disablement of data backup processes
JP2022122266A (en) Device and method for safe storage of media including personal data and erasing of stored personal data
US20110197076A1 (en) Total computer security
Tagarev System recovery management basics
CN108885576A (en) Information is removed from data
Rao et al. Analysis of Deduplication in Secure Cloud Storage

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)