CN109408290A - A kind of fragment file access pattern method, apparatus and storage medium based on InnoDB - Google Patents

A kind of fragment file access pattern method, apparatus and storage medium based on InnoDB Download PDF

Info

Publication number
CN109408290A
CN109408290A CN201811225169.1A CN201811225169A CN109408290A CN 109408290 A CN109408290 A CN 109408290A CN 201811225169 A CN201811225169 A CN 201811225169A CN 109408290 A CN109408290 A CN 109408290A
Authority
CN
China
Prior art keywords
data
data page
page
file
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811225169.1A
Other languages
Chinese (zh)
Other versions
CN109408290B (en
Inventor
梁德荣
田庆宜
黄建邦
沈长达
吴少华
张学君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Meiya Pico Information Co Ltd
Original Assignee
Xiamen Meiya Pico Information Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Meiya Pico Information Co Ltd filed Critical Xiamen Meiya Pico Information Co Ltd
Priority to CN201811225169.1A priority Critical patent/CN109408290B/en
Publication of CN109408290A publication Critical patent/CN109408290A/en
Application granted granted Critical
Publication of CN109408290B publication Critical patent/CN109408290B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)

Abstract

The present invention provides a kind of fragment file access pattern method, apparatus and storage medium based on InnoDB, this method comprises: reading a data page of the n byte data as InnoDB data file from the initial position based on InnoDB;Preceding 4 bytes for reading the data page are denoted as check value CheckSum1, calculate the check value CheckSum2 of the data page, judge whether CheckSum1 is equal to CheckSum2, if it is not, then Offset=Offset+m, reading data again, if it is, being restored;The page number PageNo of the data page and the file identification FileId of the affiliated file of the data page are read, the merging of data page is carried out according to the FileId, and is ranked up from small to large in affiliated file according to page number PageNo.The present invention is based on the page structures of InnoDB data file, it can restore data from entire disk, mirror image, file system files record can not depended on and carry out data recovery, if file partial destruction, it is capable of the non-broken parts of extraction document, if the fragment comprising multiple data files, can carry out tracing to the source to fragment recombinating and sorting to fragment is recombinated.

Description

A kind of fragment file access pattern method, apparatus and storage medium based on InnoDB
Technical field
The present invention relates to technical field of data processing, especially a kind of fragment file access pattern method based on InnoDB, dress It sets and storage medium.
Background technique
InnoDB has a wide range of applications as the default storage engine of MySql data.In database recovery, electronic data Industry of collecting evidence restores more urgent to the recovery of MySql database.When MySql database file is artificially deleted, viral subversive, Bad Track etc. leads to data file loss situation, and how accurately, comprehensively restoring file data is one important and urgent Cut problem to be solved.
Have many recovery softwares for deleting file currently on the market, these are all based on the extensive of file system files record The multiple or recovery based on file signature, the restoration methods based on file system files record have following shortcoming: 1, file Record can not be restored after new file record covering;2, disk execute quick format cause file record to be emptied can not be extensive It is multiple;3, disk has gone bad track to cause can not read file record and can not restore in file record.Recovery side based on file signature Method has following deficiency: 1, file data can not discontinuously restore on disk;2, the signature of file header and file is capped can not Restore.
Summary of the invention
The present invention is directed to above-mentioned defect in the prior art, proposes following technical solution.
A kind of fragment file access pattern method based on InnoDB, this method comprises:
Read step reads n byte data as InnoDB number since the initial position Offset=0 based on InnoDB According to a data page of file;
Matching step, preceding 4 bytes for reading the data page are denoted as check value CheckSum1, use the folding of data page Folded and checking algorithm calculates the check value CheckSum2 of the data page, judges whether CheckSuml is equal to CheckSum2, If it is not, then Offset=Offset+m, re-executes read step, if it is, executing recovering step;
Recovering step reads the page number PageNo of the data page and the file identification of the affiliated file of the data page FileId, according to the FileId carry out data page merging, and according to page number PageNo in affiliated file from small to large into Row sequence, then enables Offset=Offset+n, re-executes read step;Wherein, m is a data offset identity, n mono- The size of a data page.
Further, the fragment file is ibdata and/or ibd fragment file.
Further, the check value that the data page is calculated using the folding and checking algorithm of data page The operation of CheckSum2 are as follows: it is that the one piece of data of 22 bytes is folded that length is taken since the 4th byte of the data page It is sum1 that test value, which is calculated, in exclusive or, and it is a number of segment of n-46 byte that length is taken since the 38th byte of the data page It is sum2 that test value, which is calculated, according to progress Puckering-XOR, then the check value checksum2=sum1+sum2 of the data page.
Further, two integer exclusive or value-based algorithms are defined, operator is set as * *: set two 4 byte integer a and B, exclusive or value-based algorithm are as follows:
A**b=(((((a^b^RANDOM_MASK) < < 8)+a) ^RANDOM_MASK2)+b);
That is the value of a exclusive or b exclusive or RANDOM_MASK moves to left 8 plus a, and exclusive or RANDOM_MASK2 adds b again;
The operation that the Puckering-XOR calculates are as follows: setting number of folds flod initial value is 0, traverses the number of segment by byte order According to, if traversal structure is data set N { N1, N2, N3 .., Nm }, successively calculated by integer exclusive or value-based algorithm with fold, Return value is updated to flod, i.e. flod=flod**Ni, wherein 1=< i <=m, RANDOM_MASK=1653893711, RANDOM_MASK2=1463735687.
Further, the operation that the segment data forms data set N { N1, N2, N3 .., Nm } is traversed by byte order are as follows: It is read from the initial position of the segment data according to every four bytes and generates a 4 byte integers, if last remaining data Remaining byte number is formed an integer as Nm by less than 4 bytes.
The invention also provides a kind of fragment file restoring device based on InnoDB, the device include:
Reading unit, for reading the conduct of n byte data since the initial position Offset=0 based on InnoDB One data page of InnoDB data file;
Matching unit, preceding 4 bytes for reading the data page are denoted as check value CheckSum1, use data page Folding and checking algorithm calculate the check value CheckSum2 of the data page, judge whether CheckSum1 is equal to CheckSum2, if it is not, then Offset=Offset+m, re-executes the operation of reading unit, if it is, executing recovery The operation of unit;
Recovery unit, for reading the page number PageNo of the data page and the file identification of the affiliated file of the data page FileId, according to the FileId carry out data page merging, and according to page number PageNo in affiliated file from small to large into Row sequence, then enables Offset=Offset+n, re-executes the operation of reading unit;Wherein, m is a data-bias list Position, n are the size of a data page.
Further, the fragment file is ibdata and/or ibd fragment file.
Further, the check value that the data page is calculated using the folding and checking algorithm of data page The operation of CheckSum2 are as follows: it is that the one piece of data of 22 bytes is folded that length is taken since the 4th byte of the data page It is sum1 that test value, which is calculated, in exclusive or, and it is a number of segment of n-46 byte that length is taken since the 38th byte of the data page It is sum2 that test value, which is calculated, according to progress Puckering-XOR, then the check value checksum2=sum1+sum2 of the data page.
Further, two integer exclusive or value-based algorithms are defined, operator is set as * *: set two 4 byte integer a and B, exclusive or value-based algorithm are as follows:
A**b=(((((a^b^RANDOM_MASK) < < 8)+a) ^RANDOM_MASK2)+b);
That is the value of a exclusive or b exclusive or RANDOM_MASK moves to left 8 plus a, and exclusive or RANDOM_MASK2 adds b again;
The operation that the Puckering-XOR calculates are as follows: setting number of folds flod initial value is 0, traverses the number of segment by byte order According to, if traversal structure is data set N { N1, N2, N3 .., Nm }, successively calculated by integer exclusive or value-based algorithm with fold, Return value is updated to flod, i.e. flod=flod**Ni, wherein 1=< i <=m, RANDOM_MASK=1653893711, RANDOM_MASK2=1463735687.
Further, the operation that the segment data forms data set N { N1, N2, N3 .., Nm } is traversed by byte order are as follows: It is read from the initial position of the segment data according to every four bytes and generates a 4 byte integers, if last remaining data Remaining byte number is formed an integer as Nm by less than 4 bytes.
The invention also provides a kind of computer readable storage medium, computer program generation is stored on the storage medium Code, above-mentioned any method is executed when the computer program code is computer-executed.
Technical effect of the invention are as follows: the present invention is based on the page structures of InnoDB data file, it can data page is single Position carries out the recovery of data, can restore data file from the storage mediums such as entire disk, mirror image, can not depend on file system Unite file record carry out data recovery, if file partial destruction (such as by virus encrypt, part covering), can extraction document not Broken parts can carry out fragment according to file identification FileId if in storage medium including the fragment of multiple data files Recombination of tracing to the source can sort to fragment according to page number PageNo heavy even if fragmentation of data discontinuous and disorder distribution in disk Group.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other Feature, objects and advantages will become more apparent upon.
Fig. 1 is the schematic diagram of the InnoDB data file structure of embodiment according to the present invention.
Fig. 2 is the data page schematic diagram of the InnoDB of embodiment according to the present invention.
Fig. 3 is a kind of flow chart of fragment file access pattern method based on InnoDB of embodiment according to the present invention.
Fig. 4 is a kind of structure chart of fragment file restoring device based on InnoDB of embodiment according to the present invention.
Specific embodiment
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to Convenient for description, part relevant to related invention is illustrated only in attached drawing.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
InnoDB is to handle maximum performance design when huge data volume.Completely and MySQL server based on InnoDB Integration is buffer pool that is data cached in main memory and indexing and maintain own based on InnoDB.InnoDB stores it For table & index in a table space, table space may include several files (or raw disk partition).Technically, InnoDB It is a full database system for being mounted on the backstage MySQL, InnoDB establishes its dedicated buffer pool in main memory and is used for height Fast buffered data and index.
Shown in Figure 1, InnoDB data file structure is made of a series of data page, and data page presses page number from 0 Start ascending be ranked up.
Refering to what is shown in Fig. 2, the size of the data page of InnoDB storage engines is 16384 bytes, wherein 0~3 byte stores Check value, 4~7 bytes store page number PageNo, 34~37 byte storage file ID, i.e. file identification FileId.
Based on above-mentioned introduction, the structure of the file structure and data page that have understood InnoDB storage engines is to realize that data are extensive Multiple basis.
The principle that data of the invention are restored are as follows: 1) check value of data block using folding and checking algorithm is calculated, if The check value being calculated and the check value of top margin portion storage are equal, which is exactly the page of InnoDB data file;2) exist In one database instance, each data file has a unique file ID, i.e. file identification, records in each data There is the file ID of affiliated file, file mergences is carried out to data page by the characteristic;3) each data page records the page number of this page, By page number sort ascending in affiliated file, data page is ranked up in file by the characteristic.
Fig. 3 shows a kind of fragment file access pattern method based on InnoDB of the invention, this method comprises:
Read step S101 reads the conduct of n byte data since the initial position Offset=0 based on InnoDB One data page of InnoDB data file.
Matching step S102, preceding 4 bytes for reading the data page are denoted as check value CheckSum1, use data page Folding and checking algorithm calculate the check value CheckSum2 of the data page, judge whether CheckSum1 is equal to CheckSum2 restores step if it is, executing if it is not, then Offset=Offset+m, re-executes read step S101 Rapid S103.An emphasis of the invention be exactly the check value for the check value and reading for calculating data page whether always, this be restore The key point of file and an important inventive point of the invention.
Recovering step S103 reads the page number PageNo of the data page and the file identification of the affiliated file of the data page FileId, according to the FileId carry out data page merging, and according to page number PageNo in affiliated file from small to large into Row sequence, then enables Offset=Offset+n, re-executes read step S101;Wherein, m is a data offset identity, n For the size of a data page, i.e. m can be 1 byte according to strategy is read, and a sector-size, cluster size etc., n is general It is certainly also likely to be other for 16384 bytes.The recovery operation is continued until that reading data finishes.
In matching step S102, preceding 4 bytes for reading the data page are denoted as check value CheckSum1, this is to know The specific structure of data page, referring to fig. 2 and above-mentioned corresponding description.
In InnoDB, the format of data is ibdata or ibd, and therefore, the fragment file type restored in the present invention is Ibdata or ibd fragment file, the two can also be restored together certainly.
Another important inventive point of the invention is the check value for calculating data page, this is to realize important step of the invention Suddenly, specifically, the folding and checking algorithm using data page calculates the behaviour of the check value CheckSum2 of the data page As: it takes the one piece of data that length is 22 bytes to carry out Puckering-XOR since the 4th byte of the data page and is calculated Test value is sum1, taken since the 38th byte of the data page length be n-46 byte one piece of data fold it is different Or it is sum2, then the check value checksum2=sum1+sum2 of the data page that test value, which is calculated,.
For the check value for calculating data page, invention also defines two integer exclusive or value-based algorithms, this is also the present invention Important inventive point, specifically, the operator of the exclusive or value-based algorithm is set as * *: setting two 4 bytes integer a and b, exclusive or value is calculated Method are as follows:
A**b=(((((a^b^RANDOM_MASK) < < 8)+a) ^RANDOM_MASK2)+b);
That is the value of a exclusive or b exclusive or RANDOM_MASK moves to left 8 plus a, and exclusive or RANDOM_MASK2 adds b again;
It is that the invention proposes Puckering-XOR calculating, this is also important invention of the invention based on above-mentioned exclusive or value-based algorithm One of point, this is to realize key point of the invention, operation are as follows: setting number of folds flod initial value is 0, by byte order time Go through the segment data, if traversal structure is data set N { N1, N2, N3 .., Nm }, successively by integer exclusive or value-based algorithm and fold into Row calculates, and return value is updated to flod, i.e. flod=flod**Ni, wherein 1=< i <=m, i, m are integer, RANDOM_ MASK=1653893711, RANDOM_MASK2=1463735687.
In one embodiment, the behaviour that the segment data forms data set N { N1, N2, N3 .., Nm } is traversed by byte order As: it is read from the initial position of the segment data according to every four bytes and generates a 4 byte integers, if last remaining Remaining byte number is formed an integer as Nm by less than 4 bytes of data.
In one embodiment, it is shown below according to the combined mode that the FileId carries out data page,
Wherein, f indicates that fragment page information all in single ibdata/ibd file, PageCount indicate number of pages mesh, pi ={ PageCheckSumi, PageNoi, FileIdi, Offseti, at this point, i is the integer for being less than or equal to n greater than 0, indicate PageCheckSumiThe check value of data page, PageNoiIndicate the page number of data page, FileIdiIndicate text belonging to the data page Part id, OffsetiIndicate the data page in the position of disk.I.e. according to FileIdiThe data page recovered can be closed And then PageNo againiData page is ranked up, the file after being restored.
Fig. 4 shows a kind of fragment file access pattern method based on InnoDB of the invention, this method comprises:
Reading unit 401, for reading the conduct of n byte data since the initial position Offset=0 based on InnoDB One data page of InnoDB data file.
Matching unit 402, preceding 4 bytes for reading the data page are denoted as check value CheckSum1, use data The folding and checking algorithm of page calculate the check value CheckSum2 of the data page, judge whether CheckSum1 is equal to CheckSum2, if it is not, then Offset=Offset+m, re-executes the operation of reading unit 401, if it is, executing extensive The operation of multiple unit 403.An emphasis of the invention be exactly the check value for the check value and reading for calculating data page whether always, This is the key point for restoring file and an important inventive point of the invention.
Recovery unit 403, for reading the page number PageNo of the data page and the files-designated of the affiliated file of the data page Know FileId, according to the FileId carry out data page merging, and according to page number PageNo in affiliated file from small to large It is ranked up, then enables Offset=Offset+n, re-execute the operation of reading unit 401;Wherein, m is that a data are inclined Unit is moved, n is the size of a data page, i.e. m can be 1 byte, a sector-size, a cluster size according to strategy is read Deng n is generally 16384 bytes, is certainly also likely to be other.The recovery operation is continued until that reading data finishes.
In the operation of matching unit 402, preceding 4 bytes for reading the data page are denoted as check value CheckSum1, this It is the specific structure for being realised that data page, referring to fig. 2 and above-mentioned corresponding description.
In InnoDB, the format of data is ibdata or ibd, and therefore, the fragment file type restored in the present invention is Ibdata or ibd fragment file, the two can also be restored together certainly.
Another important inventive point of the invention is the check value for calculating data page, this is to realize important step of the invention Suddenly, specifically, the folding and checking algorithm using data page calculates the behaviour of the check value CheckSum2 of the data page As: it takes the one piece of data that length is 22 bytes to carry out Puckering-XOR since the 4th byte of the data page and is calculated Test value is sum1, taken since the 38th byte of the data page length be n-46 byte one piece of data fold it is different Or it is sum2, then the check value checksum2=sum1+sum2 of the data page that test value, which is calculated,.
For the check value for calculating data page, invention also defines two integer exclusive or value-based algorithms, this is also the present invention Important inventive point, specifically, the operator of the exclusive or value-based algorithm is set as expecting: setting two 4 bytes integer a and b, exclusive or value is calculated Method are as follows:
A**b=(((((a^b^RANDOM_MASK) < < 8)+a) ^RANDOM_MASK2)+b);
That is the value of a exclusive or b exclusive or RANDOM_MASK moves to left 8 plus a, and exclusive or RANDOM_MASK2 adds b again;
It is that the invention proposes Puckering-XOR calculating, this is also important invention of the invention based on above-mentioned exclusive or value-based algorithm One of point, this is to realize key point of the invention, operation are as follows: setting number of folds flod initial value is 0, by byte order time Go through the segment data, if traversal structure is data set N { N1, N2, N3 .., Nm }, successively by integer exclusive or value-based algorithm and fold into Row calculates, and return value is updated to flod, i.e. flod=flod**Ni, wherein 1=< i <=m, i, m are integer, RANDOM_ MASK=1653893711, RANDOM_MASK2=1463735687.
In one embodiment, the behaviour that the segment data forms data set N { N1, N2, N3 .., Nm } is traversed by byte order As: it is read from the initial position of the segment data according to every four bytes and generates a 4 byte integers, if last remaining Remaining byte number is formed an integer as Nm by less than 4 bytes of data.
In one embodiment, it is shown below according to the combined mode that the FileId carries out data page,
Wherein, f indicates that fragment page information all in single ibdata/ibd file, PageCount indicate number of pages mesh, pi ={ PageCheckSumi, PageNoi, FileIdi, Offseti, at this point, i is the integer for being less than or equal to n greater than 0, indicate PageCheckSumiThe check value of data page, PageNoiIndicate the page number of data page, FileIdiIndicate text belonging to the data page Part id, OffsetiIndicate the data page in the position of disk.I.e. according to FileIdiThe data page recovered can be closed And then PageNo againiData page is ranked up, the file after being restored.
The present invention also verifies method of the invention, and verification mode is as follows:
(1) a 3GB size vhd mirror image is created using the disk management tool of windows system, and to carry and format Change mirror image.
(2) the data file TEST.ibd that a 8.65M size Innodb storage engines are copied into the subregion of carry (is copied Suspend copy during shellfish, and other data, which are written, into disk keeps file discontinuous in disk).
To disk carry out ibd file signature restore can only recovered part data, disk is carried out based on the extensive of file record It is multiple, file can not be restored, restore to mirror image the ibd file of available write-in using method of the invention, demonstrate this The technical effect of invention.
Technical effect of the invention be that the present invention is based on the page structures of InnoDB data file, it can data page is single Position carries out the recovery of data, can restore data file from the storage mediums such as entire disk, mirror image, can not depend on file system Unite file record carry out data recovery, if file partial destruction (such as by virus encrypt, part covering), can extraction document not Broken parts can carry out fragment according to file identification FileId if in storage medium including the fragment of multiple data files Recombination of tracing to the source can sort to fragment according to page number PageNo heavy even if fragmentation of data discontinuous and disorder distribution in disk Group.
Method of the invention is particularly suitable in mobile terminal device, the mobile terminal device can be smart phone, Tablet computer, laptop, desktop computer or PDA etc., certain mobile terminal device is also possible to others can be portable Electronic equipment having data processing function.
For convenience of description, it is divided into various units when description apparatus above with function to describe respectively.Certainly, implementing this The function of each unit can be realized in the same or multiple software and or hardware when application.
As seen through the above description of the embodiments, those skilled in the art can be understood that the application can It realizes by means of software and necessary general hardware platform.Based on this understanding, the technical solution essence of the application On in other words the part that contributes to existing technology can be embodied in the form of software products, the computer software product It can store in storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions are used so that a computer equipment (can be personal computer, server or the network equipment etc.) executes the certain of each embodiment of the application or embodiment Method described in part.
It should be noted last that: above embodiments only illustrate and not to limitation technical solution of the present invention, although reference Above-described embodiment describes the invention in detail, those skilled in the art should understand that: it still can be to this hair It is bright to be modified or replaced equivalently, it without departing from the spirit or scope of the invention, or any substitutions, should all It is included within the scope of the claims of the present invention.

Claims (11)

1. a kind of fragment file access pattern method based on InnoDB, which is characterized in that this method comprises:
Read step reads n byte data as InnoDB data text since the initial position Offset=0 based on InnoDB One data page of part;
Matching step, preceding 4 bytes for reading the data page are denoted as check value CheckSum1, using data page folding and Checking algorithm calculates the check value CheckSum2 of the data page, judges whether CheckSum1 is equal to CheckSum2, if No, then Offset=Offset+m, re-executes read step, if it is, executing recovering step;
Recovering step reads the page number PageNo of the data page and the file identification FileId of the affiliated file of the data page, root The merging of data page is carried out according to the FileId, and is ranked up from small to large in affiliated file according to page number PageNo, so After enable Offset=Offset+n, re-execute read step;
Wherein, m is a data offset identity, and n is the size of a data page.
2. the method according to claim 1, wherein the fragment file is ibdata and/or ibd fragment text Part.
3. according to the method described in claim 2, it is characterized by: described calculated using the folding and checking algorithm of data page The operation of the check value CheckSum2 of the data page are as follows: it is 22 bytes that length is taken since the 4th byte of the data page One piece of data carry out Puckering-XOR be calculated test value be sum1, take length since the 38th byte of the data page It carries out Puckering-XOR test value is calculated being sum2 for the one piece of data of n-46 byte, then the check value of the data page Checksum2=sum1+sum2.
4. according to the method described in claim 3, it is characterized in that,
Two integer exclusive or value-based algorithms are defined, operator is set as * *: setting two 4 bytes integer a and b, exclusive or value-based algorithm are as follows:
A**b=(((((a^b^RANDOM_MASK) < < 8)+a) ^RANDOM_MASK2)+b);
That is the value of a exclusive or b exclusive or RANDOM_MASK moves to left 8 plus a, and exclusive or RANDOM_MASK2 adds b again;
The operation that the Puckering-XOR calculates are as follows: setting number of folds flod initial value is 0, traverses the segment data by byte order, If traversing structure is data set N { N1, N2, N3 .., Nm }, successively calculated, is returned with fold by integer exclusive or value-based algorithm Value, which updates, arrives flod, i.e. flod=flod**Ni, wherein 1=< i <=m, RANDOM_MASK=1653893711, RANDOM_MASK2=1463735687.
5. according to the method described in claim 4, it is characterized in that, traversing the segment data by byte order forms data set N The operation of { N1, N2, N3 .., Nm } are as follows: read from the initial position of the segment data according to every four bytes and generate 4 bytes Remaining byte number is formed an integer as Nm if less than 4 bytes of last remaining data by integer.
6. a kind of fragment file restoring device based on InnoDB, which is characterized in that the device includes:
Reading unit, for reading n byte data since the initial position Offset=0 based on InnoDB as InnoDB number According to a data page of file;
Matching unit, preceding 4 bytes for reading the data page are denoted as check value CheckSum1, use the folding of data page Folded and checking algorithm calculates the check value CheckSum2 of the data page, judges whether CheckSum1 is equal to CheckSum2, If it is not, then Offset=Offset+m, re-executes the operation of reading unit, if it is, executing the operation of recovery unit;
Recovery unit, for reading the page number PageNo of the data page and the file identification of the affiliated file of the data page FileId, according to the FileId carry out data page merging, and according to page number PageNo in affiliated file from small to large into Row sequence, then enables Offset=Offset+n, re-executes the operation of reading unit;
Wherein, m is a data offset identity, and n is the size of a data page.
7. device according to claim 6, which is characterized in that the fragment file is ibdata and/or ibd fragment text Part.
8. device according to claim 7, it is characterised in that: described to be calculated using the folding and checking algorithm of data page The operation of the check value CheckSum2 of the data page are as follows: it is 22 bytes that length is taken since the 4th byte of the data page One piece of data carry out Puckering-XOR be calculated test value be sum1, take length since the 38th byte of the data page It carries out Puckering-XOR test value is calculated being sum2 for the one piece of data of n-46 byte, then the check value of the data page Checksum2=sum1+sum2.
9. device according to claim 8, which is characterized in that
Two integer exclusive or value-based algorithms are defined, operator is set as expecting: setting two 4 bytes integer a and b, exclusive or value-based algorithm are as follows:
A**b=(((((a^b^RANDOM_MASK) < < 8)+a) ^RANDOM_MASK2)+b);
That is the value of a exclusive or b exclusive or RANDOM_MASK moves to left 8 plus a, and exclusive or RANDOM_MASK2 adds b again;
The operation that the Puckering-XOR calculates are as follows: setting number of folds flod initial value is 0, traverses the segment data by byte order, If traversing structure is data set N { N1, N2, N3 .., Nm }, successively calculated, is returned with fold by integer exclusive or value-based algorithm Value, which updates, arrives flod, i.e. flod=flod**Ni, wherein 1=< i <=m, RANDOM_MASK=1653893711, RANDOM_MASK2=1463735687.
10. device according to claim 9, which is characterized in that traverse the segment data by byte order and form data set N The operation of { N1, N2, N3 .., Nm } are as follows: read from the initial position of the segment data according to every four bytes and generate 4 bytes Remaining byte number is formed an integer as Nm if less than 4 bytes of last remaining data by integer.
11. a kind of computer readable storage medium, which is characterized in that it is stored with computer program code on the storage medium, When the computer program code is computer-executed, perform claim requires any method of 1-5.
CN201811225169.1A 2018-10-19 2018-10-19 Fragmented file recovery method and device based on InoDB and storage medium Active CN109408290B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811225169.1A CN109408290B (en) 2018-10-19 2018-10-19 Fragmented file recovery method and device based on InoDB and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811225169.1A CN109408290B (en) 2018-10-19 2018-10-19 Fragmented file recovery method and device based on InoDB and storage medium

Publications (2)

Publication Number Publication Date
CN109408290A true CN109408290A (en) 2019-03-01
CN109408290B CN109408290B (en) 2021-02-26

Family

ID=65468048

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811225169.1A Active CN109408290B (en) 2018-10-19 2018-10-19 Fragmented file recovery method and device based on InoDB and storage medium

Country Status (1)

Country Link
CN (1) CN109408290B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110058969A (en) * 2019-04-18 2019-07-26 腾讯科技(深圳)有限公司 A kind of data reconstruction method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120030172A1 (en) * 2010-07-27 2012-02-02 Oracle International Corporation Mysql database heterogeneous log based replication
WO2014108083A1 (en) * 2013-01-11 2014-07-17 Tencent Technology (Shenzhen) Company Limited Method and device for verifying consistency of data of master device and slave device
CN104881418A (en) * 2014-02-28 2015-09-02 阿里巴巴集团控股有限公司 Method and device for quickly reclaiming rollback space in MySQL
US9824132B2 (en) * 2013-01-08 2017-11-21 Facebook, Inc. Data recovery in multi-leader distributed systems
CN108062358A (en) * 2017-11-28 2018-05-22 厦门市美亚柏科信息股份有限公司 The offline restoration methods of innodb engine deletion records, storage medium
CN108319862A (en) * 2017-01-16 2018-07-24 阿里巴巴集团控股有限公司 A kind of method and apparatus of data documents disposal
CN108563535A (en) * 2018-04-27 2018-09-21 四川巧夺天工信息安全智能设备有限公司 A kind of restoration methods to the full library of MySQL database

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120030172A1 (en) * 2010-07-27 2012-02-02 Oracle International Corporation Mysql database heterogeneous log based replication
US20130318044A1 (en) * 2010-07-27 2013-11-28 Oracle International Corporation Mysql database heterogeneous log based replication
US9824132B2 (en) * 2013-01-08 2017-11-21 Facebook, Inc. Data recovery in multi-leader distributed systems
WO2014108083A1 (en) * 2013-01-11 2014-07-17 Tencent Technology (Shenzhen) Company Limited Method and device for verifying consistency of data of master device and slave device
CN104881418A (en) * 2014-02-28 2015-09-02 阿里巴巴集团控股有限公司 Method and device for quickly reclaiming rollback space in MySQL
CN108319862A (en) * 2017-01-16 2018-07-24 阿里巴巴集团控股有限公司 A kind of method and apparatus of data documents disposal
CN108062358A (en) * 2017-11-28 2018-05-22 厦门市美亚柏科信息股份有限公司 The offline restoration methods of innodb engine deletion records, storage medium
CN108563535A (en) * 2018-04-27 2018-09-21 四川巧夺天工信息安全智能设备有限公司 A kind of restoration methods to the full library of MySQL database

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
孙偏偏: "InnoDB数据库数据恢复技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110058969A (en) * 2019-04-18 2019-07-26 腾讯科技(深圳)有限公司 A kind of data reconstruction method and device

Also Published As

Publication number Publication date
CN109408290B (en) 2021-02-26

Similar Documents

Publication Publication Date Title
JP6884128B2 (en) Data deduplication device, data deduplication method, and data deduplication program
JP5735654B2 (en) Deduplication method for stored data, deduplication apparatus for stored data, and deduplication program
CN103870514B (en) Data de-duplication method and device
CN105718548B (en) Based on the system and method in de-duplication storage system for expansible reference management
EP2646915B1 (en) Synthetic backups within deduplication storage system
US9514138B1 (en) Using read signature command in file system to backup data
US8578112B2 (en) Data management system and data management method
CN104932841A (en) Saving type duplicated data deleting method in cloud storage system
CN103914522A (en) Data block merging method applied to deleting duplicated data in cloud storage
US10120595B2 (en) Optimizing backup of whitelisted files
JP6805816B2 (en) Information processing equipment, information processing system, information processing method and program
CN104077380A (en) Method and device for deleting duplicated data and system
CN102033924A (en) Data storage method and system
US8914325B2 (en) Change tracking for multiphase deduplication
CN109492049A (en) Data processing, block generation and synchronous method for block chain network
CN107368545B (en) A kind of De-weight method and device based on Merkle Tree deformation algorithm
US20140156607A1 (en) Index for deduplication
CN105493080A (en) Method and apparatus for context aware based data de-duplication
CN109408290A (en) A kind of fragment file access pattern method, apparatus and storage medium based on InnoDB
CN106528703A (en) Deduplication mode switching method and apparatus
WO2024082525A1 (en) File snapshot method and system, electronic device, and storage medium
CN104484402A (en) Method and device for deleting repeating data
CN110019039A (en) The Container Format of separated from meta-data
US20170293531A1 (en) Snapshot backup
US11620056B2 (en) Snapshots for any point in time replication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant