CN111125137A - Batch real-time protection data verification method - Google Patents

Batch real-time protection data verification method Download PDF

Info

Publication number
CN111125137A
CN111125137A CN201911362809.8A CN201911362809A CN111125137A CN 111125137 A CN111125137 A CN 111125137A CN 201911362809 A CN201911362809 A CN 201911362809A CN 111125137 A CN111125137 A CN 111125137A
Authority
CN
China
Prior art keywords
data
time
real
block
verification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911362809.8A
Other languages
Chinese (zh)
Other versions
CN111125137B (en
Inventor
李正祥
谢亮
张有成
姚崎
李海鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerospace One System Jiangsu Information Technology Co ltd
Original Assignee
Aerospace One System Nanjing Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerospace One System Nanjing Data Technology Co ltd filed Critical Aerospace One System Nanjing Data Technology Co ltd
Priority to CN201911362809.8A priority Critical patent/CN111125137B/en
Publication of CN111125137A publication Critical patent/CN111125137A/en
Application granted granted Critical
Publication of CN111125137B publication Critical patent/CN111125137B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Storage Device Security (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a batch real-time protection data verification method, which comprises the following steps: generating a data block number information table to be checked for one time according to the size of user source data, the data change heat and the checking period; reading source data content of a corresponding block number from an information system according to the data block number information table, and generating a source data abstract; searching corresponding backup data from the real-time protection data, sequentially searching from the latest data time point to the front when searching, finding block data of a time point and generating a real-time data abstract, and comparing the consistency of the source data abstract and the real-time data abstract; and finishing one search until the block data of the consistent time point is found or all the time points are found. The invention supports the verification of dynamic data, the verification is carried out in batches, a single verification window is reduced, and the influence on an information system is reduced; the data with high change heat degree is checked for many times, and the data can be found as early as possible after errors occur.

Description

Batch real-time protection data verification method
Technical Field
The invention relates to a batch real-time protection data verification method, and belongs to the technical field of data verification.
Background
For important information systems, there is rpo need to secure user data, so real-time protection of data is possible. To ensure data correctness, the data needs to be checked. In the timing data protection, the common methods for data verification include direct comparison, information summary comparison and application verification.
Direct comparison: the source data is compared with the protection data one by one.
And (3) information abstract comparison: the information digests (md5, sha1) are calculated for the source data and the protection data, respectively, and then compared.
Application verification: and transmitting the protection data to an application system, and finishing the verification of the protection data by the application system.
Direct comparison requires that both the source data and the protection data be read once and compared. The information abstract is generated after the source data and the protection data are read, and the information abstract is compared.
The direct comparison and the information summary comparison at least need to read the source data once, and when the source data is large, a long time is needed.
The application verification depends on the external functions of the application, and the universality is not strong.
The three verification modes are suitable for the timing data protection mode, the source data cannot change in the primary verification process, and the content of the read source data and the read protection data is consistent at any time and is static data. However, for real-time protection, the source data is constantly changing, the stored real-time protection data is also constantly changing, the source data and the real-time protection data at the same position are read at different times, the contents may be different, and the data is dynamic data, so a new verification scheme is needed to perform correctness verification on the dynamic data.
The degree of informatization in enterprises and institutions is increasing, and more information systems exist in each institution. For important information systems, data needs to be protected in real time, and data loss is reduced as much as possible. After the information system fails, the information system can be rebuilt by recovering the real-time protection data.
The real-time protection data may be incomplete or erroneous due to program failure, network fluctuation, hardware damage, and the like. Incomplete or erroneous real-time protection data cannot be recovered after a failure of the information system, or partial data is not available after recovery. Therefore, it is desirable to verify real-time protection data to ensure that the data is recoverable and consistent with the information system.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a batch real-time protection data verification method, wherein when an information system is used, data can be changed continuously, and the changed data is captured and stored.
In order to achieve the purpose, the invention is realized by the following technical scheme:
the invention discloses a batch real-time protection data verification method, which comprises the following steps:
calculating the number of primary check block samples according to the size of user source data, a check period and user bandwidth, and performing layered sampling according to the data change heat to generate a primary data block number information table to be checked;
reading source data content of a corresponding block number from a source information system according to the data block number information table, and generating a source data abstract;
searching corresponding backup data from the real-time protection data, sequentially searching from the latest data time point to the front when searching, finding block data of a time point and generating a real-time data abstract, and comparing the consistency of the source data abstract and the real-time data abstract; and finishing one search until the block data of the consistent time point is found or all the time points are found.
And when the data block in the data block number information table is completely verified, finishing one-time data verification.
The database of the data block number information table needs to be checked one block by one block, and the source data block A has changes during real-time protection, so that data blocks A1, A2, A3, A4 and A5 are generated;
when a source data block A is verified, reading block data content in source data to form a source data abstract; reading the content of the source data block A from the real-time protection data to form a data abstract, and comparing the data abstract with the source data abstract;
reading the latest time point of a source data block A in the real-time protection data, when the time summary of A5 is inconsistent with the source data summary, sequentially searching the contents of A4, A3, A2 and A1, and calculating and comparing the information summaries of the real-time protection data; if a block of data is hit, the check is passed; otherwise, checking the abnormal condition, and protecting the abnormal condition in real time.
The batch real-time protection data verification comprises a plurality of batches of data verification to form a verification period.
In a verification period, the source data is verified at least once, and the data with high change heat is verified for multiple times, so that the data accuracy is ensured, and data errors are found in time.
When a verification period is verified for 2 times, the data block ADEF is verified for the first time, and corresponding data blocks A4, D1, E2 and F1 are found from the real-time protection data;
checking the data block ABCE for the second time, and finding corresponding data blocks A5, B1, C1 and E3 from the real-time protection data;
after two data verifications, all data blocks ABCDEF are verified at least once, and a block AE with high change heat is verified 2 times, so that the correctness of the real-time data is effectively verified.
The invention supports the verification of dynamic data, the verification is carried out in batches, a single verification window is reduced, and the influence on an information system is reduced; the data with high change heat degree is checked for many times, and the data can be found as early as possible after errors occur.
Drawings
FIG. 1 is a flowchart illustrating a method for batch real-time protection data verification according to the present invention;
FIG. 2 is a flowchart of a work process for verifying specified data;
fig. 3 is a flowchart of the operation of a verification cycle.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further described with the specific embodiments.
Referring to fig. 1, a batch real-time protection data verification method of the present invention includes the following steps:
generating a data block number information table to be checked for one time according to the size of user source data, the data change heat and the checking period;
reading source data content of a corresponding block number from an information system according to the data block number information table, and generating a source data abstract;
searching backup data of a corresponding block from the real-time protection data, sequentially searching forward from the latest data time point during searching, finding block data of a time point and generating a real-time data abstract, and comparing the consistency of the source data abstract and the real-time data abstract; and finishing one search until the block data of the consistent time point is found or all the time points are found.
And when the data block in the data block number information table is completely verified, finishing one-time data verification.
The batch real-time protection data verification comprises a plurality of batches of data verification to form a verification period.
In a verification period, the source data is verified at least once, and the data with high change heat is verified for multiple times, so that the data accuracy is ensured, and data errors are found in time.
The source data and the real-time protection data which need to be verified are constantly changed, and the source data and the real-time protection data which need to be verified are dynamic data.
Referring to fig. 2: the source data block A, which has changed during real-time protection, generates A1, A2, A3, A4, A5 data blocks.
And when the A data block is verified, reading the block data content in the source data to form a source data abstract. The content of the data block a needs to be read from the real-time protection data to form a data summary, and the data summary is compared with the source data summary.
Due to the time difference between the reading source data and the real-time protection data, the latest time point (a 5 in the figure) of the a data block in the reading real-time protection data may not be of the source data a, so when the time summary of the a5 is inconsistent with the summary of the source data, the contents of the a4, the A3, the a2 and the a1 need to be searched in sequence, and the summaries of the real-time protection data information are calculated and compared. If a block of data is hit, the check is passed; otherwise, checking the abnormal condition, and protecting the abnormal condition in real time.
Referring to fig. 3: take 2 checks in one check cycle as an example.
Checking the data block ADEF for the first time, and finding corresponding data blocks A4, D1, E2 and F1 from the real-time protection data;
checking the data block ABCE for the second time, and finding corresponding data blocks A5, B1, C1 and E3 from the real-time protection data;
after two data verifications, all data blocks ABCDEF are verified at least once, and a block AE with high change heat is verified 2 times, so that the correctness of the real-time data is effectively verified.
The principle of the scheme is that data read from source data at a certain point in time must be found in real-time protected data. Therefore, data can be read from the source data, and then the data block corresponding to the time point is searched in the real-time protection data, and data comparison is performed to complete verification.
The volume real-time backup divides a source volume into blocks, monitors and captures IO changes of each data block, and stores IO streams. After the source data volume is abnormal, the information system can be reconstructed by using the stored IO stream. To ensure data correctness, the data needs to be checked.
In this embodiment, the verification method of the present invention is used, and the verification is performed in a morning of 02:00-02:30 every day, with a verification period of 1 month being configured. The checking module is started at 02:00 every day, 60% of data blocks are preferentially selected from the hot data blocks in the current day, 30% of data blocks are selected from the hot data blocks in the current month, and the rest 10% of data blocks are selected from unchanged data blocks to form a data block number information table to be checked at this time. And sequentially reading the contents of the corresponding data blocks in the source volume according to the data block number information table to be verified, and generating an information abstract. And searching the content of the corresponding data block in the stored IO stream, and generating an information summary for comparison when data of a time point is found. If the content information abstract of a certain time point data block in the IO stream is consistent with the source volume, the block passes the verification; if the user can not find the target object, the verification fails, and the user is warned. And finishing the verification if all the data blocks in the data block number information table to be verified are verified. In a one-month check period, 30 checks are started in total, all data blocks in the source volume are ensured to be checked at least once, and the correctness of real-time protection data of the volume is ensured.
The foregoing shows and describes the general principles and broad features of the present invention and advantages thereof. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (6)

1. A method for verifying batch real-time protection data is characterized by comprising the following steps:
calculating the number of primary check block samples according to the size of user source data, a check period and user bandwidth, and performing layered sampling according to the data change heat to generate a primary data block number information table to be checked;
reading source data content of a corresponding block number from a source information system according to the data block number information table, and generating a source data abstract;
searching corresponding backup data from the real-time protection data, sequentially searching from the latest data time point to the front when searching, finding block data of a time point and generating a real-time data abstract, and comparing the consistency of the source data abstract and the real-time data abstract; and finishing one search until the block data of the consistent time point is found or all the time points are found.
2. The batch real-time protection data verification method according to claim 1, wherein when all data blocks in the data block number information table are verified, one data verification is finished.
3. The batch real-time protection data verification method according to claim 2, wherein the database of the data block number information table is to be verified one block by one block, and the source data block A has a change in real-time protection, resulting in A1, A2, A3, A4, A5 data blocks;
when a source data block A is verified, reading block data content in source data to form a source data abstract; reading the content of the source data block A from the real-time protection data to form a data abstract, and comparing the data abstract with the source data abstract;
reading the latest time point of a source data block A in the real-time protection data, when the time summary of A5 is inconsistent with the source data summary, sequentially searching the contents of A4, A3, A2 and A1, and calculating and comparing the information summaries of the real-time protection data; if a block of data is hit, the check is passed; otherwise, checking the abnormal condition, and protecting the abnormal condition in real time.
4. The method for batch real-time protected data verification according to claim 1, wherein the batch real-time protected data verification comprises a plurality of batch data verifications, forming a verification period.
5. The batch real-time protection data verification method according to claim 4, wherein in a verification period, it is ensured that the source data is verified at least once, and the data with high change heat is verified for multiple times, so as to ensure data accuracy and find data errors in time.
6. The batch real-time protection data verification method of claim 5, wherein when a verification period is verified 2 times, the data blocks ADEF are verified for the first time, and corresponding data blocks a4, D1, E2, F1 are found from the real-time protection data;
checking the data block ABCE for the second time, and finding corresponding data blocks A5, B1, C1 and E3 from the real-time protection data;
after two data verifications, all data blocks ABCDEF are verified at least once, and a block AE with high change heat is verified 2 times, so that the correctness of the real-time data is effectively verified.
CN201911362809.8A 2019-12-26 2019-12-26 Batch real-time protection data verification method Active CN111125137B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911362809.8A CN111125137B (en) 2019-12-26 2019-12-26 Batch real-time protection data verification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911362809.8A CN111125137B (en) 2019-12-26 2019-12-26 Batch real-time protection data verification method

Publications (2)

Publication Number Publication Date
CN111125137A true CN111125137A (en) 2020-05-08
CN111125137B CN111125137B (en) 2022-03-15

Family

ID=70502704

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911362809.8A Active CN111125137B (en) 2019-12-26 2019-12-26 Batch real-time protection data verification method

Country Status (1)

Country Link
CN (1) CN111125137B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070186127A1 (en) * 2006-02-03 2007-08-09 Emc Corporation Verification of computer backup data
CN104618112A (en) * 2015-01-19 2015-05-13 北京海泰方圆科技有限公司 Method for verifying dynamic password of dynamic token
CN108595290A (en) * 2018-03-23 2018-09-28 上海爱数信息技术股份有限公司 A kind of method and data back up method ensureing Backup Data reliability
CN109033127A (en) * 2018-05-31 2018-12-18 阿里巴巴集团控股有限公司 A kind of synchrodata method of calibration, device and equipment
CN109871296A (en) * 2018-12-24 2019-06-11 航天信息股份有限公司 A kind of data back up method and system, data reconstruction method and system and mobile terminal

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070186127A1 (en) * 2006-02-03 2007-08-09 Emc Corporation Verification of computer backup data
CN104618112A (en) * 2015-01-19 2015-05-13 北京海泰方圆科技有限公司 Method for verifying dynamic password of dynamic token
CN108595290A (en) * 2018-03-23 2018-09-28 上海爱数信息技术股份有限公司 A kind of method and data back up method ensureing Backup Data reliability
CN109033127A (en) * 2018-05-31 2018-12-18 阿里巴巴集团控股有限公司 A kind of synchrodata method of calibration, device and equipment
CN109871296A (en) * 2018-12-24 2019-06-11 航天信息股份有限公司 A kind of data back up method and system, data reconstruction method and system and mobile terminal

Also Published As

Publication number Publication date
CN111125137B (en) 2022-03-15

Similar Documents

Publication Publication Date Title
CN108805570B (en) Data processing method, device and storage medium
US9268648B1 (en) System and method for consistency verification of replicated data in a recovery system
US20200117550A1 (en) Method, device and computer program product for backing up data
CN109522363B (en) Cloud platform synchronization method, system, equipment and storage medium based on block chain
US20140136893A1 (en) System file repair method and apparatus
US9996434B2 (en) Data mirror volume verification
CN110704428A (en) Data indexing method and device for block chain, computer equipment and storage medium
US10606712B2 (en) Metadata recovery for de-duplicated data
US11481284B2 (en) Systems and methods for generating self-notarized backups
CN113065169A (en) File storage method, device and equipment
US11467558B2 (en) Interruption recovery method for machine tool machining file and machine tool applying same
CN105045721A (en) Method and device for checking data consistency
CN113360322A (en) Method and equipment for recovering data based on backup system
CN105868127A (en) Data storage method and device and data reading method and device
CN107704342A (en) A kind of snap copy method, system, device and readable storage medium storing program for executing
CN108573172B (en) Data checking and storing method and device
CN113220777B (en) Service data processing method, device, computer equipment and storage medium
CN111125137B (en) Batch real-time protection data verification method
CN111198920B (en) Method and device for determining comparison table snapshot based on database synchronization
CN104615948A (en) Method for automatically recognizing file completeness and restoring
CN112306753A (en) Data restoration method, device and system
CN115827691A (en) Batch processing result verification method and device, computer equipment and storage medium
CN105630625A (en) Method and device for detecting consistency between data copies
JP4754007B2 (en) Information processing apparatus, information processing method, program, and recording medium
CN115080311B (en) Informatization remote control method and device for big data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220428

Address after: 210001 floor 3, building B, building C, building 5, Baixia high tech Industrial Park, No. 5, Yongzhi Road, Qinhuai District, Nanjing, Jiangsu Province

Patentee after: NANJING UNARY INFORMATION TECHNOLOGY Co.,Ltd.

Address before: 210032 floors 9-10, building 1, Changfeng building, No. 14 Xinghuo Road, Jiangbei new area, Nanjing, Jiangsu

Patentee before: Aerospace one system (Nanjing) data Technology Co.,Ltd.

CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: Building 1, 6th Floor, Changfeng Building, No.14 Xinghuo Road, Research and Innovation Park, Jiangbei New District, Nanjing City, Jiangsu Province, 210000

Patentee after: Aerospace One System (Jiangsu) Information Technology Co.,Ltd.

Address before: 210001 floor 3, building B, building C, building 5, Baixia high tech Industrial Park, No. 5, Yongzhi Road, Qinhuai District, Nanjing, Jiangsu Province

Patentee before: NANJING UNARY INFORMATION TECHNOLOGY Co.,Ltd.